Describe dataset availability via distribution/file download
We have provided quite a bit of information about the dataset, but we have not yet described how others can access and download the data. Since we offer the data as a file from GitHub, we can describe this as a Distribution and add information about it.
Note: We distinguish between Dataset, Distribution, and Data Service/API to clarify that a dataset can be made available in multiple ways. In this context, dataset is a fairly abstract concept, while the concrete files and endpoints from which you can retrieve the data are called distribution and data service.
We first need to state that the dataset has a distribution. For this, we use dcat:distribution
, which should point to a Distribution Resource described below.
<https://data.digdir.no/datasets/ai_projects_norwegian_state_dataset> rdf:type dcat:Dataset ; # ... dcat:distribution <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> ; . <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution .
Here we have defined a new resource with URI https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution
and state that it is of type dcat:Distribution
.
Access URL and Download URL
The dataset is available as a file download from Digdir's GitHub page, https://github.com/Informasjonsforvaltning/ai-project-service/blob/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv
, and we can point to this with the property dcat:accessURL
. This is the only mandatory property for a distribution description.
However, we also want to add more information, for example, the direct link to the file: https://raw.githubusercontent.com/Informasjonsforvaltning/ai-project-service/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv
, which we can point to with dcat:downloadURL
. The distribution description will then look like this:
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dcat:accessURL <https://github.com/Informasjonsforvaltning/ai-project-service/blob/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; dcat:downloadURL <https://raw.githubusercontent.com/Informasjonsforvaltning/ai-project-service/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; .
Textual Description and Publication Date
We add a textual description of the distribution and when it was published, using the fields dct:description
and dct:issued
for this:
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dct:description "CSV-fil med oversikt over kunstig intelligens-prosjekter i offentlig sektor"@nb ; dct:issued "2023-02-23"^^xsd:date ; .
License
The distribution has a CC0 license, which we specify with the property dct:license
. This property must point to a controlled vocabulary/code list from the EU.
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dct:license <http://publications.europa.eu/resource/authority/licence/CC0> ; .
Examples of some other licenses from the EU code list are:
http://publications.europa.eu/resource/authority/licence/APACHE_2_0
http://publications.europa.eu/resource/authority/licence/CC0
http://publications.europa.eu/resource/authority/licence/CC_BY_4_0
Format and Language
We will specify the format of the file, which in our case is a CSV file; for this, we use the property dct:format
and point to a code in the EU vocabulary "File Type" that represents CSV.
Additionally, the content of the dataset is in Norwegian - Bokmål. We specify this with the property dct:language
and point to the code from the EU vocabulary that indicates Norwegian - Bokmål:
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dct:format <http://publications.europa.eu/resource/authority/file-type/CSV> ; dct:language <http://publications.europa.eu/resource/authority/language/NOB> ; .
Complete Distribution Description
The complete distribution description will then look like this:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix dct: <http://purl.org/dc/terms/> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix vcard: <http://www.w3.org/2006/vcard/ns#> . <https://data.digdir.no/datasets/ai_projects_norwegian_state_dataset> rdf:type dcat:Dataset ; # ... dcat:distribution <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> ; . <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dcat:accessURL <https://github.com/Informasjonsforvaltning/ai-project-service/blob/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; dcat:downloadURL <https://raw.githubusercontent.com/Informasjonsforvaltning/ai-project-service/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; dct:description "CSV-fil med oversikt over kunstig intelligens-prosjekter i offentlig sektor"@nb ; dct:issued "2023-02-23"^^xsd:date ; dct:license <http://publications.europa.eu/resource/authority/licence/CC0> ; dct:format <http://publications.europa.eu/resource/authority/file-type/CSV> ; dct:language <http://publications.europa.eu/resource/authority/language/NOB> ; .