Describe dataset availability via distribution/file download
We have provided quite a bit of information about the dataset, but we have not yet described how others can access and download the data. Since we offer the data as a file from GitHub, we can describe this as a Distribution and add information about it.
Note: We distinguish between Dataset, Distribution, and Data Service/API to clarify that a dataset can be made available in multiple ways. In this context, dataset is a fairly abstract concept, while the concrete files and endpoints from which you can retrieve the data are called distribution and data service.
We first need to state that the dataset has a distribution. For this, we use dcat:distribution, which should point to a Distribution Resource described below.
<https://data.digdir.no/datasets/ai_projects_norwegian_state_dataset> rdf:type dcat:Dataset ; # ... dcat:distribution <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> ; . <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution .
Here we have defined a new resource with URI https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution and state that it is of type dcat:Distribution.
Access URL and Download URL
The dataset is available as a file download from Digdir's GitHub page, https://github.com/Informasjonsforvaltning/ai-project-service/blob/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv, and we can point to this with the property dcat:accessURL. This is the only mandatory property for a distribution description.
However, we also want to add more information, for example, the direct link to the file: https://raw.githubusercontent.com/Informasjonsforvaltning/ai-project-service/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv, which we can point to with dcat:downloadURL. The distribution description will then look like this:
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dcat:accessURL <https://github.com/Informasjonsforvaltning/ai-project-service/blob/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; dcat:downloadURL <https://raw.githubusercontent.com/Informasjonsforvaltning/ai-project-service/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; .
Textual Description and Publication Date
We add a textual description of the distribution and when it was published, using the fields dct:description and dct:issued for this:
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dct:description "CSV-fil med oversikt over kunstig intelligens-prosjekter i offentlig sektor"@nb ; dct:issued "2023-02-23"^^xsd:date ; .
License
The distribution has a CC0 license, which we specify with the property dct:license. This property must point to a controlled vocabulary/code list from the EU.
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dct:license <http://publications.europa.eu/resource/authority/licence/CC0> ; .
Examples of some other licenses from the EU code list are:
http://publications.europa.eu/resource/authority/licence/APACHE_2_0http://publications.europa.eu/resource/authority/licence/CC0http://publications.europa.eu/resource/authority/licence/CC_BY_4_0
Format and Language
We will specify the format of the file, which in our case is a CSV file; for this, we use the property dct:format and point to a code in the EU vocabulary "File Type" that represents CSV.
Additionally, the content of the dataset is in Norwegian - Bokmål. We specify this with the property dct:language and point to the code from the EU vocabulary that indicates Norwegian - Bokmål:
<https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dct:format <http://publications.europa.eu/resource/authority/file-type/CSV> ; dct:language <http://publications.europa.eu/resource/authority/language/NOB> ; .
Complete Distribution Description
The complete distribution description will then look like this:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix dct: <http://purl.org/dc/terms/> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix vcard: <http://www.w3.org/2006/vcard/ns#> . <https://data.digdir.no/datasets/ai_projects_norwegian_state_dataset> rdf:type dcat:Dataset ; # ... dcat:distribution <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> ; . <https://data.digdir.no/datasets/ai_projects_norwegian_state_distribution> rdf:type dcat:Distribution ; dcat:accessURL <https://github.com/Informasjonsforvaltning/ai-project-service/blob/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; dcat:downloadURL <https://raw.githubusercontent.com/Informasjonsforvaltning/ai-project-service/main/ai_projects_norwegian_state%20-%20Oversatt_v1.csv> ; dct:description "CSV-fil med oversikt over kunstig intelligens-prosjekter i offentlig sektor"@nb ; dct:issued "2023-02-23"^^xsd:date ; dct:license <http://publications.europa.eu/resource/authority/licence/CC0> ; dct:format <http://publications.europa.eu/resource/authority/file-type/CSV> ; dct:language <http://publications.europa.eu/resource/authority/language/NOB> ; .