Skip to main content
Nasjonalbiblioteket

Norwegian Anaphora Resolution Corpus

Description

Norwegian-BokmaalNARC and Norwegian-NynorskNARC are conversions of the Bokmål and Nynorsk parts of the Norwegian Anaphora Resolution Corpus (NARC), respectively. This is the first publicly available corpus annotated with anaphoric relations between noun phrases for Norwegian.

The annotation is made on top of and enriches the existing annotation in the Norwegian Dependency Treebank (NDT). The resulting corpus contains a total of 15,742 sentences and 245,515 tokens for Norwegian Bokmål, and 12,481 sentences and 206,660 tokens for Norwegian Nynorsk.

The accompanying paper by Mæhlum et al. (from CRAC 2022) describes the annotation effort in more detail.

Distributions
1

Download
Description:
Not provided
Access URL:
https://hdl.handle.net/21.11146/82
Direct download:
https://www.nb.no/sbfil/tekst/NARC_1_1.zip
Generating preview...
API:
Not provided
Documentation:
Not provided
License:
Conforms to:
Not provided

APIs providing this dataset
0

No registered APIs provide this dataset.

Similar datasets

Norsk Ordbank - Norwegian Nynorsk 2005-2012Nasjonalbiblioteket
Public access
Translation Memories from Semantix ASNasjonalbiblioteket
Public access
NST Pronunciation Lexicon for SwedishNasjonalbiblioteket
Public access
Grapheme-to-Phoneme Models for NorwegianNasjonalbiblioteket
Public access
spaCy for Norwegian NynorskNasjonalbiblioteket
Public access