Skip to main content
Nasjonalbiblioteket

N-gram - Norwegian Nynorsk

Description

This corpus is a collection of n-grams (n=1-6) based on approximately 60 million words of running text from the Nynorsk part of the Norwegian Newspaper Corpus and the text corpus of Nordic Language Technology AS.

This version contains all the n-grams, sorted by frequency and alphabetically, respectively.

A version listing only the 1000 most frequent n-grams can also be downloaded. Frequency lists (unigrams) are also available for download separately.

Distributions
1

Download
Description:
Not provided
Access URL:
https://hdl.handle.net/21.11146/8
Direct download:
  1. https://www.nb.no/sbfil/dok/ngram_nno.pdf
    Generating preview...
API:
Not provided
Documentation:
Not provided
License:
Conforms to:
Not provided

APIs providing this dataset
0

No registered APIs provide this dataset.

Similar datasets

Norsk Ordbank - Norwegian Nynorsk 2005-2012Nasjonalbiblioteket
Public access
Translation Memories from Semantix ASNasjonalbiblioteket
Public access
NST Pronunciation Lexicon for SwedishNasjonalbiblioteket
Public access
Grapheme-to-Phoneme Models for NorwegianNasjonalbiblioteket
Public access
spaCy for Norwegian NynorskNasjonalbiblioteket
Public access