Skip to main content

N-gram - Norwegian Bokmål

Description

These n-grams (n=1-6) are made on the basis of the texts in Norwegian Newspaper Corpus and the news texts from the text corpus from Nordic Language Technology AS (NST). In total, the source material consists of 1175 million words of running text.

The n-grams are sorted alphabetically and by frequency, respectively. Frequency lists (unigrams) are published in a separate download. A simplified version, listing the 1000 most frequent n-grams is also available for download.

Distributions
1

Download
Description:
Not provided
Access URL:
https://hdl.handle.net/21.11146/12
Direct download:
  1. https://www.nb.no/sbfil/dok/ngram_nob.pdf
    Generating preview...
API:
Not provided
Documentation:
Not provided
License:
Conforms to:
Not provided

APIs providing this dataset
0

No registered APIs provide this dataset.