Skip to main content
Nasjonalbiblioteket

NST N-gram - Swedish

Description

This collection of n-grams (n=1-6) has been produced on the basis of approximately 400 million words of running text from the Swedish text corpus of Nordic Language Technology AS. The corpus contains all the n-grams, sorted alphabetically and by frequency, respectively. There is also a second format available, making it possible to select text types. This version contains more texts and is based on approximately 437 million words. A simplified version, listing the 1.000 most frequent n-grams is also available separately.

Distributions
1

Download
Description:
Not provided
Access URL:
https://hdl.handle.net/21.11146/11
Direct download:
  1. https://www.nb.no/sbfil/dok/ngram_swe.pdf
    Generating preview...
API:
Not provided
Documentation:
Not provided
License:
Conforms to:
Not provided

APIs providing this dataset
0

No registered APIs provide this dataset.

Similar datasets

Norsk Ordbank - Norwegian Nynorsk 2005-2012Nasjonalbiblioteket
Public access
Translation Memories from Semantix ASNasjonalbiblioteket
Public access
NST Pronunciation Lexicon for SwedishNasjonalbiblioteket
Public access
Grapheme-to-Phoneme Models for NorwegianNasjonalbiblioteket
Public access
spaCy for Norwegian NynorskNasjonalbiblioteket
Public access