Skip to main content
Nasjonalbiblioteket

Norwegian-English Parallel Corpus from Public Web Sites

Description

This is a sentence-aligned parallel corpus built from the public web sites www.nav.no, www.nyinorge.no and skatteetaten.no. These web sites provide information in both Norwegian Bokmål and Nynorsk, and parts of this is translated into English. The material is split in two corpora, one for Norwegian Bokmål-English, and one for Norwegian Nynorsk-English. Only sentences with a corresponding translation are included in the corpora.

The corpora were made by Paul Meurer and Andrew Salway at the University of Bergen for the Language Bank. See the attached report for a description of how this was done.

The corpora are also available at the Clarino Bergen Centre's corpus management and analysis system Corpuscle (https://clarino.uib.no/korpuskel/).

Distributions
1

Download
Description:
Not provided
Access URL:
https://hdl.handle.net/21.11146/68
Direct download:
API:
Not provided
Documentation:
Not provided
License:
Conforms to:
Not provided

APIs providing this dataset
0

No registered APIs provide this dataset.

Similar datasets

Norsk Ordbank - Norwegian Nynorsk 2005-2012Nasjonalbiblioteket
Public access
SCARRIE LexiconNasjonalbiblioteket
Public access
ONOMASTICA Pronunciation Lexicon 2Nasjonalbiblioteket
Public access
Texts from Norwegian WikipediaNasjonalbiblioteket
Public access
N-gram - Norwegian NynorskNasjonalbiblioteket
Public access