CLDF dataset derived from Ugarte et al.'s "NorthPeruLex - A Lexical Dataset of Small Language Families and Isolates from Northern Peru (forthcoming).
If you use these data please cite
- the original source
Ugarte, Carlos and Blum, Frederic and Ingunza, Adriano and Gonzales, Rosa and Peña, Jaime. Forthcoming. NorthPeruLex - A Lexical Dataset of Small Language Families and Isolates from Northern Peru.
- the derived dataset using the DOI of the particular released version you were using
This dataset brings together lexical data from isolates and small language families from northern Peru to investigate their historic relations.
This dataset is licensed under a CC-BY-4.0 license
Conceptlists in Concepticon:
work in progress
- Varieties: 35 (linked to 34 different Glottocodes)
- Concepts: 200 (linked to 200 different Concepticon concept sets)
- Lexemes: 4,804
- Sources: 16
- Synonymy: 1.11
- Invalid lexemes: 0
- Tokens: 27,168
- Segments: 342 (0 BIPA errors, 0 CLTS sound class errors, 337 CLTS modified)
- Inventory size (avg): 37.43
Name | GitHub user | Description | Role |
---|---|---|---|
Carlos Ugarte | @MuffinLinwist | Data collector, CLDF conversion and annotation | Author, Editor |
Frederic Blum | @FredericBlum | CLDF conversion and annotation | Author, Editor |
Adriano Ingunza | @BadBatched | Data collector and annotation | Author |
Rosa Gonzales | @rosalgm | Data collector and annotation | Author |
Jaime Peña | @JaimePenat | Data collector and annotation | Author |
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json