Skip to content

Generator for the spelling game that is published on Dutch newspaper websites

License

Notifications You must be signed in to change notification settings

mevdschee/spelwijze-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spelwijze generator

"Spelwijze" is spelling game that is published on Dutch newspaper websites. You get 1 mandatory letter and 6 optional letters. The goal is to make as many 4 or more letter words with these 7 letters containing at least the mandatory letter once. This repository contains a tool to generate these puzzles.

Sources

Download Dutch words from:

https://www.opentaal.org/bestanden/file/2-woordenlijst-v-2-10g-bronbestanden

To filter out non-letter characters (go from 164313 to 156280 words) execute:

cat 'OpenTaal-210G-basis-gekeurd.txt' | grep -vP '[^a-z]' | sort | uniq | gzip > words.txt.gz

Download Dutch word freqencies from:

https://wortschatz.uni-leipzig.de/en/download/Dutch

To filter the word frequency list (go from 1000000 to 515630 words) execute:

cat 'nld_mixed_2012_1M-words.txt' | cut -f 2,3 | tr A-Z a-z | grep -P '^[a-z]+\t' | gzip > wordfreq.txt.gz

The text files are gzipped to reduce space.

Optional: Download another great list from:

https://kaikki.org/dictionary/Dutch/words/index.html

To filter the verbs:

cat kaikki.org-dictionary-Dutch.jsonl | grep '"pos": "verb"' | grep -o '"head_templates": \[{"name": "nl-verb", "args": {}, "expansion": "[a-z]\{4,\}"' | sort | uniq | cut -d\" -f 12 | gzip > verbs.txt.gz

To filter the nouns:

cat kaikki.org-dictionary-Dutch.jsonl | grep '"pos": "noun"' | grep -v '"plural"' | grep -o '"word": "[a-z]\+"' | cut -d: -f2 | cut -d\" -f 2 | sort | uniq | gzip > nouns.txt.gz

To add and combine these:

mv words.txt.gz words1.txt.gz
zcat words1.txt.gz verbs.txt.gz nouns.txt.gz | sort | uniq | gzip > words.txt.gz
rm words1.txt.gz verbs.txt.gz nouns.txt.gz

Now the extra words are added.

Running

Now run pick a length for your seeding word (a word with 7 different letters):

go run . 16

Showing all 16 letter words consisting of exactly 7 different letters:

begijnenbeweging
binnenduingebied
bloembollenteelt
concernonderdeel
engineeringgroep
espressoapparaat
exercitieterrein
geestesgestoorde
herinterpreteren
intentionaliteit
...(23 more)...

Now pick a seeding word and run:

go run . bloembollenteelt

Resulting in 7 different 7 letter combinations (points based on word frequency):

tbelmno: 728
eblmnot: 725
nbelmot: 711
obelmnt: 534
belmnot: 194
mbelnot: 164
lbemnot: 119

Now if we chose "mbelnot" (where "m" is the mandatory letter) we can run:

go run . mbelnot

To find all 110 words with minimum length 4 that contain the letter "m" and one or more of the other 6 letters:

beetnemen
bemeten
benemen
benoemen
betomen
betonelement
betonmolen
bloem
bloembol
bloembollenteelt
...(100 more)...

Enjoy!

About

Generator for the spelling game that is published on Dutch newspaper websites

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages