"Spelwijze" is spelling game that is published on Dutch newspaper websites. You get 1 mandatory letter and 6 optional letters. The goal is to make as many 4 or more letter words with these 7 letters containing at least the mandatory letter once. This repository contains a tool to generate these puzzles.
Download Dutch words from:
https://www.opentaal.org/bestanden/file/2-woordenlijst-v-2-10g-bronbestanden
To filter out non-letter characters (go from 164313 to 156280 words) execute:
cat 'OpenTaal-210G-basis-gekeurd.txt' | grep -vP '[^a-z]' | sort | uniq | gzip > words.txt.gz
Download Dutch word freqencies from:
https://wortschatz.uni-leipzig.de/en/download/Dutch
To filter the word frequency list (go from 1000000 to 515630 words) execute:
cat 'nld_mixed_2012_1M-words.txt' | cut -f 2,3 | tr A-Z a-z | grep -P '^[a-z]+\t' | gzip > wordfreq.txt.gz
The text files are gzipped to reduce space.
Optional: Download another great list from:
https://kaikki.org/dictionary/Dutch/words/index.html
To filter the verbs:
cat kaikki.org-dictionary-Dutch.jsonl | grep '"pos": "verb"' | grep -o '"head_templates": \[{"name": "nl-verb", "args": {}, "expansion": "[a-z]\{4,\}"' | sort | uniq | cut -d\" -f 12 | gzip > verbs.txt.gz
To filter the nouns:
cat kaikki.org-dictionary-Dutch.jsonl | grep '"pos": "noun"' | grep -v '"plural"' | grep -o '"word": "[a-z]\+"' | cut -d: -f2 | cut -d\" -f 2 | sort | uniq | gzip > nouns.txt.gz
To add and combine these:
mv words.txt.gz words1.txt.gz
zcat words1.txt.gz verbs.txt.gz nouns.txt.gz | sort | uniq | gzip > words.txt.gz
rm words1.txt.gz verbs.txt.gz nouns.txt.gz
Now the extra words are added.
Now run pick a length for your seeding word (a word with 7 different letters):
go run . 16
Showing all 16 letter words consisting of exactly 7 different letters:
begijnenbeweging
binnenduingebied
bloembollenteelt
concernonderdeel
engineeringgroep
espressoapparaat
exercitieterrein
geestesgestoorde
herinterpreteren
intentionaliteit
...(23 more)...
Now pick a seeding word and run:
go run . bloembollenteelt
Resulting in 7 different 7 letter combinations (points based on word frequency):
tbelmno: 728
eblmnot: 725
nbelmot: 711
obelmnt: 534
belmnot: 194
mbelnot: 164
lbemnot: 119
Now if we chose "mbelnot" (where "m" is the mandatory letter) we can run:
go run . mbelnot
To find all 110 words with minimum length 4 that contain the letter "m" and one or more of the other 6 letters:
beetnemen
bemeten
benemen
benoemen
betomen
betonelement
betonmolen
bloem
bloembol
bloembollenteelt
...(100 more)...
Enjoy!