Skip to content
forked from neocl/jamdict

Jamdict - A Python library Japanese dictionary empowered by JMDict & KanjiDic2 by Jim Breen

License

Notifications You must be signed in to change notification settings

alt-romes/jamdict

This branch is up to date with neocl/jamdict:main.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

85c66c1 Β· Jun 6, 2021
Apr 16, 2018
May 30, 2021
Jun 6, 2021
May 29, 2021
May 18, 2021
Mar 5, 2018
Nov 9, 2016
Sep 30, 2019
Jun 1, 2021
Mar 5, 2018
Jun 28, 2019
May 20, 2021
May 20, 2021
Apr 16, 2018
May 31, 2020
Jul 19, 2018
May 24, 2021
Jul 22, 2017
May 25, 2021
Apr 16, 2018

Repository files navigation

Jamdict

Jamdict is a Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings.

ReadTheDocs Badge

Documentation: https://jamdict.readthedocs.io/

Main features

  • Support querying different Japanese language resources
    • Japanese-English dictionary JMDict
    • Kanji dictionary KanjiDic2
    • Kanji-radical and radical-kanji maps KRADFILE/RADKFILE
    • Japanese Proper Names Dictionary (JMnedict)
  • Fast look up (dictionaries are stored in SQLite databases)
  • Command-line lookup tool (Example)

Contributors are welcome! πŸ™‡. If you want to help, please see Contributing page.

Try Jamdict out

Jamdict is used in Jamdict-web - a web-based free and open-source Japanese reading assistant software. Please try out the demo instance online at:

https://jamdict.herokuapp.com/

There also is a demo Jamdict virtual machine online for trying out Jamdict Python code on Repl.it:

https://replit.com/@tuananhle/jamdict-demo

Installation

Jamdict & Jamdict database are both available on PyPI and can be installed using pip

pip install --upgrade jamdict jamdict-data

Sample jamdict Python code

from jamdict import Jamdict
jam = Jamdict()

# use wildcard matching to find anything starts with 食べ and ends with γ‚‹
result = jam.lookup('食べ%γ‚‹')

# print all word entries
for entry in result.entries:
     print(entry)

# [id#1358280] γŸγΉγ‚‹ (ι£ŸγΉγ‚‹) : 1. to eat ((Ichidan verb|transitive verb)) 2. to live on (e.g. a salary)/to live off/to subsist on
# [id#1358300] γŸγΉγ™γŽγ‚‹ (ι£ŸγΉιŽγŽγ‚‹) : to overeat ((Ichidan verb|transitive verb))
# [id#1852290] γŸγΉγ€γ‘γ‚‹ (ι£ŸγΉδ»˜γ‘γ‚‹) : to be used to eating ((Ichidan verb|transitive verb))
# [id#2145280] γŸγΉγ―γ˜γ‚γ‚‹ (ι£ŸγΉε§‹γ‚γ‚‹) : to start eating ((Ichidan verb))
# [id#2449430] γŸγΉγ‹γ‘γ‚‹ (ι£ŸγΉζŽ›γ‘γ‚‹) : to start eating ((Ichidan verb))
# [id#2671010] たべγͺγ‚Œγ‚‹ (ι£ŸγΉζ…£γ‚Œγ‚‹) : to be used to eating/to become used to eating/to be accustomed to eating/to acquire a taste for ((Ichidan verb))
# [id#2765050] γŸγΉγ‚‰γ‚Œγ‚‹ (ι£ŸγΉγ‚‰γ‚Œγ‚‹) : 1. to be able to eat ((Ichidan verb|intransitive verb)) 2. to be edible/to be good to eat ((pre-noun adjectival (rentaishi)))
# [id#2795790] γŸγΉγγ‚‰γΉγ‚‹ (ι£ŸγΉζ―”γΉγ‚‹) : to taste and compare several dishes (or foods) of the same type ((Ichidan verb|transitive verb))
# [id#2807470] γŸγΉγ‚γ‚γ›γ‚‹ (ι£ŸγΉεˆγ‚γ›γ‚‹) : to eat together (various foods) ((Ichidan verb))

# print all related characters
for c in result.chars:
    print(repr(c))

# 食:9:eat,food
# ε–°:12:eat,drink,receive (a blow),(kokuji)
# 過:12:overdo,exceed,go beyond,error
# 付:5:adhere,attach,refer to,append
# 始:8:commence,begin
# ζŽ›:11:hang,suspend,depend,arrive at,tax,pour
# ζ…£:14:accustomed,get used to,become experienced
# ζ―”:4:compare,race,ratio,Philippines
# 合:6:fit,suit,join,0.1

Command line tools

To make sure that jamdict is configured properly, try to look up a word using command line

python3 -m jamdict lookup 言θͺžε­¦
========================================
Found entries
========================================
Entry: 1264430 | Kj:  言θͺžε­¦ | Kn: γ’γ‚“γ”γŒγ
--------------------
1. linguistics ((noun (common) (futsuumeishi)))

========================================
Found characters
========================================
Char: 言 | Strokes: 7
--------------------
Readings: yan2, eon, μ–Έ, NgΓ΄n, NgΓ’n, ゲン, ゴン, い.う, こと
Meanings: say, word
Char: θͺž | Strokes: 14
--------------------
Readings: yu3, yu4, eo, μ–΄, Ngα»―, Ngα»©, γ‚΄, γ‹γŸ.γ‚‹, γ‹γŸ.らう
Meanings: word, speech, language
Char: ε­¦ | Strokes: 8
--------------------
Readings: xue2, hag, ν•™, HoΜ£c, ガク, まγͺ.ぢ
Meanings: study, learning, science

No name was found.

Using KRAD/RADK mapping

Jamdict has built-in support for KRAD/RADK (i.e. kanji-radical and radical-kanji mapping). The terminology of radicals/components used by Jamdict can be different from else where.

  • A radical in Jamdict is a principal component, each character has only one radical.
  • A character may be decomposed into several writing components.

By default jamdict provides two maps:

  • jam.krad is a Python dict that maps characters to list of components.
  • jam.radk is a Python dict that maps each available components to a list of characters.
# Find all writing components (often called "radicals") of the character ι›²
print(jam.krad['ι›²'])
# ['δΈ€', '雨', '二', '厢']

# Find all characters with the component 鼎
chars = jam.radk['鼎']
print(chars)
# {'鼏', 'ιΌ’', '鼐', '鼎', 'ιΌ‘'}

# look up the characters info
result = jam.lookup(''.join(chars))
for c in result.chars:
    print(c, c.meanings())
# 鼏 ['cover of tripod cauldron']
# ιΌ’ ['large tripod cauldron with small']
# 鼐 ['incense tripod']
# 鼎 ['three legged kettle']
# ιΌ‘ []

Finding name entities

# Find all names with 鈴木 inside
result = jam.lookup('%鈴木%')
for name in result.names:
    print(name)

# [id#5025685] γ‚­γƒ₯γƒΌγƒ†γ‚£γƒΌγ™γšγ (γ‚­γƒ₯γƒΌγƒ†γ‚£γƒΌιˆ΄ζœ¨) : Kyu-ti- Suzuki (1969.10-) (full name of a particular person)
# [id#5064867] γƒ‘γƒ‘γ‚€γƒ€γ™γšγ (γƒ‘γƒ‘γ‚€γƒ€ιˆ΄ζœ¨) : Papaiya Suzuki (full name of a particular person)
# [id#5089076] γƒ©γ‚Έγ‚«γƒ«γ™γšγ (γƒ©γ‚Έγ‚«γƒ«ιˆ΄ζœ¨) : Rajikaru Suzuki (full name of a particular person)
# [id#5259356] γγ€γ­γ–γγ™γšγγ²γͺた (η‹ε΄Žιˆ΄ζœ¨ζ—₯向) : Kitsunezakisuzukihinata (place name)
# [id#5379158] γ“γ™γšγ (小鈴木) : Kosuzuki (family or surname)
# [id#5398812] γ‹γΏγ™γšγ (上鈴木) : Kamisuzuki (family or surname)
# [id#5465787] γ‹γ‚γ™γšγ (川鈴木) : Kawasuzuki (family or surname)
# [id#5499409] γŠγŠγ™γšγ (倧鈴木) : Oosuzuki (family or surname)
# [id#5711308] すすき (鈴木) : Susuki (family or surname)
# ...

Exact matching

Use exact matching for faster search.

Find the word 花火 by idseq (1194580)

>>> result = jam.lookup('id#1194580')
>>> print(result.names[0])
[id#1194580] はγͺび (花火) : fireworks ((noun (common) (futsuumeishi)))

Find an exact name 花火 by idseq (5170462)

>>> result = jam.lookup('id#5170462')
>>> print(result.names[0])
[id#5170462] はγͺび (花火) : Hanabi (female given name or forename)

See jamdict_demo.py and jamdict/tools.py for more information.

Useful links

Contributors

About

Jamdict - A Python library Japanese dictionary empowered by JMDict & KanjiDic2 by Jim Breen

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.8%
  • Shell 0.2%