I developed TED for my 2008 submitted Magister Artium thesis "Zur Erweiterungsfähigkeit bestehender OCR Verfahren auf den Bereich extrem früher Drucke" in which I facilitated Optical Character Recognition (OCR) on the digital images of incunables from the project "Verteilte Digitale Inkunabelbibliothek".
The character recognition process is based on a Self Organizing Map (SOM / Kohonen-Map) which works with digital images, intensively prepared by the following operations:
- Image conversion
- Binarization (many different algorithms: simple binarization by threshold to Otsu's Method)
- Median and kFill filtering
- Automatically cutting and deskewing of the image
- Edge detection
- Object / glyph isolation and recognition
- Clustering of isolated glyphs with self organizing map