Skip to content

Version 0.2

Compare
Choose a tag to compare
@kermitt2 kermitt2 released this 17 Oct 09:29
· 99 commits to master since this release
dea10fd

New in version 0.2:

  • support Unicode composition of characters
  • generalize reading order to all blocks (it was limited to the blocks of the first page)
  • use subscript/superscript text font style attribute
  • use SVG as a format for vectorial images
  • propagate unsolved character Unicode value (free Unicode range for embedded fonts) as encoded special character in ALTO (so-called "placeholder" approach)
  • generate metadata information in a separate XML file (as ALTO schema does not support that)
  • use the latest version of xpdf, version 4.00
  • add cmake
  • ALTO output is replacing custom Xerox XML format

Note: this released version was used for Grobid release 0.5.6