Problem with python -m fitz extract fonts #2849
Replies: 2 comments 9 replies
-
Fonts embedded in a PDF are usually subset fonts (if they are derived from some font file). This is also the case here. You can recognize this by the 6 uppercase letters and a "+" sign prefixing the original name, Building the subset must of course be done by some software - which of course may contain errors. So you cannot use such a font to write characters that have been deleted from the original font. So all-in-all there are quite a number of possible reasons why things may not work as you would like to. The file name used by MuPDF to store the font on disk has nothing to do with any of the above considerations and is not the reason for problems. |
Beta Was this translation helpful? Give feedback.
-
Shouldn't happen. Have an example? |
Beta Was this translation helpful? Give feedback.
-
When I extract fonts from a document and try to embed them in HTML, sometimes I can't make them work. I don't know if there is something I need to do or if there is something wrong in the extracted font. See ANCHJC+ChineseLiShu1-29.ttf in the attached file for an example. Fontforge shows the font name to be UntitledTTF.
Also, fitz adds a number to the filename to the file name, e.g. -29 above, which means I need to do extra steps to associate the file name to the name in the document. Can this be disabled?
test.zip
Beta Was this translation helpful? Give feedback.
All reactions