-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert tab-defined tables into real tables #16
Comments
Probably best to deal with the paragraphs that tabs for indent as a separate issue, and come back to this more complex step when those tabs have been converted into paragraph indent formatting, and tidied away. |
Is the features facet "tab alignment" supposed to capture these? |
Yes the "tab alignment" facet value is supposed to identify letters in which appear a sequence of two or more paragraphs which contain tab characters, where the tabs are not at the start of the paragraph. Where a tab occurs only at the start of the paragraph, I've assumed that's not for aligning into columns, but rather just a paragraph indentation. Maybe that's not a foolproof test, but it's the best I could come up with. Any suggestion for improvement? Also, if you can point to an example of a table which is constructed with tab characters (rather than a Word table), but which doesn't belong to that facet, please post a link. Cheers! |
This still causes problems in XProc. e.g. see 85-08-15a |
Here are the documents referred to, both of which have a sequence of paragraphs containing tabs, which aren't converted to tables: |
Unfortunately, Word's list capability appears limited to strictly defined presets—lots of them—with the only options allowing styling for fonts or so on. |
@LucasHorseshoeBend yes I agree that Word's "lists" aren't adequate to capturing these lists, and that comment about converting them to lists was more of a note to myself; I meant that the Word-to-TEI converter could convert them to a TEI list instead of a TEI table. But the difference between a TEI list and a TEI table with just two columns is not huge. I'd rather just fix this bug and get them converted to a table, and put off converting them to lists until later on, or never. |
I did amend "most of" the cases (I missed one block in one of the letters), so that where the XProc display said, e,g, "dodo" meaning two dittos under a previous entry these have now been separated as "do do", and where numbers followed by a tab then text now reads, e.g. "1 some text" instead of "1sometext". Many of the cases would not work as tables, because the line above, say, was set out with spaces and not tabs so it would produce a more misleading representation than the characters separated but not aligned vertically. To make a meaningful table on those cases would require editorial intervention anyway, so it's not worth getting rid of the bug, which would risk not picking up the resulting problem cases. |
Tables that are defined using tabs ("tabular sections") should be converted into TEI tables.
Need to clearly distinguish tabular sections from other uses of tabs, such as to indent paragraphs.
Some tabular sections will be unevenly tabulated, because a variable number of tabs will be used to achieve the same alignment on different paragraphs, depending on the width of the text in those paragraphs. These tabular sections should be excluded from automatic conversion to tables, and left as a residual to be manually edited.
The text was updated successfully, but these errors were encountered: