Grab latest from dev

TEIC · Jan 21, 2025 · a046176 · a046176
2 parents 713a042 + b0a5fd5
commit a046176
Show file tree

Hide file tree

Showing 48 changed files with 441 additions and 642 deletions.
diff --git a/P5/Source/Guidelines/en/CH-LanguagesCharacterSets.xml b/P5/Source/Guidelines/en/CH-LanguagesCharacterSets.xml
@@ -410,7 +410,7 @@ dates, and predefined value lists.</note></p></div>
    depends largely on the purpose, external requirements, local
    equipment and so forth, it is thus outside the scope of coverage
    for these Guidelines. </p>
-   <p>It might however nevertheless be helpful to put some of the
+   <p>It might nevertheless be helpful to put some of the
    terminology used for the rendering process in the context of the
    discussion of this chapter.  As was mentioned above, Unicode
    encodes abstract characters, not specific glyphs.  For any
@@ -421,12 +421,12 @@ dates, and predefined value lists.</note></p></div>
    and which areas have to be left blank.  If we want to print a character
    from the Latin script, besides the selection of
    the overall glyph shape, this process also requires that a
-   specific weight of the font has been selected, a specific size
+   specific weight and size of the font has been selected,
    and to what degree the shape should be slanted.  Beyond
    individual characters, the overall typesetting process also
-   follows specific rules of how to calculate the distance between
-   characters, how much whitespace occurs between words, at which
-   points line breaks might occur and so forth.  </p>
+   follows specific rules for calculating the distance between
+   characters, for determining how much whitespace occurs between any two words, and how long each line should be (and thus at which
+   points a new line begins), and so forth.  </p>
    <p>If we concern ourselves only with the rendering process of the
    characters themselves, leaving out all these other parameters, we
    will realize that of all the information required for this process, only a small

diff --git a/P5/Source/Guidelines/en/CMC-ComputerMediatedCommunication.xml b/P5/Source/Guidelines/en/CMC-ComputerMediatedCommunication.xml
@@ -951,7 +951,7 @@ See the file COPYING.txt for details.
         icon-based emoji.</p>
       <p>Alternatively, e.g. when <gi>w</gi> is not regularly used to encode tokens in the TEI
         document, <gi>c</gi> may be used to mark an emoji. For example, the source post <q>Da kostet
-          ein Haarschnitt 50 € 😱</q> (from the corpus <ptr target="#CMC_Mocoda2"/>, in
+          ein Haarschnitt 50 € 😱</q> (from the corpus <ptr target="#BIB_MoCoDa2"/>, in
         English <q>A haircut there costs 50 € 😱</q>) might be encoded as follows: <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="CMCemos-egXML-rt">
           <post xml:lang="de">Da kostet ein Haarschnitt 50 € <c type="emoji" ana="#fsif" generatedBy="template">😱</c></post>
         </egXML>

diff --git a/P5/Source/Guidelines/en/CO-CoreElements.xml b/P5/Source/Guidelines/en/CO-CoreElements.xml
@@ -329,7 +329,7 @@ problem for text encoders. Suppose, for example, that we wish to
 investigate a diachronic English corpus for occurrences of
 <mentioned>tea-pot</mentioned> and <mentioned>teapot</mentioned>, to
 find evidence for the point at which this compound becomes
-lexicalized. Any case where the word is hyphenated across a linebreak,
+lexicalized. Any case where the word is hyphenated across a line division,
 like this: <eg xml:space="preserve"><![CDATA[tea-
 pot]]></eg> is
 ambiguous: there is no simple way of deciding which of the two
@@ -381,11 +381,11 @@ whitespace. </p>
 
 <p> The <gi>lb</gi>, <gi>pb</gi>, and <gi>cb</gi> elements are notable
 exceptions to this general rule, since their function is precisely to
-represent (or replace) line, page, or column breaks, which, as noted
+represent (or replace) line, page, or column beginnings, which, as noted
 above, are generally considered to be equivalent to whitespace. These
 elements provide a more reliable way of preserving the lineation,
-pagination, etc of a source document, since the encoder should not
-assume that (untagged) line breaks etc. in an XML source file will
+pagination, etc. of a source document, since the encoder should not
+assume that (untagged) line beginnings etc. in an XML source file will
 necessarily be preserved. </p>
 
 <p>To control the intended tokenization, the encoder may use the
@@ -2696,7 +2696,7 @@ appropriate value for the <att>rend</att> attribute. Suggested values
   for <att>rend</att> include:
   <list rend="bulleted">
     <item><term>bulleted</term> (items preceded by bullets or similar markings)</item>
-    <item><term>inline</term> (items rendered within continuous prose, with no linebreaks)</item>
+    <item><term>inline</term> (items rendered within continuous prose, with no line divisions)</item>
     <item><term>numbered</term> (items preceded by numbers or letters)</item>
     <item><term>simple</term> (items rendered as blocks, but with no bullet or number)</item>
   </list>
@@ -3492,11 +3492,11 @@ section <ptr target="#CORS6"/> and in section <ptr target="#SACR"/>.
 <p>When a text has no pre-existing associated reference system of any
 kind, these Guidelines recommend as a minimum that at least the page
 boundaries of the source text be marked using one of the methods
-outlined in this section.  Retaining page breaks in the markup is also
+outlined in this section.  Retaining page boundaries in the markup is also
 recommended for texts which have a detailed reference system of their
-own. Line breaks in prose texts may be, but need not be, tagged.<note place="bottom">Many encoders find it convenient to retain the line
-breaks of the original during data entry, to simplify proofreading,
-but this may be done without inserting a tag for each line break of
+own. Line divisions in prose texts may be, but need not be, encoded.<note place="bottom">Many encoders find it convenient to retain the line
+  divisions of the original during data entry, to simplify proofreading,
+  but this may be done without inserting an element for the beginning of each line in
 the original.</note></p>
 <div type="div3" xml:id="CORS1"><head>Using the <att>xml:id</att> and <att>n</att> Attributes</head>
 <p>When traditional reference schemes represent a hierarchical
@@ -3900,7 +3900,7 @@ treated as a single word, a tagging such as the following is recommended:
   <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="CORS5-egXML-hq" source="#NONE">
 ...sed imp<lb break="no"/>erator dixit...
 </egXML>
-Where hyphenation appears before a line or page break, the encoder may
+  Where hyphenation appears at the end of a topographic line, a column, or a page, the encoder may
 or may not choose to record the fact, either explicitly using an
 appropriate Unicode character, or descriptively for example by means
 of the <att>rend</att> attribute; see further <ptr target="#COPU-2"/>.</p>
@@ -4201,6 +4201,17 @@ encoding of short-form references such as <mentioned>Baxter,
 1983</mentioned> is not as <gi>bibl</gi> elements but as
 cross-references to such elements; see section <ptr target="#COBIXR"/>
 below.  </p> 
+<p>
+  Bibliographic references encoded with <gi>bibl</gi>, <gi>biblFull</gi>, or <gi>biblStruct</gi> may link to another, possibly more detailed, bibliographic reference to the same resource using either the attribute <att>ref</att> or <att>key</att> defined in <ident type="class">att.canonical</ident>. The <att>ref</att> attribute performs this linking function using a URI that may point either within or outside the TEI document, as shown in the following example:
+  <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COBITY-egXML-bibref" source="#NONE">
+    <bibl ref="https://search.worldcat.org/title/song-of-achilles/oclc/911117343"><title>The song of Achilles</title> by <author>Madeline Miller</author>.</bibl>
+  </egXML>
+  The <att>key</att> attribute, on the other hand, associates a bibliographical reference with an external or project-defined identifier as demonstrated here:
+  <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COBITY-egXML-bibkey" source="#NONE">
+    <bibl key="Homer_Iliad.16">Book 16 of <author>Homer</author>'s <title>Iliad</title>.</bibl>
+  </egXML>
+  These attributes should not be used for encoding bibliographic pointers (short-form citations); for that purpose, see section <ptr target="#COBIXR"/> below.
+</p>
 <p>In cases where the encoder wishes to impose more structure on the
 bibliographic information, for example to make sure it conforms to a
 particular stylesheet or retrieval processor, the <gi>biblStruct</gi>
@@ -5476,8 +5487,8 @@ following example:
   normalize bibliographic references: <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COBIXR-egXML-cc" source="#NONE">Nelson claims (<ref target="#NEL80">Nelson [1980]</ref> pages 13–37)
     ...</egXML> 
   If it is desired to capture additional information like this in a short-form
-  reference, then <gi>bibl</gi> may be used with the <att>corresp</att> attribute pointing to
-  the full bibliographic reference: <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COBIXR-egXML-hi" source="#NONE">Nelson claims (<bibl corresp="#NEL80">Nelson [1980] pages <biblScope unit="page" from="13" to="37">13–37</biblScope></bibl>) ...</egXML>
+  reference, then <gi>bibl</gi> may be used with the <att>ref</att> attribute pointing to
+  the full bibliographic reference: <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COBIXR-egXML-hi" source="#NONE">Nelson claims (<bibl ref="#NEL80">Nelson [1980] pages <biblScope unit="page" from="13" to="37">13–37</biblScope></bibl>) ...</egXML>
 </p>
 <p>The <gi>ref</gi> element may also be used to provide a reference to a copy of the bibliographic item itself, particularly if this is available online, as in the following example: 
   <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COBIXR-egXML-ad" source="#NONE">
@@ -5645,13 +5656,13 @@ metrical rather than typographic lines.  In some modern or free verse,
 it may be hard to decide whether the typographic line is to be
 regarded as a verse line or not, but the distinction is quite clear
 for verse following regular metrical patterns. Where a metrical line is
-interrupted by a typographic line break, the encoder may choose to
-ignore the fact entirely or to use the empty <gi>lb</gi> (line break)
+interrupted by the start of a new typographic line, the encoder may choose to
+ignore the fact entirely or to use the empty <gi>lb</gi> (line beginning)
 element discussed in <ptr target="#CORS"/>.  By convention, the start
 of a metrical line implies the start of a typographic line; hence
 there is no need to introduce an <gi>lb</gi> tag at the start of every
 <gi>l</gi> element, but only at places where a new typographic line
-starts  within a metrical line, as in the following example:
+starts within a metrical line, as in the following example:
 
 <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="COVE-egXML-vm" source="#CO-eg-06">
 <l>Of Mans First Disobedience, and<lb/> the Fruit</l>
@@ -5664,14 +5675,13 @@ starts  within a metrical line, as in the following example:
 In the original copy text, the presence of an ornamental capital at
 the start of the poem means that the measure is not wide enough to
 print the first four lines on four lines; instead each metrical line occupies
-two typographic lines, with a break at the point indicated. Note that
+two typographic lines, with the second beginning at the point indicated. Note that
 this encoding makes no attempt to preserve information about the
 whitespace or indentation associated with either kind of line; if regarded
 as essential, this information would be recorded using the
 <att>rend</att> or <att>rendition</att> attributes discussed in <ptr target="#STGA"/>. </p>
   <p>The <gi>l</gi> element should not be used to represent typographic
-lines in non-verse materials: if the line-breaking points in a prose
-text are considered important for analysis, they should be marked with
+    lines in non-verse materials: if the lineation of a prose text is considered important for analysis, the beginning of each line should be marked with
 the <gi>lb</gi> element. Alternatively, a neutral segmentation element
 such as <gi>seg</gi> or <gi>ab</gi> may be used; see further
 discussion of these elements in chapter <ptr target="#SA"/>. The

diff --git a/P5/Source/Guidelines/en/DI-PrintDictionaries.xml b/P5/Source/Guidelines/en/DI-PrintDictionaries.xml
@@ -2227,8 +2227,7 @@ following three, which help to clarify some issues raised with particular urgenc
 dictionaries, on account of the complexity of both their typography and their
 information structure.<list rend="bulleted">
 <item>(a) the <term>typographic view</term>—the
-two-dimensional printed page, including information about line and page breaks
-and other features of layout </item>
+  two-dimensional printed page, including information about lineation, pagination, and other features of layout </item>
 <item>(b) the <term>editorial view</term>—the one-dimensional sequence of tokens
 which can be seen as the input to the typesetting process; the wording and
 punctuation of the text and the sequencing of items are visible in this view,
@@ -2243,7 +2242,7 @@ therefore hyphenated (<q>naut-</q>
 <q>ical</q>); the typographic view of the dictionary preserves this information. In a
 purely editorial view, the particular form in which the domain name is given in the
 particular dictionary (as <q>nautical</q>, rather than <q>naut.</q>, <q>Naut.</q>, etc.)
-would be preserved, but the fact of the line break would not. Font shifts might
+  would be preserved, but the fact that the word was split across two lines with a soft hyphen would not. Font shifts might
 plausibly be included in either a strictly typographic or an editorial view. In the
 lexical view, the only information preserved concerning domain would be some standard
 symbol or string representing the nautical domain (e.g. <q>naut.</q>) regardless of the

diff --git a/P5/Source/Guidelines/en/DS-DefaultTextStructure.xml b/P5/Source/Guidelines/en/DS-DefaultTextStructure.xml
@@ -1506,7 +1506,7 @@ the <gi>div</gi> elements containing chapters of the text itself. (For the
 <!-- ... -->
 </div>
 </egXML>
-Alternatively, the pointers in the index might link to the page breaks
+  Alternatively, the pointers in the index might link to the page beginnings
 at which a chapter begins, assuming that these have been included in
 the markup:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="DSFRONT-egXML-xd"><!-- ... -->
@@ -1629,7 +1629,7 @@ the work discussed earlier in this section: <egXML xmlns="http://www.tei-c.org/n
     </docImprint>
   </titlePage></front></egXML></p>
 <p>Second, a characteristically verbose 17th century example. Note the
-use of the <gi>lb</gi> tag to mark the line breaks of the original
+  use of the <gi>lb</gi> tag to mark the line beginnings of the original
 where necessary:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="DSTITL-egXML-dk" source="#DS-eg-06"><titlePage>
   <docTitle>
@@ -1731,7 +1731,7 @@ transcription):
 </div>
 </back></egXML>
 <!-- Smith, Wealth of Nations, 1776; index to vol 1 -->
-Note that if the page breaks in the original source have also been
+  Note that if the pagination in the original source have also been
 explicitly encoded, and given identifiers, the references to them in the
 above index can more usefully be recorded as links. For example,
 assuming that the encoding of page 461 of the original source starts

diff --git a/P5/Source/Guidelines/en/FT-TablesFormulaeGraphics.xml b/P5/Source/Guidelines/en/FT-TablesFormulaeGraphics.xml
@@ -255,7 +255,7 @@ this is rarely if ever done in practice.</note-->
       </p>
       <p>The content of table elements is not limited to <gi>head</gi> and
           <gi>row</gi>. Milestone elements such as <gi>cb</gi> and <gi>lb</gi>
-        allow breaks to be signalled inside tables; <gi>figure</gi> provides an
+        allow new columns or lines to be signalled inside tables; <gi>figure</gi> provides an
         option for including data which is not amenable to normal row and cell
         analysis; and other elements such as <gi>epigraph</gi> and
           <gi>trailer</gi> provide options for including text which is clearly

diff --git a/P5/Source/Guidelines/en/HD-Header.xml b/P5/Source/Guidelines/en/HD-Header.xml
@@ -1410,7 +1410,7 @@ it to mark italicised English words only.</p>
 <p>The <att>withId</att> attribute may optionally be used to specify
 how many of the occurrences of the element in question bear a value
 for the global <att>xml:id</att> attribute, as in the following
-example: <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="HD57-2-egXML-kf"><tagUsage gi="pb" occurs="321" withId="321"> Marks page breaks in the York
+  example: <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="HD57-2-egXML-kf"><tagUsage gi="pb" occurs="321" withId="321"> Marks page beginnings in the York
 (1734) edition only </tagUsage></egXML> This indicates that the
 <gi>pb</gi> element occurs 321 times, on each of which an identifier
 is provided.</p>
@@ -1518,7 +1518,7 @@ not recommended for automatic processing.</p>
   text. The <att>n</att> attribute on each <gi>div1</gi> and
   <gi>div2</gi> contains the canonical reference for each such
   division, in the form 'XX.yyy', where XX is the book number in Roman
-  numerals, and yyy the section number in arabic. Line breaks are
+  numerals, and yyy the section number in arabic. Line beginnings are
   marked by empty <gi>lb</gi> elements, each of which includes the
   through line number in Casaubon's edition as the value of its
   <gi>n</gi> attribute.</p>

diff --git a/P5/Source/Guidelines/en/NH-Non-hierarchical.xml b/P5/Source/Guidelines/en/NH-Non-hierarchical.xml
@@ -220,7 +220,7 @@ wound.</seg></p></egXML>
 	typographical and metrical line divisions correspond,
 	<gi>lb</gi> does not itself make a metrical claim: in encoding
 	verse from sources, such as Old English manuscripts, where
-	physical line breaks are not used to indicate metrical
+	physical line beginnings are not used to indicate metrical
 	lineation, the correspondence would break down entirely.</p>
 
 

diff --git a/P5/Source/Guidelines/en/PH-PrimarySources.xml b/P5/Source/Guidelines/en/PH-PrimarySources.xml
@@ -749,8 +749,8 @@ used to define a polygon of any shape using this coordinate system:-->
          chapter provides ways of encoding such information: <list rend="bulleted">
             <item>methods of recording editorial or other alterations to the text, such as expansion
                of abbreviations, corrections, conjectures, etc. (section <ptr target="#PHCH"/>)</item>
-            <item>methods of describing important extra-linguistic phenomena in the source: unusual
-               spaces, lines, page and line breaks, changes of manuscript hand, etc. (section <ptr target="#PHPH"/>)</item>
+            <item>methods of describing important extra-linguistic phenomena in the source: pagination, lineation, unusual
+               spaces, changes of manuscript hand, etc. (section <ptr target="#PHPH"/>)</item>
             <item>methods of representing aspects of layout such as spacing or lines <ptr target="#PHLAY"/>
             </item>
             <item>methods of representing material such as running heads, catch-words, and the like
@@ -2549,9 +2549,9 @@ referring to the zone marked in purple on the right
 -->
       <p>This approach assumes that the transcription will primarily be organized in the same way as
          the physical layout of the source, using embedded transcription elements. Alternatively,
-         where the a non-embedded transcription has been provided, using the <gi>text</gi> element,
-         it is still possible to record gathering breaks, page breaks, column breaks, line breaks
-         etc in the source, using the elements described in section <ptr target="#CORS"/>. Detailed
+         where a non-embedded transcription has been provided, using the <gi>text</gi> element,
+         it is still possible to record gathering beginnings, page beginnings, column beginnings, line beginnings
+         etc. in the source, using the elements described in section <ptr target="#CORS"/>. Detailed
          metadata about the physical make-up of a source will usually be summarized by the
             <gi>physDesc</gi> component of an <gi>msDesc</gi> element discussed in <ptr target="#msph"/>. </p>