Skip to content

Commit

Permalink
resolves asciidoctor#2485 convert string with non-ASCII characters to…
Browse files Browse the repository at this point in the history
… NFD when applying smallcaps transformation
  • Loading branch information
mojavelinux committed Feb 2, 2024
1 parent 360de47 commit 7e8943b
Show file tree
Hide file tree
Showing 4 changed files with 12 additions and 34 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Enhancements::
* add symbols for shift, command, option, and return keys to the fallback font
* add support for `nowrap` and `nobreak` roles on formatted text
* do not insert break opportunities into bare URL with `nobreak` role
* convert string with non-ASCII characters to NFD when applying smallcaps transformation to support diacritics (#2485)

Improvements::

Expand Down
39 changes: 5 additions & 34 deletions docs/modules/theme/pages/text.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -144,44 +144,15 @@ It accepts the following keywords:
capitalize:: Transforms the first letter of each word to a capital letter.
lowercase:: Transforms all the text to lowercase letters.
none:: Clears an inherited value and no case transformation is applied to the text.
smallcaps:: Replaces lowercase ASCII letters (a-z) with their small capital variant.
Lowercase letters outside the ASCII range are not transformed.
smallcaps:: Normalizes text as NFD (decomposed normalized form) and replaces lowercase ASCII letters (a-z) with their small capital variant.
The diacritic mark will be applied to the small capital letter using a combining character as a result of the text being normalized as NFD.
uppercase:: Transforms all the text to capital letters.

The `text-transform` key can't be set on the xref:base.adoc[base category].

If you want the smallcaps transformation to support letters beyond the a-z range, you can do so by overridding the `smallcaps` method in an extended converter.

[,ruby]
----
class MyPDFConverter < (Asciidoctor::Converter.for 'pdf')
register_for 'pdf'

def smallcaps string
string = super
string = string.gsub 'é', %(\u1d07\u0301)
string
end
end
----
This transformation can be automated using `String#unicode_normalize` with the `:nfd` form.
This method will rewrite all characters with diacritical marks so that the diacritical mark is added using a combining character (i.e., a two graphene form).
[,ruby]
----
class MyPDFConverter < (Asciidoctor::Converter.for 'pdf')
register_for 'pdf'

def smallcaps string
string = string.unicode_normalize :nfd unless string.ascii_only?
super
end
end
----
The smallcaps transformation for extended Latin characters (e.g., characters that include an accent) typically requires the addition of a combining character, such as the combining acute accent in the example above).
Therefore, you must ensure that the font you're using supports these combining characters (meaning it provides the necessary glyphs).
The smallcaps transformation for extended Latin characters (e.g., characters outside the a-z range that include an accent) require the use of a combining character, such as the combining acute accent.
Therefore, if you want the smallcaps transformation to support letters beyond the a-z range, you need to ensure the font you're using provides the required https://en.wikipedia.org/wiki/Combining_character[combining characters^] to support diacritics (meaning it provides the necessary glyphs).
Otherwise, these characters will appear as missing glyph boxes in the transformed text.

[#border-background-color]
== Border and background color
Expand Down
1 change: 1 addition & 0 deletions lib/asciidoctor/pdf/text_transformer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ def smallcaps_pcdata string
end

def smallcaps string
string = string.unicode_normalize :nfd unless string.ascii_only?
string.tr LowerAlphaChars, SmallCapsChars
end

Expand Down
5 changes: 5 additions & 0 deletions spec/formatted_text_formatter_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1024,6 +1024,11 @@
(expect pdf.lines).to eql ['HTML stands for HʏᴘᴇʀTᴇxᴛ Mᴀʀᴋᴜᴘ Lᴀɴɢᴜᴀɢᴇ']
end

it 'should decompose non-ASCII characters when applying smallcaps text transform' do
pdf = to_pdf '== Références', pdf_theme: { heading_text_transform: 'smallcaps' }, analyze: true
(expect pdf.lines).to eql [%(R\u1d07\u0301ғ\u1d07\u0301ʀᴇɴᴄᴇs)]
end

it 'should allow custom role to specify relative font size' do
pdf_theme = {
heading_h2_font_size: 24,
Expand Down

0 comments on commit 7e8943b

Please sign in to comment.