diff --git a/manual/source/datarules.rst b/manual/source/datarules.rst index 1972d9a227..029f49ccb2 100644 --- a/manual/source/datarules.rst +++ b/manual/source/datarules.rst @@ -238,19 +238,30 @@ matches the data type specifier. .. index:: ! strings **strings** - All strings are to be encoded in UTF-8. Since most strings in a - NeXus file are restricted to a small set of characters and the first 128 characters are standard across encodings, - the encoding of most of the strings in a NeXus file will be a moot point. - Where encoding in UTF-8 will be important is when recording people's names in ``NXuser`` - and text notes in ``NXnotes``. - - .. https://github.com/nexusformat/NIAC/issues/23#issuecomment-308773465 - - .. index:: strings; variable-length - .. index:: strings; fixed-length - - NeXus accepts both variable and fixed length strings, - software that reads NeXus data files should support both. + ``NX_CHAR``: + All strings are to be encoded in UTF-8. Since most strings in a + NeXus file are restricted to a small set of characters + and the first 128 characters are standard across encodings, + the encoding of most of the strings in a NeXus file will be a moot point. + Encoding in UTF-8 will be important when recording people's names + in ``NXuser`` and text notes in ``NXnotes``. + + .. https://github.com/nexusformat/NIAC/issues/23#issuecomment-308773465 + + .. index:: strings; variable-length + .. index:: strings; fixed-length + .. index:: strings; arrays + + .. https://github.com/nexusformat/definitions/issues/281 + + NeXus accepts both variable and fixed length strings, + as well as arrays of strings. + Software that reads NeXus data files should support + all of these. + + Some file writers write strings as a string array + of rank 1 and length 1. + Clients should be prepared to handle such strings. .. index:: binary data diff --git a/nxdlTypes.xsd b/nxdlTypes.xsd index 4ce9e1e8ea..79d28b9795 100644 --- a/nxdlTypes.xsd +++ b/nxdlTypes.xsd @@ -434,7 +434,18 @@ - any string representation + + any string representation + + All strings are to be encoded in UTF-8. + + Includes fixed-length strings, variable-length strings, + and string arrays. + + Some file writers write strings as a string array + of rank 1 and length 1. + Clients should be prepared to handle such strings. +