diff --git a/README.md b/README.md
new file mode 100644
index 0000000..eb81d38
--- /dev/null
+++ b/README.md
@@ -0,0 +1,5 @@
+# Confluence schema
+
+XML schema for the Atlassian Confluence storage format.
+
+See the [documentation](https://grahamhannington.github.io/confluence-schema).
\ No newline at end of file
diff --git a/catalog.xml b/catalog.xml
new file mode 100644
index 0000000..464bf1a
--- /dev/null
+++ b/catalog.xml
@@ -0,0 +1,8 @@
+
+ Paragraph of text. Paragraph of text. Paragraph of text. Paragraph of text. Paragraph of text.
+ Paragraph of text. Roughly in order of appearance on the Confluence rich text editor toolbar, from left to right. Paragraph Headings are a special type of block element. Quote
+ Bold
+
+ Italic
+
+ Underline
+ Colour
+
+ Subscript
+ Superscript
+
+ Bold Italic Underline Colour Paragraph nested in list item Paragraph nested in list item Paragraph (indented once) Paragraph (indented twice) Paragraph (align left) Paragraph (align center) Paragraph (align right)
+ Link to Google
+ Symbols: ♣ (clubs) Line Link to page:
+ Link to user:
+ Link to attachment:
+ Link to external page: Image on web:
+ Image attached to this page:
+
+
+
+
+ Left column Right column Roughly in order of appearance on the Confluence rich text editor toolbar, from left to right. Paragraph Headings are a special type of block element. Quote
+ Bold
+
+ Italic
+
+ Underline
+ Colour
+
+ Subscript
+ Superscript
+
+ Bold Italic Underline Colour Paragraph nested in list item Paragraph nested in list item Paragraph (indented once) Paragraph (indented twice) Paragraph (align left) Paragraph (align center) Paragraph (align right)
+ Link to Google
+ Symbols: ♣ (clubs) Line Link to page:
+ Link to user:
+ Link to attachment:
+ Link to external page: Image on web:
+ Image attached to this page:
+
+
+
+
+ Left column Right column Roughly in order of appearance on the Confluence rich text editor toolbar, from left to right. Paragraph Headings are a special type of block element. Quote
+ Bold
+
+ Italic
+
+ Underline
+ Colour
+
+ Subscript
+ Superscript
+
+ Bold Italic Underline Colour Paragraph nested in list item Paragraph nested in list item Paragraph (indented once) Paragraph (indented twice) Paragraph (align left) Paragraph (align center) Paragraph (align right)
+ Link to Google
+ Symbols: ♣ (clubs) Line Link to page:
+ Link to user:
+ Link to attachment:
+ Link to external page: Image on web:
+ Image attached to this page:
+
+
+
+
+ Left column Right column Example document for validation with a catalog
+— via the DOCTYPE public identifier and XML namespace URIs —
+rather than containing explicit references to a DTD and a schema
+— via the DOCTYPE system identifier and xsi:schemaLocation attribute.
+
+ Heading 1
+Heading 2
+Heading 3
+Heading 4
+Heading 5
+Heading 6
+Markup created using the rich text editor
+Markup that uses the same names as XHTML
+Markup: Headings
+Heading 1
+Heading 2
+Heading 3
+Heading 4
+Heading 5
+Heading 6
+Markup: Block elements
+Preformatted line 1
+
+
+Preformatted line 2
+
+Preformatted line 3
+
+Markup: Inline elements
+StrikethroughMonospace
+Strikethrough Subscript Superscript Monospace
+Markup: Lists
+
+
+
+
+Markup: Indent
+
+
+
+
+ Heading 5 (indented once)
+Markup: Alignment
+Heading 5 (align center)
+
+
+Markup: Links to web page
+Markup: Tables
+
+
+
+
+
+ Column heading 1
+ Column heading 2
+
+
+ Cell 1 Row 1 (spans two rows)
+ Cell 2 Row 1 (highlighted)
+
+
+ Cell 2 Row 2
+
+
+
+Cell 1 Row 3 (spans two columns)
+ Markup: Symbols
+Markup: Horizontal line
+
+Markup: Line break
+
+ breakProprietary Confluence markup (ac:* and ri:* elements)
+Markup: Links
+Markup: Images
+Markup: Emoticons
+Markup: Code Block macro
+Markup: Section and Column macros
+Markup created using the rich text editor
+Markup that uses the same names as XHTML
+Markup: Headings
+Heading 1
+Heading 2
+Heading 3
+Heading 4
+Heading 5
+Heading 6
+Markup: Block elements
+Preformatted line 1
+
+
+Preformatted line 2
+
+Preformatted line 3
+
+Markup: Inline elements
+StrikethroughMonospace
+Strikethrough Subscript Superscript Monospace
+Markup: Lists
+
+
+
+
+Markup: Indent
+
+
+
+
+ Heading 5 (indented once)
+Markup: Alignment
+Heading 5 (align center)
+
+
+Markup: Links to web page
+Markup: Tables
+
+
+
+
+
+ Column heading 1
+ Column heading 2
+
+
+ Cell 1 Row 1 (spans two rows)
+ Cell 2 Row 1 (highlighted)
+
+
+ Cell 2 Row 2
+
+
+
+Cell 1 Row 3 (spans two columns)
+ Markup: Symbols
+Markup: Horizontal line
+
+Markup: Line break
+
+ breakProprietary Confluence markup (ac:* and ri:* elements)
+Markup: Links
+Markup: Images
+Markup: Emoticons
+Markup: Code Block macro
+Markup: Section and Column macros
+Markup created using the rich text editor
+Markup that uses the same names as XHTML
+Markup: Headings
+Heading 1
+Heading 2
+Heading 3
+Heading 4
+Heading 5
+Heading 6
+Markup: Block elements
+Preformatted line 1
+
+
+Preformatted line 2
+
+Preformatted line 3
+
+Markup: Inline elements
+StrikethroughMonospace
+Strikethrough Subscript Superscript Monospace
+Markup: Lists
+
+
+
+
+Markup: Indent
+
+
+
+
+ Heading 5 (indented once)
+Markup: Alignment
+Heading 5 (align center)
+
+
+Markup: Links to web page
+Markup: Tables
+
+
+
+
+
+ Column heading 1
+ Column heading 2
+
+
+ Cell 1 Row 1 (spans two rows)
+ Cell 2 Row 1 (highlighted)
+
+
+ Cell 2 Row 2
+
+
+
+Cell 1 Row 3 (spans two columns)
+ Markup: Symbols
+Markup: Horizontal line
+
+Markup: Line break
+
+ breakProprietary Confluence markup (ac:* and ri:* elements)
+Markup: Links
+Markup: Images
+Markup: Emoticons
+Markup: Code Block macro
+Markup: Section and Column macros
+
+
This repository contains:
+You will need a validating XML editor.
+For example (there are many alternatives):
+The following procedure assumes that you have extracted all of the files in this package into one directory.
+confluence-page-example.xml
in your XML editor.Explore the DTD/XSD-aware features of your XML editor, such as +(depending on your particular editor):
+The Confluence Source Editor plugin ("advanced editor") displays the source of a page as an XML snippet: +a collection of XML elements, without a single root element.
+To edit the source as a document in a validating XML editor, you need to wrap the snippet in a root element.
+This package supplies example XML documents that use the root element name ac:confluence
.
You will need:
+confluence-page-template.xml
.+<!-- Replace this comment with your page source --> ++
<ac:confluence>
start tag and the </ac:confluence>
end tag.
+Do not select <ac:confluence>
or </ac:confluence>
.The Confluence WebDAV plugin serves page source in the same manner as the Confluence Source Editor plugin: as an XML snippet (without a root element),
+rather than an XML document; and with the file extension .txt
, rather than, say, .xml
.
Similar to the previous procedure for working with the Confluence Source Editor plugin, you need to wrap the snippet in a root element, +and then unwrap it before saving it back to Confluence.
+I have added a comment to the Confluence Storage Format page +requesting a change to the behavior of the WebDAV plugin, +so far without response from Atlassian.
+To validate an XML document, an XML editor needs to know where to find the DTD/XSD files.
+The supplied file confluence-page-template.xml
contains explicit references to confluence.dtd and confluence.xsd:
+<!DOCTYPE ac:confluence SYSTEM "confluence.dtd"> +<ac:confluence ... + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://www.atlassian.com/schema/confluence/4/ac/ confluence.xsd"> ++
This technique requires the DTD/XSDs to be in the same directory as the document (.xml). +(You can also use relative or absolute references to another directory.)
+However, if your XML editor supports catalogs, then your documents do not need to explicitly refer to the correct location of the DTD/XSDs. +Instead, the XML editor uses a catalog to locate these files.
+A catalog is a file that maps:
+(This specific mapping behavior applies to files that you are accessing via a file system +— which is typical of a document editing environment — rather than via the web.)
+The supplied file confluence-page-template-for-catalog.xml
is an example of a document that you can use with a catalog.
+The schemaLocation attribute is the same as before; the only difference is the DOCTYPE, which contains an FPI followed by a system identifier:
+<!DOCTYPE ac:confluence PUBLIC "-//Atlassian//Confluence 4 Page//EN" "http://www.atlassian.com/schema/confluence/4/confluence.dtd">
+
+(The system identifier does not, in practice, need to point to an actual resource; +the XML editor will, by preference, attempt to locate the DTD via the catalog, using the FPI).
+A catalog is supplied in the file catalog.xml
.
The method for making a catalog available to an editor depends on the particular editor.
+Edit RootCatalog.xml
in the XMLSpy installation folder
+(for example, C:\Program Files\Altova\XMLSpy2012\
),
+and insert the following element before
+the </catalog>
end tag:
+<nextCatalog catalog="drive letter:/directory path to Confluence schema package/catalog.xml"/> ++
Restart XMLSpy.
+Click Plugins ► Plugin Options... ► XML ► Catalogs, click the + (plus sign) button, and then select the supplied catalog.xml
.
Restart jEdit.
+Tip: In my experience, clicking ► Plugins ► XML ► Clear Resource Cache is not always effective.
+File | +Description | +
---|---|
catalog.xml | +OASIS XML catalog | +
confluence.dtd | +Document type definition (DTD) | +
confluence.xsd | +Master XSD (W3C XML 1.0 Schema document) | +
confluence2xhtml.xsl | +XSLT stylesheet: transforms Confluence storage format into XHTML (more like the rich text editor display than a "preview") | +
confluence-page-example.xml | +Example Confluence page source XML document | +
confluence-page-example-with-xslt.xml | +Example Confluence page source XML document containing a reference to the XSLT stylesheet confluence2xhtml.xsl (tip: open this in Firefox) | +
confluence-page-example-with-xslt-wiki.xml | +Example Confluence page source XML document containing a reference to the XSLT stylesheet wikifier/confluence2wiki.xsl (tip: open this in Firefox) | +
confluence-page-template.xml | +Example Confluence page source XML document with empty body | +
confluence-page-template-for-catalog.xml | +Example Confluence page source XML document for use with catalog (no explicit reference to local copy of DTD/XSD) | +
confluence-ri.xsd confluence-xhtml.xsd xml.xsd |
+Other XSD files used by the master XSD | +
index.html | +The file you are reading now | +
wikifier/* | +Wikifier source files | +
xhtml1-lat1.ent xhtml1-special.ent xhtml1-symbol.ent |
+XHTML character entity definitions (used in the Confluence DTD) | +
Wikifier is a web-based test harness for the XSLT stylesheet confluence2wiki.xsl +(supplied in the wikifier directory) +that transforms Confluence XML into wiki markup.
+To convert Confluence XML to wiki markup:
+To copy the wiki markup from Wikifier to your clipboard:
+Wikifier does not send your Confluence XML to a +server; all processing of your Confluence XML is done client-side.
+I have tested Wikifier in the following web browsers: IE9, and current +"production" versions of Chrome, Firefox, and Safari (all on Windows).
+Tip: Instead of using Wikifier, you can paste your XML into the supplied file +confluence-page-example-with-xslt-wiki.xml, and then open the file in Firefox to see the converted wiki markup.
+Wikifier is a minimal test harness for the XSLT stylesheet I have developed +to convert Confluence XML to wiki markup.
+The XSLT stylesheet is by no means complete. I welcome your feedback. If +Wikifier does not correctly convert some Confluence XML, please let me +know, and I will do what I can (no promises, though).
+Wikifier is not a replacement for the Confluence 3 wiki markup editor view.
+Wikifier is only a test harness; it is not intended to be a fully fledged +application. The XSLT stylesheet took me about a day and a half to +develop; same again for Wikifier (my cross-browser JavaScript coding +skills are both rudimentary and extremely rusty!).
+I developed the XSLT stylesheet for the following use case: to copy +relatively simple content from the current version of Confluence (4) to +the current version of JIRA (4).
+I did not develop the XSLT stylesheet to bring wiki markup back to Confluence. +However, if you want to, you can paste the wiki markup from Wikifier into:
+The XSLT stylesheet could be used as the "heart" of a plugin, although I have no immediate plan to do that.
+Bear in mind the following comment from Paul Curren (Atlassian):
+++Wiki markup can only represent a subset of what can be represented in XHTML.
+
What Paul says is true. For example, if you paste Confluence XML table +markup (which is, in this specific case, XHTML markup) with merged cells + into Wikifier, the resulting wiki markup will retain the table cell +contents, but will not retain the merged cell formatting.
+Also from Paul, also true:
+++just about anything is possible with appropriate development effort
+
I can imagine that it might, perhaps, be possible to develop new +Confluence 3 macros to match new capabilities in Confluence 4, and have +an XSLT stylesheet transform such Confluence 4 syntax into these new +macros (or even, say, as the contents of the existing Confluence 3 HTML +macro). For me, though, this is a purely academic issue. I can now copy +content from Confluence to JIRA, which is what I was after.
+To validate Confluence page source, you need either:
+You cannot validate Confluence page source with only the XSD, because Confluence page source
+can contain references to character entities (for example, —
) that can only be
+defined in a DTD. If you attempt to validate Confluence page source that contains character
+entity references, but you do not refer to the DTD, you will get an XML parsing error.
The DTD/XSDs have been tested using the following Confluence page source:
+A Confluence export consists of a .zip file containing a single entities.xml
file, which contains page source inside <property name="body">
elements.
I wrote a script to extract the contents of these elements into individual XML files, which I then validated using a Windows batch (.bat) file that calls xmllint, like this:
++for %%f in ("*.xml") do "%_xmllint%" --noent --nowarning --noout --loaddtd --schema "%_schema%" "%%f" >> c:\temp\log.txt 2>&1 ++
where:
+_xmllint
contains the path of the xmllint executable_schema
contains the path of confluence.xsd
+(xmllint does not look for schemaLocation attributes)+<!DOCTYPE ac:confluence SYSTEM "confluence.dtd"> ++
for the --loaddtd
option.
Tip: The <property name="body">
elements in entities.xml
wrap the page contents in a CDATA section.
+The page contents can also contain CDATA sections. However, nested CDATA sections are not allowed in XML, so, to avoid this issue,
+the ]]>
terminators of the CDATA sections in the page contents contain a space (]] >
). When extracting the page contents
+into individual XML files, you need to remove these spaces.
Highlighted items in the following listing are my own coinage, not approved by Atlassian:
++<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE ac:confluence PUBLIC "-//Atlassian//Confluence 4 Page//EN" "http://www.atlassian.com/schema/confluence/4/confluence.dtd"> +<ac:confluence + xmlns:ac="http://www.atlassian.com/schema/confluence/4/ac/" + xmlns:ri="http://www.atlassian.com/schema/confluence/4/ri/" + xmlns="http://www.atlassian.com/schema/confluence/4/"> +<p>Contents of page body</p> +</ac:confluence> ++
Value: ac:confluence
Notes:
+ac:page
might be a better choice of name for this root element (more specific).Values:
++http://www.atlassian.com/schema/confluence/4/ac/ +http://www.atlassian.com/schema/confluence/4/ri/ +http://www.atlassian.com/schema/confluence/4/ ++
Notes:
+--nowarning
option to suppress warnings about loading a schema that has already been loaded. Suggestions welcome.+<p>Text inside a paragraph element.</p> +illegal text ++
I wish to thank the following people for their assistance and/or encouragement in developing this package:
+Most recent changes first:
+Date yyyy-mm-dd |
+Description | +
---|---|
2024-09-26 | +Stored these files in GitHub, published as a GitHub Pages site. | +
2012-06-06 | +Prettified this readme. | +
2012-05-01 | +
+
|
+
2012-04-23 | +
+
|
+
2012-04-18 | +
+
|
+
2012-04-13 | +
+
|
+
2012-04-12 | +First draft of DTD. | +
This package and its contents are distributed under the BSD 2-Clause license (also known as the Simplified BSD license):
+Copyright © 2012, Fundi Software
+All rights reserved.
+Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met:
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
++ This schema document describes the XML namespace, in a form + suitable for import by other schema documents. +
++ See + http://www.w3.org/XML/1998/namespace.html and + + http://www.w3.org/TR/REC-xml for information + about this namespace. +
++ Note that local names in this namespace are intended to be + defined only by the World Wide Web Consortium or its subgroups. + The names currently defined in this namespace are listed below. + They should not be used with conflicting semantics by any Working + Group, specification, or document instance. +
++ See further below in this document for more information about how to refer to this schema document from your own + XSD schema documents and about the + namespace-versioning policy governing this schema document. +
++ denotes an attribute whose value + is a language code for the natural language of the content of + any element; its value is inherited. This name is reserved + by virtue of its definition in the XML specification.
+ ++ Attempting to install the relevant ISO 2- and 3-letter + codes as the enumerated possible values is probably never + going to be a realistic possibility. +
++ See BCP 47 at + http://www.rfc-editor.org/rfc/bcp/bcp47.txt + and the IANA language subtag registry at + + http://www.iana.org/assignments/language-subtag-registry + for further information. +
++ The union allows for the 'un-declaration' of xml:lang with + the empty string. +
++ denotes an attribute whose + value is a keyword indicating what whitespace processing + discipline is intended for the content of the element; its + value is inherited. This name is reserved by virtue of its + definition in the XML specification.
+ ++ denotes an attribute whose value + provides a URI to be used as the base for interpreting any + relative URIs in the scope of the element on which it + appears; its value is inherited. This name is reserved + by virtue of its definition in the XML Base specification.
+ ++ See http://www.w3.org/TR/xmlbase/ + for information about this attribute. +
++ denotes an attribute whose value + should be interpreted as if declared to be of type ID. + This name is reserved by virtue of its definition in the + xml:id specification.
+ ++ See http://www.w3.org/TR/xml-id/ + for information about this attribute. +
++ denotes Jon Bosak, the chair of + the original XML Working Group. This name is reserved by + the following decision of the W3C XML Plenary and + XML Coordination groups: +
++++ In appreciation for his vision, leadership and + dedication the W3C XML Plenary on this 10th day of + February, 2000, reserves for Jon Bosak in perpetuity + the XML name "xml:Father". +
+
+ This schema defines attributes and an attribute group suitable
+ for use by schemas wishing to allow xml:base
,
+ xml:lang
, xml:space
or
+ xml:id
attributes on elements they define.
+
+ To enable this, such a schema must import this schema for + the XML namespace, e.g. as follows: +
++ <schema . . .> + . . . + <import namespace="http://www.w3.org/XML/1998/namespace" + schemaLocation="http://www.w3.org/2001/xml.xsd"/> ++
+ or +
++ <import namespace="http://www.w3.org/XML/1998/namespace" + schemaLocation="http://www.w3.org/2009/01/xml.xsd"/> ++
+ Subsequently, qualified reference to any of the attributes or the + group defined below will have the desired effect, e.g. +
++ <type . . .> + . . . + <attributeGroup ref="xml:specialAttrs"/> ++
+ will define a type which will schema-validate an instance element + with any of those attributes. +
++ In keeping with the XML Schema WG's standard versioning + policy, this schema document will persist at + + http://www.w3.org/2009/01/xml.xsd. +
++ At the date of issue it can also be found at + + http://www.w3.org/2001/xml.xsd. +
++ The schema document at that URI may however change in the future, + in order to remain compatible with the latest version of XML + Schema itself, or with the XML namespace itself. In other words, + if the XML Schema or XML namespaces change, the version of this + document at + http://www.w3.org/2001/xml.xsd + + will change accordingly; the version at + + http://www.w3.org/2009/01/xml.xsd + + will not change. +
++ Previous dated (and unchanging) versions of this schema + document are at: +
+ +