Xml is about elements and attributes that are put together hierarchically.
You can mark any data with any tag that you wish following this element example:
<tag>this text here is text</tag>
The data is between an opening tag and a closing tag. If there is no data, it could either be: <tag></tag> or as single tag: <tag/>
And you can make comments
<!-- This is a comment -->
Tags can have child tags
In this example the <car> tag is the root element, every xml file needs one (and just one) root element.
It can happen that the child tag is in the middle of the parents text. This is quite common in docbook when inside the text a url link is inserted. This splits the text in two the text and the tail text. The tail text needs to be specially handled. It would be logical to link the tail text to parent text, but this is usually not done by processing sw. The reason is that the parent text could be split multiple times and therefore the parent text would need an array of tail texts. Instead of that a single optional tail text is added to the child tag.
Tags can have attributes:
<tag attribute_name= "attribute value">
But it should considered hat the same thing can be achieved using child elements.
For documents the original XML should just define the meaning of the data (semantics) and should avoid to define how it is visually or audibly presented. Using stylesheets XML can then be translated into XML that has exchanged the tags defining the semantigs by tags showing how to print it.
To check if the sequence of tags is correct (after having installed expat) type xmlwf<filename>
.xml
An other program is xmllint (emerge libxml2)
Some tools create XML files with all in a single line and not using the CR character. To clean up such or similar xml the program tidy (package htmltidy) can be used. Type man tidy and use it as tidy <filename>
. To check and format xml files: tidy -xml -i -w 80 <filename>
and if you are sure you can write it back instead of just showing the result on the screen using the -m option: tidy -xml -i -w 80 -m <filename>
xml editors as screem can start tidy using its gui.
To validate if the xml has correct syntax use the online tool http://validator.w3.org/
or xmllint from dev-libs/libxml2:
DTD the Data Type Definition can be a separate file or included in xml and defines the rules of the tags in an xml file included in XML or linked:
xmllint --valid --noout <my>
.xml
or mention dtd:
xmllint --valid --dtdvalid<mywww>
.dtd --noout <my>
.xml