? Structure node naming strategy
Help on Structure node naming strategy :
This strategy allows to reduce drastically the size of the structure nodes, hence the stream itself, and is ready for solving several performance issues including size and parsing speed.
As a definition, a node is either an element or attribute.
Element and attribute names are usually chosen so they are self-descriptive. While this looks like an advantage over binary formats, it has an overhead on size just because even in English, keywords enclosing content take statistically a significant space, resulting to a great contribution to the overall stream size. This can be avoided by enforcing a new strategy on naming described below.
The process of choosing a names for all nodes of an Xml structure is based on what is allowed by the W3C Xml recommendation itself. In other words, an element or attribute is any combination of letters and digits. With that in hand, why not make these names as short as possible ? Let us take an example with the now wellknown bookshop Xml sample :
Let's build a map of name pairs:
 Bookstore  becomes A
 Book       becomes B
 Genre      becomes C
 In_Stock   becomes D
 Title      becomes E
So we get the following equivalent Xml document :
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE Bookstore SYSTEM "bookshop_A.dtd">
<A>
   <!-- J&R Booksellers Database -->
   <B C="Thriller" D="Yes">
      <E>The Round Door</E>
   </B>
</A>
which helps to reduce by 41 bytes this simple structure, or 33%.
Again this transform raises the requirement to change the structure of the Xml stream, as explained for other potential gains such like obtained by flattening patterns (see above in this report). This gain can be cumulated to other gains described elsewhere in this report. Close this one
Help on Nodename minimum size :
This indicator shows the minimum size of all element and attribute names, in bytes (UTF8 encoding), regardless of the content they enclose. If the value is high, it means that the description is likely to be intuitive and human readable, but the designer didn't care much about the overhead produced on the overall stream.
Hint : do as much as you can to keep this value low. The node naming strategy can help you stick with a minimum of 1-byte, hence promote stream size reduction. Close this one
Help on Nodename maximum size :
This indicator shows the maximum size of all element and attribute names, in bytes (UTF8 encoding), regardless of the content they enclose. If the value is high, it means that the description is likely to be intuitive and human readable, but the designer didn't care much about the overhead produced on the overall stream.
Hint : do as much as you can to keep this value low. The node naming strategy can help you stick with a low maximum, hence promote stream size reduction. Close this one
Help on Nodename Mean size :
This indicator shows the average element and attribute names in the Xml stream. It's a key descriptor of the nodename distribution, along with the standard deviation. If the value is high, bad luck, you have a great overhead in your Xml stream.
Hint : do as much as you can to keep this value low. The node naming strategy can help you stick with a low maximum, hence promote stream size reduction. Close this one
Help on Nodename Standard Deviation :
This indicator shows how is distributed the Xml structure wrt the names of element and attribute names. It's a key descriptor of the nodename distribution, along with the mean. If the standard deviation is low, say below 1, the structure is compact and stays around the mean. In other words, names are equivalent in size in the overall structure. On the contrary, if the standard deviation is high, the distribution is proportionally uniform and flat, which means that names have quite different sizes. Questionable.
NB: the standard deviation is mathematically speaking the 2nd order moment of the name size distribution. Close this one
Help on Occurences of Nodenames :
That's how many times element or attributes whose names have the given length appear in the Xml stream. For instance, that may be how many times the <book> element and year attribute appear. In this case, we should watch for 'Occurences of nodenames 4-byte sized' in the report.
In front of it is the resulting size fraction of the structure, also presented as a percentage. The size in bytes for an element is not always times x namesize, because some elements have end tags, thus we must double its contribution in this case. Close this one
Help on Listing of elements and attributes with a given name size :
The node names enclosed with ( and ) are those that appear in the Xml stream with a given name size.
Close this one
Help on Total size of structure :
The total size of the structure is the sum of bytes in the Xml stream made of element and attributes, regardless of content itself : in other words, the total size of this structure :
<Book Genre="Thriller" In_Stock="Yes">
  <Title>The Round Door</Title>
</Book>
is the same than this one :
<Book Genre="" In_Stock="">
  <Title></Title>
</Book>
In front of the size in bytes is the equivalent in percent over all the stream size.
Hint : if this percent is significant, say above 30%, then you may just question the structure of the Xml stream wrt its ratio of real information vs garbage.Close this one
Help on Nodename Gain :
This is the gain in percent as a result of enforcing the node naming strategy, as described above. In front is the resulting size in bytes of the sole structure once updated.
This gain can be cumulated to other gains described elsewhere in this report. Close this one