Empty elements
<value></value>
Or the (equivalent) shorthand notation:
<value/>
XML parsers will treat these fragments the same way, and the difference is considered merely syntactical.
However, the actual XML specification, specifies that the shorthand notation SHOULD be used only for elements that are declared empty (e.g. elements that must be empty as opposed to those that can be empty). It is safe to assume that this subtlety is lost on the best of us, so you should not draw conclusions from either notation.
You may wonder what empty elements are good for – especially the ones that must be empty. Why not just omit them?
One of the reasons is that empty elements are often used as ‘markers’ or ‘Booleans’ that need to be present for effect, but have no content. For example, in XHTML mark-up, the <br/> element inserts a line break, and the <hr/> element displays a horizontal ruler (on visual display units anyway). For these applications, emptiness will obviously not work out the same as absence.
More surprisingly though, empty is not always empty. When the XML is governed by a schema declaring that the element has default or fixed content, the parser will actually insert that content when the element is empty.
For the purpose of data exchange, you will seldom encounter marker elements or default content in schema (the idea of implicit values is somewhat alarming anyway). So usually, empty elements will be exactly that: empty.
Unfortunately, not all content models support empty elements. For an element that is defined as a string, empty content is just fine. But “empty numbers” or “empty dates” are indeed frowned upon by an XML parser.
So it gets better (or worse, depending on your point of view). Enter nillable elements.
Nillable elements
Therefore, XML schema allows us to define “nillable” elements, which can contain nil. Nil elements are easily spotted in XML instances, as they contain an attribute indicating nil content, like so (namespace declaration has been omitted):
<value xsi:nil=“true”></value>
or
<value xsi:nil=“true”/>
Note that “nillability” short-circuits all limitations on the content model, e.g. it applies to all types (not just strings) simple or complex, even if the content model explicitly forbids empty content.
Also note that nil elements must be empty (as opposed to empty elements, which we now know could be anything, including nil). In other words, nil is empty, but empty is not always nil.
Exercise for the reader: are the following valid?
<value xsi:nil=“false”>42</value>
Note that “nillability” short-circuits all limitations on the content model, e.g. it applies to all types (not just strings) simple or complex, even if the content model explicitly forbids empty content.
Also note that nil elements must be empty (as opposed to empty elements, which we now know could be anything, including nil). In other words, nil is empty, but empty is not always nil.
Exercise for the reader: are the following valid?
<value xsi:nil=“false”>42</value>
and
<value xsi:nil=“false”/>
Now that you know what empty (including nil) content can be used for, you may think it is quite rare in application integration. After all, how often do we actually need to exchange the “empty value”?
In fact, empty content is rather common. And it may cause you a lot of trouble if you are not prepared for it. Here are a few reasons for the – often unexpected - occurrence of empty elements (in addition to legitimate ones).
Reality check
In fact, empty content is rather common. And it may cause you a lot of trouble if you are not prepared for it. Here are a few reasons for the – often unexpected - occurrence of empty elements (in addition to legitimate ones).
Design consequence
When this happens a lot, it may be advised to review the design, e.g. consider making these elements optional. If you have no control over this, you should be prepared to deal with empty content in the proper way (which depends on the application).
Implementation side-effects or carelessness
Also, the tooling that is used to create the XML may add empty elements when optional content is “mapped” from one source to the other. Again, TIBCO Business Works has caught me off guard with an unfortunate mapping mode on more than one occasion.
Misconception
What can we do?
This is unfortunate, as it places an additional burden on the consumer. Most of the time, these elements have to be ignored – and as a result applications need to check not only for presence but also for non-emptiness (or non-nilness if that is even a word).
This is easily overlooked by developers and may cause you problems to no end. You will appreciate this when the unsuccessful conversion of an empty string to a number type causes your process to fault or when it - unknowingly propagated by you - wreaks havoc in an application downstream.
As always, be liberal in what you accept and conservative in what you produce. Use schema to validate output as well as input (a frightening number of applications fail to do this). Do not create empty content just because you can.
And educate those that do.