Friday, May 27, 2022

The advent of SDA

 A long time ago, somewhere in the year 2008, I had this conviction that XML was over-complicated and too bloated for the purpose of simple data transfer or application integration, and that there ought to be a simpler alternative.

At that time, JSON was not as widespread as it is nowadays, and I entirely overlooked it. So rather than seeing this as the obvious alternative and go about my day, I designed my own "language".

After completing the specification, I realized I reinvented the wheel, but by that time it was already too late and SDA had been conceived. Not being content with a specification alone, I then set out to write a parser to process SDA content.

Incidentally, SDA is short for Structured (or Simple) DAta, and it looks like this:


addressbook {
    contact {
firstname "Alice"
phonenumber "06-11111111"
phonenumber "06-22222222"
    }
    contact {
firstname "Bob"
phonenumber "06-33333333"
phonenumber "06-44444444"
    }
}

This might represent your contacts in a format suitable to some hypothetical application, and it would be trivial to extend this with addresses and all kinds of other information. The SDA format doesn't get more complicated than this, so I am sure you get the idea.


My inner programmer especially likes the curly braces. It is certainly easier to read than XML, and less bloated when compared to  


<addressbook>
    <contact>
<firstname>Alice</firstname>
<phonenumber>06-11111111</phonenumber>
<phonenumber>06-22222222</phonenumber> 
    </contact>
    <contact>
<firstname>Bob</firstname>
<phonenumber>06-33333333</phonenumber>
<phonenumber>06-44444444</phonenumber> 
    </contact>
</addressbook>


In fact, I think it looks better than JSON, too:


{
  "addressbook": {
    "contact": [
      {
        "firstname": "Alice",
        "phonenumber": [
          "06-11111111",
          "06-22222222"
        ]
      },
      {
        "firstname": "Bob",
        "phonenumber": [
          "06-33333333",
          "06-44444444"
        ]
      }
    ]
  }
}

Thumbs up for the braces. The array notation is nifty, though in large and/or nested arrays it is hard to keep track of what you are looking at, without the object names. And what's with all those punctuation marks?

So there you have it, SDA. I expect nothing short of world domination somewhere around the year 2032.

But seriously, at that time it didn't feel like SDA was going to make a difference anywhere, and it still does not. It might have been somewhat useful for human-readable configuration files but nowadays YAML seems to be more than appropriate for that. So what I was left with was a pet project and an exercise in writing a parser.

I more or less forgot about the whole thing until 2020, when I came across the Java sources in a back-up. And before I knew it, I found myself designing a schema language to describe SDA content, which I called SDS - Simple Data Schema. 

Which looks like this:


schema {
    node {
        name "addressbook"
        node {
            name "contact" occurs "0..*"
            node { name "firstname" type "string" }
            node { name "phonenumber" type "string" occurs "1..*" }
        }
    }
}


This formally describes our hypothetical address book, and as you can see, SDS itself is written in SDA, very much like XSD is written in XML and JSON schema in JSON. 

If you think I was taking this thing way too serious, you are probably right. But this was during the COVID-19 pandemic, and I really needed to do something.

Once the SDS specification was complete I wrote a parser for it, as well as a validator to check the syntactical correctness of SDA data against a corresponding SDS.

Is all this useful to you? Probably not.

But it's out there.


https://github.com/hclbaur/sda-core

https://github.com/hclbaur/sds-core


Sunday, April 25, 2021

Configuring SSL for EMS connections in TIBCO off-the-shelf adapters

When I started working with TIBCO products in 2004, the running gag was that on every project there should be a consultant hired directly from TIBCO, to help with the more advanced - and conspicuously undocumented - features of the software. Deceptively easy to install, configure and use, snags and gotchas were never far away, and you needed said consultant to help you out (or to call the helpline available only to insiders). After hours of fruitless trial and error it would usually turn out that some "secret" property in a TRA file was enough to make things work (and make you tear out your hair in frustration).

I was reminded of this when I was trying to configure SSL for an SAP/R3 Adapter with EMS connections in a BW 5.14 project. Although the adapter configuration does not depend on a shared EMS connection resource, configuration looks similar enough. So I approached it in the same way; by referring to a Trusted Certificates directory internal to the project, and creating the infamous global variable BW_GLOBAL_TRUSTED_CA_STORE, which we all know will be used at run-time to locate the external directory holding the trusted CA certificates.

Except, of course, that this won't work. 
The adapter SDK pre-dates BW by at least 5 years, and doesn't know this trick. If you missed this, don't feel bad - it wasn't included in the documentation prior to version 7.2.0, which is the second to last version compatible with BW5. Anyway, it clearly states that you need to (surprise!) add a property a TRA file, and create a special global variable. Specifically, it tells you to

  • add this property to the designer.tra: RuntimeExternalCertificatesFeature true,
  • then add a global variable named RuntimeCertificatesDirectory, and 
  • refer to %%RuntimeCertificatesDirectory%% in the Trusted Certificates Folder field.

Except, of course, that this won't work either. 
Because it should be java.property.RuntimeExternalCertificatesFeatureWithout the prefix, Designer will remain blissfully ignorant of your efforts. When in doubt, open the AdapterDefinitions/<adapter>.adr3 file in your project, and look for a line reading: 

<AEService:runtimeCertificateDir>%%RuntimeCertificatesDirectory%%</AEService:runtimeCertificateDir>

which is ultimately what will make the adapter read the external CA certificates at run-timeExcept, of course, when it still doesn't. 

Maybe you assumed that (like BW_GLOBAL_TRUSTED_CA_STOREthe global variable must contain a URL, like file:///<directory> ? If yes, you should change that to a regular directory path.

If at this point the adapter still refuses to start, or - when run on the command line - it throws an unhelpful exception at you, don't tear out your hair just yet. 

At this point, the aforementioned consultant would step in and reveal - grinning - that trusted certificates MUST have a filename extension of .pem, for SDK adapters to find them...

Man, those were the days.





Thursday, June 27, 2019

Connectivity and port forwarding for Kafka clients


When I started with Kafka, I had some questions about connectivity, and ran into seemingly inexplicable issues when trying to set up port forwarding for clients that have no direct access to the brokers or zookeeper. During my search I noticed that I wasn’t the only one struggling with this, so I decided to dedicate a short blog to this topic after I figured it out.

The picture below shows a small Kafka cluster of 2 brokers and a zookeeper, all with default port settings; both brokers listen on port 9092 and zookeeper on port 2181. Also we have a client that wishes to connect to the cluster.


Connectivity-wise you should be aware that:

-    brokers need to be able to access each other, not just zookeeper.
-    clients need to be able to access all brokers, not just the one they bootstrap from.
-    clients do not have to use zookeeper (at least not since API release 0.9.0).

If you run Kafka in an environment that is heavily firewalled, this picture should tell you which connectivity to arrange.

So what if a client has no direct access to the cluster? This may arise when you want to use some Kafka GUI or the kafkacat command line tool on your laptop from an office network. If you are allowed access by means of a bastion server (jump host, stepping stone, etc.) you can forward a local port to a remote one over a secure SSH connection.

However, from the picture it should be obvious that this will never work when both brokers are running on port 9092 (on different servers). After all, you cannot just forward one local port to two remote destinations. And even if you could (with a load balancer) the client would have no way to specify which broker to connect to. Incidentally, this is why you should not use a load balancer to access a Kafka cluster.

One solution that may work for you, is to put the brokers on different ports. This is shown in the picture below:


If – for instance – you tell broker2 to listen on port 9093, you can set up your port-forwarding so that local port 9092 relays to broker1:9092 and local port 9093 to broker2:9093, avoiding a conflict.

This can be arranged simply by changing just one property in the server.properties of broker2 (this example uses SASL over SSL, but your mileage may vary):

listeners=SASL_SSL://:9093

With this set-up, your local client can connect to either localhost:9092 or localhost:9093 or even localhost:2181 if you are port-forwarding to zookeeper as well. Remember that broker1 needs connectivity to broker2 on port 9093, otherwise the whole thing won’t work.

Note:
  • you will need to modify your local hosts file so that broker1 and broker2 both resolve to localhost (127.0.0.1). This is because by default, brokers will not advertise their listeners on localhost.
  • clients with direct access need to use the correct bootstrapping endpoint (or use zookeeper) to obtain a valid broker endpoint. So they must be configured to use either broker1:9092 and/or broker2:9093 (or zookeeper:2181).
Since this trick involves changing the hosts file (which may be OK on your laptop but not elsewhere) it is considered more of a hack than a solution.

There is an alternative that uses a standard feature of the broker to advertise its listeners. In this scenario, you keep the default port 9092 for clients with direct access and for inter-broker communication, and you configure additional ports for clients that use port forwarding.

This can be arranged (appropriate changes in boldface) in the server.properties of broker1:

listeners=SASL_SSL://:9092,PORTFWD://:9093

advertised.listeners=SASL_SSL://:9092,PORTFWD://localhost:9093


listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,PORTFWD:SASL_SSL


Make similar changes for broker2, but use port 9094 instead of 9093. Afterwards, your set-up will look like this:

  
Arrange your port-forwarding so that local port 9093 relays to broker1:9093 and local port 9094 to broker2:9094. Your forwarded clients can connect to either localhost:9093 or localhost:9094, while clients with direct access can still use broker1:9092 or broker2:9092. Again, keep in mind that the brokers need connectivity to their peers on ports 9093 and 9094.

The upside of this scenario is that you do not need to modify your hosts file, because port 9093 and 9094 are advertised on localhost by the brokers.

The downside is that zookeeper will dutifully report all broker endpoints (listener groups), but the clients may have no way of knowing which one to use. So you will probably lose the option to bootstrap a connection through zookeeper.

If you have remarks or find a mistake, let me know!

Wednesday, July 3, 2013

Using XSLT to transform a WSDL (so SAP will understand).

If you ever created a SOAP service in TIBCO BusinessWorks and used the WSDL that can be exported from the service agent to generate a service consumer in SAP, you will be familiar with this issue: SAP is very sensitive to the order of the elements in the WSDL - and as it happens, TIBCO BW puts the elements in the "wrong" order.

Specifically, SAP expects the order to be: types, message, portType, binding and service. Anything else will result in an error. The topic of this post is not who's wrong and who's right, but I'm pretty sure SAP is wrong in assuming that the order of the elements after types is relevant.

Obviously you can just edit the WSDL and change the order of the elements manually, but if you have to do this often... So, here is an XSLT that I routinely use to re-arrange the WSDL and keep SAP happy (with the element order that is; it may complain about a million other things):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/">

    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/wsdl:definitions">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:copy-of select="wsdl:types"/>
            <xsl:copy-of select="wsdl:message"/>
            <xsl:copy-of select="wsdl:portType"/>
            <xsl:copy-of select="wsdl:binding"/>
            <xsl:copy-of select="wsdl:service"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>


I use free tools like XSLT Test and Firstobject XML to do my work, but you should be fine using your favorite tools (caveat emptor).  

Make sure the encoding of the output WSDL is actually UTF-8 without BOM, or you will upset SAP again!

So far, so good. Now let's try something more rewarding. On many occasions I had a requirement to secure the service with a WSS signing policy. That is, the service will accept only signed requests. But how do we get SAP to sign the request? As it turns out, you must provide the policy bindings in the WSDL, so SAP will add them during import (you will probably have to tweak some settings in SOA manager as well, but I'm no SAP expert).

TIBCO BusinessWorks does not add these policies when you export the WSDL, and adding them manually in the format that SAP expects is rather cumbersome and prone to error. So for this I use an XSLT as well, which I'll include below for your convenience.

Have fun!

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy"
xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"
xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd">

    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/wsdl:definitions">
        <xsl:copy>
      
            <xsl:copy-of select="@*"/>                      
            <wsp:UsingPolicy wsdl:required="true"/>
          
            <!-- binding policies -->
            <xsl:for-each select="wsdl:binding">
                <wsp:Policy>
                    <xsl:attribute name="wsu:Id">
                        <xsl:value-of select="concat('BN_BN_',@name)"/>
                    </xsl:attribute>
                    <saptrnbnd:OptimizedXMLTransfer uri="http://xml.sap.com/2006/11/esi/esp/binxml" xmlns:saptrnbnd="http://www.sap.com/webas/710/soap/features/transportbinding/" wsp:Optional="true"/>
                    <saptrnbnd:OptimizedXMLTransfer uri="http://www.w3.org/2004/08/soap/features/http-optimization" xmlns:saptrnbnd="http://www.sap.com/webas/710/soap/features/transportbinding/" wsp:Optional="true"/>
                    <wsp:ExactlyOne xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy" xmlns:sapsp="http://www.sap.com/webas/630/soap/features/security/policy" xmlns:sp="http://docs.oasis-open.org/ws-sx/ws-securitypolicy/200702" xmlns:wsa="http://www.w3.org/2005/08/addressing" xmlns:wst="http://docs.oasis-open.org/ws-sx/ws-trust/200512" xmlns:wsu="http://schemas.xmlsoap.org/ws/2002/07/utility" xmlns:wsx="http://schemas.xmlsoap.org/ws/2004/09/mex">
                        <wsp:All>
                            <sp:AsymmetricBinding>
                                <wsp:Policy>
                                    <sp:InitiatorSignatureToken>
                                        <wsp:Policy>
                                            <sp:X509Token sp:IncludeToken="http://docs.oasis-open.org/ws-sx/ws-securitypolicy/200702/IncludeToken/Never">
                                                <wsp:Policy>
                                                    <sp:WssX509V3Token10/>
                                                </wsp:Policy>
                                            </sp:X509Token>
                                        </wsp:Policy>
                                    </sp:InitiatorSignatureToken>
                                    <sp:AlgorithmSuite>
                                        <wsp:Policy>
                                            <sp:Basic128Rsa15/>
                                        </wsp:Policy>
                                    </sp:AlgorithmSuite>
                                    <sp:Layout>
                                        <wsp:Policy>
                                            <sp:Strict/>
                                        </wsp:Policy>
                                    </sp:Layout>
                                    <sp:IncludeTimestamp/>
                                    <sp:OnlySignEntireHeadersAndBody/>
                                </wsp:Policy>
                            </sp:AsymmetricBinding>
                            <sp:Wss10>
                                <wsp:Policy>
                                    <sp:MustSupportRefKeyIdentifier/>
                                </wsp:Policy>
                            </sp:Wss10>
                            <sp:SignedParts>
                                <sp:Body/>
                                <sp:Header Name="Trace" Namespace="http://www.sap.com/webas/630/soap/features/runtime/tracing/"/>
                                <sp:Header Name="messageId" Namespace="http://www.sap.com/webas/640/soap/features/messageId/"/>
                                <sp:Header Name="CallerInformation" Namespace="http://www.sap.com/webas/712/soap/features/runtime/metering/"/>
                                <sp:Header Name="Session" Namespace="http://www.sap.com/webas/630/soap/features/session/"/>
                                <sp:Header Name="To" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="ReplyTo" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="From" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="Action" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="FaultTo" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="MessageID" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="RelatesTo" Namespace="http://schemas.xmlsoap.org/ws/2004/08/addressing"/>
                                <sp:Header Name="To" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="ReplyTo" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="From" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="Action" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="FaultTo" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="MessageID" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="RelatesTo" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="ReferenceParameters" Namespace="http://www.w3.org/2005/08/addressing"/>
                                <sp:Header Name="Sequence" Namespace="http://schemas.xmlsoap.org/ws/2005/02/rm"/>
                                <sp:Header Name="SequenceAcknowledgement" Namespace="http://schemas.xmlsoap.org/ws/2005/02/rm"/>
                                <sp:Header Name="AckRequested" Namespace="http://schemas.xmlsoap.org/ws/2005/02/rm"/>
                                <sp:Header Name="SequenceFault" Namespace="http://schemas.xmlsoap.org/ws/2005/02/rm"/>
                                <sp:Header Name="Sequence" Namespace="http://docs.oasis-open.org/ws-rx/wsrm/200702"/>
                                <sp:Header Name="AckRequested" Namespace="http://docs.oasis-open.org/ws-rx/wsrm/200702"/>
                                <sp:Header Name="SequenceAcknowledgement" Namespace="http://docs.oasis-open.org/ws-rx/wsrm/200702"/>
                                <sp:Header Name="SequenceFault" Namespace="http://docs.oasis-open.org/ws-rx/wsrm/200702"/>
                                <sp:Header Name="UsesSequenceSTR" Namespace="http://docs.oasis-open.org/ws-rx/wsrm/200702"/>
                                <sp:Header Name="UsesSequenceSSL" Namespace="http://docs.oasis-open.org/ws-rx/wsrm/200702"/>
                            </sp:SignedParts>
                        </wsp:All>
                    </wsp:ExactlyOne>
                </wsp:Policy>
            </xsl:for-each>          
          
            <!-- portType policies -->
            <xsl:for-each select="wsdl:portType">
                <wsp:Policy>
                    <xsl:attribute name="wsu:Id">
                        <xsl:value-of select="concat('IF_IF_',@name)"/>
                    </xsl:attribute>
                    <sapsession:Session xmlns:sapsession="http://www.sap.com/webas/630/soap/features/session/">
                        <sapsession:enableSession>false</sapsession:enableSession>
                    </sapsession:Session>
                </wsp:Policy>
            </xsl:for-each>
          
            <!-- operation policies -->
            <xsl:for-each select="wsdl:portType/wsdl:operation">
                <wsp:Policy>
                    <xsl:attribute name="wsu:Id">
                        <xsl:value-of select="concat('OP_IF_OP_',@name)"/>
                    </xsl:attribute>
                    <sapcomhnd:enableCommit xmlns:sapcomhnd="http://www.sap.com/NW05/soap/features/commit/">false</sapcomhnd:enableCommit>
                    <sapblock:enableBlocking xmlns:sapblock="http://www.sap.com/NW05/soap/features/blocking/">true</sapblock:enableBlocking>
                    <saptrhnw05:required xmlns:saptrhnw05="http://www.sap.com/NW05/soap/features/transaction/">no</saptrhnw05:required>
                    <saprmnw05:enableWSRM xmlns:saprmnw05="http://www.sap.com/NW05/soap/features/wsrm/">false</saprmnw05:enableWSRM>
                </wsp:Policy>
            </xsl:for-each>
                      
            <!-- and the rest -->          
            <xsl:copy-of select="wsdl:types"/>
            <xsl:copy-of select="wsdl:message"/>
            <xsl:apply-templates select="wsdl:portType"/>
            <xsl:apply-templates select="wsdl:binding"/>
            <xsl:copy-of select="wsdl:service"/>
          
        </xsl:copy>
    </xsl:template>

    <xsl:template match="wsdl:portType">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <wsp:Policy>
                <wsp:PolicyReference>
                    <xsl:attribute name="URI">
                        <xsl:value-of select="concat('#IF_IF_',@name)"/>
                    </xsl:attribute>
                </wsp:PolicyReference>
            </wsp:Policy>
            <xsl:apply-templates select="wsdl:operation"/>
        </xsl:copy>
    </xsl:template>
   
    <xsl:template match="wsdl:operation">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <wsp:Policy>
                <wsp:PolicyReference>
                    <xsl:attribute name="URI">
                        <xsl:value-of select="concat('#OP_IF_OP_',@name)"/>
                    </xsl:attribute>
                </wsp:PolicyReference>
            </wsp:Policy>
            <xsl:copy-of select="*"/>
        </xsl:copy>
    </xsl:template>
   
    <xsl:template match="wsdl:binding">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <wsp:Policy>
                <wsp:PolicyReference>
                    <xsl:attribute name="URI">
                        <xsl:value-of select="concat('#BN_BN_',@name)"/>
                    </xsl:attribute>
                </wsp:PolicyReference>
            </wsp:Policy>
            <xsl:copy-of select="*"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

 



Monday, April 9, 2012

Empty elements in XML (much ado about nothing)

Most XML tutorials will explicitly mention that empty elements in XML are allowed. And then surprisingly little of them will tell you what they are for. On a superficial view, empty elements are a concept too trivial to discuss. But now that many vendors support XML based integration with their products, you may be in for a surprise or two.

Empty elements


As most of you will be aware of, it is syntactically valid to produce XML elements that have no content, such as:

<value></value>

Or the (equivalent) shorthand notation:

<value/>

XML parsers will treat these fragments the same way, and the difference is considered merely syntactical.

However, the actual XML specification, specifies that the shorthand notation SHOULD be used only for elements that are declared empty (e.g. elements that must be empty as opposed to those that can be empty). It is safe to assume that this subtlety is lost on the best of us, so you should not draw conclusions from either notation.

You may wonder what empty elements are good for – especially the ones that must be empty. Why not just omit them?

One of the reasons is that empty elements are often used as ‘markers’ or ‘Booleans’ that need to be present for effect, but have no content. For example, in XHTML mark-up, the <br/> element inserts a line break, and the <hr/> element displays a horizontal ruler (on visual display units anyway). For these applications, emptiness will obviously not work out the same as absence.

More surprisingly though, empty is not always empty. When the XML is governed by a schema declaring that the element has default or fixed content, the parser will actually insert that content when the element is empty.

For the purpose of data exchange, you will seldom encounter marker elements or default content in schema (the idea of implicit values is somewhat alarming anyway). So usually, empty elements will be exactly that: empty.

Unfortunately, not all content models support empty elements. For an element that is defined as a string, empty content is just fine. But “empty numbers” or “empty dates” are indeed frowned upon by an XML parser.

So it gets better (or worse, depending on your point of view). Enter nillable elements.

Nillable elements


Sometimes, empty content is the data. If we need to indicate that a value should be (re)set to some undefined value (like the ‘null’ in databases or many programming languages) we require a way to convey that the value is “explicitly empty” rather than unspecified (for whatever reasons).

Therefore, XML schema allows us to define “nillable” elements, which can contain nil. Nil elements are easily spotted in XML instances, as they contain an attribute indicating nil content, like so (namespace declaration has been omitted):

<value xsi:nil=“true”></value> 
or 
<value xsi:nil=“true”/>

Note that “nillability” short-circuits all limitations on the content model, e.g. it applies to all types (not just strings) simple or complex, even if the content model explicitly forbids empty content.

Also note that nil elements must be empty (as opposed to empty elements, which we now know could be anything, including nil). In other words, nil is empty, but empty is not always nil.

Exercise for the reader: are the following valid?

<value xsi:nil=“false”>42</value> 
and 
<value xsi:nil=“false”/>

Reality check


Now that you know what empty (including nil) content can be used for, you may think it is quite rare in application integration. After all, how often do we actually need to exchange the “empty value”?

In fact, empty content is rather common. And it may cause you a lot of trouble if you are not prepared for it. Here are a few reasons for the – often unexpected - occurrence of empty elements (in addition to legitimate ones).

Design consequence


If the schema declares an element as a mandatory string (or nillable) type, but there is no value available in the integration layer, the only way to escape a run-time validation error is to create empty content. Those familiar with TIBCO Business Works (should) know that this is the default behaviour when you map a non-existing optional element to a required one. In this case it silently inserts an empty element.

When this happens a lot, it may be advised to review the design, e.g. consider making these elements optional. If you have no control over this, you should be prepared to deal with empty content in the proper way (which depends on the application).

Implementation side-effects or carelessness


Even if elements are declared optional, empty content is often created “by accident”, in particular when not creating it takes additional effort. For example, consider an operation (a function, a subroutine, a method) that is designed to return a string value. If the return value is not explicitly checked, an empty element might be inserted whenever an empty string is returned.

Also, the tooling that is used to create the XML may add empty elements when optional content is “mapped” from one source to the other. Again, TIBCO Business Works has caught me off guard with an unfortunate mapping mode on more than one occasion.

Misconception


Some producers of XML create empty elements because they think they are supposed to - or that providing optional empty elements is in some way superior to omitting them. They might do this as “living proof” that the elements were considered in the process of creation (and not accidentally overlooked). Or they believe that by supplying empty elements they are doing you a favour; now you can easily spot them. The point is - there is not always a point.

What can we do?


It is my experience that due to implementation side-effects and misconception, empty elements are abundant in real life XML-enabled applications. Those that create the XML are often unprepared to “fix” this because they are not violating any rules – after all, empty elements are allowed.

This is unfortunate, as it places an additional burden on the consumer. Most of the time, these elements have to be ignored – and as a result applications need to check not only for presence but also for non-emptiness (or non-nilness if that is even a word).

This is easily overlooked by developers and may cause you problems to no end. You will appreciate this when the unsuccessful conversion of an empty string to a number type causes your process to fault or when it - unknowingly propagated by you - wreaks havoc in an application downstream.

As always, be liberal in what you accept and conservative in what you produce. Use schema to validate output as well as input (a frightening number of applications fail to do this). Do not create empty content just because you can.

And educate those that do.