It’s pretty popular these days to kick XML, all the cool kids are doing it and they don’t seem to discriminate between its ‘good’ purposes and ‘bad’ uses.
The general comment seems to be: "XML is bad m’kay?"
Well, I disagree.
Sure, its a crappy programming language, a verbose wire protocol, its got a stack of super complicated standards and it’s been perverted into doing things you only find in the darker parts of the net, but it’s pretty good at one thing: Representing structured documents.
You know, what it actually was designed for.
So if you want to store (and/or exchange) Invoices, Insurance Policies, Application Forms, basically anything where structure is essential but variation inherent – XML is going to be your best friend.
But, its not all smooth sailing and over the years of doing Business to Business XML based messaging across the Retail, Supply Chain, Transport and Insurance industries I’ve learned a few things:
Things not to worry about
Put concerns about performance and tooling to one side – performance isn’t an issue (most of the time) and no matter what you choose, all the tooling still sucks (sadly). Learn to work with it
Things you should worry about
Don’t neglect your information design
Too often tools like XML Spy give people the impression that Schema design is a task that can be undertaken without to much thought. (And in some cases, that can be true).
But its not true for any document of any complexity, and just as you shouldn’t let random developers make uncontrolled additions to your domain model or database schema, you must protect your schemas with governance and review processes.
If you can get a single person to be your "Schema Lord", do it.
Version control your schema from the get go
You haven’t got it right and it will have to change. Count it in from the beginning
Personally, I like the month/year style of dating the XSD files (myschema_092009.xsd). And remember, you only need to change versions if its a breaking change.
All industry ‘standards’ suck
In my world, suck is defined as designed by committee. As such they will be large, complicated and likely to be so abstract they do not relate to your problem.
Tooling often chokes on them as well, with all those imports and custom types, you often will hit breakages in your tooling. Don’t trust it until you see it working yourself.
Further more, with the standards being so big their is a very strong chance that all parties will not have the same implementation of the ‘standard.
Approach adoption of industry ‘standards’ with a pinch of salt and box of asprin – and be ready to cater for differences between implementations with any of your trading partners.
XML schema validation is only the beginning of your validation requirements
XML schema validation only validates a limited number of things, its not covering all of your problems. Further more, most schema validation feedback is total garbage – you can’t show it to an end user, and you likely can’t even send it to a technical party. Keep this in mind.
You might need to deal with stuff that doesn’t validate
Got that fancy schema? Lots of mandatory elements because you want to be super awesome? Great. I need to save the document as ‘draft’ – its a form that the user hasn’t finished filling in yet.
Ah, that tooling you’ve got where the first thing it does is validate the schema, yeah, it breaks now….
The people you deal with, might not ‘get it’.
There are a lot of people out there for whom XML is just a bunch of < and > signs and nothing else. CDATA? Validation? Attributes? Root elements? Encoding? meh, its not important.
So before you design your super format, check that the other party knows the basics and isn’t just treating XML asa CSV format with lots of extra crud in it.
Sadly, your righteous indignation over their inability to conform to "the right way" may be morally correct, but it tends not to convince others, especially if they are paying the bills.
Conclusion
This is all pretty off the cuff stuff on my part, if you feel the need to have me elaborate, leave a comment
One Response
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.
You may want to look at vtd-xml as the state of the art in XML processing, consuming far less memory than DOM
http://vtd-xml.sf.net