childish toys

I count religion but a childish toy, and hold there is no sin but ignorance.
Jew of Malta, Christopher Marlowe

Occasionally, indeed almost cyclically, on some of the mailing lists I’m on a big theoretical war erupts where someone declares “XML is DEAD: We should all move to using $Thing“. Though to be honest, it could be any format or technology, not just XML.

Sometimes these are well-meaning hunters of the new and shiny: Someone has heard about this brand new shiny $Thing technology and heard that it is the replacement for XML technologies (or whatever existing technologies) and that we should all start using it. With little or no critical examination of their sources, perhaps a shiny youtube promotional video, this then starts a long and usually fruitless discussion. One of the reasons that $Thing technology is quicker, shinier, and much more fun, is that it has dropped lots of the baggage of the old technology — eventually people will realise that baggage was there for a reason and slowly add it back, but this time to a framework not designed to incorporate it. People chip in from both sides but the status quo remains.

Sometimes it is naively theoretically-based: Someone notices, or reads about, the inherent problems in XML (or whatever existing technologies) and sees that using $Thing technology doesn’t have those problems (and either doesn’t notice the other problems it does have, or they don’t apply to their narrow use-case). The poster in this case wants to know “Is this really the next big thing?” but is, or should be at least, open to the reasons why it isn’t. This usually brings up discussion by posters on both sides picking flaws in one of the technologies or the others or recycling of long-dead myths. (“XML has a problem with overlapping hierarchies, $Thing doesn’t! Ha!“, “There are lots of solutions to overlapping hierarchies in XML which enable you to use all these nice tools.”, “Ah, but you can’t do stand-off markup in XML or represent a graph!“, “Erm, yes, you can. Honestly, URI-based pointing, Out of line markup, Linking multiple disparate resources by various taxonomies, all common in XML“, etc.) This sniping back and forth is hardly productive and just makes people think there is a problem where there isn’t. People chip in from both sides and the status quo remains.

Sometimes it is sophisticatedly theoretically-based: Some philosophical guru has been studying the various technologies for quite some time and expresses that the problems inherent in one, from their point of view, are dealt with more elegantly in $Thing technology. This is probably true, but is mostly done as a theoretical exercise of trying to perfect the ideal technology and express it in a form that is elegant, beautiful, and rational. More often than not this results in a particular instance of $Thing technology that solves problems that most people didn’t really care about, and although it may be elegant it is not human readable and there is only the guru’s personal implementation of anything that reads it which works for their use-case.  While potentially useful, it is not pragmatic for the majority of people to care about it until it has reached mass adoption.  It will never reach mass adoption because this guru, let’s say, isn’t interested in community building. People will gently comfort the technological genius who doesn’t understand why we persist with the well-supported but suboptimal, and the status quo remains.

Sometimes it is religiously-based: A devotee of $Thing technology, or a die-hard opponent of XML (or whatever existing technology) finds some news article or development which they can use to claim the superiority and mass-adopting of $Thing technology.  The use of $Thing technology in this instance is then cast as a slow but measurable demise of XML (or whatever existing technology).  The increase in use of one technology is not necessarily related to the demise of another technology, and this may be misleading for people viewing this exchange. In my opinion it is usually intellectually dishonest to present such a news article or development as the death knell for another technology, especially when both can happily co-exist, and especially when it is done consciously as a technique by the devotee to intentionally discredit the existing technology. Dislike of any particular technology because of its flaws is reasonable, but doing so blindly is not what users should be basing the technological decisions on.  Users of the existing technology defend their conscious decision not to be trendy and inexperienced users choose $Thing technology because of the hype and then contribute to that hype. People chip in from both sides and try to patiently convert the masses or correct the fallacies of the devotee, but the status quo remains.

Sometimes it is implementation-based: A programmer needing to process lots of XML (or whatever existing technology) runs into a problem, often a limitation by the poor implementation of the libraries they are using, and either bemoans or is advised that $Thing technology doesn’t have these problems and look comes with the wonderful library of tools. People counter showing how if the programmer had been using the appropriate tools the problem would be easier to solve. Others point to the growing code base for $Thing technology and and get shown the huge amount of tools for the existing technology.  The code base might be growing because people have seen that $Thing technology is missing support for all their special cases, and thus it agglomerates bits and pieces of new areas of support. People chip in from both sides with examples of how their chosen technology does one thing better, or how they are all bad, but the status quo remains.

There are of course other ways this arises and plays out, and different actors playing many parts. In my case I find almost any of these discussions pathetically juvenile. How many times do we have to say it:

IT ISN’T ABOUT THE FORMAT, IT IS ABOUT GRANULARITY OF INFORMATION AND APPROPRIATE TECHNOLOGIES FOR APPROPRIATE USES!

Instead, lets help each other do good and useful things rather than needlessly wasting spare cycles proclaiming death or triumph of one useful format or technology over another. To do otherwise is tiring, pathetic, and just a waste of everyone’s time. Sure, any new project needs to get good and sensible advice on what formats, technologies, and methodologies are suitable for their project. These are rarely determined by abstract considerations of the inherent properties of the format, technology, or methodology, however, and instead are determined by what the staff already know, what the local infrastructure will support, and what will give the most useful answers to the research questions with the least amount of investment. The childish toys alluded to by my appropriation of Marlowe here isn’t the formats themselves, but the arguments people have about them.  Sure, geek out and enjoy the intricacies of your chosen technologies,  but if you find yourself posting to a mailing list how your $Thing technology is better than some other technology, please have a long hard look in the mirror and go do something more useful with your life.

Although I spend a lot of my time immersed in the world of one particular technology, XML, that doesn’t mean I need to believe it is the right and true answer for all situations.  If I was designing a mobile phone app, at the time of writing I’d almost certainly be using JSON or an SQLite DB for data storage. If I was constructing an ontology then RDF would be the way to go. If I want to structurally query a large number of documents I’d use a NoSQL Document Database like eXist-db.  If I’m encoding dearly held and deeply nested semantics in the text of a medieval manuscript … I would have to be a complete lunatic to sit down and hand-encode this in JSON or RDF.  In that case I’d use TEI XML, because of the power of schema constraints and validation to enforce consistency, human readable nature of it, and its resilience for long-term preservation.  I’d do this knowing that I could convert my work to any format I needed based on the granularity of the markup I provided. They are all appropriate at different times and places, what the base storage format is depends a lot on your project’s needs, the sources of information, and the technology stack you have available to you.

The growth in one of these or other technologies doesn’t ipso facto indicate in any way the ‘death’ of any other technology. Technology will always change, things will always move on. But we should never celebrate even the perception of the marginalisation of widely adopted formats — useful legacy data migration of existing resources, no matter what the format, takes time and effort. Some technologies will eventually become less supported and the mainstream with be using one new $Thing technology or other. This has happened before and will happen again.

I’m all for pointing out the technologies chosen by good and interesting projects, and learning from their successes, but even more importantly their failures, but this should be done honestly with a desire for education, not blindly with trolling attempts to start a war where there really isn’t any argument.

More people are using $Thing technology? This well-known project has adopted $Thing technology as one of their outputs? Great! Isn’t it good that people are using all these wonderful technologies… what is even more important is what they are doing with them! Maybe we should ask them why they chose to do that rather than making assumptions about the lifecycle of technologies? In fact, one things that contributes to the strength and power of modern information systems design is the ability to work between multiple formats simultaneously and sometimes even automatically. For example, to store something as XML, but auto-generate a subset of that as JSON metadata to then in a web frontend to link to some PDFs and EPUBs generated from the same XML. To say that “if you want to use JSON you shouldn’t being using XML”, is like saying “if you want to play with a Princess Elsa Doll, then you shouldn’t play with a Batman Action Figure”. It is nonsensical.  Anyone who thinks you can’t play with both just doesn’t deserve the oxygen of being listened to.

Posted in XML | Leave a comment

Leave a Reply