What’s that smell?

There is a school of thought that XML is a be all and end all file format, suitable for anything requiring any kind of structured text file. This is OK when the text to be marked up is, well, text. Like this document. What bothers me is that there is a very very common class of file that stores essentially key-value pairs. Typically this is configuration information for a program. Crucially, it’s also typically information that requires hand editing.

An example is assigning values to variables. In a programming language, it might be done like this:

Value = 1;
Colour = "red";

where you don’t actually need the equals signs, the quotes or the semi-colons. In XML, this tends to come out like this:

<Value>1</Value>
<Colour>red</Colour>

or this

<entry Value="1" Colour="red" />

or this

<entry key="Value" value="1" />
<entry key="Colour" value="red" />

or, even

<entry>
  <key>Value</key>
  <value>1</value>
</entry>
<entry>
  <key>Colour</key>
  <value>red</value>
</entry>

In fact, if you take the view that the content should still be there when the markup is removed then that last one is the “right” one.

Alternatives

There are a whole bunch of far more suitable formats for key-value pairs. One of the most persuasive (to me) is the .ini format that used to come with Windows:

# This is a comment
[Section]
Value = 1
Colour = red

In fact, it’s this format that’s also used by Torvalds in git:

[core]
        repositoryformatversion = 0
        filemode = true
[remote "origin"]
        url = git+ssh://www.site.org/blah.git
        fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
        remote = origin
        merge = refs/heads/master

ALSA has a nice enough format:

pcm.!default {
   type             plug
   slave.pcm       "softvol"
}

pcm.softvol {
   type            softvol
   slave {
       pcm         hw:UA25EX
   }
   control {
       name        "SoftMaster"
       card        1
   }
}

Not even Timber Nerdsley uses XML for key-value pairs. CSS is key-value pairs and looks like this:

body {
        margin-left: 3em;
        margin-right: 3em;
        margin-top: 3em;
        margin-bottom: 3em;
        font-family: sans-serif;
}
img.mugshot {
        margin-right: 1ex;
}

Even the guys who write XML Schema acknowledge that XML is not the thing to use to write it: http://en.wikipedia.org/wiki/RELAX_NG

Edit: (years later) JSON is the thing, it’s awesome. Lube speaks JSON natively.