HTML Made Special

Naïve Versioning

Versioning, as explained previously, is important enough that it deserves to be done right. Yet some languages have taken a rather naïve approach to it typically consisting in a version or data-foo-version attribute on the root element or other such simplistic schemes built on the presence of a version indicator. That is fine if the purpose is to die immediately when a given version is not supported (which is a bit rude and only really justified in extreme cases), but will not produce any useful effect if the intent is to allow processors to work across versions.

Indeed, what is such a processor to do if it sees a version attribute with a value greater than the version it supports? Nothing useful comes to mind, short of warning the user that there may be rendering issues, a message which will be either ignored by the user or panic-inducing, but will not yield any useful result. Conversely, if the version attribute points to an earlier version, should features from later versions that nevertheless appear in the content be ignored? Should older bugs be emulated? That would make implementations unduly complex.

When producing content, it is easily admitted that using the smallest possible version that includes all of the needed functionality is a good practice as it will enable the largest usage by older implementations. But doing so properly requires authors to know for a given list of language constructs which is the lowest version number that comprises them all. That is asking a lot, and in practice authors will likely fall back in such situations to using the highest version number that they can get away with. Either way, version information will often be out of synch (either through error, or bit rot, or organic growth, or because the content is composed from multiple sources) with the actual content. This tendency is strong enough that relying on version metadata is largely useless, unless applied to content that is exclusively produced under tight control by programmatic means — a situation that is exceedingly rare for web formats.

But even without considerations that include hard to demonstrate behaviour from putative users, the following very short decision tree can be followed when looking at whether a version indicator would be useful:

Are processors expected to process content across version boundaries?

Then each version is actually a different language (i.e. they are not mutually intelligible). Just change the language completely. You don‘t need a version indicator.
The processors will have to be defined so that they can apply default values, error handling, and other similar rules intended to render unknown constructs sufficiently intelligible. There is nothing which a version indicator could add on top of what they already do. You don‘t need a version indicator.

As simple as it seems, a zombie of this debate will almost always arise. Kill it early, kill it often.

↖︎ Back to list