This generality comes at a cost, however. When your goal is the interchange of specialised information within a given domain, the full generality of HTML can get in the way. Your applications should not have to sift through some vague information soup in order to extract the data that they require, and equally they need an obvious way of encoding their data in such a way that other applications will reliably understand it.
That is the traditional purview of domain-specific data formats. These are often relatively simple (at least compared to the fullness of HTML) data formats, typically encoded in XML or JSON, with predictable rules that (one hopes) enable the interoperable interchange of information.
Domain-specific data formats are, however, not a panacea. They require conversion to an output format (usually HTML, sometimes PDF) in order to be consumed, which is often a lossy operation. This can make it harder to users to interact with them; it also means that round-tripping them to and from user-friendly formats is an error-prone process. Not being directly published on the Web, their content tends to be less discoverable.
“Vernacular HTML” is the growing practice of creating domain-specific data formats using the HTML platform. Its goal is to enable interoperable, specialised data interchange while retaining as much of HTML’s full power as is sensible for the given domain. It is not a technology so much as an incipient body of good practices.
Bodies of good practices are suspicious things. Why not do and advocate through example rather than talk about doing? In some cases, though, it is worth sitting down and writing up a few tips and tricks acquired through experience that might not be immediately obvious from just looking at what others are doing.
HTML already has several extensibility mechanisms, ranging from well-known ones such as
data-* to less obvious parts like
also has (sadly, several) ways of overlaying additional semantics atop its markup:
RDFa, and Microformats. Its elements and attributes can at times
have non-obvious meanings, a fact made worse by the occasionally cryptic language in which
they are described in the specification. Additionally, HTML’s processing rules make it
possible to mint your own elements and attributes and expect a reliable DOM to come out at
the other end. With all of these options it is no surprise that people could find creating
their own vernacular daunting.
The HTML specification does contain some notes about how to use this or that extensibility feature, but it is worth noting that standard authors can at times suffer from self-important notions of right and wrong. Not that their advice should be ignored wholesale, but it needs to be appraised through the sieve of your detailed knowledge of the specific problem you are solving.
Our approach here is of a decidedly pragmatic bent. We simply wish to exchange good ideas and good examples; never-ending debates about the significance of semantic meaning are very much out of scope.
Much of the time, if you sit down and start typing out examples, you will quickly get a good feel for the kind of markup you want for your specific use case. Modulo some syntactical variation, the overall shape of it is likely to be obvious. What may not be so obvious are the bits that can trip you farther down the line. As a result, many of the practices listed here are actually bad practices, thing you should probably not be doing. I say “probably” because the philosophy at play here is that, ultimately, you know what you’re doing (if not, no amount of advice will save you). You should have no qualms about ignoring a red flag found here; all we’re saying is that it is likely a good idea to think about it.
The guidelines below commonly make use of examples drawn from XML vocabularies. That is because the problems one is likely to encounter in the creation XML and HTML languages have a fair amount of overlap, and the XML examples tend to be more outrageously wrong. Don’t let that lead you to thinking that using HTML makes you safe from such mistakes — if the problem is listed here, someone smart has hit that problem before.
Vernacular HTML is a practice that existed long before it was given a name — as well it should be! Here is a short list of some examples (with varying qualities of documentation). If you wish to expand it, please simply file a pull request.
data-ng-*attributes, and a few other options) to add interactivity to HTML.