HTML Made Special

Ignore What You Don’t Need to Enforce

In most domain-specific data formats, the default expectation is that a processor will process the entirety of the content and understand how to meaningfully operate on all of it. For HTML vernaculars, this expectation is less useful.

If you are defining a vernacular that concerns itself with ancillary parts of a document, that much should be obvious. For instance, a vernacular describing how to capture EXIF metadata inside of a figure element would not be all that useful if it forbade all manners of content outside of that given figure. You expect it to be embedded.

But this consideration holds even for vernaculars intended to capture the primary content of a given page. For example, the Scholarly HTML vernacular intends to describe the entirety of a scientific article, which will typically be most of the content in a given page. Nevertheless, it clearly states that it concerns itself solely with the content of the first article element found in the document, and deliberately ignores the rest. This allows people to produce content that can be usefully interpreted as Scholarly HTML while fully retaining the ability to surround that content with all manners of other markup such as navigation, header and footer, and whatever markup needs to be wrapped around the article for stylistic purposes.

In addition to making vernaculars more useful (since they give authors a lot of leeway in organising their HTML around their content) it also makes them composable. There is no reason to stick to using just the one vernacular. For instance, one could easily use Scholarly HTML together with html-version-spec.

↖︎ Back to list