Speculative HTML parser

Google

Standardizing the Speculative HTML Parser

Challenge

In 2020, Bocoup specified and wrote conformance tests for the speculative HTML parser. However, the specification pull request was still pending review. In February 2021, Henri Sivonen, implementer of the HTML parser in Gecko, left a few outstanding review comments that needed to be addressed. Our objective was to address the review comments for the speculative HTML parser spec change and get it merged into the HTML standard.

Solution

There were two material review comments.

The first was requesting that speculative fetches should be allowed to happen both during speculative parsing and also during normal parsing. The potential performance gain is smaller for the normal parsing case, but if parsing happens in parallel to JavaScript execution, it can still make a difference and at least Gecko already does this. We changed the specification to allow (but not require) speculative fetches during normal parsing, during element creation by the tree builder.

The second review comment was that a specific URL should not be allowed to be speculatively fetched multiple times. We changed the specification to maintain a list of speculative fetch URLs, to match Gecko’s implementation strategy.

Impact

The spec PR and the tests were reviewed and merged in September 2021. Browser engines now have a specification for what the correct behavior is for the speculative HTML parser optimization and a test suite to highlight interoperability issues. Web developers can (in theory) reason about and predict what should be fetched for their HTML markup.

In addition, specifying the speculative HTML parser allowed for specifying meta charset scanning in terms of the speculative HTML parser, which is more closely aligned with the behavior of WebKit and Chromium compared to the current specification, which only Gecko had implemented as specified.

Contact Us

We'd love to hear from you. Get in touch!

Mail

P.O. Box 961436
Boston, MA 02196