Modeling DOM Events

Tags: Browsers, JavaScript, Semantics

Posted on 17 July 2012.

In previous posts, we’ve talked about our group’s work on providing an operational semantics for JavaScript, including the newer features of the language. While that work is useful for understanding the language, most JavaScript programs don’t run in a vacuum: they run in a browser, with a rich API to access the contents of the page.

That API, known as the Document Object Model (or DOM), consists of several parts:

  • A graph of objects encoding the structure of page (This graph is optimistically called a "tree" since the HTML markup is indeed tree-shaped, but this graph has extra pointers between objects.),
  • Methods to manipulate the HTML tree structure,
  • A sophisticated event model to allow scripts to react to user interactions.
These three parts of the DOM interact with one other, making reasoning about any one of them in isolation challenging. Moreover, the specs describing them are long, heavily self-referential, and difficult to understand incrementally. So what to do?

What makes this event programming so special?

To a first approximation, the execution of every web page looks roughly like: load the markup of the page, load scripts, set up lots of event handlers … and wait. For events. To fire. Accordingly, to understand the control flow of a page, we have to understand what happens when events fire.

Let’s start with this:

  <div id="d1">
    In outer div
    <p id="p1">
      In paragraph in div.
      <span id="s1" style="background:white;">
        In span in paragraph in div.
      </span>
    </p>
  </div>
  <script>
    document.getElementById("s1").addEventListener("click",
      function() { this.style.color = "red"; });
  </script>
Requires JavaScript enabled to view the example

If you click on the text "In span in paragraph in div" the event listener that gets added to element span#s1 is triggered by the click, and turns the text red. But consider the slightly more complicated example:

  <div id="d2">
    In outer div
    <p id="p2">
      In paragraph in div.
      <span id="s2" style="background:white;">
        In span in paragraph in div.
      </span>
    </p>
  </div>
  <script>
    document.getElementById("d2").addEventListener("click",
      function() { this.style.color = "red"; });
    document.getElementById("s2").addEventListener("click",
      function() { this.style.color = "blue"; });
  </script>
Requires JavaScript enabled to view the example

Now, clicking anywhere in the box will turn all the text red. That makes sense: we just clicked on the <div> element, so its listener fires. But clicking on the <span> will turn it blue and still turn the rest red. Why? We didn’t click on the <div>! Well, not directly…

The key feature of event dispatch, as implemented for the DOM, is that it takes advantage of the page structure. Clicking on an element of the page (or typing into a text box, moving the mouse over an element, etc.) will cause an event to fire "at" that element: the element is the target of the event, and any event listener installed for that event on that target node will be called. But in addition, the event will also trigger event listeners on the ancestors of the target node: this is called the dispatch path. So in the example above, because div#d2 is an ancestor of span#s2, its event listener is also invoked, turning the text red.

What Could Possibly Go Wrong?

In a word: mutation. The functions called as event listeners are arbitrary JavaScript code, which can do anything they want to the state of the page, including modifying the DOM. So what might happen?

  • The event listener might move the current target in the page. What happens to the dispatch path?
  • The event listener adds (or removes) other listeners for the event being dispatched. Should newly installed listeners be invoked before or after existing ones? Should those listeners even be called?
  • The event listener tries to cancel event dispatch. Can it do so?
  • The listener tries to (programmatically) fire another event while the current one is active. Is event dispatch reentrant?
  • There are legacy mechanisms to add event "handlers" as well as listeners. How should they interact with listeners?

Modeling Event Dispatch

Continuing our group’s theme of reducing a complicated, real-world system to a simpler operational model, we developed an idealized version of event dispatch in PLT Redex, a domain-specific language embedded in Racket for specifying operational semantics. Because we are focusing on exactly how event dispatch works, our model does not include all of JavaScript, nor does it need to—instead, it includes a miniature statement language containing the handful of DOM APIs that manipulate events. Our model does not include all the thousands of DOM properties and methods, instead including just a simplified tree-structured heap of nodes: this is all the structure we need to faithfully model the dispatch path of an event.

Our model is based on the DOM Level 3 Events specification. It expresses the key behaviors of event dispatch, and does so far more compactly than the spec: roughly 1000 lines of commented Redex code replace several pages’ worth of (at times self-contradictory!) requirements that are spread throughout a spec over a hundred pages long. From this concise model, for example, we can easily extract a state machine describing the key stages of dispatch:

From this state machine, it’s much easier to answer the questions raised above, precisely and formally. For example, if an event listener moves the event target in the page, nothing happens to the dispatch path: only the first state of the machine constructs the dispatch path, while all the others just read from it. Done! It’s unfortunate that this state machine isn't sketched in the spec anywhere…

Moreover, the model is executable: Redex allows us to construct test cases—randomly, systematically, or ad-hoc, as we choose—and then run them through our model and see what output it produces. Even better, we can export our tests to HTML and JavaScript, and run them in real browsers and compare results:

Comparing a test model (tree structure, event listeners, and an event to be fired) in our semantics, and in various browsers.
Most importantly, our model agrees with all browsers on most test cases: this gives us confidence that our model is faithful to the intent of the spec. But not all test cases—not too surprisingly, we identified examples where real-world browsers differ in their behavior. Under our reading of the spec, at least one of these browsers is wrong—but since the spec is so intricate, it is easy to see why browsers have a hard time agreeing in all cases!

What’s Done

Here’s what we’ve got so far:

What’s Next

Since our original JavaScript semantics was also written in Redex, we can combine our model of event dispatch with the JavaScript one, for a much higher-fidelity model of what event listeners can do in a browser setting. Then of course there are further applications, such as building a precise control-flow analysis of web pages and analyzing their code. And other uses? If you’re interested in using our model, let us know!