Modeling DOM Events
Tags: Browsers, JavaScript, Semantics
Posted on 17 July 2012.In previous posts, we’ve talked about our group’s work on providing an operational semantics for JavaScript, including the newer features of the language. While that work is useful for understanding the language, most JavaScript programs don’t run in a vacuum: they run in a browser, with a rich API to access the contents of the page.
That API, known as the Document Object Model (or DOM), consists of several parts:
- A graph of objects encoding the structure of page (This graph is optimistically called a "tree" since the HTML markup is indeed tree-shaped, but this graph has extra pointers between objects.),
- Methods to manipulate the HTML tree structure,
- A sophisticated event model to allow scripts to react to user interactions.
What makes this event programming so special?
To a first approximation, the execution of every web page looks roughly like: load the markup of the page, load scripts, set up lots of event handlers … and wait. For events. To fire. Accordingly, to understand the control flow of a page, we have to understand what happens when events fire.
Let’s start with this:
<div id="d1"> In outer div <p id="p1"> In paragraph in div. <span id="s1" style="background:white;"> In span in paragraph in div. </span> </p> </div> <script> document.getElementById("s1").addEventListener("click", function() { this.style.color = "red"; }); </script>
If you click on the text "In span in paragraph in div"
the event listener that gets added to element span#s1
is
triggered by the click, and turns the text red. But consider the
slightly more complicated example:
<div id="d2"> In outer div <p id="p2"> In paragraph in div. <span id="s2" style="background:white;"> In span in paragraph in div. </span> </p> </div> <script> document.getElementById("d2").addEventListener("click", function() { this.style.color = "red"; }); document.getElementById("s2").addEventListener("click", function() { this.style.color = "blue"; }); </script>
Now, clicking anywhere in the box will turn all the text red. That
makes sense: we just clicked on the <div>
element, so its listener fires. But clicking on the <span>
will turn
it blue and still turn the rest red. Why? We didn’t click on
the <div>
! Well, not directly…
The key feature of event dispatch, as implemented for the DOM, is that
it takes advantage of the page structure. Clicking on an element of
the page (or typing into a text box, moving the mouse over an
element, etc.) will cause an event to fire "at" that element: the
element is the target of the event, and any event listener
installed for that event on that target node will be called. But in
addition, the event will also trigger event listeners on
the ancestors of the target node: this is called
the dispatch path. So in the example above,
because div#d2
is an ancestor of span#s2
,
its event listener is also invoked, turning the text red.
What Could Possibly Go Wrong?
In a word: mutation. The functions called as event listeners are arbitrary JavaScript code, which can do anything they want to the state of the page, including modifying the DOM. So what might happen?
- The event listener might move the current target in the page. What happens to the dispatch path?
- The event listener adds (or removes) other listeners for the event being dispatched. Should newly installed listeners be invoked before or after existing ones? Should those listeners even be called?
- The event listener tries to cancel event dispatch. Can it do so?
- The listener tries to (programmatically) fire another event while the current one is active. Is event dispatch reentrant?
- There are legacy mechanisms to add event "handlers" as well as listeners. How should they interact with listeners?
Modeling Event Dispatch
Continuing our group’s theme of reducing a complicated, real-world system to a simpler operational model, we developed an idealized version of event dispatch in PLT Redex, a domain-specific language embedded in Racket for specifying operational semantics. Because we are focusing on exactly how event dispatch works, our model does not include all of JavaScript, nor does it need to—instead, it includes a miniature statement language containing the handful of DOM APIs that manipulate events. Our model does not include all the thousands of DOM properties and methods, instead including just a simplified tree-structured heap of nodes: this is all the structure we need to faithfully model the dispatch path of an event.
Our model is based on the DOM Level 3 Events specification. It expresses the key behaviors of event dispatch, and does so far more compactly than the spec: roughly 1000 lines of commented Redex code replace several pages’ worth of (at times self-contradictory!) requirements that are spread throughout a spec over a hundred pages long. From this concise model, for example, we can easily extract a state machine describing the key stages of dispatch: From this state machine, it’s much easier to answer the questions raised above, precisely and formally. For example, if an event listener moves the event target in the page, nothing happens to the dispatch path: only the first state of the machine constructs the dispatch path, while all the others just read from it. Done! It’s unfortunate that this state machine isn't sketched in the spec anywhere…
Moreover, the model is executable: Redex allows us to construct test cases—randomly, systematically, or ad-hoc, as we choose—and then run them through our model and see what output it produces. Even better, we can export our tests to HTML and JavaScript, and run them in real browsers and compare results: Most importantly, our model agrees with all browsers on most test cases: this gives us confidence that our model is faithful to the intent of the spec. But not all test cases—not too surprisingly, we identified examples where real-world browsers differ in their behavior. Under our reading of the spec, at least one of these browsers is wrong—but since the spec is so intricate, it is easy to see why browsers have a hard time agreeing in all cases!
What’s Done
Here’s what we’ve got so far:
- A PLT Redex model of event dispatch,
- An annotated copy of the DOM Level 3 Events spec, showing exactly which lines of our model correspond to which text in the spec, and
- A paper describing the model (and some applications of it) in greater detail.
What’s Next
Since our original JavaScript semantics was also written in Redex, we can combine our model of event dispatch with the JavaScript one, for a much higher-fidelity model of what event listeners can do in a browser setting. Then of course there are further applications, such as building a precise control-flow analysis of web pages and analyzing their code. And other uses? If you’re interested in using our model, let us know!