Typechecking Uses of the jQuery Language
Tags: Browsers, JavaScript, Types
Posted on 17 January 2014.Manipulating HTML via JavaScript is often a frustrating task: the APIs for doing so, known as the Document Object Model (or DOM), are implemented to varying degrees of fidelity across different browsers, leading to browser-specific workarounds that contort the code. Worse, the APIs are too low-level: they amount to an “assembly language for trees”, allowing programmers to navigate from a node to its parent, children or adjacent siblings; add, remove, or modify a node at a time; etc. And like assembly programming, these low-level operations are imperative and often error-prone.
Fortunately, developers have created libraries that abstract away from this low-level tedium to provide a powerful, higher-level API. The best-known example of these is jQuery, and its API is so markedly different from the DOM API that it might as well be its own language — a domain-specific language for HTML manipulation.
With great powerlanguage comes great responsibility
Every language has its own idiomatic forms, and therefore has its own characteristic failure modes and error cases. Our group has been studying various (sub-)languages of JavaScript— JS itself, ADsafety, private browsing violations— so the natural question to ask is, what can we do to analyze jQuery as a language?
This post takes a tour of the solution we have constructed to this problem. In it, we will see how a sophisticated technique, a dependent type system, can be specialized for the jQuery language in a way that:
- Retains decidable type-inference,
- Requires little programmer annotation overhead,
- Hides most of the complexity from the programmer, and yet
- Yields non-trivial results.
Engineering such a system is challenging, and is aided by the fact that jQuery’s APIs are well-structured enough that we can extract a workable typed description of them.
What does a jQuery program look like?
Let’s get a feel for those idiomatic pitfalls of jQuery by looking at a few tiny examples. We’ll start with a (seemingly) simple question: what does this one line of code do?
$(".tweet span").next().html()With a little knowledge of the API, jQuery’s code is quite
  readable: get all the ".tweet span" nodes, get
  their next() siblings, and…get the HTML code of
  the first one of them.  In general, a jQuery expression consists of
  three parts: a initial query to select some nodes, a
  sequence of navigational steps that select nearby nodes
  related to the currently selected set, and finally
  some manipulation to retrieve or set data on those node(s).
This glib explanation hides a crucial assumption: that there exist nodes in the page that match the initial selector in the first place! So jQuery programs’ behavior depends crucially on the structure of the pages in which they run.
Given the example above, the following line of code should behave analogously:
$(".tweet span").next().text()But it doesn’t—it actually returns the concatenated text content of all the selected nodes, rather than just the first. So jQuery APIs have behavior that depends heavily on how many nodes are currently selected.
Finally, we might try to manipulate the nodes by mapping a function
  across them via jQuery’s each() method, and here we
  have a classic problem that appears in any language: we must ensure
  that the mapped function is only applied to the correct sorts of
  HTML nodes.
Our approach
The examples above give a quick flavor of what can go wrong:
- The final manipulation might not be appropriate for the type of nodes actually in the query-set.
- The navigational steps might “fall off the edge of the tree”, navigating by too many steps and resulting in a vacuous query-set.
- The initial query might not match any nodes in the document, or match too many nodes.
Our tool of choice for resolving all these issues is a type system for JavaScript. We’ve used such a system before to analyze the security of Firefox browser extensions. The type system is quite sophisticated to handle JavaScript idioms precisely, and while we take advantage of that sophistication, we also completely hide the bulk of it from the casual jQuery programmer.
To adapt our prior system for jQuery, we need two technical insights: we define multiplicities, a new kind that allows us to approximate the size of a query set in its type and ensure that APIs are only applied to correctly-sized sets, and we define local structure, which allows developers to inform the type system about the query-related parts of the page.
Technical details
Multiplicities
Typically, when type system designers are confronted with the need to keep track of the size of a collection, they turn to a dependently typed system, where the type of a collection can depend on, say, a numeric value. So “a list of five booleans” has a distinct type from “a list of six booleans”. This is very precise, allowing some APIs to be expressed with exactitude. It does come at a heavy cost: most dependent type systems lose the ability to infer types, requiring hefty programmer annotations. But is all this precision necessary for jQuery?
Examining jQuery’s APIs reveals that its behavior can be broken down into five cases: methods might require their invocant query-set contain
- Zero elements of any type T, writen0<T>
- One element of some type T, written1<T>
- Zero or one elements of some type T, written01<T>
- One or more elements of some type T, written1+<T>
- Any number (zero or more) elements of some type T, written0+<T>
These cases effectively abstract the exact size of a collection into a simple, finite approximation. None of jQuery’s APIs actually care if a query set contains five or fifty elements; all the matters is that it contains one-or-more. And therefore our system can avoid the very heavyweight onus of a typical dependent type system, instead using this simpler abstraction without any loss of necessary precision.
Moreover, we can
  manipulate these cases using interval arithmetic: for
  example, combining one collection of zero-or-one elements with
  another collection containing exactly one element yields a
  collection of one-or-more elements: 01<T> + 1<T> =
  1+<T>.  This is just interval addition.
jQuery’s APIs all effectively map some behavior across a queryset.
  Consider the next() method: given a collection of at
  least one element, it transforms each element into at most one
  element (because some element might not have any next sibling).  The
  result is a collection of zero-or-more elements (if every element
  does not have a next sibling).
  Symbolically: 1+<01<T>> = 0+<T>.  This
  is just interval multiplication.
We can use these two operations to describe the types of all of
  jQuery’s APIs. For example, the next() method can be
  given the type Forall T, 1+<T> ->
  0+<@next<T>>.
Local structure
Wait — what’s @next?  Recall that we need to
connect the type system to the content of the page.  So we ask
programmers to define local structures that describe the
portions of their pages that they intend to query.  Think of them as
“schemas for page fragments”: while it is not reasonable
to ask programmers to schematize parts of the page they neither control
nor need to manipulate (such as 3rd-party ads), they
certainly must know the structure of those parts of the page that they
  query!  A local structure for, say, a Twitter stream might say:
(Tweet : Div classes = {tweet} optional = {starred}
   (Author : Span classes = {span})
   (Time : Span classes = {time})
   (Content : Span classes = {content})
Read this as “A Tweet is a Div
  that definitely has class tweet and might have
  class starred.  It contains three children elements in order:
  an Author, a Time, and
  a Content.  An Author is
  a Span with class span…”
(If you are familiar with templating libraries like mustache.js, these local structures might look familiar: they are just the widgets you would define for template expansion.)
From these definitions, we can compute crucial relationships: for
  instance, the @next sibling of an Author
  is a Time.  And that in turn completes our
  understanding of the type for next() above: for
  any local structure
  type T, next() can be called on a
  collection of at least one T and will
  return collection of zero or more elements that are
  the @next siblings of Ts.
Note crucially that while the uses of @next
  are baked into our system, the function itself is not
  defined once-and-for-all for all pages: it is computed from the
  local structures that the programmer defined.  In this way,
  we’ve parameterized our type system.  We imbue it with
  knowledge of jQuery’s fixed API, but leave a hole for programmers to
  provide their page-specific information, and
  thereby specialize our system for their code.
Of course, @next is not the only function we compute
  over the local structure.  We can
  compute @parents, @siblings, @ancestors,
  and more.  But all of these functions are readily deducible from the
  local structure.
Ok, so what does this all do for me?
Continuing with our Tweet example above, our system provides the following output for these next queries:// Mistake made: one too many calls to .next(), so the query set is empty
$(".tweet").children().next().next().next().css("color", "red")
// ==> ERROR: 'css' expects 1+<Element>, got 0<Element>
// Mistake made: one too few calls to .next(), so the query set is too big
$(".tweet #myTweet").children().next().css("color")
// ==>; ERROR: 'css' expects 1<Element>, got 1+<Author+Time>
// Correct query
$(".tweet #myTweet").children().next().next().css("color")
// ==> Typechecks successfullyThe big picture
We have defined a set of types for the jQuery APIs that captures
  their intended behavior.  These types are expressed using helper
  functions such as @next, whose behavior is specific to
  each page.  We ask programmers merely to define the local structures
  of their page, and from that we compute the helper functions we
  need.  And finally, from that, our system can produce type errors
  whenever it encounters the problematic situations we listed above.
  No additional programmer effort is needed, and the type errors
  produced are typically local and pertinent to fixing buggy code.
Further reading
Obviously we have elided many of the nitty-gritty details that make our system work. We’ve written up our full system, with more formal definitions of the types and worked examples of finding errors in buggy queries and successfully typechecking correct ones. The writeup also explains some surprising subtleties of the type environment, and proposes some API enhancements to jQuery that were suggested by difficulties in engineering and using the types we defined.
