S5: Engineering Eval

Tags: JavaScript, Programming Languages, Semantics

Posted on 21 October 2012.

In an earlier post, we introduced S5, our semantics for ECMAScript 5.1 (ES5). S5 is no toy, but strives to correctly model JavaScript's messy details.

One such messy detail of JavaScript is eval. The behavior of eval was updated in the ES5 specification to make its behavior less surprising and give more control to programmers. However, the old behavior was left intact for backwards compatibility. This has led to a language construct with a number of subtle behaviors. Today, we're going to explore JavaScript's eval, explain its several modes, and describe our approach to engineering an implementation of it.

Quiz Time!

We've put together a short quiz to give you a tour of the various types of eval in JavaScript. How many can you get right on the first try?

The interactive quiz requires that you have JavaScript enabled. You can still see the programs we ask about below, but the experience is a lot more fun if you can click the buttons and get feedback on your answers.

Question 1

function f(x) {

  eval("var x = 2;");
  return x;
}
f(1) === ?;

f(1) === 2
This example returns 2 because the var declaration in the eval actually refers to the same variables as the body of the function. So, the eval body overwrites the x parameter and returns the new value.

Question 2

function f(x) {

  eval("'use strict'; var x = 2;");
  return x;
}
f(1) === ?;

f(1) === 1
The 'use strict'; directive creates a new scope for variables defined inside the eval. So, the var x = 2; still evaluates, but doesn't affect the x that is the function's parameter. These first two examples show that strict mode changes the scope that eval affects. We might ask, now that we've seen these, what scope does eval see?

Question 3

function f(x) {

  eval("var x = y;");
  return x;
}
f(1) === ?;

f(1) === ReferenceError: y is not defined
OK, that was sort of a trick question. This program throws an exception saying that y is unbound. But it serves to remind us of an important JavaScript feature; if a variable isn't defined in a scope, trying to access it is an exception. Now we can ask the obvious question: can we see y if we define it outside the eval?

Question 4

function f(x) {
  var y = 2;
  eval("var x = y;");
  return x;
}
f(1) === ?;

f(1) === 2
OK, here's our real answer. The y is certainly visible inside the eval, which can both see and affect the outer scope. What if the eval is strict?

Question 5

function f(x) {
  var y = 2;
  eval("'use strict'; var x = y;");
  return x;
}
f(1) === ?;

f(1) === 1
Interestingly, we don't get an error here, so it seems like y was visible to the eval even in strict mode. However, as before the assignment doesn't escape. New topic next.

Question 6

function f(x) {
  var avel = eval;
  avel("var x = y;");
  return x;
}
f(1) === ?;

f(1) === ReferenceError: y is not defined
OK, that was a gimme. Lets add the variable declaration we need.

Question 7

function f(x) {
  var avel = eval;
  var y = 2;
  avel("var x = y;");
  return x;
}
f(1) === ?;

f(1) --> ReferenceError: y is not defined
What's going on here? We defined a variable and it isn't visible like it was before, and all we did was rename eval. Let's try a simpler example.

Question 8

function f(x) {
  var avel = eval;
  avel("var x = 2;");
  return x;
}
f(1) === ?;

f(1) === 1
OK, so somehow we aren't seeing the assignment to x either... Let's try making one more observation:

Question 9

function f(x) {
  var avel = eval;
  avel("var x = 2;");
  return x;
}
f(1);
x === ?;

x === 2
Whoa! So that eval changed the x in the global scope. This is what the specification refers to as an indirect eval; when the call to eval doesn't use a direct reference to the variable eval.

Question 10 (On the home stretch!)

function f(x) {
  "use strict";
  eval("var x = 2;");
  return x;
}
f(1) === ?;
x === ?;

f(1) === 1
Before, when we had "use strict"; inside the eval, we saw that the variable declarations did not escape. Here, the "use strict"; is outside, but we see the same thing: the value of 1 simply flows through to the return statement unaffected. Second, we know that we aren't doing the same thing as the indirect eval from the previous question, because we didn't affect the global scope.

Question 11 (last one!)

function f(x) {
  "use strict";
  var avel = eval;
  avel("var x = 2;");
  return x;
}
f(1) === ?;
x === ?;

f(1) === 1
x === 2
Unlike in the previous question, this indirect eval has the same behavior as before: it affects the global scope. The presence of a "use strict"; appears to mean something different to an indirect versus a direct eval.

Capturing all the Evals

We saw three factors that could affect the behavior of eval above:

Whether the code passed to eval was in strict mode;
Whether the code surrounding the eval was in strict mode; and
Whether the eval was direct or indirect.

Each of these is a binary choice, so there are eight potential configurations for an eval. Each of the eight cases specifies both:

Whether the eval sees the current scope or the global one;
Whether variables introduced in the eval are seen outside of it.

We can crisply describe all of these choices in a table:

Strict outside?	Strict inside?	Direct or Indirect?	Local or global scope?	Affects scope?
Yes	Yes	Indirect	Global	No
No	Yes	Indirect	Global	No
Yes	No	Indirect	Global	Yes
No	No	Indirect	Global	Yes
Yes	Yes	Direct	Local	No
No	Yes	Direct	Local	No
Yes	No	Direct	Local	No
No	No	Direct	Local	Yes

Rows where eval can affect some scope are shown in red (where it cannot is blue), and rows where the string passed to eval is strict mode code are in bold. Some patterns emerge here that make some of the design decisions of eval clear. For example:

If the eval is indirect it always uses global scope; if direct it always uses local scope.
If the string passed to eval is strict mode code, then variable declarations will not be seen outside the eval.
An indirect eval behaves the same regardless of the strictness of its context, while direct eval is sensitive to it.

Engineering `eval`

To specify eval, we need to somehow both detect these different configurations, and evaluate code with the right combination of visible environment and effects. To do so, we start with a flexible primitive that lets us evaluate code in an environment expressed as an object:

internal-eval(string, env-object)

This internal-eval expects env-object to be an object whose fields represent the environment to evaluate in. No identifiers other than those in the passed-in environment are bound. For example, a call like:

internal-eval("x + y", { "x" : 2, "y" : 5 })

Would evaluate to 7, using the values of the "x" and "y" fields from the environment object as the bindings for the identifiers x and y. With this core primitive, we have the control we need to implement all the different versions of eval.

In previous posts, we talked about the overall strategy of our evaluator for JavaScript. The relevant high-level point for this discussion is that we define a core language, dubbed S5, that contains only the essential features of JavaScript. Then, we define a source-to-source transformer, called desugar, that converts JavaScript programs to S5 programs. Since our evaluator is defined only over S5, we need to use desugar in our interpreter to perform the evaluation step. Semantically, the evaluation of internal-eval is then:

internal-eval(string, env-object) -> desugar(string)[x₁ / v₁, ...]
  for each x₁ : v₁ in env-object
  (where [x / v] indicates substitution)

It is the combination of desugar and the customizable environment argument to internal-eval that let us implement all of JavaScript's eval forms. We actually desugar all calls to JavaScript's eval into a function call defined in S5 called maybeDirectEval, which performs all the necessary checks to construct the correct environment for the eval.

Leveraging S5's Eval

With our implementation of eval, we have made progress on a few fronts.

Analyzing more JavaScript: We can now tackle more programs than any of our prior formal semantics for JavaScript. For example, we can actually run all of the complicated evals in Secure ECMAScript, and print the heap inside a use of a sandboxed eval. This enables new kinds of analyses that we haven't been able to perform before.

Understanding scripting languages' eval: Other scripting languages, like Ruby and Python, also have eval. Their implementations are closer to our internal-eval, in that they take dictionary arguments that specify the bindings that are available inside the evaluation. Is something like internal-eval, which was inspired by well-known semantic considerations, a useful underlying mechanism to use to describe all of these?

The implementation of S5 is open-source, and a detailed report of our strategy and test results is appearing at the Dynamic Languages Symposium. Check them out if you'd like to learn more!

The Brown PLT Blog