S5: Engineering Eval
Tags: JavaScript, Programming Languages, Semantics
Posted on 21 October 2012.In an earlier post, we introduced S5, our semantics for ECMAScript 5.1 (ES5). S5 is no toy, but strives to correctly model JavaScript's messy details.
One such messy detail of JavaScript is eval
. The
behavior of eval
was updated in the ES5 specification to
make its behavior less surprising and give more control to programmers.
However, the old behavior was left intact for backwards compatibility.
This has led to a language construct with a number of subtle behaviors.
Today, we're going to explore JavaScript's eval
, explain
its several modes, and describe our approach to engineering an
implementation of it.
Quiz Time!
We've put together a short quiz to give you a tour of the various
types of eval
in JavaScript. How many can you get right on
the first try?
Question 1
f(1) === 2
This example returns 2 because the var
declaration in
the eval
actually refers to the same variables as
the body of the function. So, the eval
body overwrites the
x
parameter and returns the new value.
Question 2
f(1) === 1
The 'use strict';
directive creates a new scope for
variables defined inside the eval
. So, the
var x = 2;
still evaluates, but doesn't affect the
x
that is the function's parameter. These first two
examples show that strict mode
changes the scope that eval
affects. We might
ask, now that we've seen these, what scope does eval
see?
Question 3
f(1) === ReferenceError: y is not defined
OK, that was sort of a trick question. This program throws an
exception saying that y
is unbound. But it serves to
remind us of an important JavaScript feature; if a variable isn't
defined in a scope, trying to access it is an exception. Now we can ask
the obvious question: can we see y
if we define it outside
the eval
?
Question 4
f(1) === 2
OK, here's our real answer. The y
is certainly visible
inside the eval
, which can both see and affect the outer
scope. What if the eval
is strict?
Question 5
f(1) === 1
Interestingly, we don't get an error here, so
it seems like y
was visible to the eval
even in strict mode. However, as before the assignment doesn't
escape. New topic next.
Question 6
f(1) === ReferenceError: y is not defined
OK, that was a gimme. Lets add the variable declaration we need.
Question 7
f(1) --> ReferenceError: y is not defined
What's going on here? We defined a variable and it isn't visible
like it was before, and all we did was rename eval
.
Let's try a simpler example.
Question 8
f(1) === 1
OK, so somehow we aren't seeing the assignment to x
either... Let's try making one more observation:
Question 9
x === 2
Whoa! So that eval changed the x
in the global
scope. This is what the specification refers to as an
indirect eval; when the call to eval
doesn't
use a direct reference to the variable eval
.
Question 10 (On the home stretch!)
f(1) === 1
Before, when we had "use strict";
inside the
eval
, we saw that the variable declarations did not
escape. Here, the "use strict";
is outside, but we see
the same thing: the value of
1
simply flows through to the return statement
unaffected. Second, we know that we aren't doing the same thing as
the indirect eval
from the previous question,
because we didn't affect the global scope.
Question 11 (last one!)
f(1) === 1
x === 2
Unlike in the previous question, this indirect eval
has
the same behavior as before: it affects the global scope. The
presence of a "use strict";
appears to mean something different to an
indirect versus a direct eval
.
Capturing all the Evals
We saw three factors that could affect the behavior of eval
above:
-
Whether the code passed to
eval
was instrict
mode; -
Whether the code surrounding the
eval
was instrict
mode; and -
Whether the
eval
was direct or indirect.
Each of these is a binary choice, so there are eight potential
configurations for an eval
. Each of the eight cases
specifies both:
-
Whether the
eval
sees the current scope or the global one; -
Whether variables introduced in the
eval
are seen outside of it.
We can crisply describe all of these choices in a table:
Strict outside? | Strict inside? | Direct or Indirect? | Local or global scope? | Affects scope? |
---|---|---|---|---|
Yes | Yes | Indirect | Global | No |
No | Yes | Indirect | Global | No |
Yes | No | Indirect | Global | Yes |
No | No | Indirect | Global | Yes |
Yes | Yes | Direct | Local | No |
No | Yes | Direct | Local | No |
Yes | No | Direct | Local | No |
No | No | Direct | Local | Yes |
Rows where eval
can affect some scope are shown in red
(where it cannot is blue),
and rows where the string passed to eval
is strict mode code are in
bold.
Some patterns emerge here that make some of the design decisions of
eval
clear. For example:
- If the
eval
is indirect it always uses global scope; if direct it always uses local scope. - If the string passed to
eval
is strict mode code, then variable declarations will not be seen outside theeval
. - An indirect
eval
behaves the same regardless of the strictness of its context, while directeval
is sensitive to it.
Engineering eval
To specify eval
, we need to somehow both detect these
different configurations, and evaluate code with the right combination
of visible environment and effects. To do so, we start with a flexible
primitive that lets us evaluate code in an environment expressed as an
object:
internal-eval(string, env-object)
This internal-eval
expects env-object
to be an
object whose fields represent the environment to evaluate in. No
identifiers other than those in the passed-in environment are bound.
For example, a call like:
internal-eval("x + y", { "x" : 2, "y" : 5 })
Would evaluate to 7
, using the values of the
"x"
and "y"
fields from the environment object as the
bindings for the identifiers x
and y
. With
this core primitive, we have the control we need to implement all the
different versions of eval
.
In previous
posts, we talked about the
overall strategy of our evaluator for JavaScript. The relevant
high-level point for this discussion is that we define a core language,
dubbed S5, that contains only the essential features of JavaScript.
Then, we define a source-to-source transformer, called desugar,
that converts JavaScript programs to S5 programs. Since our evaluator
is defined only over S5, we need to use desugar in our
interpreter to perform the evaluation step. Semantically, the
evaluation of internal-eval
is then:
internal-eval(string, env-object) -> desugar(string)[x1 / v1, ...] for each x1 : v1 in env-object (where [x / v] indicates substitution)
It is the combination of desugar and the customizable
environment argument to internal-eval
that let us implement
all of JavaScript's eval
forms. We actually
desugar all calls to JavaScript's eval
into a
function call defined in S5 called maybeDirectEval
, which
performs all the necessary checks to construct the correct environment
for the eval
.
Leveraging S5's Eval
With our implementation of eval
, we have made progress on a
few fronts.
Analyzing more JavaScript: We can now tackle more programs
than any of our prior formal semantics for JavaScript. For example, we
can actually run all of the complicated eval
s in
Secure ECMAScript, and
print the
heap inside a use of a sandboxed eval
. This enables
new kinds of analyses that we haven't been able to perform before.
Understanding scripting languages' eval
: Other
scripting languages, like Ruby and Python, also have eval
.
Their implementations are closer to our internal-eval
, in
that they take dictionary arguments that specify the bindings that are
available inside the evaluation. Is something like
internal-eval
, which was inspired by well-known semantic
considerations, a useful underlying mechanism to use to describe all of
these?
The implementation of S5 is open-source, and a detailed report of our strategy and test results is appearing at the Dynamic Languages Symposium. Check them out if you'd like to learn more!