(Sub)Typing First Class Field Names
Tags: Programming Languages, Semantics, Types
Posted on 10 December 2012.This post picks up where a previous post left off, and jumps back into the action with subtyping.
Updating Subtyping
There are two traditional rules for subtyping records, historically known as width and depth subtyping. Width subtyping allows generalization by dropping fields; one record is a supertype of another if it contains a subset of the sub-record's fields at the same types. Depth subtyping allows generalization within a field; one record is a supertype of another if it is identical aside from generalizing one field to a supertype of the type in the same field in the sub-record.
We would like to understand both of these kinds of subtyping in the context of our first-class field names. With traditional record types, fields are either mentioned in the record or not. Thus, for each possible field in both types, there are four combinations to consider. We can describe width and depth subtyping in a table:
T1 |
T2 |
T1 <: T2 if... |
---|---|---|
f: S |
f: T |
S <: T |
f: - |
f: T |
Never |
f: S |
f: - |
Always |
f: - |
f: - |
Always |
We read f: S
as saying that T1
has
the field f
with type S
, and we read f:
-
as saying that the corresponding type doesn't mention the field
f
. The first row of the table corresponds to
depth subtyping, where the field f
is still
present, but at a more general type in T2
. The
second row is a failure to subtype, when T2
has
a field that isn't mentioned at all in T1
. The
third row corresponds to width subtyping, where a field is
dropped and not mentioned in the supertype. The last row is a trivial
case, where neither type mentions the field.
For records with string patterns, we can extend this table with new combinations to account for ○ and ↓ annotations. The old rows remain, and become the ↓ cases, and new rows are added for ○ annotations:
T1 |
T2 |
T1 <: T2 if... |
---|---|---|
f↓: S |
f↓: T |
S <: T |
f○: S |
f↓: T |
Never |
f: - |
f↓: T |
Never |
f↓: S |
f○: T |
S <: T |
f○: S |
f○: T |
S <: T |
f: - |
f○: T |
Never |
f↓: S |
f: - |
Always |
f○: S |
f: - |
Always |
f: - |
f: - |
Always |
Here, we see that it is safe to treat a definitely present field as a
possibly-present one, in the case where we compare
f↓:S
to f○:T
). The dual
of this case, treating a possibly-present field as definitely-present,
is unsafe, as the comparison of f○:S
to
f↓:T
shows. Possibly present annotations do not
allow us to invent fields, as having f: -
on the
left-hand-side is still only permissible if the right-hand-side also
doesn't mention f
.
Giving Types to Values
In order to ascribe these rich types to object values, we need rules for typing basic objects, and then we need to apply these subtyping rules to generalize them. As a working example, one place where objects with field patterns come up every day in practice is in JavaScript arrays. Arrays in JavaScript hold their elements in fields named by stringified numbers. Thus, a simplified type for a JavaScript array of booleans is roughly:
BoolArrayFull: { [0-9]+○: Bool }
That is, each field made up of a sequence of digits is possibly present, and if it is there, it has a boolean value. For simplicity, let's consider a slightly neutered version of this type, where only single digit fields are allowed:
BoolArray: { [0-9]○: Bool }
Let's think about how we'd go about typing a value that should
clearly have this type: the array [true, false]
. We can
think of this array literal as desugaring into an object like (indeed,
this is what λJS
does):
We would like to be able to state that this object is a member of the
BoolArray
type above. The traditional rule for record
typing would ascribe a type mapping the names that are present to the
types of their right hand side. Since the fields are certainly present,
in our notation we can write:
{"0"↓: Bool, "1"↓: Bool}
This type should certainly be generalizable to
BoolArray
. That is, it should hold (using the rules in the
table above) that:
{"0"↓: Bool, "1"↓: Bool} <: { [0-9]○: Bool }
Let's see what happens when we instantiate the table for these two types:
T1 |
T2 |
T1 <: T2 if... |
---|---|---|
0↓: Bool |
0○: Bool |
Bool <: Bool |
1↓: Bool |
1○: Bool |
Bool <: Bool |
3: - |
3○: Bool |
Fail! |
4: - |
4○: Bool |
Fail! |
... | ... | ... |
(We cut off the table for 5-9, which are the same as the cases for 3 and 4). Our subtyping fails to hold for these types, which don't let us reflect the fact that the fields 3 and 4 are actually absent, and we should be allowed to consider them as possibly present at the boolean type. In fact, our straightforward rule for typing records is in fact responsible for throwing away this information! The type that it ascribed,
{"0"↓: Bool, "1"↓: Bool}
is actually the type of many objects, including those that happen to
have fields like "banana" : 42
. Traditional record typing
drops fields when it doesn't care if they are present or absent, which
loses information about definitive absence.
We extend our type language once more to keep track of this information. We add an explicit piece of a record type that tracks a description of the fields that are definitely absent on an object, and use this for typing object literals:
p = ○ | ↓ T = ... | { Lp : T ..., LA: abs }
Thus, the new type for ["0": true, "1": false]
would be:
{"0"↓: Bool, "1"↓: Bool, ("0"|"1"): abs}
Here, the overbar denotes
regular-expression complement, and this type is expressing that all
fields other than "0"
and "1"
are definitely
absent.
Adding another type of field annotation requires that we again extend our table of subtyping options, so we now have a complete description with 16 cases:
T1 |
T2 |
T1 <: T2 if... |
---|---|---|
f↓: S |
f↓: T |
S <: T |
f○: S |
f↓: T |
Never |
f: abs |
f↓: T |
Never |
f: - |
f↓: T |
Never |
f↓: S |
f○: T |
S <: T |
f○: S |
f○: T |
S <: T |
f: abs |
f○: T |
Always |
f: - |
f○: T |
Never |
f↓: S |
f: abs |
Never |
f○: S |
f: abs |
Never |
f: abs |
f: abs |
Always |
f: - |
f: abs |
Never |
f↓: S |
f: - |
Always |
f○: S |
f: - |
Always |
f: abs |
f: - |
Always |
f: - |
f: - |
Always |
We see that absent fields cannot be generalized to be definitely present
(the abs
to f↓
case), but
they can be generalized to be possibly present at any type.
This is expressed in the case that compares f : abs
to f○: T
, which always holds for any
T
. To see these rules in action, we can instantiate them
for the array example we've been working with to ask a new question:
{"0"↓: Bool, "1"↓: Bool, ("0"|"1"): abs} <: { [0-9]○: Bool }
And the table:
T1 |
T2 |
T1 <: T2 if... |
---|---|---|
0↓: Bool |
0○: Bool |
Bool <: Bool |
1↓: Bool |
1○: Bool |
Bool <: Bool |
3: abs |
3○: Bool |
OK! |
4: abs |
4○: Bool |
OK! |
... | ... | ... |
9: abs |
9○: Bool |
OK! |
foo: abs |
foo: - |
OK! |
bar: abs |
bar: - |
OK! |
... | ... | ... |
There's two things that make this possible. First, it is sound to generalize the absent fields that are possibly present on the array type, because the larger type doesn't guarantee their presence either. Second, it is sound to generalize absent fields that aren't mentioned on the array type, because unmentioned fields can be present or absent with any type. The combination of these two features of our subtyping relation lets us generalize from particular array instances to the more general type for arrays.
Capturing the Whole Table
The tables above present subtyping on a field-by-field basis, and the patterns we considered at first were finite. In the last case, however, the pattern of “fields other than 0 and 1” was in fact infinite, and we cannot actually construct that infinite table to describe subtyping. The writeup and its associated proof document lay out an algorithmic version of the rules presented in the tables above, and also provides a proof of their soundness.
The writeup also discusses another interesting problem, which is the
interaction between these pattern types and inheritance, where patterns
on the child and parent objects may overlap in subtle ways. It goes
further and discusses what happens in cases like JavaScript, where the
field "__proto__"
is an accessible member that has
inheritance semantics. Check it all out here!