3.6 Tables

Tables consist of a sequence of zero or more rows, which each contain an equal amount of entries in named columns.

3.6.1 Creating Tables

Tables are created with a table: expression, which lists any number of columns, with optional annotations, and then any number of rows. For example, this expression creates a table with three columns, name, age, and favorite-color, and three rows:

Examples:

my-table = table: name :: String, age :: Number, favorite-color :: String
  row: "Bob", 12, "blue"
  row: "Alice", 17, "green"
  row: "Eve", 13, "red"
end

Note that my-table is referred to in many of the following examples.

3.6.2 Loading Tables

Table loading expressions allow for the importing of tables from the outside world into Pyret. Currently, only Google Sheets is supported. In addition to data sources, the notion of sanitizers is used. These are used to properly load each entry of the table as the correct type; for example, the string-sanitizer in the data-source module causes each item in its column to be loaded as a string (if it is not a string, it is first converted to one). This is illustrated by the following example:

Examples:

import data-source as DS
import gdrive-sheets as GS
music-ssheet = GS.load-spreadsheet("<some-spreadsheet-id>")

music = load-table: artist :: String, title :: String, year, sales :: Number
  source: music-ssheet.sheet-by-name("Sales", false)
  sanitize artist using DS.string-sanitizer
  sanitize title using DS.string-sanitizer
  sanitize year using DS.strict-num-sanitizer
  sanitize sales using DS.strict-num-sanitizer
end

In general, it is safest to sanitize every input column, since it is the only way to guarantee that the data source will not guess the column’s type incorrectly.

Data sources are currently an internal concept to Pyret, so no public interface for creating them is supported.

While the data-source library provides sanitizers which should cover most use cases, there may be times when one would like to create a custom data sanitizer. To do so, one must simply create a function which conforms to the Sanitizer<A,B> type in the data-source module.

3.6.3 Selecting Columns

The select expression can be used to create a new table from a subset of the columns of an existing one. For example, we can get just the names and ages from my-table above:

names-and-ages = select name, age from my-table end
check:
  names-and-ages is table: name, age
    row: "Bob", 12
    row: "Alice", 17
    row: "Eve", 13
  end
end

3.6.4 Filtering Tables

The sieve mechanism allows for filtering out rows of tables based on some criteria. The using keyword specifies which columns may be used in the body of the sieve expression.

For instance, we can find the individuals in my-table who are old enough to drive in the United States.

can-drive = sieve my-table using age:
  age >= 16
end
check:
  can-drive is table: name, age
    row: "Alice", 17
  end
end

Note that the sieve block must explicitly list the columns used to filter out values with using. The following would signal an undefined name error for age, because names being used in the expression body must be listed:

can-drive = sieve my-table using name:
  # age is not visible inside of this expression
  age >= 16
end

3.6.5 Ordering Tables

Since a table consists of a sequence of rows, one may desire to arrange those rows in some particular order. This can be done with any column whose type supports the use of < and > by using an order expression:

age-ordered = order my-table:
  age descending
end
check:
  age-ordered is table: name, age
    row: "Alice", 17
    row: "Eve", 13
    row: "Bob", 12
  end
end

3.6.6 Transforming Tables

The transform expression allows the changing of columns within a table, similar to the map function over lists (and, just like map, transform expressions do not mutate the table, but instead return a new one).

Suppose we find out that my-table is wrong and everyone is actually a year older than it says they are. We can fix our data as follows:

age-fixed = update my-table using age:
  age: age + 1
end
check:
  age-fixed is table: name, age
    row: "Bob", 13
    row: "Alice", 18
    row: "Eve", 14
  end
end

3.6.7 Extracting Columns from Tables

A large number of Pyret modules work on lists instead of tables, so it may be desired to pull the contents of one column of a table as a list to use it elsewhere. The extract mechanism allows this ability, and serves as the primary "link" between processing tabular data and non-tabular Pyret functions.

Suppose, for example, we wanted just the names of each person in my-table. We could pull those names out as follows:

name-list = extract name from my-table end
check:
  name-list is [list: "Bob", "Alice", "Eve"]
end

3.6.8 Extending Tables

There are two types of extensions which can be made to tables: "mapping" and "reducing" columns. A "mapping" column is one whose contents depend only on the row it is being added to. An example of this would be a column which tells whether the age field of a given row in my-table indicates that the person in that row can drive in the United States or not (i.e. whether that person is at least 16):

can-drive-col = extend my-table using age:
  can-drive: age >= 16
end
check:
  can-drive-col is table: name, age, can-drive
    row: "Bob", 12, false
    row: "Alice", 17, true
    row: "Eve", 13, false
  end
end

Note that just like in transform, it is required to specify which columns will be used in the body of the extend expression using the using keyword.

Conversely, a "reducing" column is one whose information is computed from the row it is being added to and the rows above that row. For example, given can-drive-col from the previous example, suppose we would like to keep a running total of how many people are able to drive. Importing the tables module allows us to do this:

import tables as TS
num-can-drive-col = extend can-drive-col using can-drive:
  count-true = TS.running-fold(0, {(acc, cur): acc + (if cur: 1 else: 0 end)})
  num-can-drive: count-true of can-drive
end
check:
  num-can-drive-col = table: name, age, can-drive, num-can-drive
    row: "Bob", 12, false, 0
    row: "Alice", 17, true, 1
    row: "Eve", 13, false, 1
  end
end

While the reducers found in the tables module should cover most all use cases, there may be times when one would like to create a reducer of their own. To do so, one must construct an object of the following type:

Reducer<Acc,InVal,OutVal>

.one :: (value-from-column :: InVal) -> {Acc; OutVal}

.reduce :: (accumulator :: Acc, value-from-column :: InVal) -> {Acc; OutVal}

Reducers are essentially descriptions of folds (in the list fold sense) over table columns. The way reducers are called by the language runtime is as follows: the value(s) from the first row are passed to the reducer’s .one method, which should return a tuple containing both any accumulated information needed for the fold and the value which should be placed in the new column in that row. The remaining rows are then sequentially populated using the reducer’s .reduce method, which is identical to the .one method except that it receives an additional argument which is the previously mentioned accumulated information from the previous row.

To illustrate, a running-mean reducer which is equivalent to the one provided by the tables module could be implemented as follows:

import tables as TS
running-mean :: TS.Reducer<{Number; Number}, Number, Number> = {
  one: lam(n): {{n; 1}; n} end,
  reduce: lam({sum; count}, n):
    { {sum + n; count + 1}; (sum + n) / (count + 1) }
  end
}

3.6.8.1 Reducers

The following reducers are provided:

running-mean :: Reducer<Number, Number, Number>

difference-from :: (start-value :: Number) -> Reducer<Number, Number, Number>

running-fold :: (start-value :: Result, combiner :: (Result, Col -> Result)) -> Reducer<Result, Col, Result>

check:
  count-if-driver = TS.running-fold(0, {(sum, col): if col >= 16: 1 + sum else: sum end})
  t = table: name, age
    row: "Bob", 17
    row: "Mary", 22
    row: "Jane", 6
    row: "Jim", 15
    row: "Barbara", 30
  end
  with-driver-count = extend t using age:
    total-drivers: count-if-driver of age
  end

  with-drive-count is table: name, age, total-drivers
    row: "Bob", 17, 1
    row: "Mary", 22, 2
    row: "Jane", 6, 2
    row: "Jim", 15, 2
    row: "Barbara", 30, 3
  end
end

running-reduce :: (combiner :: (Col, Col -> Col)) -> Reducer<Col, Col, Col>

Creates a reducer that combines the first value in the column with the second, then the result of that combination with the third, then the result of that combination with the fourth, and so on.

check:
  running-product = TS.running-reduce(lam(x, y): x * y end)

  t = table: outcome, probability
    row: "H-T", 0.5
    row: "T-T", 0.25
    row: "T-T", 0.25
    row: "T-T", 0.25
    row: "H-T", 0.5
    row: "H-T", 0.5
    row: "H-H", 0.25
  end

  with-cumulative = extend t using probability:
    cumulative: running-product of probability
  end

  extract cumulative from with-cumulative end
    is [list: 0.5, 0.125, 0.03125, 0.0078125, 0.00390625, 0.001953125, 0.00048828125]

end

difference :: Reducer<Number, Number, Number>

running-max :: Reducer<Number, Number, Number>

running-min :: Reducer<Number, Number, Number>

running-sum :: Reducer<Number, Number, Number>

3.6.9 Comparing Tables

The order of both rows and columns are part of a table value. To be considered equal, tables need to have all the same rows and columns, with the rows and columns appearing in the same order.

← prev up next →

1	Getting Started
2	Language Concepts
3	Builtins and Libraries
4	Internals

3.1	Global Utilities
3.2	Numbers
3.3	Strings
3.4	Booleans
3.5	Raw Array
3.6	Tables
3.7	lists
3.8	sets
3.9	arrays
3.10	string-dict
3.11	option
3.12	pick
3.13	either
3.14	srcloc
3.15	pprint
3.16	s-exp
3.17	s-exp-structs
3.18	image-structs
3.19	image
3.20	world
3.21	reactors
3.22	plot

3.6.1	Creating Tables
3.6.2	Loading Tables
3.6.3	Selecting Columns
3.6.4	Filtering Tables
3.6.5	Ordering Tables
3.6.6	Transforming Tables
3.6.7	Extracting Columns from Tables
3.6.8	Extending Tables
3.6.9	Comparing Tables