3.6 Tables
Tables consist of a sequence of zero or more rows, which each contain an equal amount of entries in named columns.
3.6.1 Creating Tables
Tables are created with a table: expression, which lists any number of columns, with optional annotations, and then any number of rows. For example, this expression creates a table with three columns, name, age, and favorite-color, and three rows:
my-table = table: name :: String, age :: Number, favorite-color :: String row: "Bob", 12, "blue" row: "Alice", 17, "green" row: "Eve", 13, "red" end
Note that my-table is referred to in many of the following examples.
3.6.2 Loading Tables
Table loading expressions allow for the importing of tables from the outside world into Pyret. Currently, only Google Sheets is supported. In addition to data sources, the notion of sanitizers is used. These are used to properly load each entry of the table as the correct type; for example, the string-sanitizer in the data-source module causes each item in its column to be loaded as a string (if it is not a string, it is first converted to one). This is illustrated by the following example:
import data-source as DS import gdrive-sheets as GS music-ssheet = GS.load-spreadsheet("<some-spreadsheet-id>") music = load-table: artist :: String, title :: String, year, sales :: Number source: music-ssheet.sheet-by-name("Sales", false) sanitize artist using DS.string-sanitizer sanitize title using DS.string-sanitizer sanitize year using DS.strict-num-sanitizer sanitize sales using DS.strict-num-sanitizer end
In general, it is safest to sanitize every input column, since it is the only way to guarantee that the data source will not guess the column’s type incorrectly.
Data sources are currently an internal concept to Pyret, so no public interface for creating them is supported.
While the data-source library provides sanitizers which should cover most use cases, there may be times when one would like to create a custom data sanitizer. To do so, one must simply create a function which conforms to the Sanitizer<A,B> type in the data-source module.
3.6.3 Selecting Columns
The select expression can be used to create a new table from a subset of the columns of an existing one. For example, we can get just the names and ages from my-table above:
names-and-ages = select name, age from my-table end check: names-and-ages is table: name, age row: "Bob", 12 row: "Alice", 17 row: "Eve", 13 end end
3.6.4 Filtering Tables
The sieve mechanism allows for filtering out rows of tables based on some criteria. The using keyword specifies which columns may be used in the body of the sieve expression.
For instance, we can find the individuals in my-table who are old enough to drive in the United States.
can-drive = sieve my-table using age: age >= 16 end check: can-drive is table: name, age row: "Alice", 17 end end
Note that the sieve block must explicitly list the columns used to filter out values with using. The following would signal an undefined name error for age, because names being used in the expression body must be listed:
can-drive = sieve my-table using name: # age is not visible inside of this expression age >= 16 end
3.6.5 Ordering Tables
Since a table consists of a sequence of rows, one may desire to arrange those rows in some particular order. This can be done with any column whose type supports the use of < and > by using an order expression:
age-ordered = order my-table: age descending end check: age-ordered is table: name, age row: "Alice", 17 row: "Eve", 13 row: "Bob", 12 end end
3.6.6 Transforming Tables
The transform expression allows the changing of columns within a table, similar to the map function over lists (and, just like map, transform expressions do not mutate the table, but instead return a new one).
age-fixed = update my-table using age: age: age + 1 end check: age-fixed is table: name, age row: "Bob", 13 row: "Alice", 18 row: "Eve", 14 end end
3.6.7 Extracting Columns from Tables
A large number of Pyret modules work on lists instead of tables, so it may be desired to pull the contents of one column of a table as a list to use it elsewhere. The extract mechanism allows this ability, and serves as the primary "link" between processing tabular data and non-tabular Pyret functions.
name-list = extract name from my-table end check: name-list is [list: "Bob", "Alice", "Eve"] end
3.6.8 Extending Tables
can-drive-col = extend my-table using age: can-drive: age >= 16 end check: can-drive-col is table: name, age, can-drive row: "Bob", 12, false row: "Alice", 17, true row: "Eve", 13, false end end
Note that just like in transform, it is required to specify which columns will be used in the body of the extend expression using the using keyword.
import tables as TS num-can-drive-col = extend can-drive-col using can-drive: count-true = TS.running-fold(0, {(acc, cur): acc + (if cur: 1 else: 0 end)}) num-can-drive: count-true of can-drive end check: num-can-drive-col = table: name, age, can-drive, num-can-drive row: "Bob", 12, false, 0 row: "Alice", 17, true, 1 row: "Eve", 13, false, 1 end end
While the reducers found in the tables module should cover most all use cases, there may be times when one would like to create a reducer of their own. To do so, one must construct an object of the following type:
Reducers are essentially descriptions of folds (in the list fold sense) over table columns. The way reducers are called by the language runtime is as follows: the value(s) from the first row are passed to the reducer’s .one method, which should return a tuple containing both any accumulated information needed for the fold and the value which should be placed in the new column in that row. The remaining rows are then sequentially populated using the reducer’s .reduce method, which is identical to the .one method except that it receives an additional argument which is the previously mentioned accumulated information from the previous row.
import tables as TS running-mean :: TS.Reducer<{Number; Number}, Number, Number> = { one: lam(n): {{n; 1}; n} end, reduce: lam({sum; count}, n): { {sum + n; count + 1}; (sum + n) / (count + 1) } end }
3.6.8.1 Reducers
The following reducers are provided:
check: count-if-driver = TS.running-fold(0, {(sum, col): if col >= 16: 1 + sum else: sum end}) t = table: name, age row: "Bob", 17 row: "Mary", 22 row: "Jane", 6 row: "Jim", 15 row: "Barbara", 30 end with-driver-count = extend t using age: total-drivers: count-if-driver of age end with-drive-count is table: name, age, total-drivers row: "Bob", 17, 1 row: "Mary", 22, 2 row: "Jane", 6, 2 row: "Jim", 15, 2 row: "Barbara", 30, 3 end end
Creates a reducer that combines the first value in the column with the second, then the result of that combination with the third, then the result of that combination with the fourth, and so on.
check: running-product = TS.running-reduce(lam(x, y): x * y end) t = table: outcome, probability row: "H-T", 0.5 row: "T-T", 0.25 row: "T-T", 0.25 row: "T-T", 0.25 row: "H-T", 0.5 row: "H-T", 0.5 row: "H-H", 0.25 end with-cumulative = extend t using probability: cumulative: running-product of probability end extract cumulative from with-cumulative end is [list: 0.5, 0.125, 0.03125, 0.0078125, 0.00390625, 0.001953125, 0.00048828125] end
3.6.9 Comparing Tables
The order of both rows and columns are part of a table value. To be considered equal, tables need to have all the same rows and columns, with the rows and columns appearing in the same order.