diff --git a/doc/Tablicious.qch b/doc/Tablicious.qch index f21c712d..5a0647f3 100644 Binary files a/doc/Tablicious.qch and b/doc/Tablicious.qch differ diff --git a/doc/html/API-Alphabetically.html b/doc/html/API-Alphabetically.html index cc612ceb..04b990af 100644 --- a/doc/html/API-Alphabetically.html +++ b/doc/html/API-Alphabetically.html @@ -1,6 +1,6 @@ - +
+ + + + + + + + + + + + + + + + + + + + + + -out =
colon (lo, hi)
¶out =
colon (hi, inc, hi)
¶Generate a sequence of uniformly-spaced values. +
+This method implements the behavior for the colon operator (lo:hi
or
+lo:inc:hi
calls) for the datetime type.
+
"Uniformly-spaced" means uniform in terms of the duration or calendarDuration +value used as the increment. Calendar durations are not necessarily equal-sized in +terms of the amount of actual time contained in them, so when using a +calendarDuration as the increment, the resulting vector may not be, and often will +not be, uniformly spaced in terms of actual (non-"calendar") time. +
+The inc argument may be a duration, calendarDuration, or numeric. Numerics
+are taken to be a number of days (uniform-size days, not calendar days), and are
+converted to a duration object with duration.ofDays (inc)
. The default value
+for inc, used in the two-arg lo:hi
is 1, that is, 1 day of exactly 24
+hours.
+
Returns a datetime vector. +
+WARNING: There are issues with negative-direction sequences. When hi is less than +lo, this will always produce an empty array, even if inc is a negative value. +And there are cases with calendarDurations that have both Months, Days and/or Times +with mixed signs that values may move in the "wrong" direction, or produce an +infinite loop. If these problem cases can be correctly identified, but not +corrected, those cases may raise an error future releases of Tablicious. +
+Tabular data array containing multiple columnar variables. -
-See table. -
Convert an array to a table. -
-See array2table. -
Convert a cell array to a table. -
-See cell2table. -
Convert struct to a table. -
-See struct2table. -
See tableOuterFillValue. -
Filter by variable type for use in suscripting. -
-See vartype. -
True if input is a ‘table’ array or other table-like type, false otherwise. -
-See istable. -
True if input is a ‘timetable’ array or other timetable-like type, false otherwise. -
-See istimetable. -
True if input is eitehr a ‘table’ or ‘timetable’ array, or an object like them. -
-See istabular. -
Evaluate an expression against a table array’s variables. -
-See tblish.evalWithTableVars. -
Statistics by group for a table array. -
-See tblish.table.grpstats. -
A string array of Unicode strings. -
-See string. -
“Not-a-String". -
-See NaS. -
Test if strings contain a pattern. -
-See contains. -
Display strings for array. -
-See dispstrs. -
Categorical variable array. -
-See categorical. -
True if input is a ‘categorical’ array, false otherwise. -
-See iscategorical. -
“Not-a-Categorical". -
-See NaC. -
Group data into discrete bins or categories. -
-See discretize. -
Represents points in time using the Gregorian calendar. -
-See datetime. -
“Not-a-Time”. -
-See NaT. -
Convert input to a Tablicious datetime array, with convenient interface. -
-See todatetime. -
Represents a complete day using the Gregorian calendar. -
-See localdate. -
True if input is a ‘datetime’ array, false otherwise. -
-See isdatetime. -
Durations of time using variable-length calendar periods, such as days, months, and years, which may vary in length over time. -
-See calendarDuration. -
True if input is a ‘calendarDuration’ array, false otherwise. -
-See iscalendarduration. -
Create a ‘calendarDuration’ that is a given number of calendar months long. -
-See calmonths. -
Construct a ‘calendarDuration’ a given number of years long. -
-See calyears. -
Duration in days. -
-See days. -
Represents durations or periods of time as an amount of fixed-length time (i.e. -
-See duration. -
Create a ‘duration’ X hours long, or get the hours in a ‘duration’ X. -
-See hours. -
True if input is a ‘duration’ array, false otherwise. -
-See isduration. -
Create a ‘duration’ X milliseconds long, or get the milliseconds in a ‘duration’ X. -
-See milliseconds. -
Create a ‘duration’ X hours long, or get the hours in a ‘duration’ X. -
-See minutes. -
Create a ‘duration’ X seconds long, or get the seconds in a ‘duration’ X. -
-See seconds. -
List all the time zones defined on this system. -
-See timezones. -
Create a ‘duration’ X years long, or get the years in a ‘duration’ X. -
-See years. -
See mustBeA. -
See mustBeCellstr. -
See mustBeCharvec. -
See mustBeFinite. -
See mustBeInteger. -
See mustBeMember. -
See mustBeNonempty. -
See mustBeNumeric. -
See mustBeReal. -
See mustBeSameSize. -
See mustBeScalar. -
See mustBeScalarLogical. -
See mustBeVector. -
Apply a function to column vectors in array. -
-See colvecfun. -
Display strings for array. -
-See dispstrs. -
Get first K rows of an array. -
-See head. -
See isfile. -
See isfolder. -
Alias for prettyprint, for interactive use. -
-See pp. -
Expand scalar inputs to match size of non-scalar inputs. -
-See scalarexpand. -
Format an array size for display. -
-See size2str. -
Split data into groups and apply function. -
-See splitapply. -
Get last K rows of an array. -
-See tail. -
Apply function to vectors in array along arbitrary dimension. -
-See vecfun. -
Approximate size of an array in bytes, with object support. -
-See tblish.sizeof2. -
Example dataset collection. -
-See tblish.datasets. -
The ‘tblish.dataset’ class provides convenient access to the various datasets included with Tablicious. -
-See tblish.dataset. -
Conditioning plot. -
-See tblish.examples.coplot. -
Plot pairs of variables against each other. -
-See tblish.examples.plot_pairs. -
The classic Suppliers-Parts example database. -
-See tblish.examples.SpDb. -
Tablicious for GNU Octave is covered by the GNU GPLv3 and other Free and Open Source Software licenses. -
-The main code of Tablicious is licensed under the GNU GPL version 3. -
-The date/time portion of Tablicious includes some Unicode data files licensed under the Unicode License Agreement - Data Files and Software license. -
-The Tablicious test suite contains some files, specifically some table-related tests using MP-Test like t/t_01_table.m
, which are BSD 3-Clause licensed, and are adapted from MATPOWER written by Ray Zimmerman.
-
The Fisher Iris dataset is Public Domain. -
-This manual is for Tablicious, version 0.4.4-SNAPSHOT. -
-Copyright © 2019, 2023, 2024 Andrew Janke -
--- -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. -
-Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. -
-Permission is granted to copy and distribute translations of this manual -into another language, under the same conditions as for modified versions. -
Many of Tablicious’ example data sets are based on the example datasets
-found in R’s datasets
package. R can be found at
-https://www.r-project.org/, and documentation for its datasets
-is at https://rdrr.io/r/datasets/datasets-package.html.
-Thanks to the R developers for producing the original data sets here.
-
Tablicious’ examples’ code tries to replicate the R examples, so it can -be useful to compare the two of them if you are moving from one language to -another. -
-Core Octave currently lacks some of the plotting features found in the R -examples, such as LOWESS smoothing and linear model characteristic plots, so -you will just find “TODO” placeholders for these in Tablicious’ example code. -
-Tablicious provides the datetime
class for representing points in time.
-
There’s also duration
and calendarDuration
for representing
-periods or durations of time. Like vector quantities along the time line,
-as opposed to datetime
being a point along the time line.
-
While the underlying data representation of datetime
is compatible with
-(in fact, identical to) that of datenums, you cannot directly combine them
-via assignment, concatenation, or most arithmetic operations.
-
This is because of the signature of the datetime
constructor. When combining
-objects and primitive types like double
, the primitive type is promoted to an
-object by calling the other object’s one-argument constructor on it. However, the
-one-argument numeric-input consstructor for datetime
does not accept datenums:
-it interprets its input as datevecs instead. This is due to a design decision on
-Matlab’s part; for compatibility, Octave does not alter that interface.
-
To combine datetime
s with datenums, you can convert the datenums to datetime
s
-by calling datetime.ofDatenum
or datetime(x, 'ConvertFrom', 'datenum')
, or you
-can convert the datetime
s to datenums by accessing its dnums
field with
-x.dnums
.
-
Examples: -
-dt = datetime('2011-03-04') -dn = datenum('2017-01-01') -[dt dn] - ⇒ error: datenum: expected date vector containing [YEAR, MONTH, DAY, HOUR, MINUTE, SECOND] -[dt datetime.ofDatenum(dn)] - ⇒ 04-Mar-2011 01-Jan-2017 -
Also, if you have a zoned datetime
, you can’t combine it with a datenum, because datenums
-do not carry time zone information.
-
Tablicious’s time zone data is drawn from the IANA Time Zone Database, also known as the “Olson Database”. Tablicious includes a -copy of this database in its distribution so it can work on Windows, which does -not supply it like Unix systems do. -
-You can use the timezones
function to list the time zones known to Tablicious. These will be
-all the time zones in the IANA database on your system (for Linux and macOS) or in the IANA
-time zone database redistributed with Tablicious (for Windows).
-
-- -Note: The IANA Time Zone Database only covers dates from about the year 1880 to 2038. Converting -time zones for
datetime
s outside that range is currently unimplemented. (Tablicious -needs to add support for proleptic POSIX time zone rules, which are used to govern -behavior outside that date range.) -
Tablicious comes with several example data sets that you can use to explore how
-its functions and objects work. These are accessed through the
-tblish.datasets
and tblish.dataset
classes.
-
To see a list of the available data sets, run tblish.datasets.list()
.
-Then to load one of the example data sets, run
-tblish.datasets.load('examplename')
. For example:
-
tblish.datasets.list -t = tblish.datasets.load('cupcake') -
You can also load it by calling tblish.dataset.<name>
. This does
-the same thing. For example:
-
t = tblish.dataset.cupcake -
When you load a data set, it either returns all its data in a single variable -(if you capture it), or loads its data into one or more variables in your -workspace (if you call it with no outputs). -
-Each example data set comes with help text that describes the data set and
-provides examples of how to work with it. This help is found using the doc
-command on tblish.dataset.<name>
, where <name> is the name of
-the data set.
-
For example: -
-doc tblish.dataset.cupcake -
(The command help tblish.dataset.<name>
ought to work too, but it
-currently doesn’t. This may be due to an issue with Octave’s help
-command.)
-
The easiest way to obtain Tablicious is by using Octave’s pkg
package manager.
-To install the development prerelease of Tablicious, run this in Octave:
-
pkg install https://github.com/apjanke/octave-tablicious/releases/download/v0.4.4-SNAPSHOT/tablicious-0.4.4-SNAPSHOT.tar.gz -
(Check the releases page at https://github.com/apjanke/octave-tablicious/releases to -find out what the actual latest release number is.) -
-For development, you can obtain the source code for Tablicious from the project repo on GitHub at -https://github.com/apjanke/octave-tablicious. Make a local clone of the repo. -Then add the inst directory in the repo to your Octave path. -
- - ---Time is an illusion. Lunchtime doubly so. -
-
This is the manual for the Tablicious package version 0.4.4-SNAPSHOT for GNU Octave. -
-Tablicious provides somewhat-Matlab-compatible tabular data and date/time support for
-GNU Octave.
-This includes a table
class with support for filtering and join operations;
-datetime
, duration
, and related classes;
-Missing Data support; string
and categorical
data types;
-and other miscellaneous things.
-
This document is a work in progress. You are invited to help improve it and -submit patches. -
-Tablicious’s classes are designed to be convenient to use while still being efficient. -The data representations used by Tablicious are designed to be efficient and suitable -for working with large-ish data sets. A “large-ish” data set is one that can have -millions of elements or rows, but still fits in main computer memory. Tablicious’s main -relational and arithmetic operations are all implemented using vectorized -operations on primitive Octave data types. -
-Tablicious was written by Andrew Janke <floss@apjanke.net>. Support can be -found on the Tablicious project -GitHub page. -
- -Tablicious is based on Matlab’s table and date/time APIs and supports some of -their major functionality. -But not all of it is implemented yet. The missing parts are currently: -
-readtable()
and writetable()
-summary()
categorical
-.
-indexing
-timetable
-'ConvertFrom'
forms for datetime
and duration
constructors
-datetime
-between
-caldiff
-dateshift
-week
-isdst
, isweekend
-calendarDuration.split
-duration.Format
support
-fillmissing
-UTCOffset
and DSTOffset
fields in the output of timezones()
-It is the author’s hope that many these will be implemented some day. -
-These areas of missing functionality are tracked on the Tablicious issue -tracker at https://github.com/apjanke/octave-tablicious/issues and -https://github.com/users/apjanke/projects/3. -
- -out =
NaC ()
¶out =
NaC (sz)
¶“Not-a-Categorical". Creates missing-valued categorical arrays. -
-Returns a new categorical
array of all missing values of
-the given size. If no input sz is given, the result is a scalar missing
-categorical.
-
NaC
is the categorical
equivalent of NaN
or NaT
. It
-represents a missing, invalid, or null value. NaC
values never compare
-equal to any value, including other NaC
s.
-
NaC
is a convenience function which is strictly a wrapper around
-categorical.undefined
and returns the same results, but may be more convenient
-to type and/or more readable, especially in array expressions with several values.
-
See also: categorical.undefined -
-out =
NaS ()
¶out =
NaS (sz)
¶“Not-a-String". Creates missing-valued string arrays. -
-Returns a new string
array of all missing values of
-the given size. If no input sz is given, the result is a scalar missing
-string.
-
NaS
is the string
equivalent of NaN
or NaT
. It
-represents a missing, invalid, or null value. NaS
values never compare
-equal to any value, including other NaS
s.
-
NaS
is a convenience function which is strictly a wrapper around
-string.missing
and returns the same results, but may be more convenient
-to type and/or more readable, especially in array expressions with several values.
-
See also: string.missing -
-out =
NaT ()
¶out =
NaT (sz)
¶“Not-a-Time”. Creates missing-valued datetime arrays. -
-Constructs a new datetime
array of all NaT
values of
-the given size. If no input sz is given, the result is a scalar NaT
.
-
NaT
is the datetime
equivalent of NaN
. It represents a missing
-or invalid value. NaT
values never compare equal to, greater than, or less
-than any value, including other NaT
s. Doing arithmetic with a NaT
and
-any other value results in a NaT
.
-
NaT
currently cannot create NaT arrays of type localdate
. To do that,
-use localdate.NaT instead.
-
There are two main ways to construct a table
array: build one up by combining
-multiple variables together, or convert an existing tabular-organized array into a
-table
.
-
To build an array from multiple variables, use the table(…)
constructor, passing
-in all of your variables as separate inputs. It takes any number of inputs. Each input
-becomes a table variable in the new table
object. If you pass your constructor
-inputs directly from variables, it automatically picks up their names and uses them
-as the table variable names. Otherwise, if you’re using more complex expressions, you’ll
-need to supply the 'VariableNames'
option.
-
To convert a tabular-organized array of another type into a table
, use the
-conversion functions like array2table
, struct2table
and cell2table
.
-array2table
and cell2table
take each column of the input array and turn
-it into a separate table variable in the resulting table
. struct2table
takes
-the fields of a struct and puts them into table variables.
-
Tablicious provides the table
class for representing tabular data.
-
A table
is an array object that represents a tabular data structure. It holds
-multiple named “variables”, each of which is a column vector, or a 2-D matrix whose
-rows are read as records.
-
A table
is composed of multiple “variables”, each with a name, which all have
-the same number of rows. (A table
variable is like a “column” in SQL tables
-or in R or Python/pandas dataframes. Whenever you read “variable” here, think
-“column”.) Taken together, the i-th element or row of each variable compose
-a single record or observation.
-
Tables are good ways of arranging data if you have data that would otherwise be stored
-in a few separate variables which all need to be kept in the same shape and order,
-especially if you might want to do element-wise comparisons involving two or more of
-those variables. That’s basically all a table
is: it holds a collection of
-variables, and makes sure they are all kept aligned and ordered in the same way.
-
Tables are a lot like SQL tables or result sets, and are based on the same relational
-algebra theory that SQL is. Many common, even powerful, SQL operations can be done
-in Octave using table
arrays. It’s like having your own in-memory SQL engine.
-
Here’s a table (ha!) of what SQL and relational algebar operations correspond to
-what Octave table
operations.
-
In this table, t
is a variable holding a table
array, and ix
is
-some indexing expression.
-
SQL | Relational | Octave table |
---|---|---|
SELECT | PROJECT | subsetvars , t(:,ix) |
WHERE | RESTRICT | subsetrows , t(ix,:) |
INNER JOIN | JOIN | innerjoin |
OUTER JOIN | OUTER JOIN | outerjoin |
FROM table1, table2, … | Cartesian product | cartesian |
GROUP BY | SUMMARIZE | groupby |
DISTINCT | (automatic) | unique(t) |
Note that there is one big difference between relational algebra and SQL & Octave
-table
: Relations in relational algebra are sets, not lists.
-There are no duplicate rows in relational algebra, and there is no ordering.
-So every operation there does an implicit DISTINCT
/unique()
on its
-results, and there‘s no ORDER BY
/sort()
. This is not the case in SQL
-or Octave table
.
-
Note for users coming from Matlab: Matlab does not provide a general groupby
-function. Instead, you have to variously use rowfun
, grpstats
,
-groupsummary
, and manual code to accomplish “group by” operations.
-
Note: I wrote this based on my understanding of relational algebra from reading -C. J. Date books. Other people’s understanding and terminology may differ. - apjanke -
- - -Tablicious has support for representing dates in time zones and for converting between time zones. -
-A datetime
may be "zoned" or "zoneless". A zoneless datetime
does not have a time zone
-associated with it. This is represented by an empty TimeZone
property on the datetime
-object. A zoneless datetime
represents the local time in some unknown time zone, and assumes a
-continuous time scale (no DST shifts).
-
A zoned datetime
is associated with a time zone. It is represented by having the time zone’s
-IANA zone identifier (e.g. 'UTC'
or 'America/New_York'
) in its TimeZone
-property. A zoned datetime
represents the local time in that time zone.
-
By default, the datetime
constructor creates unzoned datetime
s. To
-make a zoned datetime
, either pass the 'TimeZone'
option to the constructor,
-or set the TimeZone
property after object creation. Setting the TimeZone
-property on a zoneless datetime
declares that it’s a local time in that time zone.
-Setting the TimeZone
property on a zoned datetime
turns it back into a
-zoneless datetime
without changing the local time it represents.
-
You can tell a zoned from a zoneless time zone in the object display because the time zone
-is included for zoned datetime
s.
-
% Create an unzoned datetime -d = datetime('2011-03-04 06:00:00') - ⇒ 04-Mar-2011 06:00:00 - -% Create a zoned datetime -d_ny = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/New_York') - ⇒ 04-Mar-2011 06:00:00 America/New_York -% This is equivalent -d_ny = datetime('2011-03-04 06:00:00'); -d_ny.TimeZone = 'America/New_York' - ⇒ 04-Mar-2011 06:00:00 America/New_York - -% Convert it to Chicago time -d_chi.TimeZone = 'America/Chicago' - ⇒ 04-Mar-2011 05:00:00 America/Chicago -
When you combine two zoned datetime
s via concatenation, assignment, or
-arithmetic, if their time zones differ, they are converted to the time zone of
-the left-hand input.
-
d_ny = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/New_York') -d_la = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/Los_Angeles') -d_la - d_ny - ⇒ 03:00:00 -
You cannot combine a zoned and an unzoned datetime
. This results in an error
-being raised.
-
-- - - -Warning: Normalization of "nonexistent" times (like between 02:00 and 03:00 on a "spring forward" -DST change day) is not implemented yet. The results of converting a zoneless local time -into a time zone where that local time did not exist are currently undefined. -
Tablicious provides several validation functions which can be used to check properties -of function arguments, variables, object properties, and other expressions. These can -be used to express invariants in your program and catch problems due to input errors, -incorrect function usage, or other bugs. -
-These validation functions are named following the pattern mustBeXxx
, where Xxx
-is some property of the input it is testing. Validation functions may check the type,
-size, or other aspects of their inputs.
-
The most common place for validation functions to be used will probably be at the -beginning of functions, to check the input arguments and ensure that the contract of -the function is not being violated. If in the future Octave gains the ability to -declaratively express object property constraints, they will also be of use there. -
-Be careful not to get too aggressive with the use of validation functions: while using -them can make sure invariants are followed and your program is correct, they also reduce -the code’s ability to make use of duck typing, reducing its flexibility. Whether you want -to make this trade-off is a design decision you will have to consider. -
-When a validation function’s condition is violated, it raises an error that includes a
-description of the violation in the error message. This message will include a label for
-the input that describes what is being tested. By default, this label is initialized
-with inputname()
, so when you are calling a validator on a function argument or
-variable, you will generally not need to supply a label. But if you’re calling it on
-an object property or an expression more complex than a simple variable reference, the
-validator cannot automatically detect the input name for use in the label. In this case,
-make use of the optional trailing argument(s) to the functions to manually supply a
-label for the value being tested.
-
% Validation of a simple variable does not need a label -mustBeScalar (x); -% Validation of a field or property reference does need a label -mustBeScalar (this.foo, 'this.foo'); -
out =
array2table (c)
¶out =
array2table (…, 'VariableNames'
, VariableNames)
¶out =
array2table (…, 'RowNames'
, RowNames)
¶Convert an array to a table. -
-Converts a 2-D array to a table, with columns in the array becoming variables in -the output table. This is typically used on numeric arrays, but it can -be applied to any type of array. -
-You may not want to use this on cell arrays, though, because you will
-end up with a table that has all its variables of type cell. If you use
-cell2table
instead, columns of the cell array which can be
-condensed into primitive arrays will be. With array2table
, they
-won’t be.
-
See also: cell2table, table, struct2table -
-calendarDuration
Class ¶A calendarDuration
represents a period of time in variable-length calendar
-components. For example, years and months can have varying numbers of days, and days
-in time zones with Daylight Saving Time have varying numbers of hours. A
-calendarDuration
does arithmetic with "whole" calendar periods.
-
calendarDuration
s and duration
s cannot be directly combined, because
-they are not semantically equivalent. (This may be relaxed in the future to allow
-duration
s to be interpreted as numbers of days when combined with
-calendarDuration
s.)
-
d = datetime('2011-03-04 00:00:00') - ⇒ 04-Mar-2011 -cdur = calendarDuration(1, 3, 0) - ⇒ 1y 3mo -d2 = d + cdur - ⇒ 04-Jun-2012 -
Durations of time using variable-length calendar periods, such as days, -months, and years, which may vary in length over time. (For example, a -calendar month may have 28, 30, or 31 days.) -
-calendarDuration
: char
Sign ¶The sign (1 or -1) of this duration, which indicates whether it is a -positive or negative span of time. -
-calendarDuration
: char
Years ¶The number of whole calendar years in this duration. Must be integer-valued. -
-calendarDuration
: char
Months ¶The number of whole calendar months in this duration. Must be integer-valued. -
-calendarDuration
: char
Days ¶The number of whole calendar days in this duration. Must be integer-valued. -
-calendarDuration
: char
Hours ¶The number of whole hours in this duration. Must be integer-valued. -
-calendarDuration
: char
Minutes ¶The number of whole minutes in this duration. Must be integer-valued. -
-calendarDuration
: char
Seconds ¶The number of seconds in this duration. May contain fractional values. -
-calendarDuration
: char
Format ¶The format to display this calendarDuration
in. Currently unsupported.
-
This is a single value that applies to the whole array. -
-obj =
calendarDuration ()
¶Constructs a new scalar calendarDuration
of zero elapsed time.
-
out =
dispstrs (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
ismissing (obj)
¶True if input elements are missing. -
-This is equivalent to ismissing
.
-
Returns logical array the same size as obj. -
-out =
isnan (obj)
¶True if input elements are NaN. -
-This is equivalent to ismissing
, and is provided for compatibility
-and polymorphic programming purposes.
-
Returns logical array the same size as obj. -
-out =
times (A, B)
¶Subtraction: Subtracts one calendarDuration
from another.
-
Returns a calendarDuration
.
-
out =
plus (A, B)
¶Addition: add two calendarDuration
s.
-
All the calendar elements (properties) of the two inputs are added -together. No normalization is done across the elements, aside from -the normalization of NaNs. -
-If B is numeric, it is converted to a calendarDuration
-using calendarDuration.ofDays
.
-
Returns a calendarDuration
.
-
out =
uminus (obj)
¶Unary minus. Negates the sign of obj. -
-out =
calyears (x)
¶Construct a calendarDuration
a given number of years long.
-
This is a shorthand for calling calendarDuration(x, 0, 0)
.
-
See calendarDuration. -
-Categorical variable array. -
-A categorical
array represents an array of values of a categorical
-variable. Each categorical
array stores the element values along
-with a list of the categories, and indicators of whether the categories
-are ordinal (that is, they have a meaningful mathematical ordering), and
-whether the set of categories is protected (preventing new categories
-from being added to the array).
-
In addition to the categories defined in the array, a categorical array
-may have elements of "undefined" value. This is not considered a
-category; rather, it is the absence of any known value. It is
-analagous to a NaN
value.
-
This class is not fully implemented yet. Missing stuff: -
-categorical
: uint16
code ¶The numeric codes of the array element values. These are indexes into the
-cats
category list.
-
This is a planar property. -
-categorical
: logical
tfMissing ¶A logical mask indicating whether each element of the array is missing -(that is, undefined). -
-This is a planar property. -
-categorical
: cellstr
cats ¶The names of the categories in this array. This is the list into which
-the code
values are indexes.
-
categorical
: scalar_logical
isOrdinal ¶A scalar logical indicating whether the categories in this array have an -ordinal relationship. -
-out =
addcats (obj, newcats)
¶Add categories to categorical array. -
-Adds the specified categories to obj, without changing any of -its values. -
-newcats is a cellstr listing the category names to add to -obj. -
-obj =
categorical ()
¶Constructs a new scalar categorical whose value is undefined. -
-obj =
categorical (vals)
¶obj =
categorical (vals, valueset)
¶obj =
categorical (vals, valueset, category_names)
¶obj =
categorical (…, 'Ordinal'
, Ordinal)
¶obj =
categorical (…, 'Protected'
, Protected)
¶Constructs a new categorical array from the given values. -
-vals is the array of values to convert to categoricals. -
-valueset is the set of all values from which vals is drawn. -If omitted, it defaults to the unique values in vals. -
-category_names is a list of category names corresponding to -valueset. If omitted, it defaults to valueset, converted -to strings. -
-Ordinal is a logical indicating whether the category values in -obj have a numeric ordering relationship. Defaults to false. -
-Protected indicates whether obj should be protected, which -prevents the addition of new categories to the array. Defaults to -false. -
-out =
categories (obj)
¶Get a list of the categories in obj. -
-Gets a list of the categories in obj, identified by their -category names. -
-Returns a cellstr column vector. -
-out =
cellstr (obj)
¶Convert to cellstr. -
-Converts obj to a cellstr array. The strings will be the
-category names for corresponding values, or ''
for undefined
-values.
-
Returns a cellstr array the same size as obj. -
-out =
dispstrs (obj)
¶Display strings. -
-Gets display strings for each element in obj. The display strings are
-either the category string, or '<undefined>'
for undefined values.
-
Returns a cellstr array the same size as obj. -
-out =
double (obj)
¶Convert to double array, by getting the underlying code values. -
-Converts obj to a string array. The doubles will be the
-underlying numeric code values of obj, or NaN
for
-undefined values.
-
The numeric code values of two different categorical arrays do -*not* necessarily correspond to the same string values, and can -*not* be meaningfully compared for equality or ordering. -
-Returns a double
array the same size as obj.
-
out =
iscategory (obj, catnames)
¶Test whether input is a category on a categorical array. -
-catnames is a cellstr listing the category names to check against -obj. -
-Returns a logical array the same size as catnames. -
-out =
ismissing (obj)
¶Test whether elements are missing. -
-For categorical arrays, undefined elements are considered to be -missing. -
-Returns a logical array the same size as obj. -
-out =
isnanny (obj)
¶Test whethere elements are NaN-ish. -
-Checks where each element in obj is NaN-ish. For categorical -arrays, undefined values are considered NaN-ish; any other -value is not. -
-Returns a logical array the same size as obj. -
-out =
isordinal (obj)
¶Whether obj is ordinal. -
-Returns true if obj is ordinal (as determined by its
-IsOrdinal
property), and false otherwise.
-
out =
isundefined (obj)
¶Test whether elements are undefined. -
-Checks whether each element in obj is undefined. "Undefined" is
-a special value defined by categorical
. It is equivalent to
-a NaN
or a missing
value.
-
Returns a logical array the same size as obj. -
-out =
mergecats (obj, oldcats)
¶out =
mergecats (obj, oldcats, newcat)
¶Merge multiple categories. -
-Merges the categories oldcats into a single category. If newcat -is specified, that new category is added if necessary, and all of oldcats -are merged into it. newcat must be an existing category in obj if -obj is ordinal. -
-If newcat is not provided, all of odcats are merged into
-oldcats{1}
.
-
out =
categorical.missing ()
¶out =
categorical.missing (sz)
¶Create an array of missing (undefined) categoricals. -
-Creates a categorical array whose elements are all missing (<undefined>). -
-This is a convenience alias for categorical.undefined, so you can call -it generically. It returns strictly the same results as calling -categorical.undefined with the same arguments. -
-Returns a categorical array. -
-See also: categorical.undefined -
-out =
removecats (obj)
¶Removes all unused categories from obj. This is equivalent to
-out = squeezecats (obj)
.
-
out =
removecats (obj, oldcats)
¶Remove categories from categorical array. -
-Removes the specified categories from obj. Elements of obj -whose values belonged to those categories are replaced with undefined. -
-newcats is a cellstr listing the category names to add to -obj. -
-out =
reordercats (obj)
¶out =
reordercats (obj, newcats)
¶Reorder categories. -
-Reorders the categories in obj to match newcats. -
-newcats is a cellstr that must be a reordering of obj’s existing -category list. If newcats is not supplied, sorts the categories -in alphabetical order. -
-out =
setcats (obj, newcats)
¶Set categories for categorical array. -
-Sets the categories to use for obj. If any current categories -are absent from the newcats list, current values of those -categories become undefined. -
-out =
squeezecats (obj)
¶Remove unused categories. -
-Removes all categories which have no corresponding values in obj’s -elements. -
-This is currently unimplemented. -
-out =
string (obj)
¶Convert to string array. -
-Converts obj to a string array. The strings will be the -category names for corresponding values, or <missing> for undefined -values. -
-Returns a string
array the same size as obj.
-
(obj)
¶Display summary of array’s values. -
-Displays a summary of the values in this categorical array. The output -may contain info like the number of categories, number of undefined values, -and frequency of each category. -
-out =
categorical.undefined ()
¶out =
categorical.undefined (sz)
¶Create an array of undefined categoricals. -
-Creates a categorical array whose elements are all <undefined>. -
-sz is the size of the array to create. If omitted or empty, creates -a scalar. -
-Returns a categorical array. -
-See also: categorical.missing -
-out =
cell2table (c)
¶out =
cell2table (…, 'VariableNames'
, VariableNames)
¶out =
cell2table (…, 'RowNames'
, RowNames)
¶Convert a cell array to a table. -
-Converts a 2-dimensional cell matrix into a table. Each column in the
-input c becomes a variable in out. For columns that contain
-all scalar values of cat
-compatible types, they are “popped out”
-of their cells and condensed into a homogeneous array of the contained
-type.
-
See also: array2table, table, struct2table -
-out =
colvecfun (fcn, x)
¶Apply a function to column vectors in array. -
-Applies the given function fcn to each column vector in the -array x, by iterating over the indexes along all dimensions except -dimension 1. Collects the function return values in an output array. -
-fcn must be a function which takes a column vector and returns a column -vector of the same size. It does not have to return the same type as -x. -
-Returns the result of applying fcn to each column in x, all concatenated -together in the same shape as x. -
-out =
colvecfun (str, pattern)
¶out =
colvecfun (…, 'IgnoreCase'
, IgnoreCase)
¶Test if strings contain a pattern. -
-Tests whether the given strings contain the given pattern(s). -
-str (char, cellstr, or string) is a list of strings to compare against -pattern. -
-pattern (char, cellstr, or string) is a list of patterns to match. These are -literal plain string patterns, not regex patterns. If more than one pattern -is supplied, the return value is true if the string matched any of them. -
-Returns a logical array of the same size as the string array represented by -str. -
-See also: startsWith, endsWith -
-description
(datasetName) ¶out =
description (datasetName)
¶Get or display the description for a dataset. -
-Gets the description for the named dataset. If the output is captured, -it is returned as a charvec containing plain text suitable for human display. -If the output is not captured, displays the description to the console. -
-()
¶out =
list ()
¶List all datasets. -
-Lists all the example datasets known to this class. If the output is -captured, returns the list as a table. If the output is not captured, -displays the list. -
-Returns a table with variables Name, Description, and possibly more. -
-datetime
Class ¶A datetime
is an array object that represents points in time in the familiar
-Gregorian calendar.
-
This is an attempt to reproduce the functionality of Matlab’s datetime
. It
-also contains some Octave-specific extensions.
-
The underlying representation is that of a datenum (a double
-containing the number of days since the Matlab epoch), but encapsulating it in an
-object provides several benefits: friendly human-readable display, type safety,
-automatic type conversion, and time zone support. In addition to the underlying
-datenum array, a datetime
inclues an optional TimeZone
property
-indicating what time zone the datetimes are in.
-
So, basically, a datetime
is an object wrapper around a datenum array,
-plus time zone support.
-
Represents points in time using the Gregorian calendar. -
-The underlying values are doubles representing the number of days since the -Matlab epoch of "January 0, year 0". This has a precision of around nanoseconds -for typical times. -
-A datetime
array is an array of date/time values, with each element
-holding a complete date/time. The overall array may also have a TimeZone and a
-Format associated with it, which apply to all elements in the array.
-
This is an attempt to reproduce the functionality of Matlab’s datetime
. It
-also contains some Octave-specific extensions.
-
datetime
: double
dnums ¶The underlying datenums that represent the points in time. These are always in UTC. -
-This is a planar property: the size of dnums
is the same size as the
-containing datetime
array object.
-
datetime
: char
TimeZone ¶The time zone this datetime
array is in. Empty if this does not have a
-time zone associated with it (“unzoned”). The name of an IANA time zone if
-this does.
-
Setting the TimeZone
of a datetime
array changes the time zone it
-is presented in for strings and broken-down times, but does not change the
-underlying UTC times that its elements represent.
-
datetime
: char
Format ¶The format to display this datetime
in. Currently unsupported.
-
out =
datetime.NaT ()
¶out =
datetime.NaT (sz)
¶“Not-a-Time”: Creates NaT-valued arrays. -
-Constructs a new datetime
array of all NaT
values of
-the given size. If no input sz is given, the result is a scalar NaT
.
-
NaT
is the datetime
equivalent of NaN
. It represents a missing
-or invalid value. NaT
values never compare equal to, greater than, or less
-than any value, including other NaT
s. Doing arithmetic with a NaT
and
-any other value results in a NaT
.
-
out =
datetime.convertDatenumTimeZone (dnum, fromZoneId, toZoneId)
¶Convert a datenum from one time zone to another. -
-dnum is a datenum array to convert. -
-fromZoneId is a charvec containing the IANA Time Zone identifier for -the time zone to convert from. -
-toZoneId is a charvec containing the IANA Time Zone identifier for -the time zone to convert to. -
-Returns a datenum array the same size as dnum. -
-out =
datetime.datenum2posix (dnums)
¶Converts Octave datenums to Unix dates. -
-The input datenums are assumed to be in UTC. -
-Returns a double, which may have fractional seconds. -
-out =
datestruct (obj)
¶Converts this to a "datestruct" broken-down time structure. -
-A "datestruct" is a format of struct that Tablicious came up with. It is a scalar -struct with fields Year, Month, Day, Hour, Minute, and Second, each containing -a double array the same size as the date array it represents. -
-The values in the returned broken-down time are those of the local time -in this’ defined time zone, if it has one. -
-Returns a struct with fields Year, Month, Day, Hour, Minute, and Second. -Each field contains a double array of the same size as this. -
-obj =
datetime ()
¶Constructs a new scalar datetime
containing the current local time, with
-no time zone attached.
-
obj =
datetime (datevec)
¶obj =
datetime (datestrs)
¶obj =
datetime (in, 'ConvertFrom'
, inType)
¶obj =
datetime (Y, M, D, H, MI, S)
¶obj =
datetime (Y, M, D, H, MI, MS)
¶obj =
datetime (…, 'Format'
, Format, 'InputFormat'
, InputFormat, 'Locale'
, InputLocale, 'PivotYear'
, PivotYear, 'TimeZone'
, TimeZone)
¶Constructs a new datetime
array based on input values.
-
out =
diff (obj)
¶Differences between elements. -
-Computes the difference between each successive element in obj, as a
-duration
.
-
Returns a duration
array the same size as obj.
-
out =
dispstrs (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
eq (A, B)
¶True if A is equal to B. This defines the ==
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
ge (A, B)
¶True if A is greater than or equal to B. This defines the >=
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
gmtime (obj)
¶Convert to TM_STRUCT structure in UTC time. -
-Converts obj to a TM_STRUCT style structure array. The result is in -UTC time. If obj is unzoned, it is assumed to be in UTC time. -
-Returns a struct array in TM_STRUCT style. -
-out =
gt (A, B)
¶True if A is greater than B. This defines the >
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-[h, m, s] =
hms (obj)
¶Get the Hour, Minute, and Second components of a obj. -
-For zoned datetime
s, these will be local times in the associated time zone.
-
Returns double arrays the same size as obj
.
-
out =
isbetween (obj, lower, upper)
¶Tests whether the elements of obj are between lower and -upper. -
-All inputs are implicitly converted to datetime
arrays, and are subject
-to scalar expansion.
-
Returns a logical array the same size as the scalar expansion of the inputs. -
-out =
isnan (obj)
¶True if input elements are NaT. This is an alias for isnat
-to support type compatibility and polymorphic programming.
-
Returns logical array the same size as obj. -
-out =
isnat (obj)
¶True if input elements are NaT. -
-Returns logical array the same size as obj. -
-out =
le (A, B)
¶True if A is less than or equal toB. This defines the <=
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
linspace (from, to, n)
¶Linearly-spaced values in date/time space. -
-Constructs a vector of datetime
s that represent linearly spaced points
-starting at from and going up to to, with n points in the
-vector.
-
from and to are implicitly converted to datetime
s.
-
n is how many points to use. If omitted, defaults to 100. -
-Returns an n-long datetime
vector.
-
out =
localtime (obj)
¶Convert to TM_STRUCT structure in UTC time. -
-Converts obj to a TM_STRUCT style structure array. The result is a -local time in the system default time zone. Note that the system default -time zone is always used, regardless of what TimeZone is set on obj. -
-If obj is unzoned, it is assumed to be in UTC time. -
-Returns a struct array in TM_STRUCT style. -
-Example: -
dt = datetime; -dt.TimeZone = datetime.SystemTimeZone; -tm_struct = localtime (dt); -
out =
lt (A, B)
¶True if A is less than B. This defines the <
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
minus (A, B)
¶Subtraction (-
operator). Subtracts a duration
,
-calendarDuration
or numeric B from a datetime
A,
-or subtracts two datetime
s from each other.
-
If both inputs are datetime
, then the output is a duration
.
-Otherwise, the output is a datetime
.
-
Numeric B inputs are implicitly converted to duration
using
-duration.ofDays
.
-
Returns an array the same size as A. -
-out =
ne (A, B)
¶True if A is not equal to B. This defines the !=
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-obj =
datetime.ofDatenum (dnums)
¶Converts a datenum array to a datetime array. -
-Returns an unzoned datetime
array of the same size as the input.
-
obj =
datetime.ofDatestruct (dstruct)
¶Converts a datestruct to a datetime array. -
-A datestruct is a special struct format used by Tablicious that has fields -Year, Month, Day, Hour, Minute, and Second. It is not a standard Octave datatype. -
-Returns an unzoned datetime
array.
-
out =
plus (A, B)
¶Addition (+
operator). Adds a duration
, calendarDuration
,
-or numeric B to a datetime
A.
-
A must be a datetime
.
-
Numeric B inputs are implicitly converted to duration
using
-duration.ofDays
.
-
Returns datetime
array the same size as A.
-
dnums =
datetime.posix2datenum (pdates)
¶Converts POSIX (Unix) times to datenums -
-Pdates (numeric) is an array of POSIX dates. A POSIX date is the number -of seconds since January 1, 1970 UTC, excluding leap seconds. The output -is implicitly in UTC. -
-out =
posixtime (obj)
¶Converts this to POSIX time values (seconds since the Unix epoch) -
-Converts this to POSIX time values that represent the same time. The -returned values will be doubles that may include fractional second values. -POSIX times are, by definition, in UTC. -
-Returns double array of same size as this. -
-[keysA, keysB] =
proxyKeys (a, b)
¶Computes proxy key values for two datetime arrays. Proxy keys are numeric -values whose rows have the same equivalence relationships as the elements of -the inputs. -
-This is primarily for Tablicious’s internal use; users will typically not need to call -it or know how it works. -
-Returns two 2-D numeric matrices of size n-by-k, where n is the number of elements -in the corresponding input. -
-out =
timeofday (obj)
¶Get the time of day (elapsed time since midnight). -
-For zoned datetime
s, these will be local times in the associated time zone.
-
Returns a duration
array the same size as obj
.
-
out =
week (obj)
¶Get the week of the year. -
-This method is unimplemented. -
-[y, m, d] =
ymd (obj)
¶Get the Year, Month, and Day components of obj. -
-For zoned datetime
s, these will be local times in the associated time zone.
-
Returns double arrays the same size as obj
.
-
[y, m, d, h, mi, s] =
ymdhms (obj)
¶Get the Year, Month, Day, Hour, Minute, and Second components of a obj. -
-For zoned datetime
s, these will be local times in the associated time zone.
-
Returns double arrays the same size as obj
.
-
out =
days (x)
¶Duration in days. -
-If x is numeric, then out is a duration
array in units
-of fixed-length 24-hour days, with the same size as x.
-
If x is a duration
, then returns a double
array the same
-size as x indicating the number of fixed-length days that each duration
-is.
-
[Y, E] =
discretize (X, n)
¶[Y, E] =
discretize (X, edges)
¶[Y, E] =
discretize (X, dur)
¶[Y, E] =
discretize (…, 'categorical'
)
¶[Y, E] =
discretize (…, 'IncludedEdge'
, IncludedEdge)
¶Group data into discrete bins or categories. -
-n is the number of bins to group the values into. -
-edges is an array of edge values defining the bins. -
-dur is a duration
value indicating the length of time of each
-bin.
-
If 'categorical'
is specified, the resulting values are a categorical
-array instead of a numeric array of bin indexes.
-
Returns: - Y - the bin index or category of each value from X - E - the list of bin edge values -
-out =
dispstrs (x)
¶Display strings for array. -
-Gets the display strings for each element of x. The display strings -should be short, one-line, human-presentable strings describing the -value of that element. -
-The default implementation of dispstrs
can accept input of any
-type, and has decent implementations for Octave’s standard built-in types,
-but will have opaque displays for most user-defined objects.
-
This is a polymorphic method that user-defined classes may override -with their own custom display that is more informative. -
-Returns a cell array the same size as x. -
-duration
Class ¶A duration
represents a period of time in fixed-length seconds (or minutes, hours,
-or whatever you want to measure it in.)
-
A duration
has a resolution of about a nanosecond for typical dates. The underlying
-representation is a double
representing the number of days elapsed, similar to a
-datenum, except it’s interpreted as relative to some other reference point you provide,
-instead of being relative to the Matlab/Octave epoch.
-
You can add or subtract a duration
to a datetime
to get another datetime
.
-You can also add or subtract durations
to each other.
-
Represents durations or periods of time as an amount of fixed-length -time (i.e. fixed-length seconds). It does not care about calendar things -like months and days that vary in length over time. -
-This is an attempt to reproduce the functionality of Matlab’s duration
. It
-also contains some Octave-specific extensions.
-
Duration values are stored as double numbers of days, so they are an -approximate type. In display functions, by default, they are displayed with -millisecond precision, but their actual precision is closer to nanoseconds -for typical times. -
-duration
: double
days ¶The underlying datenums that represent the durations, as number of (whole and -fractional) days. These are uniform 24-hour days, not calendar days. -
-This is a planar property: the size of days
is the same size as the
-containing duration
array object.
-
duration
: char
Format ¶The format to display this duration
in. Currently unsupported.
-
out =
char (obj)
¶Convert to char. The contents of the strings will be the same as
-returned by dispstrs
.
-
This is primarily a convenience method for use on scalar objs. -
-Returns a 2-D char array with one row per element in obj. -
-out =
duration (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
hours (obj)
¶Equivalent number of hours. -
-Gets the number of fixed-length 60-minute hours that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-out =
linspace (from, to, n)
¶Linearly-spaced values in time duration space. -
-Constructs a vector of duration
s that represent linearly spaced points
-starting at from and going up to to, with n points in the
-vector.
-
from and to are implicitly converted to duration
s.
-
n is how many points to use. If omitted, defaults to 100. -
-Returns an n-long datetime
vector.
-
out =
milliseconds (obj)
¶Equivalent number of milliseconds. -
-Gets the number of milliseconds that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-out =
minutes (obj)
¶Equivalent number of minutes. -
-Gets the number of fixed-length 60-second minutes that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-obj =
duration.ofDays (dnums)
¶Converts a double array representing durations in whole and fractional days
-to a duration
array. This is the method that is used for implicit conversion
-of numerics in many cases.
-
Returns a duration
array of the same size as the input.
-
out =
seconds (obj)
¶Equivalent number of seconds. -
-Gets the number of seconds that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-out =
years (obj)
¶Equivalent number of years. -
-Gets the number of fixed-length 365.2425-day years that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-out =
eqn (A, B)
¶Determine element-wise equality, treating NaNs as equal -
-out = eqn (A, B) -
-eqn
is just like eq
(the function that implements the
-==
operator), except
-that it considers NaN and NaN-like values to be equal. This is the element-wise
-equivalent of isequaln
.
-
eqn
uses isnanny
to test for NaN and NaN-like values,
-which means that NaNs and NaTs are considered to be NaN-like, and
-string arrays’ “missing” and categorical objects’ “undefined” values
-are considered equal, because they are NaN-ish.
-
Developer’s note: the name “eqn
” is a little unfortunate,
-because “eqn” could also be an abbreviation for “equation”. But this
-name follows the isequaln
pattern of appending an “n” to the
-corresponding non-NaN-equivocating function.
-
See also: eq
, isequaln
, isnanny
-
out =
head (A)
¶out =
head (A, k)
¶Get first K rows of an array. -
-Returns the array A, subsetted to its first k rows. This means
-subsetting it to the first (min (k, size (A, 1)))
elements along
-dimension 1, and leaving all other dimensions unrestricted.
-
A is the array to subset. -
-k is the number of rows to get. k defaults to 8 if it is omitted -or empty. -
-If there are less than k rows in A, returns all rows. -
-Returns an array of the same type as A, unless ()-indexing A -produces an array of a different type, in which case it returns that type. -
-See also: tail -
-out =
hours (x)
¶Create a duration
x hours long, or get the hours in a duration
-x.
-
If input is numeric, returns a duration
array that is that many hours in
-time.
-
If input is a duration
, converts the duration
to a number of hours.
-
Returns an array the same size as x. -
This manual is for Tablicious, version 0.4.4-SNAPSHOT. -
- - -out =
iscalendarduration (x)
¶True if input is a calendarDuration
array, false otherwise.
-
Respects iscalendarduration
override methods on user-defined classes, even if
-they do not inherit from calendarDuration
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
iscategorical (x)
¶True if input is a categorical
array, false otherwise.
-
Respects iscategorical
override methods on user-defined classes, even if
-they do not inherit from categorical
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
isdatetime (x)
¶True if input is a datetime
array, false otherwise.
-
Respects isdatetime
override methods on user-defined classes, even if
-they do not inherit from datetime
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
isduration (x)
¶True if input is a duration
array, false otherwise.
-
Respects isduration
override methods on user-defined classes, even if
-they do not inherit from duration
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
isnanny (X)
¶Test if elements are NaN or NaN-like -
-Tests if input elements are NaN, NaT, or otherwise NaN-like. This is true
-if isnan()
or isnat()
returns true, and is false for types that do not support
-isnan()
or isnat()
.
-
This function only exists because: -
-isnanny()
smooths over those differences so you can call it polymorphically on
-any input type. Hopefully.
-
Under normal operation, isnanny()
should not throw an error for any type or
-value of input.
-
See also: ismissing, isnan
, isnat
, eqn, isequaln
-
out =
istable (x)
¶True if input is a table
array or other table-like type, false
-otherwise.
-
Respects istable
override methods on user-defined classes, even if
-they do not inherit from table
or were known to Tablicious at
-authoring time.
-
User-defined classes should only override istable
to return true if
-they conform to the table
public interface. That interface is not
-well-defined or documented yet, so maybe you don’t want to do that yet.
-
Returns a scalar logical. -
-out =
istabular (x)
¶True if input is eitehr a table
or timetable
array, or an object
-like them.
-
Respects istable
and istimetable
override methods on user-defined
-classes, even if they do not inherit from table
or were known to Tablicious
-at authoring time.
-
Returns a scalar logical. -
-out =
istimetable (x)
¶True if input is a timetable
array or other timetable-like type, false
-otherwise.
-
Respects istimetable
override methods on user-defined classes, even if
-they do not inherit from table
or were known to Tablicious at
-authoring time.
-
User-defined classes should only override istimetable
to return true if
-they conform to the table
public interface. That interface is not
-well-defined or documented yet, so maybe you don’t want to do that yet.
-
Returns a scalar logical. -
-Represents a complete day using the Gregorian calendar. -
-This class is useful for indexing daily-granularity data or representing -time periods that cover an entire day in local time somewhere. The major -purpose of this class is "type safety", to prevent time-of-day values -from sneaking in to data sets that should be daily only. As a secondary -benefit, this uses less memory than datetimes. -
-localdate
: double
dnums ¶The underlying datenum values that represent the days. The datenums are at -the midnight that is at the start of the day it represents. -
-These are doubles, but -they are restricted to be integer-valued, so they represent complete days, with -no time-of-day component. -
-localdate
: char
Format ¶The format to display this localdate
in. Currently unsupported.
-
out =
localdate.NaT ()
¶out =
localdate.NaT (sz)
¶“Not-a-Time”: Creates NaT-valued arrays. -
-Constructs a new datetime
array of all NaT
values of
-the given size. If no input sz is given, the result is a scalar NaT
.
-
NaT
is the datetime
equivalent of NaN
. It represents a missing
-or invalid value. NaT
values never compare equal to, greater than, or less
-than any value, including other NaT
s. Doing arithmetic with a NaT
and
-any other value results in a NaT
.
-
This static method is provided because the global NaT
function creates
-datetime
s, not localdate
s
-
out =
datenum (obj)
¶Convert this to datenums that represent midnight on obj’s days. -
-Returns double array of same size as this. -
-out =
datestruct (obj)
¶Converts this to a “datestruct” broken-down time structure. -
-A “datestruct” is a format of struct that Tablicious came up with. It is a scalar
-struct with fields Year, Month, and Day, each containing
-a double array the same size as the date array it represents. This format
-differs from the “datestruct” used by datetime
in that it lacks
-Hour, Minute, and Second components. This is done for efficiency.
-
The values in the returned broken-down time are those of the local time -in obj’s defined time zone, if it has one. -
-Returns a struct with fields Year, Month, and Day. -Each field contains a double array of the same size as this. -
-out =
dispstrs (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
isnan (obj)
¶True if input elements are NaT. This is an alias for isnat
-to support type compatibility and polymorphic programming.
-
Returns logical array the same size as obj. -
-out =
isnat (obj)
¶True if input elements are NaT. -
-Returns logical array the same size as obj. -
-obj =
localdate ()
¶Constructs a new scalar localdate
containing the current local date.
-
out =
posixtime (obj)
¶Converts this to POSIX time values for midnight of obj’s days. -
-Converts this to POSIX time values that represent the same date. The -returned values will be doubles that will not include fractional second values. -The times returned are those of midnight UTC on obj’s days. -
-Returns double array of same size as this. -
-[y, m, d] =
ymd (obj)
¶Get the Year, Month, and Day components of obj. -
-Returns double arrays the same size as obj
.
-
out =
milliseconds (x)
¶Create a duration
x milliseconds long, or get the milliseconds in a duration
-x.
-
If input is numeric, returns a duration
array that is that many milliseconds in
-time.
-
If input is a duration
, converts the duration
to a number of milliseconds.
-
Returns an array the same size as x. -
out =
hours (x)
¶Create a duration
x hours long, or get the hours in a duration
-x.
-
Generic auto-converting missing value. -
-missing
is a generic missing value that auto-converts to other
-types.
-
A missing
array indicates a missing value, of no particular type. It auto-
-converts to other types when it is combined with them via concatenation or
-other array combination operations.
-
This class is currently EXPERIMENTAL. Use at your own risk. -
-Note: This class does not actually work for assignment. If you do this: -
-x = 1:5 - x(3) = missing -
It’s supposed to work, but I can’t figure out how to do this in a normal -classdef object, because there doesn’t seem to be any function that’s implicitly -called for type conversion in that assignment. Darn it. -
-out =
dispstrs (obj)
¶Display strings. -
-Gets display strings for each element in obj. -
-For missing
, the display strings are always '<missing>'
.
-
Returns a cellstr the same size as obj. -
-out =
ismissing (obj)
¶Test whether elements are missing values. -
-ismissing
is always true for missing
arrays.
-
Returns a logical array the same size as obj. -
-out =
isnan (obj)
¶Test whether elements are NaN. -
-isnan
is always true for missing
arrays.
-
Returns a logical array the same size as obj. -
-out =
isnanny (obj)
¶Test whether elements are NaN-like. -
-isnanny
is always true for missing
arrays.
-
Returns a logical array the same size as obj. -
-obj =
missing ()
¶Constructs a scalar missing
array.
-
The constructor takes no arguments, since there’s only one
-missing
value.
-
(X)
¶(A, B, C, …)
¶('A'
, 'B'
, 'C'
, …)
¶A
B
C
…
¶Alias for prettyprint, for interactive use. -
-This is an alias for prettyprint(), with additional name-conversion magic. -
-If you pass in a char, instead of pretty-printing that directly, it will -grab and pretty-print the variable of that name from the caller’s workspace. -This is so you can conveniently run it from the command line. -
-[out1, out2, …, outN] =
scalarexpand (x1, x2, …, xN)
¶Expand scalar inputs to match size of non-scalar inputs. -
-Expands each scalar input argument to match the size of the non-scalar
-input arguments, and returns the expanded values in the corresponding
-output arguments. repmat
is used to do the expansion.
-
Works on any input types that support size
, isscalar
, and
-repmat
.
-
It is an error if any of the non-scalar inputs are not the same size as -all of the other non-scalar inputs. -
-Returns as many output arguments as there were input arguments. -
-Examples: -
-x1 = rand(3); -x2 = 42; -x3 = magic(3); -[x1, x2, x3] = scalarexpand (x1, x2, x3) -
out =
seconds (x)
¶Create a duration
x seconds long, or get the seconds in a duration
-x.
-
If input is numeric, returns a duration
array that is that many seconds in
-time.
-
If input is a duration
, converts the duration
to a number of seconds.
-
Returns an array the same size as x. -
out =
size2str (sz)
¶Format an array size for display. -
-Formats the given array size sz as a string for human-readable -display. It will be in the format “d1-by-d2-...-by-dN”, for the N -dimensions represented by sz. -
-sz is an array of dimension sizes, in the format returned by
-the size
function.
-
Returns a charvec. -
-Examples: -
str = size2str (size (magic (4))) - ⇒ str = 4-by-4 -
out =
splitapply (func, X, G)
¶out =
splitapply (func, X1, …, XN, G)
¶[Y1, …, YM] =
splitapply (…)
¶Split data into groups and apply function. -
-func is a function handle to call on each group of inputs in turn. -
-X, X1, …, XN are the input variables that are split into
-groups for the function calls. If X is a table
, then its contained
-variables are “popped out” and considered to be the X1 … XN
-input variables.
-
G is the grouping variable vector. It contains a list of integers that -identify which group each element of the X input variables belongs to. -NaNs in G mean that element is ignored. -
-Vertically concatenates the function outputs for each of the groups and returns them in -as many variables as you capture. -
-Returns the concatenated outputs of applying func to each group. -
-See also: table.groupby, table.splitapply -
-A string array of Unicode strings. -
-A string array is an array of strings, where each array element is a single -string. -
-The string class represents strings, where: -
This should correspond pretty well to what people think of as strings, and -is pretty compatible with people’s typical notion of strings in Octave. -
-String arrays also have a special “missing” value, that is like the string -equivalent of NaN for doubles or “undefined” for categoricals, or SQL NULL. -
-This is a slightly higher-level and more strongly-typed way of representing -strings than cellstrs are. (A cellstr array is of type cell, not a text- -specific type, and allows assignment of non-string data into it.) -
-Be aware that while string arrays interconvert with Octave chars and cellstrs, -Octave char elements represent 8-bit UTF-8 code units, not Unicode code points. -
-This class really serves three roles: -
-Not clear whether it’s a good fit to have the Unicode support wrapped -up in this. Maybe it should just be a simple object wrapper -wrapper, and defer Unicode semantics to when core Octave adopts them for -char and cellstr. On the other hand, because Octave chars are UTF-8, not UCS-2, -some methods like strlength() and reverse() are just going to be wrong if -they delegate straight to chars. -
-“Missing” string values work like NaNs. They are never considered equal, -less than, or greater to any other string, including other missing strings. -This applies to set membership and other equivalence tests. -
-TODO: Need to decide how far to go with Unicode semantics, and how much to -just make this an object wrapper over cellstr and defer to Octave’s existing -char/string-handling functions. -
-TODO: demote_strings should probably be static or global, so that other -functions can use it to hack themselves into being string-aware. -
-out =
cell (obj)
¶Convert to cell array. -
-Converts this to a cell, which will be a cellstr. Missing values are
-converted to ''
.
-
This method returns the same values as cellstr(obj)
; it is just provided
-for interface compatibility purposes.
-
Returns a cell array of the same size as obj. -
-out =
cellstr (obj)
¶Convert to cellstr. -
-Converts obj to a cellstr. Missing values are converted to ''
.
-
Returns a cellstr array of the same size as obj. -
-out =
char (obj)
¶Convert to char array. -
-Converts obj to a 2-D char array. It will have as many rows -as obj has elements. -
-It is an error to convert missing-valued string
arrays to
-char. (NOTE: This may change in the future; it may be more appropriate)
-to convert them to space-padded empty strings.)
-
Returns 2-D char array. -
-[out, outA, outB] =
cmp (A, B)
¶Value ordering comparison, returning -1/0/+1. -
-Compares each element of A and B, returning for
-each element i
whether A(i)
was less than (-1),
-equal to (0), or greater than (1) the corresponding B(i)
.
-
TODO: What to do about missing values? Should missings sort to the end -(preserving total ordering over the full domain), or should their comparisons -result in a fourth "null"/"undef" return value, probably represented by NaN? -FIXME: The current implementation does not handle missings. -
-Returns a numeric array out of the same size as the scalar expansion -of A and B. Each value in it will be -1, 0, or 1. -
-Also returns scalar-expanded copies of A and B as outA and -outB, as a programming convenience. -
-out =
string.decode (bytes, charsetName)
¶Decode encoded text from bytes. -
-Decodes the given encoded text in bytes according to the specified -encoding, given by charsetName. -
-Returns a scalar string. -
-See also: string.encode -
-out =
dispstrs (obj)
¶Display strings for array elements. -
-Gets display strings for all the elements in obj. These display strings
-will either be the string contents of the element, enclosed in "..."
,
-and with CR/LF characters replaced with '\r'
and '\n'
escape sequences,
-or "<missing>"
for missing values.
-
Returns a cellstr of the same size as obj. -
-out =
empty (sz)
¶Get an empty string array of a specified size. -
-The argument sz is optional. If supplied, it is a numeric size -array whose product must be zero. If omitted, it defaults to [0 0]. -
-The size may also be supplied as multiple arguments containing -scalar numerics. -
-Returns an empty string array of the requested size. -
-out =
encode (obj, charsetName)
¶Encode string in a given character encoding. -
-obj must be scalar. -
-charsetName (charvec) is the name of a character encoding. -(TODO: Document what determines the set of valid encoding names.) -
-Returns the encoded string as a uint8
vector.
-
See also: string.decode. -
-out =
erase (obj, match)
¶Erase matching substring. -
-Erases the substrings in obj which match the match input. -
-Returns a string array of the same size as obj. -
-out =
ismissing (obj)
¶Test whether array elements are missing. -
-For string
arrays, only the special “missing” value is
-considered missing. Empty strings are not considered missing,
-the way they are with cellstrs.
-
Returns a logical array the same size as obj
.
-
out =
isnanny (obj)
¶Test whether array elements are NaN-like. -
-Missing values are considered nannish; any other string value is not. -
-Returns a logical array of the same size as obj. -
-out =
isstring (obj)
¶Test if input is a string array. -
-isstring
is always true for string
inputs.
-
Returns a scalar logical. -
-out =
lower (obj)
¶Convert to lower case. -
-Converts all the characters in all the strings in obj to lower case. -
-This currently delegates to Octave’s own lower()
function to
-do the conversion, so whatever character class handling it has, this
-has.
-
Returns a string array of the same size as obj. -
-out =
string.missing (sz)
¶Missing string value. -
-Creates a string array of all-missing values of the specified size sz. -If sz is omitted, creates a scalar missing string. -
-Returns a string array of size sz or [1 1]. -
-See also: NaS -
-out =
plus (a, b)
¶String concatenation via plus operator. -
-Concatenates the two input arrays, string-wise. Inputs that are -not string arrays are converted to string arrays. -
-The concatenation is done by calling ‘strcat‘ on the inputs, and has the -same behavior. -
-Returns a string array the same size as the scalar expansion of its -inputs. -
-See also: string.strcat -
-out =
reverse (obj)
¶Reverse string, character-wise. -
-Reverses the characters in each string in obj. This operates on -Unicode characters (code points), not on bytes, so it is guaranteed -to produce valid UTF-8 as its output. -
-Returns a string array the same size as obj. -
-out =
reverse_bytes (obj)
¶Reverse string, byte-wise. -
-Reverses the bytes in each string in obj. This operates on bytes -(Unicode code units), not characters. -
-This may well produce invalid strings as a result, because reversing a -UTF-8 byte sequence does not necessarily produce another valid UTF-8 -byte sequence. -
-You probably do not want to use this method. You probably want to use
-string.reverse
instead.
-
Returns a string array the same size as obj. -
-See also: string.reverse -
-out =
strcat (varargin)
¶String concatenation. -
-Concatenates the corresponding elements of all the input arrays, -string-wise. Inputs that are not string arrays are converted to -string arrays. -
-The semantics of concatenating missing strings with non-missing -strings has not been determined yet. -
-Returns a string array the same size as the scalar expansion of its -inputs. -
-out =
strcmp (A, B)
¶String comparison. -
-Tests whether each element in A is exactly equal to the corresponding -element in B. Missing values are not considered equal to each other. -
-This does the same comparison as A == B
, but is not polymorphic.
-Generally, there is no reason to use strcmp
instead of ==
-or eq
on string arrays, unless you want to be compatible with
-cellstr inputs as well.
-
Returns logical array the size of the scalar expansion of A and B. -
-out =
strfind (obj, pattern)
¶out =
strfind (…, varargin)
¶Find pattern in string. -
-Finds the locations where pattern occurs in the strings of obj. -
-TODO: It’s ambiguous whether a scalar this should result in a numeric -out or a cell array out. -
-Returns either an index vector, or a cell array of index vectors. -
-obj =
string ()
¶obj =
string (in)
¶Construct a new string array. -
-The zero-argument constructor creates a new scalar string array -whose value is the empty string. -
-The other constructors construct a new string array by converting -various types of inputs. -
-out =
strlength (obj)
¶String length in characters (actually, UTF-16 code units). -
-Gets the length of each string, counted in UTF-16 code units. In most -cases, this is the same as the number of characters. The exception is for -characters outside the Unicode Basic Multilingual Plane, which are -represented with UTF-16 surrogate pairs, and thus will count as 2 characters -each. -
-The reason this method counts UTF-16 code units, instead of Unicode code -points (true characters), is for Matlab compatibility. -
-This is the string length method you probably want to use,
-not strlength_bytes
.
-
Returns double array of the same size as obj. Returns NaNs for missing -strings. -
-See also: string.strlength_bytes -
-out =
strlength_bytes (obj)
¶String length in bytes. -
-Gets the length of each string in obj, counted in Unicode UTF-8
-code units (bytes). This is the same as numel(str)
for the corresponding
-Octave char vector for each string, but may not be what you
-actually want to use. You may want strlength
instead.
-
Returns double array of the same size as obj. Returns NaNs for missing -strings. -
-See also: string.strlength -
-out =
strrep (obj, match, replacement)
¶out =
strrep (…, varargin)
¶Replace occurrences of pattern with other string. -
-Replaces matching substrings in obj with a given replacement string. -
-varargin is passed along to the core Octave strrep
function. This
-supports whatever options it does.
-TODO: Maybe document what those options are.
-
Returns a string array of the same size as obj. -
-out =
upper (obj)
¶Convert to upper case. -
-Converts all the characters in all the strings in obj to upper case. -
-This currently delegates to Octave’s own upper()
function to
-do the conversion, so whatever character class handling it has, this
-has.
-
Returns a string array of the same size as obj. -
-Tabular data array containing multiple columnar variables. -
-A table
is a tabular data structure that collects multiple parallel
-named variables.
-Each variable is treated like a column. (Possibly a multi-columned column, if
-that makes sense.)
-The types of variables may be heterogeneous.
-
A table object is like an SQL table or resultset, or a relation, or a -DataFrame in R or Pandas. -
-A table is an array in itself: its size is nrows-by-nvariables, -and you can index along the rows and variables by indexing into the table -along dimensions 1 and 2. -
-A note on accessing properties of a table
array: Because .-indexing is
-used to access the variables inside the array, it can’t also be directly used
-to access properties as well. Instead, do t.Properties.<property>
for
-a table t
. That will give you a property instead of a variable.
-(And due to this mechanism, it will cause problems if you have a table
-with a variable named Properties
. Try to avoid that.)
-
WARNING ABOUT HANDLE CLASSES IN TABLE VARIABLES -
-Using a handle class in a table variable (column) value may lead to unpredictable -and buggy behavior! A handle class array is a reference type, and it holds shared -mutable state, which may be shared with references to it in other table arrays or -outside the table array. The table class makes no guarantees about what it will -or will not do internally with arrays that are held in table variables, and any -operation on a table holding handle arrays may have unpredictable and undesirable -side effects. These side effects may change between versions of Tablicious. -
-We currently recommend that you do not use handle classes in table variables. It -may be okay to use handle classes *inside* cells or other non-handle composite types -that are used in table variables, but this hasn’t been fully thought through or -tested. -
-See also: tblish.table.grpstats, tblish.evalWithTableVars, tblish.examples.SpDb -
-table
: cellstr
VariableNames ¶The names of the variables in the table, as a cellstr row vector. -
-table
: cell
VariableValues ¶A cell vector containing the values for each of the variables.
-VariableValues(i)
corresponds to VariableNames(i)
.
-
table
: cellstr
RowNames ¶An optional list of row names that identify each row in the table. This -is a cellstr column vector, if present. -
-table
: cellstr
DimensionNames ¶Names for the two dimensions of the table array, as a cellstr row vector. Always
-exactly 2-long, because tables are always exactly 2-D. Defaults to
-{"Row", "Variables"}
. (I feel the singular "Row" and plural "Variables" here
-are inconsistent, but that’s what Matlab uses, so Tablicious uses it too, for
-Matlab compatibility.)
-
[outA, ixA, outB, ixB] =
antijoin (A, B)
¶Natural antijoin (AKA “semidifference”). -
-Computes the anti-join of A and B. The anti-join is defined as all the -rows from one input which do not have matching rows in the other input. -
-Returns: - outA - all the rows in A with no matching row in B - ixA - the row indexes into A which produced outA - outB - all the rows in B with no matching row in A - ixB - the row indexes into B which produced outB -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-[out, ixs] =
cartesian (A, B)
¶Cartesian product of two tables. -
-Computes the Cartesian product of two tables. The Cartesian product is -each row in A combined with each row in B. -
-Due to the definition and structural constraints of table, the two inputs -must have no variable names in common. It is an error if they do. -
-The Cartesian product is seldom used in practice. If you find yourself -calling this method, you should step back and re-evaluate what you are -doing, asking yourself if that is really what you want to happen. If nothing -else, writing a function that calls cartesian() is usually much less -efficient than alternate ways of arriving at the same result. -
-This implementation does not remove duplicate values. -TODO: Determine whether this duplicate-removing behavior is correct. -
-The ordering of the rows in the output is not specified, and may be implementation- -dependent. TODO: Determine if we can lock this behavior down to a fixed, -defined ordering, without killing performance. -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-out =
convertvars (obj, vars, dataType)
¶Convert variables to specified data type. -
-Converts the variables in obj specified by vars to the specified data type. -
-vars is a cellstr or numeric vector specifying which variables to convert. -
-dataType specifies the data type to convert those variables to. It is either -a char holding the name of the data type, or a function handle which will -perform the conversion. If it is the name of the data type, there must -either be a one-arg constructor of that type which accepts the specified -variables’ current types as input, or a conversion method of that name -defined on the specified variables’ current type. -
-Returns a table with the same variable names as obj, but with converted -types. -
-[G, TID] =
findgroups (obj)
¶Find groups within a table’s row values. -
-Finds groups within a table’s row values and get group numbers. A group -is a set of rows that have the same values in all their variable elements. -
-Returns: - G - A double column vector of group numbers created from obj. - TID - A table containing the row values corresponding to the group numbers. -
-[out, name]
= getvar (obj, varRef)
¶Get value and name for single table variable. -
-varRef is a variable reference. It may be a name or an index. It -may only specify a single table variable. -
-Returns: - out – the value of the referenced table variable - name – the name of the referenced table variable -
-[out1, …]
= getvars (obj, varRef)
¶Get values for one ore more table variables. -
-varRef is a variable reference in the form of variable names or -indexes. -
-Returns as many outputs as varRef referenced variables. Each output -contains the contents of the corresponding table variable. -
-[out] =
groupby (obj, groupvars, aggcalcs)
¶Find groups in table data and apply functions to variables within groups. -
-This works like an SQL "SELECT ... GROUP BY ..."
statement.
-
groupvars (cellstr, numeric) is a list of the grouping variables, -identified by name or index. -
-aggcalcs is a specification of the aggregate calculations to perform
-on them, in the form {
out_var,
fcn,
in_vars; ...}
, where:
- out_var (char) is the name of the output variable
- fcn (function handle) is the function to apply to produce it
- in_vars (cellstr) is a list of the input variables to pass to fcn
-
Returns a table. -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-out =
height (obj)
¶Number of rows in table. -
-For a zero-variable table, this currently always returns 0. This is a bug, -and will change in the future. It should be possible for zero-variable table -arrays to have any number of rows. -
-out =
horzcat (varargin)
¶Horizontal concatenation. -
-Combines tables by horizontally concatenating them. -Inputs that are not tables are automatically converted to tables by calling -table() on them. Inputs must have all distinct variable names. -
-Output has the same RowNames as varargin{1}
. The variable names and values
-are the result of the concatenation of the variable names and values lists
-from the inputs.
-
[out, ixa, ixb] =
innerjoin (A, B)
¶[…] =
innerjoin (A, B, …)
¶Combine two tables by rows using key variables. -
-Computes the relational inner join between two tables. “Inner” means that -only rows which had matching rows in the other input are kept in the -output. -
-TODO: Document options. -
-Returns: - out - A table that is the result of joining A and B - ix - Indexes into A for each row in out - ixb - Indexes into B for each row in out -
-[C, ia, ib] =
intersect (A, B)
¶Set intersection. -
-Computes the intersection of two tables. The intersection is defined to be the unique -row values which are present in both of the two input tables. -
-Returns: - C - A table containing all the unique row values present in both A and B. - ia - Row indexes into A of the rows from A included in C. - ib - Row indexes into B of the rows from B included in C. -
-out =
isempty (obj)
¶Test whether array is empty. -
-For tables, isempty
is true if the number of rows is 0 or the number
-of variables is 0.
-
[tf, loc] =
ismember (A, B)
¶Set membership. -
-Finds rows in A that are members of B. -
-Returns: - tf - A logical vector indicating whether each A(i,:) was present in B. - loc - Indexes into B of rows that were found. -
-out =
ismissing (obj)
¶out =
ismissing (obj, indicator)
¶Find missing values. -
-Finds missing values in obj’s variables. -
-If indicator is not supplied, uses the standard missing values for each -variable’s data type. If indicator is supplied, the same indicator list is -applied across all variables. -
-All variables in this must be vectors. (This is due to the requirement
-that size(out) == size(obj)
.)
-
Returns a logical array the same size as obj. -
-[C, ib] =
join (A, B)
¶[C, ib] =
join (A, B, …)
¶Combine two tables by rows using key variables, in a restricted form. -
-This is not a "real" relational join operation. It has the restrictions -that: - 1) The key values in B must be unique. - 2) Every key value in A must map to a key value in B. -These are restrictions inherited from the Matlab definition of table.join. -
-You probably don’t want to use this method. You probably want to use -innerjoin or outerjoin instead. -
-See also: table.innerjoin, table.outerjoin -
-out =
movevars (obj, vars, relLocation, location)
¶Move around variables in a table. -
-vars is a list of variables to move, specified by name or index. -
-relLocation is 'Before'
or 'After'
.
-
location indicates a single variable to use as the target location, -specified by name or index. If it is specified by index, it is the index -into the list of *unmoved* variables from obj, not the original full -list of variables in obj. -
-Returns a table with the same variables as obj, but in a different order. -
-out =
ndims (obj)
¶Number of dimensions -
-For tables, ndims(obj)
is always 2, because table arrays are always
-2-D (rows-by-columns).
-
out =
numel (obj)
¶Total number of elements in table (actually 1). -
-For compatibility reasons with Octave’s OOP interface and subsasgn behavior, -table’s numel is defined to always return 1. It is not useful for client -code to query a table’s size using numel. This is an incompatibility with -Matlab. -
-out =
outerfillvals (obj)
¶Get fill values for outer join. -
-Returns a table with the same variables as this, but containing only -a single row whose variable values are the values to use as fill values -when doing an outer join. -
-[out, ixa, ixb] =
outerjoin (A, B)
¶[…] =
outerjoin (A, B, …)
¶Combine two tables by rows using key variables, retaining unmatched rows. -
-Computes the relational outer join of tables A and B. This is like a -regular join, but also includes rows in each input which did not have -matching rows in the other input; the columns from the missing side are -filled in with placeholder values. -
-TODO: Document options. -
-Returns: - out - A table that is the result of the outer join of A and B - ixa - indexes into A for each row in out - ixb - indexes into B for each row in out -
-(obj)
¶Display table’s values in tabular format. This prints the contents -of the table in human-readable, tabular form. -
-Variables which contain objects are displayed using the strings
-returned by their dispstrs
method, if they define one.
-
[out, ixs] =
realjoin (A, B)
¶[…] =
realjoin (A, B, …)
¶"Real" relational inner join, without key restrictions -
-Performs a "real" relational natural inner join between two tables, -without the key restrictions that JOIN imposes. -
-Currently does not support tables which have RowNames. This may be -added in the future. -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-Name/value option arguments are: Keys, LeftKeys, RightKeys, -LeftVariables, RightVariables. -
-FIXME: Document those options. -
-Returns: - out - A table that is the result of joining A and B - ixs - Indexes into A for each row in out -
-out =
removevars (obj, vars)
¶Remove variables from table. -
-Deletes the variables specified by vars from obj. -
-vars may be a char, cellstr, numeric index vector, or logical -index vector. -
-out =
renamevars (obj, renameMap)
¶Rename variables in a table. -
-Renames selected variables in the table obj based on the mapping -provided in renameMap. -
-renameMap is an n-by-2 cellstr array, with the old variable names -in the first column, and the corresponding new variable names in the -second column. -
-Variables which are not included in renameMap are not modified. -
-It is an error if any variables named in the first column of renameMap -are not present in obj. -
-Renames -
out =
repmat (obj, sz)
¶Replicate matrix. -
-Repmats a table by repmatting each of its variables vertically. -
-For tables, repmatting is only supported along dimension 1. That is, the -values of sz(2:end) must all be exactly 1. This behavior may change in the -future to support repmatting horizontally, with the added variable names being -automatically changed to maintain uniqueness of variable names within the -resulting table. -
-Returns a new table with the same variable names and types as tbl, but -with a possibly different row count. -
-out =
restrict (obj, expr)
¶out =
restrict (obj, ix)
¶Subset rows using variable expression or index. -
-Subsets a table row-wise, using either an index vector or an expression -involving obj’s variables. -
-If the argument is a numeric or logical vector, it is interpreted as an -index into the rows of this. (Just as with ‘subsetrows (this, index)‘.) -
-If the argument is a char, then it is evaulated as an M-code expression,
-with all of this’ variables available as workspace variables, as with
-tblish.evalWithTableVars
. The output of expr must be a numeric or logical index
-vector (This form is a shorthand for
-out = subsetrows (this, tblish.evalWithTableVars (this, expr))
.)
-
TODO: Decide whether to name this to "where" to be more like SQL instead -of relational algebra. -
-Examples: -
[s,p,sp] = tblish.examples.SpDb; -prettyprint (restrict (p, 'Weight >= 14 & strcmp(Color, "Red")')) -
This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-See also: tblish.evalWithTableVars -
-out =
varfun (func, obj)
¶out =
varfun (…, 'OptionName'
, OptionValue, …)
¶Apply function to rows in table and collect outputs. -
-This applies the function func to the elements of each row of -obj’s variables, and collects the concatenated output(s) into the -variable(s) of a new table. -
-func is a function handle. It should take as many inputs as there
-are variables in obj. Or, it can take a single input, and you must
-specify 'SeparateInputs', false
to have the input variables
-concatenated before being passed to func. It may return multiple
-argouts, but to capture those past the first one, you must explicitly
-specify the 'NumOutputs'
or 'OutputVariableNames'
options.
-
Supported name/value options: -
'OutputVariableNames'
Names of table variables to store combined function output arguments in. -
'NumOutputs'
Number of output arguments to call function with. If omitted, defaults to -number of items in OutputVariableNames if it is supplied, otherwise -defaults to 1. -
'SeparateInputs'
If true, input variables are passed as separate input arguments to func. -If false, they are concatenated together into a row vector and passed as -a single argument. Defaults to true. -
'ErrorHandler'
A function to call as a fallback when calling func results in an error. -It is passed the caught exception, along with the original inputs passed -to func, and it has a “second chance” to compute replacement values -for that row. This is useful for converting raised errors to missing-value -fill values, or logging warnings. -
'ExtractCellContents'
Whether to “pop out” the contents of the elements of cell variables in -obj, or to leave them as cells. True/false; default is false. If -you specify this option, then obj may not have any multi-column -cell-valued variables. -
'InputVariables'
If specified, only these variables from obj are used as the function -inputs, instead of using all variables. -
'GroupingVariables'
Not yet implemented. -
'OutputFormat'
The format of the output. May be 'table'
(the default),
-'uniform'
, or 'cell'
. If it is 'uniform'
or 'cell'
,
-the output variables are returned in multiple output arguments from
-'rowfun'
.
-
Returns a table
whose variables are the collected output arguments
-of func if OutputFormat is 'table'
. Otherwise, returns
-multiple output arguments of whatever type func returned (if
-OutputFormat is 'uniform'
) or cells (if OutputFormat
-is 'cell'
).
-
out =
rows2vars (obj)
¶out =
rows2vars (obj, 'VariableNamesSource'
, VariableNamesSource)
¶out =
rows2vars (…, 'DataVariables'
, DataVariables)
¶Reorient table, swapping rows and variables dimensions. -
-This flips the dimensions of the given table obj, swapping the -orientation of the contained data, and swapping the row names/labels -and variable names. -
-The variable names become a new variable named “OriginalVariableNames”. -
-The row names are drawn from the column VariableNamesSource if it -is specified. Otherwise, if obj has row names, they are used. -Otherwise, new variable names in the form “VarN” are generated. -
-If all the variables in obj are of the same type, they are concatenated -and then sliced to create the new variable values. Otherwise, they are -converted to cells, and the new table has cell variable values. -
-[outA, ixA, outB, ixB] =
semijoin (A, B)
¶Natural semijoin. -
-Computes the natural semijoin of tables A and B. The semi-join of tables -A and B is the set of all rows in A which have matching rows in B, based -on comparing the values of variables with the same names. -
-This method also computes the semijoin of B and A, for convenience. -
-Returns: - outA - all the rows in A with matching row(s) in B - ixA - the row indexes into A which produced outA - outB - all the rows in B with matching row(s) in A - ixB - the row indexes into B which produced outB -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-out =
setDimensionNames (obj, names)
¶out =
setDimensionNames (obj, ix, names)
¶Set dimension names. -
-Sets the DimensionNames
for this table to a new list of names.
-
names is a char or cellstr vector. It must have the same number of elements -as the number of dimension names being assigned. -
-ix is an index vector indicating which dimension names to set. If -omitted, it sets all two of them. Since there are always two dimension, -the indexes in ix may never be higher than 2. -
-This method exists because the obj.Properties.DimensionNames = …
-assignment form did not originally work, possibly due to an Octave bug, or more
-likely due to a bug in Tablicious prior to the early 0.4.x versions. That was
-fixed around 0.4.4. This method may be deprecated and removed at some point, since
-it is not part of the standard Matlab table interface, and is now redundant with
-the obj.Properties.DimensionNames = …
assignment form.
-
out =
setRowNames (obj, names)
¶Set row names. -
-Sets the row names on obj to names. -
-names is a cellstr column vector, with the same number of rows as -obj has. -
-out =
setVariableNames (obj, names)
¶out =
setVariableNames (obj, ix, names)
¶Set variable names. -
-Sets the VariableNames
for this table to a new list of names.
-
names is a char or cellstr vector. It must have the same number of elements -as the number of variable names being assigned. -
-ix is an index vector indicating which variable names to set. If -omitted, it sets all of them present in obj. -
-This method exists because the obj.Properties.VariableNames = …
-assignment form does not work, possibly due to an Octave bug.
-
[C, ia] =
setdiff (A, B)
¶Set difference. -
-Computes the set difference of two tables. The set difference is defined to be -the unique row values which are present in table A that are not in table B. -
-Returns: - C - A table containing the unique row values in A that were not in B. - ia - Row indexes into A of the rows from A included in C. -
-out =
setvar (obj, varRef, value)
¶Set value for a variable in table. -
-This sets (adds or replaces) the value for a variable in obj. It -may be used to change the value of an existing variable, or add a new -variable. -
-This method exists primarily because I cannot get obj.foo = value
to work,
-apparently due to an issue with Octave’s subsasgn support.
-
varRef is a variable reference, either the index or name of a variable. -If you are adding a new variable, it must be a name, and not an index. -
-value is the value to set the variable to. If it is scalar or -a single string as charvec, it is scalar-expanded to match the number -of rows in obj. -
-[C, ia, ib] =
setxor (A, B)
¶Set exclusive OR. -
-Computes the setwise exclusive OR of two tables. The set XOR is defined to be -the unique row values which are present in one or the other of the two input -tables, but not in both. -
-Returns: - C - A table containing all the unique row values in the set XOR of A and B. - ia - Row indexes into A of the rows from A included in C. - ib - Row indexes into B of the rows from B included in C. -
-out =
splitapply (func, obj, G)
¶[Y1, …, YM] =
splitapply (func, obj, G)
¶Split table data into groups and apply function. -
-Performs a splitapply, using the variables in obj as the input X variables
-to the splitapply
function call.
-
See also: splitapply, table.groupby, tblish.table.grpstats -
-out =
splitvars (obj)
¶out =
splitvars (obj, vars)
¶out =
splitvars (…, 'NewVariableNames'
, NewVariableNames)
¶Split multicolumn table variables. -
-Splits multicolumn table variables into new single-column variables. -If vars is supplied, splits only those variables. If vars -is not supplied, splits all multicolumn variables. -
-obj =
squeeze (obj)
¶Remove singleton dimensions. -
-For tables, this is always a no-op that returns the input unmodified, -because tables always have exactly 2 dimensions, and 2-D arrays are unaffected -by squeeze. -
-summary
(obj) ¶Display a summary of a table’s data. -
-Displays a summary of data in the input table. This will contain some -statistical information on each of its variables. The output is printed -to the Octave console (command window, stdout, or the like in your current -session), in a format suited for human consumption. The output format is -not fixed or formally defined, and may change over time. It is only -suitable for human display, and not for parsing or programmatic use. -
-This method supports, to some degree, extension by other packages. If your -Octave session has loaded other packages which supply extension implementaions -of ‘summary‘, Tablicious will use those in preference to its own internal -implementation, and you will get different, and hopefully better, output. -
-obj =
table ()
¶Constructs a new empty (0 rows by 0 variables) table. -
-obj =
table (var1, var2, …, varN)
¶Constructs a new table from the given variables. The variables passed as -inputs to this constructor become the variables of the table. Their names -are automatically detected from the input variable names that you used. -
-Note: If you call the constructor with exactly three arguments, and the first -argument is exactly the value ’__tblish_backdoor__’, that will trigger a special internal-use -backdoor calling form, and you will get incorrect results. This is a bug in -Tablicious. -
-obj =
table ('Size'
, sz, 'VariableTypes'
, varTypes)
¶Constructs a new table of the given size, and with the given variable types. -The variables will contain the default value for elements of that type. -
-s =
table2struct (obj)
¶Converts obj to a homogeneous array. -
-c =
table2cell (obj)
¶Converts table to a cell array. Each variable in obj becomes -one or more columns in the output, depending on how many columns -that variable has. -
-Returns a cell array with the same number of rows as obj, and -with as many or more columns as obj has variables. -
-s =
table2struct (obj)
¶s =
table2struct (…, 'ToScalar'
, trueOrFalse)
¶Converts obj to a scalar structure or structure array. -
-Row names are not included in the output struct. To include them, you -must add them manually: - s = table2struct (tbl, ’ToScalar’, true); - s.RowNames = tbl.Properties.RowNames; -
-Returns a scalar struct or struct array, depending on the value of the
-ToScalar
option.
-
[C, ia, ib] =
union (A, B)
¶Set union. -
-Computes the union of two tables. The union is defined to be the unique -row values which are present in either of the two input tables. -
-Returns: - C - A table containing all the unique row values present in A or B. - ia - Row indexes into A of the rows from A included in C. - ib - Row indexes into B of the rows from B included in C. -
-out =
varfun (fcn, obj)
¶out =
varfun (…, 'OutputFormat'
, outputFormat)
¶out =
varfun (…, 'InputVariables'
, vars)
¶out =
varfun (…, 'ErrorHandler'
, errorFcn)
¶Apply function to table variables. -
-Applies the given function fcn to each variable in obj, -collecting the output in a table, cell array, or array of another type. -
-out =
vertcat (varargin)
¶Vertical concatenation. -
-Combines tables by vertically concatenating them. -
-Inputs that are not tables are automatically converted to tables by calling -table() on them. -
-The inputs must have the same number and names of variables, and their -variable value types and sizes must be cat-compatible. The types of the resulting -variables are the types that result from doing a ‘vertcat()‘ on the variables -from the corresponding input tables, in the order they were input in. -
-out =
width (obj)
¶Number of variables in table. -
-Note that this is not the sum of the number of columns in each variable. -It is just the number of variables. -
-out =
tail (A)
¶out =
tail (A, k)
¶Get last K rows of an array. -
-Returns the array A, subsetted to its last k rows. This means
-subsetting it to the last (min (k, size (A, 1)))
elements along
-dimension 1, and leaving all other dimensions unrestricted.
-
A is the array to subset. -
-k is the number of rows to get. k defaults to 8 if it is omitted -or empty. -
-If there are less than k rows in A, returns all rows. -
-Returns an array of the same type as A, unless ()-indexing A -produces an array of a different type, in which case it returns that type. -
-See also: head -
-The tblish.dataset
class provides convenient access to the various
-datasets included with Tablicious.
-
This class just contains a bunch of static methods, each of which loads -the dataset of that name. It is provided as a convenience so you can use tab -completion or other run-time introspection on the dataset list. -
-out =
AirPassengers ()
¶Monthly Airline Passenger Numbers 1949-1960 -
-The classic Box & Jenkins airline data. Monthly totals of international -airline passengers, 1949 to 1960. -
-Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976). Time Series -Analysis, Forecasting and Control. Third Edition. San Francisco: Holden-Day. -Series G. -
-## TODO: This example needs to be ported from R. - -
out =
BJsales ()
¶Sales Data with Leading Indicator -
-Sales Data with Leading Indicator -
-record
Index of the record. -
lead
Leading indicator. -
sales
Sales volume. -
The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and -Control. San Francisco: Holden-Day. p. 537. -
-Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, -Second edition. New York: Springer-Verlag. p. 414. -
-# TODO: Come up with example code here - -
out =
BOD ()
¶Biochemical Oxygen Demand -
-Contains biochemical oxygen demand versus time in an evaluation of water quality. -
-Time
Time of the measurement (in days). -
demand
Biochemical oxygen demand (mg/l). -
Bates, D.M. and Watts, D.G. (1988). Nonlinear Regression Analysis and Its -Applications. New York: John Wiley & Sons. Appendix A1.4. -
-Originally from: Marske (1967). Biochemical Oxygen Demand Data -Interpretation Using Sum of Squares Surface, M.Sc. Thesis, University of -Wisconsin – Madison. -
-# TODO: Port this example from R - -
out =
ChickWeight ()
¶Weight versus age of chicks on different diets -
-weight
a numeric vector giving the body weight of the chick (gm). -
Time
a numeric vector giving the number of days since birth when the -measurement was made. -
Chick
an ordered factor with levels 18 < ... < 48 giving a unique -identifier for the chick. The ordering of the levels groups chicks on the same -diet together and orders them according to their final weight (lightest to -heaviest) within diet. -
Diet
a factor with levels 1, ..., 4 indicating which experimental diet -the chick received. -
Crowder, M. and Hand, D. (1990). Analysis of Repeated Measures. London: Chapman and -Hall. (example 5.3) -
-Hand, D. and Crowder, M. (1996), Practical Longitudinal Data Analysis. London: Chapman -and Hall. (table A.2) -
-Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS. -New York: Springer. -
-t = tblish.dataset.ChickWeight - -tblish.examples.coplot (t, "Time", "weight", "Chick"); - -
out =
DNase ()
¶Elisa assay of DNase -
-Data obtained during development of an ELISA assay for the recombinant protein DNase in rat serum. -
-Run
Ordered categorical
indicating the assay run.
-
conc
Known concentration of the protein (ng/ml). -
density
Measured optical density in the assay (dimensionless). -
Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated -Measurement Data. London: Chapman & Hall. (section 5.2.4, p. 134) -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
-t = tblish.dataset.DNase; - -# TODO: Port this from R - -tblish.examples.coplot (t, "conc", "density", "Run", "PlotFcn", @scatter); -tblish.examples.coplot (t, "conc", "density", "Run", "PlotFcn", @loglog, ... - "PlotArgs", {"o"}); - -
out =
EuStockMarkets ()
¶Daily Closing Prices of Major European Stock Indices -
-Contains the daily closing prices of major European stock indices: Germany DAX -(Ibis), Switzerland SMI, France CAC, and UK FTSE. The data are sampled in -business time, i.e., weekends and holidays are omitted. -
-A multivariate time series with 1860 observations on 4 variables. -
-The starting date is the 130th day of 1991, with a frequency of 260 observations -per year. -
-The data were kindly provided by Erste Bank AG, Vienna, Austria. -
-- -t = tblish.dataset.EuStockMarkets; - -# The fact that we're doing this munging means that table might have -# been the wrong structure for this data in the first place - -t2 = removevars (t, "day"); -index_names = t2.Properties.VariableNames; -day = 1:height (t2); -price = table2array (t2); - -price0 = price(1,:); - -rel_price = price ./ repmat (price0, [size(price, 1) 1]); - -figure; -plot (day, rel_price); -legend (index_names); -xlabel ("Business day"); -ylabel ("Relative price"); - - - -
out =
Formaldehyde ()
¶Determination of Formaldehyde -
-These data are from a chemical experiment to prepare a standard curve for the -determination of formaldehyde by the addition of chromatropic acid and -concentrated sulphuric acid and the reading of the resulting purple color on -a spectrophotometer. -
-record
Observation record number. -
carb
Carbohydrate (ml). -
optden
Optical Density -
Bennett, N. A. and N. L. Franklin (1954). Statistical Analysis in -Chemistry and the Chemical Industry. New York: Wiley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.Formaldehyde; - -figure -scatter (t.carb, t.optden) -# TODO: Add a linear model line -xlabel ("Carbohydrate (ml)") -ylabel ("Optical Density") -title ("Formaldehyde data") - -# TODO: Add linear model summary output -# TOD: Add linear model summary plot - -
out =
HairEyeColor ()
¶Hair and Eye Color of Statistics Students -
-Distribution of hair and eye color and sex in 592 statistics students. -
-This data set comes in multiple variables -
-n
A 3-dimensional array containing the counts of students in each bucket. It -is arranged as hair-by-eye-by-sex. -
hair
Hair colors for the indexes along dimension 1. -
eye
Eye colors for the indexes along dimension 2. -
sex
Sexes for the indexes along dimension 3. -
The Hair x Eye table comes rom a survey of students at the University of -Delaware reported by Snee (1974). The split by Sex was added by Friendly -(1992a) for didactic purposes. -
-This data set is useful for illustrating various techniques for the analysis -of contingency tables, such as the standard chi-squared test or, more -generally, log-linear modelling, and graphical methods such as mosaic plots, -sieve diagrams or association plots. -
-http://euclid.psych.yorku.ca/ftp/sas/vcd/catdata/haireye.sas -
-Snee (1974) gives the two-way table aggregated over Sex. The Sex split of -the ‘Brown hair, Brown eye’ cell was changed to agree with that used by -Friendly (2000). -
-Snee, R. D. (1974). Graphical display of two-way contingency tables. -The American Statistician, 28, 9–12. -
-Friendly, M. (1992a). Graphical methods for categorical data. SAS User -Group International Conference Proceedings, 17, 190–200. -http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html -
-Friendly, M. (1992b). Mosaic displays for loglinear models. Proceedings -of the Statistical Graphics Section, American Statistical Association, pp. -61–68. http://www.math.yorku.ca/SCS/Papers/asa92.html -
-Friendly, M. (2000). Visualizing Categorical Data. SAS Institute, -ISBN 1-58025-660-0. -
-tblish.dataset.HairEyeColor - -# TODO: Aggregate over sex and display a table of counts - -# TODO: Port mosaic plot to Octave - -
out =
Harman23cor ()
¶Harman Example 2.3 -
-A correlation matrix of eight physical measurements on 305 girls between -ages seven and seventeen. -
-cov
An 8-by-8 correlation matrix. -
names
Names of the variables corresponding to the indexes of the correlation matrix’s -dimensions. -
Harman, H. H. (1976). Modern Factor Analysis, Third Edition Revised. -Chicago: University of Chicago Press. Table 2.3. -
-tblish.dataset.Harman23cor; - -# TODO: Port factanal to Octave - -
out =
Harman74cor ()
¶Harman Example 7.4 -
-A correlation matrix of 24 psychological tests given to 145 seventh and -eighth-grade children in a Chicago suburb by Holzinger and Swineford. -
-cov
A 2-dimensional correlation matrix. -
vars
Names of the variables corresponding to the indexes along the dimensions of
-cov
.
-
Harman, H. H. (1976). Modern Factor Analysis, Third Edition -Revised. Chicago: University of Chicago Press. Table 7.4. -
-tblish.dataset.Harman74cor; - -# TODO: Port factanal to Octave - -
out =
Indometh ()
¶Pharmacokinetics of Indomethacin -
-Data on the pharmacokinetics of indometacin (or, older spelling, -‘indomethacin’). -
-Subject
Subject identifier. -
time
Time since drug administration at which samples were drawn (hours). -
conc
Plasma concentration of indomethacin (mcg/ml). -
Each of the six subjects were given an intravenous injection of indometacin. -
-Kwan, Breault, Umbenhauer, McMahon and Duggan (1976). Kinetics of -Indomethacin absorption, elimination, and enterohepatic circulation in man. -Journal of Pharmacokinetics and Biopharmaceutics 4, 255–280. -
-Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated -Measurement Data. London: Chapman & Hall. (section 5.2.4, p. 129) -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
- -out =
InsectSprays ()
¶Effectiveness of Insect Sprays -
-The counts of insects in agricultural experimental units treated with different -insecticides. -
-spray
The type of spray. -
count
Insect count. -
Beall, G., (1942). The Transformation of data from entomological field -experiments. Biometrika, 29, 243–262. -
-McNeil, D. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.InsectSprays; - -# TODO: boxplot - -# TODO: AOV plots - -
out =
JohnsonJohnson ()
¶Quarterly Earnings per Johnson & Johnson Share -
-Quarterly earnings (dollars) per Johnson & Johnson share 1960–80. -
-date
Start date of the quarter. -
earnings
Earnings per share (USD). -
Shumway, R. H. and Stoffer, D. S. (2000). Time Series Analysis and its -Applications. Second Edition. New York: Springer. Example 1.1. -
-t = tblish.dataset.JohnsonJohnson - -# TODO: Yikes, look at all those plots. Port them to Octave. - -
out =
LakeHuron ()
¶Level of Lake Huron 1875-1972 -
-Annual measurements of the level, in feet, of Lake Huron 1875–1972. -
-year
Year of the measurement -
level
Lake level (ft). -
Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting -Methods. Second edition. New York: Springer. Series A, page 555. -
-Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series -and Forecasting. New York: Springer. Sections 5.1 and 7.6. -
-t = tblish.dataset.LakeHuron; - -plot (t.year, t.level) -xlabel ("Year") -ylabel ("Lake level (ft)") -title ("Level of Lake Huron") - -
out =
LifeCycleSavings ()
¶Intercountry Life-Cycle Savings Data -
-Data on the savings ratio 1960–1970. -
-country
Name of the country. -
sr
Aggregate personal savings. -
pop15
Percentage of population under 15. -
pop75
Percentage of population over 75. -
dpi
Real per-capita disposable income. -
ddpi
Percent growth rate of dpi. -
Under the life-cycle savings hypothesis as developed by Franco Modigliani, the -savings ratio (aggregate personal saving divided by disposable income) is -explained by per-capita disposable income, the percentage rate of change in -per-capita disposable income, and two demographic variables: the percentage -of population less than 15 years old and the percentage of the population over -75 years old. The data are averaged over the decade 1960–1970 to remove the -business cycle or other short-term fluctuations. -
-The data were obtained from Belsley, Kuh and Welsch (1980). They in turn -obtained the data from Sterling (1977). -
-Sterling, Arnie (1977). Unpublished BS Thesis. Massachusetts Institute of -Technology. -
-Belsley, D. A., Kuh. E. and Welsch, R. E. (1980). Regression Diagnostics. -New York: Wiley. -
-t = tblish.dataset.LifeCycleSavings; - -# TODO: linear model - -# TODO: pairs plot with Lowess smoothed line - -
out =
Loblolly ()
¶Growth of Loblolly pine trees -
-Records of the growth of Loblolly pine trees. -
-height
Tree height (ft). -
age
Tree age (years). -
Seed
Seed source for the tree. Ordering is according to increasing maximum height. -
Kung, F. H. (1986). Fitting logistic growth curve with predetermined carrying -capacity. Proceedings of the Statistical Computing Section, American -Statistical Association, 340–343. -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
-t = tblish.dataset.Loblolly; - -t2 = t(t.Seed == "329",:); -scatter (t2.age, t2.height) -xlabel ("Tree age (yr)"); -ylabel ("Tree height (ft)"); -title ("Loblolly data and fitted curve (Seed 329 only)") - -# TODO: Compute and plot fitted curve - -
out =
Nile ()
¶Flow of the River Nile -
-Measurements of the annual flow of the river Nile at Aswan (formerly Assuan), -1871–1970, in m^3, “with apparent changepoint near 1898” -(Cobb(1978), Table 1, p.249). -
-year
Year of the record. -
flow
Annual flow (cubic meters). -
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/DKbook.html -
-Balke, N. S. (1993). Detecting level shifts in time series. Journal of -Business and Economic Statistics, 11, 81–92. -
-Cobb, G. W. (1978). The problem of the Nile: conditional solution to a -change-point problem. Biometrika 65, 243–51. -
-t = tblish.dataset.Nile; - -figure -plot (t.year, t.flow); - -# TODO: Port the rest of the example to Octave - -
out =
Orange ()
¶Growth of Orange Trees -
-Records of the growth of orange trees. -
-Tree
A categorical indicating on which tree the measurement is made. -Ordering is according to increasing maximum diameter. -
age
Age of the tree (days since 1968-12-31). -
circumference
Trunk circumference (mm). -This is probably “circumference at breast height”, a standard measurement in forestry. -
The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Draper, N. R. and Smith, H. (1998). Applied Regression Analysis (3rd ed). -New York: Wiley. (exercise 24.N). -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
-t = tblish.dataset.Orange; - -# TODO: Port coplot to Octave - -# TODO: Linear model - -
out =
OrchardSprays ()
¶Potency of Orchard Sprays -
-An experiment was conducted to assess the potency of various constituents -of orchard sprays in repelling honeybees, using a Latin square design. -
-rowpos
Row of the design. -
colpos
Column of the design -
treatment
Treatment level. -
decrease
Response. -
Individual cells of dry comb were filled with measured amounts of lime -sulphur emulsion in sucrose solution. Seven different concentrations of lime -sulphur ranging from a concentration of 1/100 to 1/1,562,500 in successive -factors of 1/5 were used as well as a solution containing no lime sulphur. -
-The responses for the different solutions were obtained by releasing 100 -bees into the chamber for two hours, and then measuring the decrease in volume -of the solutions in the various cells. -
-An 8 x 8 Latin square design was used and the treatments were coded as follows: -
-A – highest level of lime sulphur -B – next highest level of lime sulphur -… -G – lowest level of lime sulphur -H – no lime sulphur -
-Finney, D. J. (1947). Probit Analysis. Cambridge. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.OrchardSprays; - -tblish.examples.plot_pairs (t); - -
out =
PlantGrowth ()
¶Results from an Experiment on Plant Growth -
-Results from an experiment to compare yields (as measured by dried weight of -plants) obtained under a control and two different treatment conditions. -
-group
Treatment condition group. -
weight
Weight of plants. -
Dobson, A. J. (1983). An Introduction to Statistical Modelling. -London: Chapman and Hall. -
-t = tblish.dataset.PlantGrowth; - -# TODO: Port anova to Octave - -
out =
Puromycin ()
¶Reaction Velocity of an Enzymatic Reaction -
-Reaction velocity versus substrate concentration in an enzymatic reaction -involving untreated cells or cells treated with Puromycin. -
-state
Whether the cell was treated. -
conc
Substrate concentrations (ppm). -
rate
Instantaneous reaction rates (counts/min/min). -
Data on the velocity of an enzymatic reaction were obtained by Treloar -(1974). The number of counts per minute of radioactive product from the -reaction was measured as a function of substrate concentration in parts per -million (ppm) and from these counts the initial rate (or velocity) of the -reaction was calculated (counts/min/min). The experiment was conducted once -with the enzyme treated with Puromycin, and once with the enzyme untreated. -
-The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Bates, D.M. and Watts, D.G. (1988). Nonlinear Regression Analysis and -Its Applications. New York: Wiley. Appendix A1.3. -
-Treloar, M. A. (1974). Effects of Puromycin on Galactosyltransferase -in Golgi Membranes. M.Sc. Thesis, U. of Toronto. -
-t = tblish.dataset.Puromycin; - -# TODO: Port example to Octave - -
out =
Theoph ()
¶Pharmacokinetics of Theophylline -
-An experiment on the pharmacokinetics of theophylline. -
-Subject
Categorical identifying the subject on whom the observation was made. The -ordering is by increasing maximum concentration of theophylline observed. -
Wt
Weight of the subject (kg). -
Dose
Dose of theophylline administerred orally to the subject (mg/kg). -
Time
Time since drug administration when the sample was drawn (hr). -
conc
Theophylline concentration in the sample (mg/L). -
Boeckmann, Sheiner and Beal (1994) report data from a study by Dr. Robert -Upton of the kinetics of the anti-asthmatic drug theophylline. Twelve subjects -were given oral doses of theophylline then serum concentrations were measured -at 11 time points over the next 25 hours. -
-These data are analyzed in Davidian and Giltinan (1995) and Pinheiro and Bates -(2000) using a two-compartment open pharmacokinetic model, for which a -self-starting model function, SSfol, is available. -
-The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Boeckmann, A. J., Sheiner, L. B. and Beal, S. L. (1994). NONMEM Users -Guide: Part V. NONMEM Project Group, University of California, San Francisco. -
-Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated -Measurement Data. London: Chapman & Hall. (section 5.5, p. 145 and section 6.6, p. 176) -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in -S and S-PLUS. New York: Springer. (Appendix A.29) -
-t = tblish.dataset.Theoph; - -# TODO: Coplot -# TODO: Yet another linear model to port to Octave - -
out =
Titanic ()
¶Survival of passengers on the Titanic -
-This data set provides information on the fate of passengers on the fatal -maiden voyage of the ocean liner ‘Titanic’, summarized according to -economic status (class), sex, age and survival. -
-n
is a 4-dimensional array resulting from cross-tabulating 2201 observations
-on 4 variables. The dimensions of the array correspond to the following variables:
-
Class
1st, 2nd, 3rd, Cre. -
Sex
Male, Female. -
Age
Child, Adult. -
Survived
No, Yes. -
The sinking of the Titanic is a famous event, and new books are still being -published about it. Many well-known facts—from the proportions of first-class -passengers to the ‘women and children first’ policy, and the fact that that -policy was not entirely successful in saving the women and children in the -third class—are reflected in the survival rates for various classes of -passenger. -
-These data were originally collected by the British Board of Trade in their -investigation of the sinking. Note that there is not complete agreement among -primary sources as to the exact numbers on board, rescued, or lost. -
-Due in particular to the very successful film ‘Titanic’, the last years saw a -rise in public interest in the Titanic. Very detailed data about the passengers -is now available on the Internet, at sites such as Encyclopedia Titanica -(https://www.encyclopedia-titanica.org/). -
-Dawson, Robert J. MacG. (1995). The ‘Unusual Episode’ Data Revisited. -Journal of Statistics Education, 3. -
-The source provides a data set recording class, sex, age, and survival status -for each person on board of the Titanic, and is based on data originally -collected by the British Board of Trade and reprinted in: -
-British Board of Trade (1990). Report on the Loss of the ‘Titanic’ -(S.S.). British Board of Trade Inquiry Report (reprint). Gloucester, -UK: Allan Sutton Publishing. -
-tblish.dataset.Titanic; - -# TODO: Port mosaic plot to Octave - -# TODO: Check for higher survival rates in children and females - -
out =
ToothGrowth ()
¶The Effect of Vitamin C on Tooth Growth in Guinea Pigs -
-The response is the length of odontoblasts (cells responsible for tooth growth)
-in 60 guinea pigs. Each animal received one of three dose levels of vitamin C
-(0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or
-ascorbic acid (a form of vitamin C and coded as VC
).
-
supp
Supplement type. -
dose
Dose (mg/day). -
len
Tooth length. -
C. I. Bliss (1952). The Statistics of Bioassay. Academic Press. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-Crampton, E. W. (1947). The growth of the odontoblast of the incisor -teeth as a criterion of vitamin C intake of the guinea pig. The -Journal of Nutrition, 33(5), 491–504. -
-t = tblish.dataset.ToothGrowth; - -tblish.examples.coplot (t, "dose", "len", "supp"); - -# TODO: Port Lowess smoothing to Octave - -
out =
UCBAdmissions ()
¶Student Admissions at UC Berkeley -
-Aggregate data on applicants to graduate school at Berkeley for the six -largest departments in 1973 classified by admission and sex. -
-A 3-dimensional array resulting from cross-tabulating 4526 observations on -3 variables. The variables and their levels are as follows: -
-Admit
Admitted, Rejected. -
Gender
Male, Female. -
Dept
A, B, C, D, E, F. -
This data set is frequently used for illustrating Simpson’s paradox, see -Bickel et al (1975). At issue is whether the data show evidence of sex bias -in admission practices. There were 2691 male applicants, of whom 1198 (44.5%) -were admitted, compared with 1835 female applicants of whom 557 (30.4%) were -admitted. This gives a sample odds ratio of 1.83, indicating that males were -almost twice as likely to be admitted. In fact, graphical methods (as in the -example below) or log-linear modelling show that the apparent association -between admission and sex stems from differences in the tendency of males -and females to apply to the individual departments (females used to apply -more to departments with higher rejection rates). -
-The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Bickel, P. J., Hammel, E. A., and O’Connell, J. W. (1975). Sex bias in -graduate admissions: Data from Berkeley. Science, 187, 398–403. -http://www.jstor.org/stable/1739581. -
-tblish.dataset.UCBAdmissions; - -# TODO: Port mosaic plot to Octave - -
out =
UKDriverDeaths ()
¶Road Casualties in Great Britain 1969-84 -
-UKDriverDeaths
is a time series giving the monthly totals of car drivers in Great Britain killed
-or seriously injured Jan 1969 to Dec 1984. Compulsory wearing of seat belts
-was introduced on 31 Jan 1983.
-
Seatbelts
is more information on the same problem.
-
UKDriverDeaths
is a table with the following variables:
-
month
Month of the observation. -
deaths
Number of deaths. -
Seatbelts
is a table with the following variables:
-
month
Month of the observation. -
DriversKilled
Car drivers killed. -
drivers
Same as UKDriverDeaths
deaths
count.
-
front
Front-seat passengers killed or seriously injured. -
rear
Rear-seat passengers killed or seriously injured. -
kms
Distance driven. -
PetrolPrice
Petrol price. -
VanKilled
Number of van (“light goods vehicle”) drivers killed. -
law
0/1: was the seatbelt law in effect that month? -
Harvey, A.C. (1989). Forecasting, Structural Time Series Models and -the Kalman Filter. Cambridge: Cambridge University Press. pp. 519–523. -
-Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/ -
-Harvey, A. C. and Durbin, J. (1986). The effects of seat belt legislation -on British road casualties: A case study in structural time series -modelling. Journal of the Royal Statistical Society series A, 149, 187–227. -
-tblish.dataset.UKDriverDeaths; -d = UKDriverDeaths; -s = Seatbelts; - -# TODO: Port the model and plots to Octave - -
out =
UKLungDeaths ()
¶Monthly Deaths from Lung Diseases in the UK -
-Three time series giving the monthly deaths from bronchitis, emphysema and -asthma in the UK, 1974–1979. -
-date
Month of the observation. -
ldeaths
Total lung deaths. -
fdeaths
Lung deaths among females. -
mdeaths
Lung deaths among males. -
P. J. Diggle (1990). Time Series: A Biostatistical Introduction. Oxford. table A.3 -
-t = tblish.dataset.UKLungDeaths; - -figure -plot (datenum (t.date), t.ldeaths); -title ("Total UK Lung Deaths") -xlabel ("Month") -ylabel ("Deaths") - -figure -plot (datenum (t.date), [t.fdeaths t.mdeaths]); -title ("UK Lung Deaths buy sex") -legend ({"Female", "Male"}) -xlabel ("Month") -ylabel ("Deaths") - -
out =
UKgas ()
¶UK Quarterly Gas Consumption -
-Quarterly UK gas consumption from 1960Q1 to 1986Q4, in millions of therms. -
-date
Quarter of the observation -
gas
Gas consumption (MM therms). -
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/. -
-t = tblish.dataset.UKgas; - -plot (datenum (t.date), t.gas); -datetick ("x") -xlabel ("Month") -ylabel ("Gas consumption (MM therms)") - -
out =
USAccDeaths ()
¶Accidental Deaths in the US 1973-1978 -
-A time series giving the monthly totals of accidental deaths in the USA. -
-month
Month of the observation. -
deaths
Accidental deaths. -
Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods. -New York: Springer. -
-t = tblish.dataset.USAccDeaths; - -
out =
USArrests ()
¶Violent Crime Rates by US State -
-This data set contains statistics, in arrests per 100,000 residents for -assault, murder, and rape in each of the 50 US states in 1973. Also given -is the percent of the population living in urban areas. -
-State
State name. -
Murder
Murder arrests (per 100,000). -
Assault
Assault arrests (per 100,000). -
UrbanPop
Percent urban population. -
Rape
Rape arrests (per 100,000). -
USArrests
contains the data as in McNeil’s monograph. For the
-UrbanPop
percentages, a review of the table (No. 21) in the
-Statistical Abstracts 1975 reveals a transcription error for Maryland
-(and that McNeil used the same “round to even” rule), as found by
-Daniel S Coven (Arizona).
-
See the example below on how to correct the error and improve accuracy -for the ‘<n>.5’ percentages. -
-World Almanac and Book of Facts 1975. (Crime rates). -
-Statistical Abstracts of the United States 1975, p.20, (Urban rates), -possibly available as https://books.google.ch/books?id=zl9qAAAAMAAJ&pg=PA20. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.USArrests; - -summary (t); - -tblish.examples.plot_pairs (t(:,2:end)); - -# TODO: Difference between USArrests and its correction - -# TODO: +/- 0.5 to restore the original <n>.5 percentages - -
out =
USJudgeRatings ()
¶Lawyers’ Ratings of State Judges in the US Superior Court -
-Lawyers’ ratings of state judges in the US Superior Court. -
-CONT
Number of contacts of lawyer with judge. -
INTG
Judicial integrity. -
DMNR
Demeanor. -
DILG
Diligence. -
CFMG
Case flow managing. -
DECI
Prompt decisions. -
PREP
Preparation for trial. -
FAMI
Familiarity with law. -
ORAL
Sound oral rulings. -
WRIT
Sound written rulings. -
PHYS
Physical ability. -
RTEN
Worthy of retention. -
New Haven Register, 14 January, 1977 (from John Hartigan). -
-t = tblish.dataset.USJudgeRatings; - -figure -tblish.examples.plot_pairs (t(:,2:end)); -title ("USJudgeRatings data") - -
out =
USPersonalExpenditure ()
¶Personal Expenditure Data -
-This data set consists of United States personal expenditures (in billions -of dollars) in the categories: food and tobacco, household operation, -medical and health, personal care, and private education for the years 1940, -1945, 1950, 1955 and 1960. -
-A 2-dimensional matrix x
with Category along dimension 1 and Year along dimension 2.
-
The World Almanac and Book of Facts, 1962, page 756. -
-Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass: Addison-Wesley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.USPersonalExpenditure; - -# TODO: Port medpolish() from R, whatever that is. - -
out =
VADeaths ()
¶Death Rates in Virginia (1940) -
-Death rates per 1000 in Virginia in 1940. -
-A 2-dimensional matrix deaths
, with age group along dimension 1 and
-demographic group along dimension 2.
-
The death rates are measured per 1000 population per year. They are -cross-classified by age group (rows) and population group (columns). The -age groups are: 50–54, 55–59, 60–64, 65–69, 70–74 and the population groups -are Rural/Male, Rural/Female, Urban/Male and Urban/Female. -
-This provides a rather nice 3-way analysis of variance example. -
-Molyneaux, L., Gilliam, S. K., and Florant, L. C.(1947) Differences -in Virginia death rates by color, sex, age, and rural or urban -residence. American Sociological Review, 12, 525–535. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.VADeaths; - -# TODO: Port to Octave - -
out =
WWWusage ()
¶WWWusage -
-A time series of the numbers of users connected to the Internet through -a server every minute. -
-A time series of length 100. -
-Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/ -
-Makridakis, S., Wheelwright, S. C. and Hyndman, R. J. (1998). Forecasting: -Methods and Applications. New York: Wiley. -
-# TODO: Come up with example code here - -
out =
WorldPhones ()
¶The World’s Telephones -
-The number of telephones in various regions of the world (in thousands). -
-A matrix with 7 rows and 8 columns. The columns of the matrix give the -figures for a given region, and the rows the figures for a year. -
-The regions are: North America, Europe, Asia, South America, Oceania, -Africa, Central America. -
-The years are: 1951, 1956, 1957, 1958, 1959, 1960, 1961. -
-AT&T (1961) The World’s Telephones. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.WorldPhones; - -# TODO: Port matplot() to Octave - -
out =
airmiles ()
¶Passenger Miles on Commercial US Airlines, 1937-1960 -
-The revenue passenger miles flown by commercial airlines in the -United States for each year from 1937 to 1960. -
-F.A.A. Statistical Handbook of Aviation. -
-t = tblish.dataset.airmiles; -plot (t.year, t.miles); -title ("airmiles data"); -xlabel ("Passenger-miles flown by U.S. commercial airlines") -ylabel ("airmiles"); - -
out =
airquality ()
¶New York Air Quality Measurements from 1973 -
-Daily air quality measurements in New York, May to September 1973. -
-Ozone
Ozone concentration (ppb) -
SolarR
Solar R (lang) -
Wind
Wind (mph) -
Temp
Temperature (degrees F) -
Month
Month (1-12) -
Day
Day of month (1-31) -
New York State Department of Conservation (ozone data) and the National -Weather Service (meteorological data). -
-Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A. (1983). -Graphical Methods for Data Analysis. Belmont, CA: Wadsworth. -
-t = tblish.dataset.airquality -# Plot a scatter-plot plus a fitted line, for each combination of measurements -vars = {"Ozone", "SolarR", "Wind", "Temp" "Month", "Day"}; -n_vars = numel (vars); -figure; -for i = 1:n_vars - for j = 1:n_vars - if (i == j) - continue - endif - ix_subplot = (n_vars * (j - 1) + i); - hax = subplot (n_vars, n_vars, ix_subplot); - var_x = vars{i}; - var_y = vars{j}; - x = t.(var_x); - y = t.(var_y); - scatter (hax, x, y, 10); - # Fit a cubic line to these points - # TODO: Find out exactly what kind of fitted line R's example is using, and - # port that. - hold on - p = polyfit (x, y, 3); - x_hat = unique(x); - p_y = polyval (p, x_hat); - plot (hax, x_hat, p_y, "r"); - endfor -endfor - -
out =
anscombe ()
¶Anscombe’s Quartet of “Identical” Simple Linear Regressions -
-Four sets of x/y pairs which have the same statistical properties, but are -very different. -
-The data comes in an array of 4 structs, each with fields as follows: -
-x
The X values for this pair. -
y
The Y values for this pair. -
Tufte, Edward R. (1989). The Visual Display of Quantitative Information. -13–14. Cheshire, CT: Graphics Press. -
-Anscombe, Francis J. (1973). Graphs in statistical analysis. The -American Statistician, 27, 17–21. -
-data = tblish.dataset.anscombe - -# Pick good limits for the plots -all_x = [data.x]; -all_y = [data.y]; -x_limits = [min(0, min(all_x)) max(all_x)*1.2]; -y_limits = [min(0, min(all_y)) max(all_y)*1.2]; - -# Do regression on each pair and plot the input and results -figure; -haxs = NaN (1, 4); -for i_pair = 1:4 - x = data(i_pair).x; - y = data(i_pair).y; - # TODO: Port the anova and other characterizations from the R code - # TODO: Do a linear regression and plot its line - hax = subplot (2, 2, i_pair); - haxs(i_pair) = hax; - xlabel (sprintf ("x%d", i_pair)); - ylabel (sprintf ("y%d", i_pair)); - scatter (x, y, "r"); -endfor - -# Fiddle with the plot axes parameters -linkaxes (haxs); -xlim (haxs(1), x_limits); -ylim (haxs(1), y_limits); - -
out =
attenu ()
¶Joyner-Boore Earthquake Attenuation Data -
-Event data for 23 earthquakes in California, showing peak accelerations. -
-event
Event number -
mag
Moment magnitude -
station
Station identifier -
dist
Station-hypocenter distance (km) -
accel
Peak acceleration (g) -
Joyner, W.B., D.M. Boore and R.D. Porcella (1981). Peak horizontal acceleration -and velocity from strong-motion records including records from the 1979 -Imperial Valley, California earthquake. USGS Open File report 81-365. Menlo -Park, CA. -
-Boore, D. M. and Joyner, W. B. (1982). The empirical prediction of ground -motion. Bulletin of the Seismological Society of America, 72, S269–S268. -
-# TODO: Port the example code from R -# It does coplot() and pairs(), which are higher-level plotting tools -# than core Octave provides. This could turn into a long example if we -# just use base Octave here. -
out =
attitude ()
¶The Chatterjee-Price Attitude Data -
-Aggregated data from a survey of clerical employees at a large financial -organization. -
-rating
Overall rating. -
complaints
Handling of employee complaints. -
privileges
Does not allow special privileges. -
learning
Opportunity to learn. -
raises
Raises based on performance. -
critical
Too critical. -
advance
Advancement. -
Chatterjee, S. and Price, B. (1977). Regression Analysis by Example. New York: -Wiley. (Section 3.7, p.68ff of 2nd ed.(1991).) -
-t = tblish.dataset.attitude - -tblish.examples.plot_pairs (t); - -# TODO: Display table summary - -# TODO: Whatever those statistical linear-model plots are that R is doing - - -
out =
austres ()
¶Australian Population -
-Numbers of Australian residents measured quarterly from March 1971 to March 1994. -
-date
The month of the observation. -
residents
The number of residents. -
Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and -Forecasting. New York: Springer-Verlag. -
-t = tblish.dataset.austres - -plot (datenum (t.date), t.residents); -datetick x -xlabel ("Month"); ylabel ("Residents"); title ("Australian Residents"); - -
out =
beavers ()
¶Body Temperature Series of Two Beavers -
-Body temperature readings for two beavers. -
-day
Day of observation (in days since the beginning of 1990), December 12–13 (beaver1) -and November 3–4 (beaver2). -
time
Time of observation, in the form 0330 for 3:30am -
temp
Measured body temperature in degrees Celsius. -
activ
Indicator of activity outside the retreat. -
P. S. Reynolds (1994) Time-series analyses of beaver body temperatures. -Chapter 11 of Lange, N., Ryan, L., Billard, L., Brillinger, D., Conquest, -L. and Greenhouse, J. (Eds.) (1994) Case Studies in Biometry. New York: John Wiley -and Sons. -
-# TODO: This example needs to be ported from R. -
out =
cars ()
¶Speed and Stopping Distances of Cars -
-Speed of cars and distances taken to stop. Note that the data were recorded in the 1920s. -
-speed
Speed (mph). -
dist
Stopping distance (ft). -
Ezekiel, M. (1930). Methods of Correlation Analysis. New York: Wiley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-- -t = tblish.dataset.cars; - - -# TODO: Add Lowess smoothed lines to the plots - -figure; -plot (t.speed, t.dist, "o"); -xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)"); -title ("cars data"); - -figure; -loglog (t.speed, t.dist, "o"); -xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)"); -title ("cars data (logarithmic scales)"); - -# TODO: Do the linear model plot - -# Polynomial regression -figure; -plot (t.speed, t.dist, "o"); -xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)"); -title ("cars polynomial regressions"); -hold on -xlim ([0 25]); -x2 = linspace (0, 25, 200); -for degree = 1:4 - [P, S, mu] = polyfit (t.speed, t.dist, degree); - y2 = polyval(P, x2, [], mu); - plot (x2, y2); -endfor - - -
out =
chickwts ()
¶Chicken Weights by Feed Type -
-An experiment was conducted to measure and compare the effectiveness of various -feed supplements on the growth rate of chickens. -
-Newly hatched chicks were randomly allocated into six groups, and each group -was given a different feed supplement. Their weights in grams after six weeks -are given along with feed types. -
-weight
Chick weight at six weeks (gm). -
feed
Feed type. -
Anonymous (1948) Biometrika, 35, 214. -
-McNeil, D. R. (1977). Interactive Data Analysis
. New York: Wiley.
-
# This example requires the statistics package from Octave Forge - -t = tblish.dataset.chickwts - -# Boxplot by group -figure -g = groupby (t, "feed", { - "weight", @(x) {x}, "weight" -}); -boxplot (g.weight, 1); -xlabel ("feed"); ylabel ("Weight at six weeks (gm)"); -xticklabels ([{""} cellstr(g.feed')]); - -# Linear model -# TODO: This linear model thing and anova - -
out =
co2 ()
¶Mauna Loa Atmospheric CO2 Concentration -
-Atmospheric concentrations of CO2 are expressed in parts per million (ppm) and -reported in the preliminary 1997 SIO manometric mole fraction scale. Contains -monthly observations from 1959 to 1997. -
-date
Date of the month of the observation, as datetime. -
co2
CO2 concentration (ppm). -
The values for February, March and April of 1964 were missing and have -been obtained by interpolating linearly between the values for January -and May of 1964. -
-Keeling, C. D. and Whorf, T. P., Scripps Institution of Oceanography -(SIO), University of California, La Jolla, California USA 92093-0220. -
-ftp://cdiac.esd.ornl.gov/pub/maunaloa-co2/maunaloa.co2. -
-Cleveland, W. S. (1993). Visualizing Data
. New Jersey: Summit Press.
-
t = tblish.dataset.co2; - -plot (datenum (t.date), t.co2); -datetick ("x"); -xlabel ("Time"); ylabel ("Atmospheric concentration of CO2"); -title ("co2 data set"); - -
out =
crimtab ()
¶Student’s 3000 Criminals Data -
-Data of 3000 male criminals over 20 years old undergoing their sentences in the -chief prisons of England and Wales. -
-This dataset contains three separate variables. The finger_length
and
-body_height
variables correspond to the rows and columns of the
-count
matrix.
-
finger_length
Midpoints of intervals of finger lengths (cm). -
body_height
Body heights (cm). -
count
Number of prisoners in this bin. -
Student is the pseudonym of William Sealy Gosset. In his 1908 paper he wrote -(on page 13) at the beginning of section VI entitled Practical Test of the -forgoing Equations: -
-“Before I had succeeded in solving my problem analytically, I had endeavoured -to do so empirically. The material used was a correlation table containing -the height and left middle finger measurements of 3000 criminals, from a -paper by W. R. MacDonell (Biometrika, Vol. I., p. 219). The measurements -were written out on 3000 pieces of cardboard, which were then very thoroughly -shuffled and drawn at random. As each card was drawn its numbers were written -down in a book, which thus contains the measurements of 3000 criminals in a -random order. Finally, each consecutive set of 4 was taken as a sample—750 -in all—and the mean, standard deviation, and correlation of each sample -etermined. The difference between the mean of each sample and the mean of -the population was then divided by the standard deviation of the sample, giving -us the z of Section III.” -
-The table is in fact page 216 and not page 219 in MacDonell(1902). In the -MacDonell table, the middle finger lengths were given in mm and the heights -in feet/inches intervals, they are both converted into cm here. The midpoints -of intervals were used, e.g., where MacDonell has “4’ 7"9/16 – 8"9/16”, we -have 142.24 which is 2.54*56 = 2.54*(4’ 8"). -
-MacDonell credited the source of data (page 178) as follows: “The data on which -the memoir is based were obtained, through the kindness of Dr Garson, from the -Central Metric Office, New Scotland Yard... He pointed out on page 179 that: -“The forms were drawn at random from the mass on the office shelves; we are -therefore dealing with a random sampling.” -
-http://pbil.univ-lyon1.fr/R/donnees/criminals1902.txt thanks to Jean R. -Lobry and Anne-Béatrice Dufour. -
-Garson, J.G. (1900). The metric system of identification of criminals, as used -in in Great Britain and Ireland. The Journal of the Anthropological -Institute of Great Britain and Ireland, 30, 161–198. -
-MacDonell, W.R. (1902). On criminal anthropometry and the identification of -criminals. Biometrika, 1(2), 177–227. -
-Student (1908). The probable error of a mean. Biometrika
, 6, 1–25.
-
# TODO: Port this from R - -
out =
cupcake ()
¶Google Search popularity for "cupcake", 2004-2019 -
-Monthly popularity of worldwide Google search results for "cupcake", 2004-2019. -
-Month
Month when searches took place -
Cupcake
An indicator of search volume, in unknown units -
Google Trends, https://trends.google.com/trends/explore?q=%2Fm%2F03p1r4&date=all, -retrieved 2019-05-04 by Andrew Janke. -
-t = tblish.dataset.cupcake -plot (datenum (t.Month), t.Cupcake) -title ('“Cupcake” Google Searches'); xlabel ("Year"); ylabel ("Unknown popularity metric"); - -
out =
discoveries ()
¶Yearly Numbers of Important Discoveries -
-The numbers of “great” inventions and scientific discoveries in each year from 1860 to 1959. -
-year
Year. -
discoveries
Number of “great” discoveries that year. -
The World Almanac and Book of Facts, 1975 Edition, pages 315–318. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.discoveries; - -plot (t.year, t.discoveries); -xlabel ("Time"); ylabel ("Number of important discoveries"); -title ("discoveries data set"); - -
out =
esoph ()
¶Smoking, Alcohol and Esophageal Cancer -
-Data from a case-control study of (o)esophageal cancer in Ille-et-Vilaine, France. -
-item
Age group (years). -
alcgp
Alcohol consumption (gm/day). -
tobgp
Tobacco consumption (gm/day). -
ncases
Number of cases. -
ncontrols
Number of controls -
Breslow, N. E. and Day, N. E. (1980) Statistical Methods in Cancer Research. -Volume 1: The Analysis of Case-Control Studies. Oxford: IARC Lyon / Oxford University Press. -
-# TODO: Port this from R - -# TODO: Port the anova output - -# TODO: Port the fancy plot -# This involves a "mosaic plot", which is not supported by Octave, so this will -# take some work. - -
out =
euro ()
¶Conversion Rates of Euro Currencies -
-Conversion rates between the various Euro currencies. -
-This data comes in two separate variables. -
-euro
An 11-long vector of the value of 1 Euro in all participating currencies. -
euro_cross
An 11-by-11 matrix of conversion rates between various Euro currencies. -
euro_date
The date upon which these Euro conversion rates were fixed. -
The data set euro contains the value of 1 Euro in all currencies participating -in the European monetary union (Austrian Schilling ATS, Belgian Franc BEF, -German Mark DEM, Spanish Peseta ESP, Finnish Markka FIM, French Franc FRF, -Irish Punt IEP, Italian Lira ITL, Luxembourg Franc LUF, Dutch Guilder NLG and -Portuguese Escudo PTE). These conversion rates were fixed by the European -Union on December 31, 1998. To convert old prices to Euro prices, divide by the -respective rate and round to 2 digits. -
-Unknown. -
-This example data set was derived from the R 3.6.0 example datasets, and they -do not specify a source. -
-# TODO: Port this from R - -# TODO: Example conversion - -# TODO: "dot chart" showing euro-to-whatever conversion rates and vice versa - -
out =
eurodist ()
¶Distances Between European Cities and Between US Cities -
-eurodist
gives road distances (in km) between 21 cities in Europe. The
-data are taken from a table in The Cambridge Encyclopaedia.
-
UScitiesD
gives “straight line” distances between 10 cities in the US.
-
eurodist
????? -
TODO: Finish this. -
-Crystal, D. Ed. (1990). The Cambridge Encyclopaedia. Cambridge: -Cambridge University Press. -
-The US cities distances were provided by Pierre Legendre. -
-out =
faithful ()
¶Old Faithful Geyser Data -
-Waiting time between eruptions and the duration of the eruption for the Old -Faithful geyser in Yellowstone National Park, Wyoming, USA. -
-eruptions
Eruption time (mins). -
waiting
Waiting time to next eruption (mins). -
W. Härdle. -
-Härdle, W. (1991). Smoothing Techniques with Implementation in S. New York: -Springer. -
-Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old -Faithful geyser. Applied Statistics, 39, 357–365. -
-t = tblish.dataset.faithful; - -# Munge the data, rounding eruption time to the second -e60 = 60 * t.eruptions; -ne60 = round (e60); -# TODO: Port zapsmall to Octave -eruptions = ne60 / 60; -# TODO: Display mean relative difference and bins summary - -# Histogram of rounded eruption times -figure -hist (ne60, max (ne60)) -xlabel ("Eruption time (sec)") -ylabel ("n") -title ("faithful data: Eruptions of Old Faithful") - -# Scatter plot of eruption time vs waiting time -figure -scatter (t.eruptions, t.waiting) -xlabel ("Eruption time (min)") -ylabel ("Waiting time to next eruption (min)") -title ("faithful data: Eruptions of Old Faithful") -# TODO: Port Lowess smoothing to Octave - -
out =
freeny ()
¶Freeny’s Revenue Data -
-Freeny’s data on quarterly revenue and explanatory variables. -
-Freeny’s dataset consists of one observed dependent variable -(revenue) and four explanatory variables (lagged quartery -revenue, price index, income level, and market potential). -
-date
Start date of the quarter for the observation. -
y
Observed quarterly revenue. -TODO: Determine units (probably millions of USD?) -
lag_quarterly_revenue
Quarterly revenue (y
), lagged 1 quarter.
-
price_index
A price index -
income_level
??? TODO: Fill this in -
market_potential
??? TODO: Fill this in -
Freeny, A. E. (1977). A Portable Linear Regression Package with Test -Programs. Bell Laboratories memorandum. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.freeny; - -summary (t) - -tblish.examples.plot_pairs (removevars (t, "date")) - -# TODO: Create linear model and print summary - -# TODO: Linear model plot - -
out =
infert ()
¶Infertility after Spontaneous and Induced Abortion -
-This is a matched case-control study dating from before the availability of -conditional logistic regression. -
-education
Index of the record. -
age
Age in years of case. -
parity
Count. -
induced
Number of prior induced abortions, grouped into “0”, “1”, or “2 or more”. -
case_status
0 = control, 1 = case. -
spontaneous
Number of prior spontaneous abortions, grouped into “0”, “1”, or “2 or more”. -
stratum
Matched set number. -
pooled_stratum
Stratum number. -
One case with two prior spontaneous abortions and two prior induced abortions is omitted. -
-Trichopoulos et al (1976). Br. J. of Obst. and Gynaec. 83, 645–650. -
-t = tblish.dataset.infert; - -# TODO: Port glm() (generalized linear model) stuff to Octave - -
out =
iris ()
¶The Fisher Iris dataset: measurements of various flowers -
-This is the classic Fisher Iris dataset. -
-Species
The species of flower being measured. -
SepalLength
Length of sepals, in centimeters. -
SepalWidth
Width of sepals, in centimeters. -
PetalLength
Length of petals, in centimeters. -
PetalWidth
Width of petals, in centimeters. -
http://archive.ics.uci.edu/ml/datasets/Iris -
-https://en.wikipedia.org/wiki/Iris_flower_data_set -
-Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. -Annals of Eugenics, 7, Part II, 179-188. also in Contributions -to Mathematical Statistics (John Wiley, NY, 1950). -
-Duda, R.O., & Hart, P.E. (1973). Pattern Classification and Scene Analysis. -(Q327.D83) New York: John Wiley & Sons. ISBN 0-471-22361-1. See page 218. -
-The data were collected by Anderson, Edgar (1935). The irises of the Gaspe -Peninsula. Bulletin of the American Iris Society, 59, 2–5. -
-# TODO: Port this example from R - -
out =
islands ()
¶Areas of the World’s Major Landmasses -
-The areas in thousands of square miles of the landmasses which exceed 10,000 -square miles. -
-name
The name of the island. -
area
The area, in thousands of square miles. -
The World Almanac and Book of Facts, 1975, page 406. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.islands; - -# TODO: Port dot chart to Octave - -
out =
lh ()
¶Luteinizing Hormone in Blood Samples -
-A regular time series giving the luteinizing hormone in blood samples at 10 -minute intervals from a human female, 48 samples. -
-sample
The number of the observation. -
lh
Level of luteinizing hormone. -
P.J. Diggle (1990). Time Series: A Biostatistical Introduction. Oxford. -Table A.1, series 3. -
-t = tblish.dataset.lh; - -plot (t.sample, t.lh); -xlabel ("Sample Number"); -ylabel ("lh level"); - -
out =
longley ()
¶Longley’s Economic Regression Data -
-A macroeconomic data set which provides a well-known example for a highly -collinear regression. -
-Year
The year. -
GNP_deflator
GNP implicit price deflator (1954=100). -
GNP
Gross National Product. -
Unemployed
Number of unemployed. -
Armed_Forces
Number of people in the armed forces. -
Population
“Noninstitutionalized” population ≥ 14 years of age. -
Employed
Number of people employed. -
J. W. Longley (1967). An appraisal of least-squares programs from the point of -view of the user. Journal of the American Statistical Association, 62, -819–841. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.longley; - -# TODO: Linear model -# TODO: opar plot - -
out =
lynx ()
¶Annual Canadian Lynx trappings 1821-1934 -
-Annual numbers of lynx trappings for 1821–1934 in Canada. Taken from Brockwell -& Davis (1991), this appears to be the series considered by Campbell & Walker -(1977). -
-year
Year of the record. -
lynx
Number of lynx trapped. -
Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting -Methods. Second edition. New York: Springer. Series G (page 557). -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-Campbell, M. J. and Walker, A. M. (1977). A Survey of statistical work on -the Mackenzie River series of annual Canadian lynx trappings for the years -1821–1934 and a new analysis. Journal of the Royal Statistical Society -series A, 140, 411–431. -
-t = tblish.dataset.lynx; - -plot (t.year, t.lynx); -xlabel ("Year"); -ylabel ("Lynx Trapped"); - -
out =
morley ()
¶Michelson Speed of Light Data -
-A classical data of Michelson (but not this one with Morley) on measurements -done in 1879 on the speed of light. The data consists of five experiments, -each consisting of 20 consecutive ‘runs’. The response is the speed of -light measurement, suitably coded (km/sec, with 299000 subtracted). -
-Expt
The experiment number, from 1 to 5. -
Run
The run number within each experiment. -
Speed
Speed-of-light measurement. -
The data is here viewed as a randomized block experiment with experiment
-and run
as the factors. run
may also be considered a quantitative
-variate to account for linear (or polynomial) changes in the measurement over
-the course of a single experiment.
-
A. J. Weekes (1986). A Genstat Primer. London: Edward Arnold. -
-S. M. Stigler (1977). Do robust estimators work with real data? Annals -of Statistics 5, 1055–1098. (See Table 6.) -
-A. A. Michelson (1882). Experimental determination of the velocity of -light made at the United States Naval Academy, Annapolis. Astronomic -Papers, 1, 135–8. U.S. Nautical Almanac Office. (See Table 24.). -
-t = tblish.dataset.morley; - -# TODO: Port to Octave - -
out =
mtcars ()
¶Motor Trend 1974 Car Road Tests -
-The data was extracted from the 1974 Motor Trend US magazine, and -comprises fuel consumption and 10 aspects of automobile design and -performance for 32 automobiles (1973–74 models). -
-mpg
Fuel efficiency in miles/gallon -
cyl
Number of cylinders -
disp
Displacement (cu. in.) -
hp
Gross horsepower -
drat
Rear axle ratio -
wt
Weight (1,000 lbs) -
qsec
1/4 mile time -
vs
Engine type (0 = V-shaped, 1 = straight) -
am
Transmission type (0 = automatic, 1 = manual) -
gear
Number of forward gears -
carb
Number of carburetors -
Henderson and Velleman (1981) comment in a footnote to Table 1: “Hocking -[original transcriber]’s noncrucial coding of the Mazda’s rotary engine -as a straight six-cylinder engine and the Porsche’s flat engine as a V -engine, as well as the inclusion of the diesel Mercedes 240D, have been -retained to enable direct comparisons to be made with previous analyses.” -
-Henderson and Velleman (1981). Building multiple regression models -interactively. Biometrics, 37, 391–411. -
-# TODO: Port this example from R -
out =
nhtemp ()
¶Average Yearly Temperatures in New Haven -
-The mean annual temperature in degrees Fahrenheit in New Haven, Connecticut, -from 1912 to 1971. -
-year
Year of the observation. -
temp
Mean annual temperature (degrees F). -
Vaux, J. E. and Brinker, N. B. (1972) Cycles, 1972, 117–121. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.nhtemp; - -plot (t.year, t.temp); -title ("nhtemp data"); -xlabel ("Mean annual temperature in New Haven, CT (deg. F)"); - -
out =
nottem ()
¶Average Monthly Temperatures at Nottingham, 1920-1939 -
-A time series object containing average air temperatures at -Nottingham Castle in degrees Fahrenheit for 20 years. -
-record
Index of the record. -
lead
Leading indicator. -
sales
Sales volume. -
Anderson, O. D. (1976). Time Series Analysis and Forecasting: -The Box-Jenkins approach. London: Butterworths. Series R. -
-# TODO: Come up with example code here - -
out =
npk ()
¶Classical N, P, K Factorial Experiment -
-A classical N, P, K (nitrogen, phosphate, potassium) factorial experiment -on the growth of peas conducted on 6 blocks. Each half of a fractional -factorial design confounding the NPK interaction was used on 3 of the plots. -
-block
Which block (1 to 6). -
N
Indicator (0/1) for the application of nitrogen. -
P
Indicator (0/1) for the application of phosphate. -
K
Indicator (0/1) for the application of potassium. -
yield
Yield of peas, in pounds/plot. Plots were 1/70 acre. -
Imperial College, London, M.Sc. exercise sheet. -
-Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics -with S. Fourth edition. New York: Springer. -
-t = tblish.dataset.npk; - -# TODO: Port aov() and LM to Octave - -
out =
occupationalStatus ()
¶Occupational Status of Fathers and their Sons -
-Cross-classification of a sample of British males according to each subject’s -occupational status and his father’s occupational status. -
-An 8-by-8 matrix of counts, with classifying fators origin
(father’s
-occupational status, levels 1:8) and destination
(son’s
-occupational status, levels 1:8).
-
Goodman, L. A. (1979). Simple Models for the Analysis of Association in -Cross-Classifications having Ordered Categories. J. Am. Stat. -Assoc., 74 (367), 537–552. -
-# TODO: Come up with example code here - -
out =
precip ()
¶Annual Precipitation in US Cities -
-The average amount of precipitation (rainfall) in inches for each of 70 United -States (and Puerto Rico) cities. -
-city
City observed. -
precip
Annual precipitation (in). -
Statistical Abstracts of the United States, 1975. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.precip; - -# TODO: Port dot plot to Octave - -
out =
presidents ()
¶Quarterly Approval Ratings of US Presidents -
-The (approximately) quarterly approval rating for the President of the United -States from the first quarter of 1945 to the last quarter of 1974. -
-date
Approximate date of the observation. -
approval
Approval rating (%). -
The data are actually a fudged version of the approval ratings. See McNeil’s book -for details. -
-The Gallup Organisation. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.presidents; - -figure -plot (datenum (t.date), t.approval) -datetick ("x") -xlabel ("Date") -ylabel ("Approval rating (%)") -title ("presidents data") - -
out =
pressure ()
¶Vapor Pressure of Mercury as a Function of Temperature -
-Data on the relation between temperature in degrees Celsius and vapor pressure -of mercury in millimeters (of mercury). -
-temperature
Temperature (deg C). -
pressure
Pressure (mm Hg). -
Weast, R. C., ed. (1973). Handbook of Chemistry and Physics. Cleveland: CRC Press. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.pressure; - -figure -plot (t.temperature, t.pressure) -xlabel ("Temperature (deg C)") -ylabel ("Pressure (mm of Hg)") -title ("pressure data: Vapor Pressure of Mercury") - -figure -semilogy (t.temperature, t.pressure) -xlabel ("Temperature (deg C)") -ylabel ("Pressure (mm of Hg)") -title ("pressure data: Vapor Pressure of Mercury") - - -
out =
quakes ()
¶Locations of Earthquakes off Fiji -
-The data set give the locations of 1000 seismic events of MB > 4.0. The events -occurred in a cube near Fiji since 1964. -
-lat
Latitude of event. -
long
Longitude of event. -
depth
Depth (km). -
mag
Richter magnitude. -
stations
Number of stations reporting. -
There are two clear planes of seismic activity. One is a major plate junction; -the other is the Tonga trench off New Zealand. These data constitute a subsample -from a larger dataset of containing 5000 observations. -
-This is one of the Harvard PRIM-H project data sets. They in turn obtained it -from Dr. John Woodhouse, Dept. of Geophysics, Harvard University. -
-G. E. P. Box and G. M. Jenkins (1976). Time Series Analysis, Forecasting and -Control. San Francisco: Holden-Day. p. 537. -
-P. J. Brockwell and R. A. Davis (1991). Time Series: Theory and Methods. -Second edition. New York: Springer-Verlag. p. 414. -
-# TODO: Come up with example code here - -
out =
randu ()
¶Random Numbers from Congruential Generator RANDU -
-400 triples of successive random numbers were taken from the VAX FORTRAN -function RANDU running under VMS 1.5. -
-record
Index of the record. -
x
X value of the triple. -
y
Y value of the triple. -
z
Z value of the triple. -
In three dimensional displays it is evident that the triples fall on 15 -parallel planes in 3-space. This can be shown theoretically to be true -for all triples from the RANDU generator. -
-These particular 400 triples start 5 apart in the sequence, that is they -are ((U[5i+1], U[5i+2], U[5i+3]), i= 0, ..., 399), and they are rounded -to 6 decimal places. -
-Under VMS versions 2.0 and higher, this problem has been fixed. -
-David Donoho -
-t = tblish.dataset.randu; - - -
out =
rivers ()
¶Lengths of Major North American Rivers -
-This data set gives the lengths (in miles) of 141 “major” rivers in North -America, as compiled by the US Geological Survey. -
-rivers
A vector containing 141 observations. -
World Almanac and Book of Facts, 1975, page 406. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.rivers; - -longest_river = max (rivers) -shortest_river = min (rivers) - -
out =
rock ()
¶Measurements on Petroleum Rock Samples -
-Measurements on 48 rock samples from a petroleum reservoir. -
-area
Area of pores space, in pixels out of 256 by 256. -
peri
Perimeter in pixels. -
shape
Perimeter/sqrt(area). -
perm
Permeability in milli-Darcies. -
Twelve core samples from petroleum reservoirs were sampled by 4 -cross-sections. Each core sample was measured for permeability, and each -cross-section has total area of pores, total perimeter of pores, and shape. -
-Data from BP Research, image analysis by Ronit Katz, U. Oxford. -
-t = tblish.dataset.rock; - -figure -scatter (t.area, t.perm) -xlabel ("Area of pores space (pixels out of 256x256)") -ylabel ("Permeability (milli-Darcies)") - -
out =
sleep ()
¶Student’s Sleep Data -
-Data which show the effect of two soporific drugs (increase in hours of sleep -compared to control) on 10 patients. -
-id
Patient ID. -
group
Drug given. -
extra
Increase in hours of sleep. -
The group
variable name may be misleading about the data: They
-represent measurements on 10 persons, not in groups.
-
Cushny, A. R. and Peebles, A. R. (1905). The action of optical isomers: -II hyoscines. The Journal of Physiology, 32, 501–510. -
-Student (1908). The probable error of the mean. Biometrika, 6, 20. -
-Scheffé, Henry (1959). The Analysis of Variance. New York, NY: Wiley. -
-t = tblish.dataset.sleep; - -# TODO: Port to Octave - -
out =
stackloss ()
¶Brownlee’s Stack Loss Plant Data -
-Operational data of a plant for the oxidation of ammonia to nitric acid. -
-AirFlow
Flow of cooling air. -
WaterTemp
Cooling Water Inlet temperature. -
AcidConc
Concentration of acid (per 1000, minus 500). -
StackLoss
Stack loss -
“Obtained from 21 days of operation of a plant for the oxidation of ammonia -(NH3) to nitric acid (HNO3). The nitric oxides produced are absorbed in a -countercurrent absorption tower”. (Brownlee, cited by Dodge, slightly reformatted by MM.) -
-AirFlow
represents the rate of operation of the plant. WaterTemp
is the
-temperature of cooling water circulated through coils in the absorption tower.
-AcidConc
is the concentration of the acid circulating, minus 50, times 10:
-that is, 89 corresponds to 58.9 per cent acid. StackLoss
(the dependent variable)
-is 10 times the percentage of the ingoing ammonia to the plant that escapes from
-the absorption column unabsorbed; that is, an (inverse) measure of the over-all
-efficiency of the plant.
-
Brownlee, K. A. (1960, 2nd ed. 1965). Statistical Theory and Methodology -in Science and Engineering. New York: Wiley. pp. 491–500. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-Dodge, Y. (1996). The guinea pig of multiple regression. In: Robust -Statistics, Data Analysis, and Computer Intensive Methods; In Honor of -Peter Huber’s 60th Birthday, 1996, Lecture Notes in Statistics -109, Springer-Verlag, New York. -
-t = tblish.dataset.stackloss; - -# TODO: Create linear model and print summary - -
out =
state ()
¶US State Facts and Figures -
-Data related to the 50 states of the United States of America. -
-abb
State abbreviation. -
name
State name. -
area
Area (sq mi). -
lat
Approximate center (latitude). -
lon
Approximate center (longitude). -
division
State division. -
revion
State region. -
Population
Population estimate as of July 1, 1975. -
Income
Per capita income (1974). -
Illiteracy
Illiteracy as of 1970 (percent of population). -
LifeExp
Lfe expectancy in years (1969-71). -
Murder
Murder and non-negligent manslaughter rate per 100,000 population (1976). -
HSGrad
Percent high-school graduates (1970). -
Frost
Mean number of days with minimum temperature below freezing (1931-1960) -in capital or large city. -
U.S. Department of Commerce, Bureau of the Census (1977) Statistical -Abstract of the United States. -
-U.S. Department of Commerce, Bureau of the Census (1977) County -and City Data Book. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.state; - -
out =
sunspot_month ()
¶Monthly Sunspot Data, from 1749 to “Present” -
-Monthly numbers of sunspots, as from the World Data Center, aka SIDC. This -is the version of the data that may occasionally be updated when new counts -become available. -
-month
Month of the observation. -
sunspots
Number of sunspots. -
WDC-SILSO, Solar Influences Data Analysis Center (SIDC), Royal Observatory -of Belgium, Av. Circulaire, 3, B-1180 BRUSSELS. -Currently at http://www.sidc.be/silso/datafiles. -
-t = tblish.dataset.sunspot_month; - - -
out =
sunspot_year ()
¶Yearly Sunspot Data, 1700-1988 -
-Yearly numbers of sunspots from 1700 to 1988 (rounded to one digit). -
-year
Year of the observation. -
sunspots
Number of sunspots. -
H. Tong (1996) Non-Linear Time Series. Clarendon Press, Oxford, p. 471. -
-t = tblish.dataset.sunspot_year; - -figure -plot (t.year, t.sunspots) -xlabel ("Year") -ylabel ("Sunspots") - -
out =
sunspots ()
¶Monthly Sunspot Numbers, 1749-1983 -
-Monthly mean relative sunspot numbers from 1749 to 1983. Collected at Swiss -Federal Observatory, Zurich until 1960, then Tokyo Astronomical Observatory. -
-month
Month of the observation. -
sunspots
Number of observed sunspots. -
Andrews, D. F. and Herzberg, A. M. (1985) Data: A Collection -of Problems from Many Fields for the Student and Research Worker. -New York: Springer-Verlag. -
-t = tblish.dataset.sunspots; - -figure -plot (datenum (t.month), t.sunspots) -datetick ("x") -xlabel ("Date") -ylabel ("Monthly sunspot numbers") -title ("sunspots data") - - -
out =
swiss ()
¶Swiss Fertility and Socioeconomic Indicators (1888) Data -
-Standardized fertility measure and socio-economic indicators for each of 47 -French-speaking provinces of Switzerland at about 1888. -
-Fertility
Ig, ‘common standardized fertility measure’. -
Agriculture
% of males involved in agriculture as occupation. -
Examination
% draftees receiving highest mark on army examination. -
Education
% education beyond primary school for draftees. -
Catholic
% ‘Catholic’ (as opposed to ‘Protestant’). -
InfantMortality
Live births who live less than 1 year. -
All variables but ‘Fertility’ give proportions of the population. -
-(paraphrasing Mosteller and Tukey): -
-Switzerland, in 1888, was entering a period known as the demographic transition; -i.e., its fertility was beginning to fall from the high level typical of -underdeveloped countries. -
-The data collected are for 47 French-speaking “provinces” at about 1888. -
-Here, all variables are scaled to [0, 100], where in the original, all but
-Catholic
were scaled to [0, 1].
-
Files for all 182 districts in 1888 and other years have been available at -https://opr.princeton.edu/archive/pefp/switz.aspx. -
-They state that variables Examination
and Education
are averages
-for 1887, 1888 and 1889.
-
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.swiss; - -# TODO: Port linear model to Octave - -
out =
treering ()
¶Yearly Treering Data, -6000-1979 -
-Contains normalized tree-ring widths in dimensionless units. -
-A univariate time series with 7981 observations. -
-Each tree ring corresponds to one year. -
-The data were recorded by Donald A. Graybill, 1980, from Gt Basin -Bristlecone Pine 2805M, 3726-11810 in Methuselah Walk, California. -
-Time Series Data Library: http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/, -series ‘CA535.DAT’. -
-For some photos of Methuselah Walk see -https://web.archive.org/web/20110523225828/http://www.ltrr.arizona.edu/~hallman/sitephotos/meth.html. -
-t = tblish.dataset.treering; - -
out =
trees ()
¶Diameter, Height and Volume for Black Cherry Trees -
-This data set provides measurements of the diameter, height and volume of -timber in 31 felled black cherry trees. Note that the diameter (in inches) -is erroneously labelled Girth in the data. It is measured at 4 ft 6 in -above the ground. -
-Girth
Tree diameter (rather than girth, actually) in inches. -
Height
Height in ft. -
Volume
Volume of timber in cubic feet. -
Ryan, T. A., Joiner, B. L. and Ryan, B. F. (1976). The Minitab -Student Handbook. Duxbury Press. -
-Atkinson, A. C. (1985). Plots, Transformations and Regression. -Oxford: Oxford University Press. -
-t = tblish.dataset.trees; - -figure -tblish.examples.plot_pairs (t); - -figure -loglog (t.Girth, t.Volume) -xlabel ("Girth") -ylabel ("Volume") - -# TODO: Transform to log space for the coplot - -# TODO: Linear model - -
out =
uspop ()
¶Populations Recorded by the US Census -
-This data set gives the population of the United States -(in millions) as recorded by the decennial census for the period 1790–1970. -
-year
Year of the census. -
population
Population, in millions. -
McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.uspop; - -figure -semilogy (t.year, t.population) -xlabel ("Year") -ylabel ("U.S. Population (millions)") - -
out =
volcano ()
¶Topographic Information on Auckland’s Maunga Whau Volcano -
-Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic -field. This data set gives topographic information for Maunga Whau on a -10m by 10m grid. -
-A matrix volcano
with 87 rows and 61 columns, rows corresponding
-to grid lines running east to west and columns to grid lines running south
-to north.
-
Digitized from a topographic map by Ross Ihaka. These data should not be regarded as accurate. -
-Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and -Control. San Francisco: Holden-Day. p. 537. -
-Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods. -Second edition. New York: Springer-Verlag. p. 414. -
-tblish.dataset.volcano; - -# TODO: Figure out how to do a topo map in Octave. Just a gridded color plot -# should be fine. And then maybe do a 3-d mesh plot. - -
out =
warpbreaks ()
¶The Number of Breaks in Yarn during Weaving -
-This data set gives the number of warp breaks per loom, where a loom -corresponds to a fixed length of yarn. -
-wool
Type of wool (A or B). -
tension
The level of tension (L, M, H). -
breaks
Number of breaks. -
There are measurements on 9 looms for each of the six types of warp (AL, AM, AH, BL, BM, BH). -
-Tippett, L. H. C. (1950). Technological Applications of Statistics. -New York: Wiley. Page 106. -
-Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass: Addison-Wesley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.warpbreaks; - -summary (t) - -# TODO: Port the plotting code and OPAR to Octave - -
out =
women ()
¶Average Heights and Weights for American Women -
-This data set gives the average heights and weights for American women aged 30–39. -
-height
Height (in). -
weight
Weight (lbs). -
The data set appears to have been taken from the American Society of Actuaries -Build and Blood Pressure Study for some (unknown to us) earlier year. -
-The World Almanac notes: “The figures represent weights in ordinary indoor -clothing and shoes, and heights with shoes”. -
-The World Almanac and Book of Facts, 1975. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.women; - -figure -scatter (t.height, t.weight) -xlabel ("Height (in)") -ylabel ("Weight (lb") -title ("women data: American women aged 30-39") - -
out =
zCO2 ()
¶Carbon Dioxide Uptake in Grass Plants -
-The CO2
data set has 84 rows and 5 columns of data from an experiment
-on the cold tolerance of the grass species Echinochloa crus-galli.
-
The CO2 uptake of six plants from Quebec and six plants from Mississippi was -measured at several levels of ambient CO2 concentration. Half the plants of -each type were chilled overnight before the experiment was conducted. -
-Potvin, C., Lechowicz, M. J. and Tardif, S. (1990). The statistical -analysis of ecophysiological response curves obtained from experiments -involving repeated measures. Ecology, 71, 1389–1400. -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models -in S and S-PLUS. New York: Springer. -
-t = tblish.dataset.zCO2; - -# TODO: Coplot -# TODO: Port the linear model to Octave - -
Example dataset collection. -
-tblish.datasets
is a collection of example datasets to go with the
-Tablicious package.
-
The tblish.datasets
class provides methods for listing and loading
-the example datasets.
-
out =
tblish.evalWithTableVars (tbl, expr)
¶Evaluate an expression against a table array’s variables. -
-Evaluates the M-code expression expr in a workspace where all of tbl’s -variables have been assigned to workspace variables. -
-expr is a charvec containing an Octave expression. -
-As an implementation detail, the workspace will also contain some variables -that are prefixed and suffixed with "__". So try to avoid those in your -table variable names. -
-Returns the result of the evaluation. -
-Examples: -
-[s,p,sp] = tblish.examples.SpDb -tmp = join (sp, p); -shipment_weight = tblish.evalWithTableVars (tmp, "Qty .* Weight") -
See also: table.restrict -
-spdb =
tblish.examples.SpDb ()
¶[s, p, sp] =
tblish.examples.SpDb ()
¶The classic Suppliers-Parts example database. -
-Constructs the classic C. J. Date Suppliers-Parts ("SP") example database as tables. -This database is the one used as an example throughout Date’s "An Introduction to -Database Systems" textbook. -
-Returns the database as a set of three table arrays. If one argout is captured, the -tables are returned in the fields of a single struct. If multiple argouts are captured, the -tables are returned as three argouts with a single table in each, in the order (s, -p, sp). -
-[fig, hax] =
tblish.examples.coplot (tbl, xvar, yvar, gvar)
¶[fig, hax] =
tblish.examples.coplot (fig, tbl, xvar, yvar, gvar)
¶[fig, hax] =
tblish.examples.coplot (…, OptionName, OptionValue, …)
¶Conditioning plot. -
-tblish.examples.coplot
produces conditioning plots. This is a kind of plot that breaks up the
-data into groups based on one or two grouping variables, and plots each group of data
-in a separate subplot.
-
tbl is a table
containing the data to plot.
-
xvar is the name of the table variable within tbl to use as the X values. -May be a variable name or index. -
-yvar is the name of the table variable within tbl to use as the Y values. -May be a variable name or index. -
-gvar is the name of the table variable or variables within tbl to use as -the grouping variable(s). The grouping variables split the data into groups based on -the distinct values in those variables. gvar may specify either one or two -grouping variables (but not more). It can be provided as a charvec, cellstr, or index -array. Records with a missing value for their grouping variable(s) are ignored. -
-fig is the figure handle to plot into. If fig is not provided, a new figure -is created. -
-Name/Value options: -
-PlotFcn
The plotting function to use, supplied as a function handle. Defaults to @plot
.
-It must be a function that provides the signature fcn(hax, X, Y, …)
.
-
PlotArgs
A cell array of arguments to pass in to the plotting function, following the hax, -x, and y arguments. -
Returns: - fig – the figure handle it plotted into - hax – array of axes handles to all the axes for the subplots -
-out =
tblish.examples.plot_pairs (data)
¶out =
tblish.examples.plot_pairs (data, plot_type)
¶out =
tblish.examples.plot_pairs (fig, …)
¶Plot pairs of variables against each other. -
-data is the data holding the variables to plot. It may be either a
-table
or a struct. Each variable or field in the table
-or struct is considered to be one variable. Each must hold a vector, and
-all the vectors of all the variables must be the same size.
-
plot_type is a charvec indicating what plot type to do in each subplot.
-("scatter"
is the default.) Valid plot_type values are:
-
"scatter"
A plain scatter plot. -
"smooth"
A scatter plot + fitted line, like R’s panel.smooth
does.
-
fig is an optional figure handle to plot into. If omitted, a new -figure is created. -
-Returns the created figure, if the output is captured. -
-out =
tblish.sizeof2 (x)
¶Approximate size of an array in bytes, with object support. -
-This is an alternative to Octave’s sizeof
function that tries to provide
-meaningful support for objects, including the classes defined in Tablicious. It is
-named "sizeof2" instead of "sizeof" to avoid a "shadowing core function" warning
-when loading Tablicious, because it seems that Octave does not consider packages
-(namespaces) when detecting shadowed functions.
-
This may be supplemented or replaced by sizeof
override methods on Tablicious’s
-classes. I’m not sure whether Octave’s sizeof
supports extension by method
-overrides, so I’m not doing that yet. If that happens, this sizeof2
function
-will stick around in a deprecated state for a while, and it will respect those override
-methods.
-
For tables, this returns the sum of sizeof
for all of its variables’
-arrays, plus the size of the VariableNames and any other metadata stored in obj.
-
This is currently broken for some types, because its implementation is in transition -from overridden methods on Tablicious’s objects to a separate function. -
-This is not supported, fully or at all, for all input types, but it has support -for the types defined in Tablicious, plus some Octave built-in types, and makes a -best effort at figuring out user-defined classdef objects. It currently does not -have extensibility support for customization by classdef classes, but that may be -added in the future, in which case its output may change significantly for classdef -objects in future releases. -
-x is an array of any type. -
-Returns a scalar numeric. Returns NaN for types that are known to not be supported, -instead of raising an error. Raises an error if it fails to determine the size of an -input of a type that it thought was supported. -
-See also: sizeof -
-[out] =
tblish.table.grpstats (tbl, groupvar)
¶[out] =
tblish.table.grpstats (…, 'DataVars'
, DataVars)
¶Statistics by group for a table array. -
-This is a table-specific implementation of grpstats
that works on table arrays.
-It is supplied as a function in the +tblish package to avoid colliding with
-the global grpstats
function supplied by the Statistics Octave Forge package.
-Depending on which version of the Statistics OF package you are using, it may or may
-not support table inputs to its grpstats
function. This function is supplied
-as an alternative you can use in an environment where table
arrays are not
-supported by the grpstats
that you have, though you need to make code changes
-and call it as tblish.table.grpstats(tbl)
instead of with a plain
-grpstats(tbl)
.
-
See also: table.groupby, table.findgroups, table.splitapply -
-out =
timezones ()
¶out =
timezones (area)
¶List all the time zones defined on this system. -
-This lists all the time zones that are defined in the IANA time zone database -used by this Octave. (On Linux and macOS, that will generally be the system -time zone database from /usr/share/zoneinfo. On Windows, it will be -the database redistributed with the Tablicious package. -
-If the return is captured, the output is returned as a table if your Octave -has table support, or a struct if it does not. It will have fields/variables -containing column vectors: -
-Name
The IANA zone name, as cellstr. -
Area
The geographical area the zone is in, as cellstr. -
Compatibility note: Matlab also includes UTCOffset and DSTOffset fields in -the output; these are currently unimplemented. -
-out =
todatetime (x)
¶Convert input to a Tablicious datetime array, with convenient interface. -
-This is an alternative to the regular datetime constructor, with a signature -and conversion logic that Tablicious’s author likes better. -
-This mainly exists because datetime’s constructor signature does not accept -datenums, and instead treats one-arg numeric inputs as datevecs. (For compatibility -with Matlab’s interface.) I think that’s less convenient: datenums seem to be -more common than datevecs in M-code, and it returns an object array that’s not the -same size as the input. -
-Returns a datetime array whose size depends on the size and type of the input -array, but will generally be the same size as the array of strings or numerics -the input array "represents". -
-out =
vartype (type)
¶Filter by variable type for use in suscripting. -
-Creates an object that can be used for subscripting into the variables -dimension of a table and filtering on variable type. -
-type is the name of a type as charvec. This may be anything that
-the isa
function accepts, or 'cellstr'
to select cellstrs,
-as determined by iscellstr
.
-
Returns an object of an opaque type. Don’t worry about what type it is;
-just pass it into the second argument of a subscript into a table
-object.
-
out =
vecfun (fcn, x, dim)
¶Apply function to vectors in array along arbitrary dimension. -
-This function is not implemented yet. -
-Applies a given function to the vector slices of an N-dimensional array, where -those slices are along a given dimension. -
-fcn is a function handle to apply. -
-x is an array of arbitrary type which is to be sliced and passed -in to fcn. -
-dim is the dimension along which the vector slices lay. -
-Returns the collected output of the fcn calls, which will be -the same size as x, but not necessarily the same type. -
-out =
years (x)
¶Create a duration
x years long, or get the years in a duration
-x.
-
If input is numeric, returns a duration
array in units of fixed-length
-years of 365.2425 days each.
-
If input is a duration
, converts the duration
to a number of fixed-length
-years as double.
-
Note: years
creates fixed-length years, which may not be what you want.
-To create a duration of calendar years (which account for actual leap days),
-use calyears
.
-
See calyears. -
This manual is for Tablicious, version 0.4.4-SNAPSHOT. -
- - ---Time is an illusion. Lunchtime doubly so. -
-
This is the manual for the Tablicious package version 0.4.4-SNAPSHOT for GNU Octave. -
-Tablicious provides somewhat-Matlab-compatible tabular data and date/time support for
-GNU Octave.
-This includes a table
class with support for filtering and join operations;
-datetime
, duration
, and related classes;
-Missing Data support; string
and categorical
data types;
-and other miscellaneous things.
-
This document is a work in progress. You are invited to help improve it and -submit patches. -
-Tablicious’s classes are designed to be convenient to use while still being efficient. -The data representations used by Tablicious are designed to be efficient and suitable -for working with large-ish data sets. A “large-ish” data set is one that can have -millions of elements or rows, but still fits in main computer memory. Tablicious’s main -relational and arithmetic operations are all implemented using vectorized -operations on primitive Octave data types. -
-Tablicious was written by Andrew Janke <floss@apjanke.net>. Support can be -found on the Tablicious project -GitHub page. -
- -The easiest way to obtain Tablicious is by using Octave’s pkg
package manager.
-To install the development prerelease of Tablicious, run this in Octave:
-
pkg install https://github.com/apjanke/octave-tablicious/releases/download/v0.4.4-SNAPSHOT/tablicious-0.4.4-SNAPSHOT.tar.gz -
(Check the releases page at https://github.com/apjanke/octave-tablicious/releases to -find out what the actual latest release number is.) -
-For development, you can obtain the source code for Tablicious from the project repo on GitHub at -https://github.com/apjanke/octave-tablicious. Make a local clone of the repo. -Then add the inst directory in the repo to your Octave path. -
- - -Tablicious provides the table
class for representing tabular data.
-
A table
is an array object that represents a tabular data structure. It holds
-multiple named “variables”, each of which is a column vector, or a 2-D matrix whose
-rows are read as records.
-
A table
is composed of multiple “variables”, each with a name, which all have
-the same number of rows. (A table
variable is like a “column” in SQL tables
-or in R or Python/pandas dataframes. Whenever you read “variable” here, think
-“column”.) Taken together, the i-th element or row of each variable compose
-a single record or observation.
-
Tables are good ways of arranging data if you have data that would otherwise be stored
-in a few separate variables which all need to be kept in the same shape and order,
-especially if you might want to do element-wise comparisons involving two or more of
-those variables. That’s basically all a table
is: it holds a collection of
-variables, and makes sure they are all kept aligned and ordered in the same way.
-
Tables are a lot like SQL tables or result sets, and are based on the same relational
-algebra theory that SQL is. Many common, even powerful, SQL operations can be done
-in Octave using table
arrays. It’s like having your own in-memory SQL engine.
-
There are two main ways to construct a table
array: build one up by combining
-multiple variables together, or convert an existing tabular-organized array into a
-table
.
-
To build an array from multiple variables, use the table(…)
constructor, passing
-in all of your variables as separate inputs. It takes any number of inputs. Each input
-becomes a table variable in the new table
object. If you pass your constructor
-inputs directly from variables, it automatically picks up their names and uses them
-as the table variable names. Otherwise, if you’re using more complex expressions, you’ll
-need to supply the 'VariableNames'
option.
-
To convert a tabular-organized array of another type into a table
, use the
-conversion functions like array2table
, struct2table
and cell2table
.
-array2table
and cell2table
take each column of the input array and turn
-it into a separate table variable in the resulting table
. struct2table
takes
-the fields of a struct and puts them into table variables.
-
Here’s a table (ha!) of what SQL and relational algebar operations correspond to
-what Octave table
operations.
-
In this table, t
is a variable holding a table
array, and ix
is
-some indexing expression.
-
SQL | Relational | Octave table |
---|---|---|
SELECT | PROJECT | subsetvars , t(:,ix) |
WHERE | RESTRICT | subsetrows , t(ix,:) |
INNER JOIN | JOIN | innerjoin |
OUTER JOIN | OUTER JOIN | outerjoin |
FROM table1, table2, … | Cartesian product | cartesian |
GROUP BY | SUMMARIZE | groupby |
DISTINCT | (automatic) | unique(t) |
Note that there is one big difference between relational algebra and SQL & Octave
-table
: Relations in relational algebra are sets, not lists.
-There are no duplicate rows in relational algebra, and there is no ordering.
-So every operation there does an implicit DISTINCT
/unique()
on its
-results, and there‘s no ORDER BY
/sort()
. This is not the case in SQL
-or Octave table
.
-
Note for users coming from Matlab: Matlab does not provide a general groupby
-function. Instead, you have to variously use rowfun
, grpstats
,
-groupsummary
, and manual code to accomplish “group by” operations.
-
Note: I wrote this based on my understanding of relational algebra from reading -C. J. Date books. Other people’s understanding and terminology may differ. - apjanke -
- - -Tablicious provides the datetime
class for representing points in time.
-
There’s also duration
and calendarDuration
for representing
-periods or durations of time. Like vector quantities along the time line,
-as opposed to datetime
being a point along the time line.
-
datetime
Class ¶A datetime
is an array object that represents points in time in the familiar
-Gregorian calendar.
-
This is an attempt to reproduce the functionality of Matlab’s datetime
. It
-also contains some Octave-specific extensions.
-
The underlying representation is that of a datenum (a double
-containing the number of days since the Matlab epoch), but encapsulating it in an
-object provides several benefits: friendly human-readable display, type safety,
-automatic type conversion, and time zone support. In addition to the underlying
-datenum array, a datetime
inclues an optional TimeZone
property
-indicating what time zone the datetimes are in.
-
So, basically, a datetime
is an object wrapper around a datenum array,
-plus time zone support.
-
While the underlying data representation of datetime
is compatible with
-(in fact, identical to) that of datenums, you cannot directly combine them
-via assignment, concatenation, or most arithmetic operations.
-
This is because of the signature of the datetime
constructor. When combining
-objects and primitive types like double
, the primitive type is promoted to an
-object by calling the other object’s one-argument constructor on it. However, the
-one-argument numeric-input consstructor for datetime
does not accept datenums:
-it interprets its input as datevecs instead. This is due to a design decision on
-Matlab’s part; for compatibility, Octave does not alter that interface.
-
To combine datetime
s with datenums, you can convert the datenums to datetime
s
-by calling datetime.ofDatenum
or datetime(x, 'ConvertFrom', 'datenum')
, or you
-can convert the datetime
s to datenums by accessing its dnums
field with
-x.dnums
.
-
Examples: -
-dt = datetime('2011-03-04') -dn = datenum('2017-01-01') -[dt dn] - ⇒ error: datenum: expected date vector containing [YEAR, MONTH, DAY, HOUR, MINUTE, SECOND] -[dt datetime.ofDatenum(dn)] - ⇒ 04-Mar-2011 01-Jan-2017 -
Also, if you have a zoned datetime
, you can’t combine it with a datenum, because datenums
-do not carry time zone information.
-
Tablicious has support for representing dates in time zones and for converting between time zones. -
-A datetime
may be "zoned" or "zoneless". A zoneless datetime
does not have a time zone
-associated with it. This is represented by an empty TimeZone
property on the datetime
-object. A zoneless datetime
represents the local time in some unknown time zone, and assumes a
-continuous time scale (no DST shifts).
-
A zoned datetime
is associated with a time zone. It is represented by having the time zone’s
-IANA zone identifier (e.g. 'UTC'
or 'America/New_York'
) in its TimeZone
-property. A zoned datetime
represents the local time in that time zone.
-
By default, the datetime
constructor creates unzoned datetime
s. To
-make a zoned datetime
, either pass the 'TimeZone'
option to the constructor,
-or set the TimeZone
property after object creation. Setting the TimeZone
-property on a zoneless datetime
declares that it’s a local time in that time zone.
-Setting the TimeZone
property on a zoned datetime
turns it back into a
-zoneless datetime
without changing the local time it represents.
-
You can tell a zoned from a zoneless time zone in the object display because the time zone
-is included for zoned datetime
s.
-
% Create an unzoned datetime -d = datetime('2011-03-04 06:00:00') - ⇒ 04-Mar-2011 06:00:00 - -% Create a zoned datetime -d_ny = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/New_York') - ⇒ 04-Mar-2011 06:00:00 America/New_York -% This is equivalent -d_ny = datetime('2011-03-04 06:00:00'); -d_ny.TimeZone = 'America/New_York' - ⇒ 04-Mar-2011 06:00:00 America/New_York - -% Convert it to Chicago time -d_chi.TimeZone = 'America/Chicago' - ⇒ 04-Mar-2011 05:00:00 America/Chicago -
When you combine two zoned datetime
s via concatenation, assignment, or
-arithmetic, if their time zones differ, they are converted to the time zone of
-the left-hand input.
-
d_ny = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/New_York') -d_la = datetime('2011-03-04 06:00:00', 'TimeZone', 'America/Los_Angeles') -d_la - d_ny - ⇒ 03:00:00 -
You cannot combine a zoned and an unzoned datetime
. This results in an error
-being raised.
-
-- - - -Warning: Normalization of "nonexistent" times (like between 02:00 and 03:00 on a "spring forward" -DST change day) is not implemented yet. The results of converting a zoneless local time -into a time zone where that local time did not exist are currently undefined. -
Tablicious’s time zone data is drawn from the IANA Time Zone Database, also known as the “Olson Database”. Tablicious includes a -copy of this database in its distribution so it can work on Windows, which does -not supply it like Unix systems do. -
-You can use the timezones
function to list the time zones known to Tablicious. These will be
-all the time zones in the IANA database on your system (for Linux and macOS) or in the IANA
-time zone database redistributed with Tablicious (for Windows).
-
-- -Note: The IANA Time Zone Database only covers dates from about the year 1880 to 2038. Converting -time zones for
datetime
s outside that range is currently unimplemented. (Tablicious -needs to add support for proleptic POSIX time zone rules, which are used to govern -behavior outside that date range.) -
duration
Class ¶A duration
represents a period of time in fixed-length seconds (or minutes, hours,
-or whatever you want to measure it in.)
-
A duration
has a resolution of about a nanosecond for typical dates. The underlying
-representation is a double
representing the number of days elapsed, similar to a
-datenum, except it’s interpreted as relative to some other reference point you provide,
-instead of being relative to the Matlab/Octave epoch.
-
You can add or subtract a duration
to a datetime
to get another datetime
.
-You can also add or subtract durations
to each other.
-
calendarDuration
Class ¶A calendarDuration
represents a period of time in variable-length calendar
-components. For example, years and months can have varying numbers of days, and days
-in time zones with Daylight Saving Time have varying numbers of hours. A
-calendarDuration
does arithmetic with "whole" calendar periods.
-
calendarDuration
s and duration
s cannot be directly combined, because
-they are not semantically equivalent. (This may be relaxed in the future to allow
-duration
s to be interpreted as numbers of days when combined with
-calendarDuration
s.)
-
d = datetime('2011-03-04 00:00:00') - ⇒ 04-Mar-2011 -cdur = calendarDuration(1, 3, 0) - ⇒ 1y 3mo -d2 = d + cdur - ⇒ 04-Jun-2012 -
Tablicious provides several validation functions which can be used to check properties -of function arguments, variables, object properties, and other expressions. These can -be used to express invariants in your program and catch problems due to input errors, -incorrect function usage, or other bugs. -
-These validation functions are named following the pattern mustBeXxx
, where Xxx
-is some property of the input it is testing. Validation functions may check the type,
-size, or other aspects of their inputs.
-
The most common place for validation functions to be used will probably be at the -beginning of functions, to check the input arguments and ensure that the contract of -the function is not being violated. If in the future Octave gains the ability to -declaratively express object property constraints, they will also be of use there. -
-Be careful not to get too aggressive with the use of validation functions: while using -them can make sure invariants are followed and your program is correct, they also reduce -the code’s ability to make use of duck typing, reducing its flexibility. Whether you want -to make this trade-off is a design decision you will have to consider. -
-When a validation function’s condition is violated, it raises an error that includes a
-description of the violation in the error message. This message will include a label for
-the input that describes what is being tested. By default, this label is initialized
-with inputname()
, so when you are calling a validator on a function argument or
-variable, you will generally not need to supply a label. But if you’re calling it on
-an object property or an expression more complex than a simple variable reference, the
-validator cannot automatically detect the input name for use in the label. In this case,
-make use of the optional trailing argument(s) to the functions to manually supply a
-label for the value being tested.
-
% Validation of a simple variable does not need a label -mustBeScalar (x); -% Validation of a field or property reference does need a label -mustBeScalar (this.foo, 'this.foo'); -
Tablicious comes with several example data sets that you can use to explore how
-its functions and objects work. These are accessed through the
-tblish.datasets
and tblish.dataset
classes.
-
To see a list of the available data sets, run tblish.datasets.list()
.
-Then to load one of the example data sets, run
-tblish.datasets.load('examplename')
. For example:
-
tblish.datasets.list -t = tblish.datasets.load('cupcake') -
You can also load it by calling tblish.dataset.<name>
. This does
-the same thing. For example:
-
t = tblish.dataset.cupcake -
When you load a data set, it either returns all its data in a single variable -(if you capture it), or loads its data into one or more variables in your -workspace (if you call it with no outputs). -
-Each example data set comes with help text that describes the data set and
-provides examples of how to work with it. This help is found using the doc
-command on tblish.dataset.<name>
, where <name> is the name of
-the data set.
-
For example: -
-doc tblish.dataset.cupcake -
(The command help tblish.dataset.<name>
ought to work too, but it
-currently doesn’t. This may be due to an issue with Octave’s help
-command.)
-
Many of Tablicious’ example data sets are based on the example datasets
-found in R’s datasets
package. R can be found at
-https://www.r-project.org/, and documentation for its datasets
-is at https://rdrr.io/r/datasets/datasets-package.html.
-Thanks to the R developers for producing the original data sets here.
-
Tablicious’ examples’ code tries to replicate the R examples, so it can -be useful to compare the two of them if you are moving from one language to -another. -
-Core Octave currently lacks some of the plotting features found in the R -examples, such as LOWESS smoothing and linear model characteristic plots, so -you will just find “TODO” placeholders for these in Tablicious’ example code. -
-Tablicious is based on Matlab’s table and date/time APIs and supports some of -their major functionality. -But not all of it is implemented yet. The missing parts are currently: -
-readtable()
and writetable()
-summary()
categorical
-.
-indexing
-timetable
-'ConvertFrom'
forms for datetime
and duration
constructors
-datetime
-between
-caldiff
-dateshift
-week
-isdst
, isweekend
-calendarDuration.split
-duration.Format
support
-fillmissing
-UTCOffset
and DSTOffset
fields in the output of timezones()
-It is the author’s hope that many these will be implemented some day. -
-These areas of missing functionality are tracked on the Tablicious issue -tracker at https://github.com/apjanke/octave-tablicious/issues and -https://github.com/users/apjanke/projects/3. -
- -Tabular data array containing multiple columnar variables. -
-See table. -
Convert an array to a table. -
-See array2table. -
Convert a cell array to a table. -
-See cell2table. -
Convert struct to a table. -
-See struct2table. -
See tableOuterFillValue. -
Filter by variable type for use in suscripting. -
-See vartype. -
True if input is a ‘table’ array or other table-like type, false otherwise. -
-See istable. -
True if input is a ‘timetable’ array or other timetable-like type, false otherwise. -
-See istimetable. -
True if input is eitehr a ‘table’ or ‘timetable’ array, or an object like them. -
-See istabular. -
Evaluate an expression against a table array’s variables. -
-See tblish.evalWithTableVars. -
Statistics by group for a table array. -
-See tblish.table.grpstats. -
A string array of Unicode strings. -
-See string. -
“Not-a-String". -
-See NaS. -
Test if strings contain a pattern. -
-See contains. -
Display strings for array. -
-See dispstrs. -
Categorical variable array. -
-See categorical. -
True if input is a ‘categorical’ array, false otherwise. -
-See iscategorical. -
“Not-a-Categorical". -
-See NaC. -
Group data into discrete bins or categories. -
-See discretize. -
Represents points in time using the Gregorian calendar. -
-See datetime. -
“Not-a-Time”. -
-See NaT. -
Convert input to a Tablicious datetime array, with convenient interface. -
-See todatetime. -
Represents a complete day using the Gregorian calendar. -
-See localdate. -
True if input is a ‘datetime’ array, false otherwise. -
-See isdatetime. -
Durations of time using variable-length calendar periods, such as days, months, and years, which may vary in length over time. -
-See calendarDuration. -
True if input is a ‘calendarDuration’ array, false otherwise. -
-See iscalendarduration. -
Create a ‘calendarDuration’ that is a given number of calendar months long. -
-See calmonths. -
Construct a ‘calendarDuration’ a given number of years long. -
-See calyears. -
Duration in days. -
-See days. -
Represents durations or periods of time as an amount of fixed-length time (i.e. -
-See duration. -
Create a ‘duration’ X hours long, or get the hours in a ‘duration’ X. -
-See hours. -
True if input is a ‘duration’ array, false otherwise. -
-See isduration. -
Create a ‘duration’ X milliseconds long, or get the milliseconds in a ‘duration’ X. -
-See milliseconds. -
Create a ‘duration’ X hours long, or get the hours in a ‘duration’ X. -
-See minutes. -
Create a ‘duration’ X seconds long, or get the seconds in a ‘duration’ X. -
-See seconds. -
List all the time zones defined on this system. -
-See timezones. -
Create a ‘duration’ X years long, or get the years in a ‘duration’ X. -
-See years. -
See mustBeA. -
See mustBeCellstr. -
See mustBeCharvec. -
See mustBeFinite. -
See mustBeInteger. -
See mustBeMember. -
See mustBeNonempty. -
See mustBeNumeric. -
See mustBeReal. -
See mustBeSameSize. -
See mustBeScalar. -
See mustBeScalarLogical. -
See mustBeVector. -
Apply a function to column vectors in array. -
-See colvecfun. -
Display strings for array. -
-See dispstrs. -
Get first K rows of an array. -
-See head. -
See isfile. -
See isfolder. -
Alias for prettyprint, for interactive use. -
-See pp. -
Expand scalar inputs to match size of non-scalar inputs. -
-See scalarexpand. -
Format an array size for display. -
-See size2str. -
Split data into groups and apply function. -
-See splitapply. -
Get last K rows of an array. -
-See tail. -
Apply function to vectors in array along arbitrary dimension. -
-See vecfun. -
Approximate size of an array in bytes, with object support. -
-See tblish.sizeof2. -
Example dataset collection. -
-See tblish.datasets. -
The ‘tblish.dataset’ class provides convenient access to the various datasets included with Tablicious. -
-See tblish.dataset. -
Conditioning plot. -
-See tblish.examples.coplot. -
Plot pairs of variables against each other. -
-See tblish.examples.plot_pairs. -
The classic Suppliers-Parts example database. -
-See tblish.examples.SpDb. -
out =
array2table (c)
¶out =
array2table (…, 'VariableNames'
, VariableNames)
¶out =
array2table (…, 'RowNames'
, RowNames)
¶Convert an array to a table. -
-Converts a 2-D array to a table, with columns in the array becoming variables in -the output table. This is typically used on numeric arrays, but it can -be applied to any type of array. -
-You may not want to use this on cell arrays, though, because you will
-end up with a table that has all its variables of type cell. If you use
-cell2table
instead, columns of the cell array which can be
-condensed into primitive arrays will be. With array2table
, they
-won’t be.
-
See also: cell2table, table, struct2table -
-Durations of time using variable-length calendar periods, such as days, -months, and years, which may vary in length over time. (For example, a -calendar month may have 28, 30, or 31 days.) -
-calendarDuration
: char
Sign ¶The sign (1 or -1) of this duration, which indicates whether it is a -positive or negative span of time. -
-calendarDuration
: char
Years ¶The number of whole calendar years in this duration. Must be integer-valued. -
-calendarDuration
: char
Months ¶The number of whole calendar months in this duration. Must be integer-valued. -
-calendarDuration
: char
Days ¶The number of whole calendar days in this duration. Must be integer-valued. -
-calendarDuration
: char
Hours ¶The number of whole hours in this duration. Must be integer-valued. -
-calendarDuration
: char
Minutes ¶The number of whole minutes in this duration. Must be integer-valued. -
-calendarDuration
: char
Seconds ¶The number of seconds in this duration. May contain fractional values. -
-calendarDuration
: char
Format ¶The format to display this calendarDuration
in. Currently unsupported.
-
This is a single value that applies to the whole array. -
-obj =
calendarDuration ()
¶Constructs a new scalar calendarDuration
of zero elapsed time.
-
obj =
calendarDuration (Y, M, D)
¶obj =
calendarDuration (Y, M, D, H, MI, S)
¶Constructs new calendarDuration
arrays based on input values.
-
out =
dispstrs (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
ismissing (obj)
¶True if input elements are missing. -
-This is equivalent to ismissing
.
-
Returns logical array the same size as obj. -
-out =
isnan (obj)
¶True if input elements are NaN. -
-This is equivalent to ismissing
, and is provided for compatibility
-and polymorphic programming purposes.
-
Returns logical array the same size as obj. -
-out =
times (A, B)
¶Subtraction: Subtracts one calendarDuration
from another.
-
Returns a calendarDuration
.
-
out =
plus (A, B)
¶Addition: add two calendarDuration
s.
-
All the calendar elements (properties) of the two inputs are added -together. No normalization is done across the elements, aside from -the normalization of NaNs. -
-If B is numeric, it is converted to a calendarDuration
-using calendarDuration.ofDays
.
-
Returns a calendarDuration
.
-
out =
calmonths (x)
¶Create a calendarDuration
that is a given number of calendar months
-long.
-
Input x is a numeric array specifying the number of calendar months. -
-This is a shorthand alternative to calling the calendarDuration
-constructor with calendarDuration(0, x, 0)
.
-
Returns a new calendarDuration
object of the same size as x.
-
See calendarDuration. -
-out =
calyears (x)
¶Construct a calendarDuration
a given number of years long.
-
This is a shorthand for calling calendarDuration(x, 0, 0)
.
-
See calendarDuration. -
-Categorical variable array. -
-A categorical
array represents an array of values of a categorical
-variable. Each categorical
array stores the element values along
-with a list of the categories, and indicators of whether the categories
-are ordinal (that is, they have a meaningful mathematical ordering), and
-whether the set of categories is protected (preventing new categories
-from being added to the array).
-
In addition to the categories defined in the array, a categorical array
-may have elements of "undefined" value. This is not considered a
-category; rather, it is the absence of any known value. It is
-analagous to a NaN
value.
-
This class is not fully implemented yet. Missing stuff: -
-categorical
: uint16
code ¶The numeric codes of the array element values. These are indexes into the
-cats
category list.
-
This is a planar property. -
-categorical
: logical
tfMissing ¶A logical mask indicating whether each element of the array is missing -(that is, undefined). -
-This is a planar property. -
-categorical
: cellstr
cats ¶The names of the categories in this array. This is the list into which
-the code
values are indexes.
-
categorical
: scalar_logical
isOrdinal ¶A scalar logical indicating whether the categories in this array have an -ordinal relationship. -
-out =
addcats (obj, newcats)
¶Add categories to categorical array. -
-Adds the specified categories to obj, without changing any of -its values. -
-newcats is a cellstr listing the category names to add to -obj. -
-obj =
categorical ()
¶Constructs a new scalar categorical whose value is undefined. -
-obj =
categorical (vals)
¶obj =
categorical (vals, valueset)
¶obj =
categorical (vals, valueset, category_names)
¶obj =
categorical (…, 'Ordinal'
, Ordinal)
¶obj =
categorical (…, 'Protected'
, Protected)
¶Constructs a new categorical array from the given values. -
-vals is the array of values to convert to categoricals. -
-valueset is the set of all values from which vals is drawn. -If omitted, it defaults to the unique values in vals. -
-category_names is a list of category names corresponding to -valueset. If omitted, it defaults to valueset, converted -to strings. -
-Ordinal is a logical indicating whether the category values in -obj have a numeric ordering relationship. Defaults to false. -
-Protected indicates whether obj should be protected, which -prevents the addition of new categories to the array. Defaults to -false. -
-out =
categories (obj)
¶Get a list of the categories in obj. -
-Gets a list of the categories in obj, identified by their -category names. -
-Returns a cellstr column vector. -
-out =
cellstr (obj)
¶Convert to cellstr. -
-Converts obj to a cellstr array. The strings will be the
-category names for corresponding values, or ''
for undefined
-values.
-
Returns a cellstr array the same size as obj. -
-out =
dispstrs (obj)
¶Display strings. -
-Gets display strings for each element in obj. The display strings are
-either the category string, or '<undefined>'
for undefined values.
-
Returns a cellstr array the same size as obj. -
-out =
double (obj)
¶Convert to double array, by getting the underlying code values. -
-Converts obj to a string array. The doubles will be the
-underlying numeric code values of obj, or NaN
for
-undefined values.
-
The numeric code values of two different categorical arrays do -*not* necessarily correspond to the same string values, and can -*not* be meaningfully compared for equality or ordering. -
-Returns a double
array the same size as obj.
-
out =
iscategory (obj, catnames)
¶Test whether input is a category on a categorical array. -
-catnames is a cellstr listing the category names to check against -obj. -
-Returns a logical array the same size as catnames. -
-out =
ismissing (obj)
¶Test whether elements are missing. -
-For categorical arrays, undefined elements are considered to be -missing. -
-Returns a logical array the same size as obj. -
-out =
isnanny (obj)
¶Test whethere elements are NaN-ish. -
-Checks where each element in obj is NaN-ish. For categorical -arrays, undefined values are considered NaN-ish; any other -value is not. -
-Returns a logical array the same size as obj. -
-out =
isordinal (obj)
¶Whether obj is ordinal. -
-Returns true if obj is ordinal (as determined by its
-IsOrdinal
property), and false otherwise.
-
out =
isundefined (obj)
¶Test whether elements are undefined. -
-Checks whether each element in obj is undefined. "Undefined" is
-a special value defined by categorical
. It is equivalent to
-a NaN
or a missing
value.
-
Returns a logical array the same size as obj. -
-out =
mergecats (obj, oldcats)
¶out =
mergecats (obj, oldcats, newcat)
¶Merge multiple categories. -
-Merges the categories oldcats into a single category. If newcat -is specified, that new category is added if necessary, and all of oldcats -are merged into it. newcat must be an existing category in obj if -obj is ordinal. -
-If newcat is not provided, all of odcats are merged into
-oldcats{1}
.
-
out =
categorical.missing ()
¶out =
categorical.missing (sz)
¶Create an array of missing (undefined) categoricals. -
-Creates a categorical array whose elements are all missing (<undefined>). -
-This is a convenience alias for categorical.undefined, so you can call -it generically. It returns strictly the same results as calling -categorical.undefined with the same arguments. -
-Returns a categorical array. -
-See also: categorical.undefined -
-out =
removecats (obj)
¶Removes all unused categories from obj. This is equivalent to
-out = squeezecats (obj)
.
-
out =
removecats (obj, oldcats)
¶Remove categories from categorical array. -
-Removes the specified categories from obj. Elements of obj -whose values belonged to those categories are replaced with undefined. -
-newcats is a cellstr listing the category names to add to -obj. -
-out =
renamecats (obj, newcats)
¶out =
renamecats (obj, oldcats, newcats)
¶Rename categories. -
-Renames some or all of the categories in obj, without changing -any of its values. -
-out =
reordercats (obj)
¶out =
reordercats (obj, newcats)
¶Reorder categories. -
-Reorders the categories in obj to match newcats. -
-newcats is a cellstr that must be a reordering of obj’s existing -category list. If newcats is not supplied, sorts the categories -in alphabetical order. -
-out =
setcats (obj, newcats)
¶Set categories for categorical array. -
-Sets the categories to use for obj. If any current categories -are absent from the newcats list, current values of those -categories become undefined. -
-out =
squeezecats (obj)
¶Remove unused categories. -
-Removes all categories which have no corresponding values in obj’s -elements. -
-This is currently unimplemented. -
-out =
string (obj)
¶Convert to string array. -
-Converts obj to a string array. The strings will be the -category names for corresponding values, or <missing> for undefined -values. -
-Returns a string
array the same size as obj.
-
(obj)
¶Display summary of array’s values. -
-Displays a summary of the values in this categorical array. The output -may contain info like the number of categories, number of undefined values, -and frequency of each category. -
-out =
categorical.undefined ()
¶out =
categorical.undefined (sz)
¶Create an array of undefined categoricals. -
-Creates a categorical array whose elements are all <undefined>. -
-sz is the size of the array to create. If omitted or empty, creates -a scalar. -
-Returns a categorical array. -
-See also: categorical.missing -
-out =
cell2table (c)
¶out =
cell2table (…, 'VariableNames'
, VariableNames)
¶out =
cell2table (…, 'RowNames'
, RowNames)
¶Convert a cell array to a table. -
-Converts a 2-dimensional cell matrix into a table. Each column in the
-input c becomes a variable in out. For columns that contain
-all scalar values of cat
-compatible types, they are “popped out”
-of their cells and condensed into a homogeneous array of the contained
-type.
-
See also: array2table, table, struct2table -
-out =
colvecfun (fcn, x)
¶Apply a function to column vectors in array. -
-Applies the given function fcn to each column vector in the -array x, by iterating over the indexes along all dimensions except -dimension 1. Collects the function return values in an output array. -
-fcn must be a function which takes a column vector and returns a column -vector of the same size. It does not have to return the same type as -x. -
-Returns the result of applying fcn to each column in x, all concatenated -together in the same shape as x. -
-out =
colvecfun (str, pattern)
¶out =
colvecfun (…, 'IgnoreCase'
, IgnoreCase)
¶Test if strings contain a pattern. -
-Tests whether the given strings contain the given pattern(s). -
-str (char, cellstr, or string) is a list of strings to compare against -pattern. -
-pattern (char, cellstr, or string) is a list of patterns to match. These are -literal plain string patterns, not regex patterns. If more than one pattern -is supplied, the return value is true if the string matched any of them. -
-Returns a logical array of the same size as the string array represented by -str. -
-See also: startsWith, endsWith -
-Represents points in time using the Gregorian calendar. -
-The underlying values are doubles representing the number of days since the -Matlab epoch of "January 0, year 0". This has a precision of around nanoseconds -for typical times. -
-A datetime
array is an array of date/time values, with each element
-holding a complete date/time. The overall array may also have a TimeZone and a
-Format associated with it, which apply to all elements in the array.
-
This is an attempt to reproduce the functionality of Matlab’s datetime
. It
-also contains some Octave-specific extensions.
-
datetime
: double
dnums ¶The underlying datenums that represent the points in time. These are always in UTC. -
-This is a planar property: the size of dnums
is the same size as the
-containing datetime
array object.
-
datetime
: char
TimeZone ¶The time zone this datetime
array is in. Empty if this does not have a
-time zone associated with it (“unzoned”). The name of an IANA time zone if
-this does.
-
Setting the TimeZone
of a datetime
array changes the time zone it
-is presented in for strings and broken-down times, but does not change the
-underlying UTC times that its elements represent.
-
datetime
: char
Format ¶The format to display this datetime
in. Currently unsupported.
-
out =
datetime.convertDatenumTimeZone (dnum, fromZoneId, toZoneId)
¶Convert a datenum from one time zone to another. -
-dnum is a datenum array to convert. -
-fromZoneId is a charvec containing the IANA Time Zone identifier for -the time zone to convert from. -
-toZoneId is a charvec containing the IANA Time Zone identifier for -the time zone to convert to. -
-Returns a datenum array the same size as dnum. -
-out =
datenum (obj)
¶Convert this to datenums that represent the same local time -
-Returns double array of same size as this. -
-out =
datetime.datenum2posix (dnums)
¶Converts Octave datenums to Unix dates. -
-The input datenums are assumed to be in UTC. -
-Returns a double, which may have fractional seconds. -
-out =
datestr (obj)
¶out =
datestr (obj, …)
¶Format obj as date strings. Supports all arguments that core Octave’s
-datestr
does.
-
Returns date strings as a 2-D char array. -
-out =
datestrs (obj)
¶out =
datestrs (obj, …)
¶Format obj as date strings, returning cellstr.
-Supports all arguments that core Octave’s datestr
does.
-
Returns a cellstr array the same size as obj. -
-out =
datestruct (obj)
¶Converts this to a "datestruct" broken-down time structure. -
-A "datestruct" is a format of struct that Tablicious came up with. It is a scalar -struct with fields Year, Month, Day, Hour, Minute, and Second, each containing -a double array the same size as the date array it represents. -
-The values in the returned broken-down time are those of the local time -in this’ defined time zone, if it has one. -
-Returns a struct with fields Year, Month, Day, Hour, Minute, and Second. -Each field contains a double array of the same size as this. -
-obj =
datetime ()
¶Constructs a new scalar datetime
containing the current local time, with
-no time zone attached.
-
obj =
datetime (datevec)
¶obj =
datetime (datestrs)
¶obj =
datetime (in, 'ConvertFrom'
, inType)
¶obj =
datetime (Y, M, D, H, MI, S)
¶obj =
datetime (Y, M, D, H, MI, MS)
¶obj =
datetime (…, 'Format'
, Format, 'InputFormat'
, InputFormat, 'Locale'
, InputLocale, 'PivotYear'
, PivotYear, 'TimeZone'
, TimeZone)
¶Constructs a new datetime
array based on input values.
-
out =
diff (obj)
¶Differences between elements. -
-Computes the difference between each successive element in obj, as a
-duration
.
-
Returns a duration
array the same size as obj.
-
out =
dispstrs (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
eq (A, B)
¶True if A is equal to B. This defines the ==
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
ge (A, B)
¶True if A is greater than or equal to B. This defines the >=
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
gmtime (obj)
¶Convert to TM_STRUCT structure in UTC time. -
-Converts obj to a TM_STRUCT style structure array. The result is in -UTC time. If obj is unzoned, it is assumed to be in UTC time. -
-Returns a struct array in TM_STRUCT style. -
-out =
gt (A, B)
¶True if A is greater than B. This defines the >
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-[h, m, s] =
hms (obj)
¶Get the Hour, Minute, and Second components of a obj. -
-For zoned datetime
s, these will be local times in the associated time zone.
-
Returns double arrays the same size as obj
.
-
out =
isbetween (obj, lower, upper)
¶Tests whether the elements of obj are between lower and -upper. -
-All inputs are implicitly converted to datetime
arrays, and are subject
-to scalar expansion.
-
Returns a logical array the same size as the scalar expansion of the inputs. -
-out =
isnan (obj)
¶True if input elements are NaT. This is an alias for isnat
-to support type compatibility and polymorphic programming.
-
Returns logical array the same size as obj. -
-out =
isnat (obj)
¶True if input elements are NaT. -
-Returns logical array the same size as obj. -
-out =
le (A, B)
¶True if A is less than or equal toB. This defines the <=
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
linspace (from, to, n)
¶Linearly-spaced values in date/time space. -
-Constructs a vector of datetime
s that represent linearly spaced points
-starting at from and going up to to, with n points in the
-vector.
-
from and to are implicitly converted to datetime
s.
-
n is how many points to use. If omitted, defaults to 100. -
-Returns an n-long datetime
vector.
-
out =
localtime (obj)
¶Convert to TM_STRUCT structure in UTC time. -
-Converts obj to a TM_STRUCT style structure array. The result is a -local time in the system default time zone. Note that the system default -time zone is always used, regardless of what TimeZone is set on obj. -
-If obj is unzoned, it is assumed to be in UTC time. -
-Returns a struct array in TM_STRUCT style. -
-Example: -
dt = datetime; -dt.TimeZone = datetime.SystemTimeZone; -tm_struct = localtime (dt); -
out =
lt (A, B)
¶True if A is less than B. This defines the <
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-out =
minus (A, B)
¶Subtraction (-
operator). Subtracts a duration
,
-calendarDuration
or numeric B from a datetime
A,
-or subtracts two datetime
s from each other.
-
If both inputs are datetime
, then the output is a duration
.
-Otherwise, the output is a datetime
.
-
Numeric B inputs are implicitly converted to duration
using
-duration.ofDays
.
-
Returns an array the same size as A. -
-out =
datetime.NaT ()
¶out =
datetime.NaT (sz)
¶“Not-a-Time”: Creates NaT-valued arrays. -
-Constructs a new datetime
array of all NaT
values of
-the given size. If no input sz is given, the result is a scalar NaT
.
-
NaT
is the datetime
equivalent of NaN
. It represents a missing
-or invalid value. NaT
values never compare equal to, greater than, or less
-than any value, including other NaT
s. Doing arithmetic with a NaT
and
-any other value results in a NaT
.
-
out =
ne (A, B)
¶True if A is not equal to B. This defines the !=
operator
-for datetime
s.
-
Inputs are implicitly converted to datetime
using the one-arg
-constructor or conversion method.
-
Returns logical array the same size as obj. -
-obj =
datetime.ofDatenum (dnums)
¶Converts a datenum array to a datetime array. -
-Returns an unzoned datetime
array of the same size as the input.
-
obj =
datetime.ofDatestruct (dstruct)
¶Converts a datestruct to a datetime array. -
-A datestruct is a special struct format used by Tablicious that has fields -Year, Month, Day, Hour, Minute, and Second. It is not a standard Octave datatype. -
-Returns an unzoned datetime
array.
-
out =
plus (A, B)
¶Addition (+
operator). Adds a duration
, calendarDuration
,
-or numeric B to a datetime
A.
-
A must be a datetime
.
-
Numeric B inputs are implicitly converted to duration
using
-duration.ofDays
.
-
Returns datetime
array the same size as A.
-
dnums =
datetime.posix2datenum (pdates)
¶Converts POSIX (Unix) times to datenums -
-Pdates (numeric) is an array of POSIX dates. A POSIX date is the number -of seconds since January 1, 1970 UTC, excluding leap seconds. The output -is implicitly in UTC. -
-out =
posixtime (obj)
¶Converts this to POSIX time values (seconds since the Unix epoch) -
-Converts this to POSIX time values that represent the same time. The -returned values will be doubles that may include fractional second values. -POSIX times are, by definition, in UTC. -
-Returns double array of same size as this. -
-[keysA, keysB] =
proxyKeys (a, b)
¶Computes proxy key values for two datetime arrays. Proxy keys are numeric -values whose rows have the same equivalence relationships as the elements of -the inputs. -
-This is primarily for Tablicious’s internal use; users will typically not need to call -it or know how it works. -
-Returns two 2-D numeric matrices of size n-by-k, where n is the number of elements -in the corresponding input. -
-out =
timeofday (obj)
¶Get the time of day (elapsed time since midnight). -
-For zoned datetime
s, these will be local times in the associated time zone.
-
Returns a duration
array the same size as obj
.
-
out =
week (obj)
¶Get the week of the year. -
-This method is unimplemented. -
-out =
days (x)
¶Duration in days. -
-If x is numeric, then out is a duration
array in units
-of fixed-length 24-hour days, with the same size as x.
-
If x is a duration
, then returns a double
array the same
-size as x indicating the number of fixed-length days that each duration
-is.
-
[Y, E] =
discretize (X, n)
¶[Y, E] =
discretize (X, edges)
¶[Y, E] =
discretize (X, dur)
¶[Y, E] =
discretize (…, 'categorical'
)
¶[Y, E] =
discretize (…, 'IncludedEdge'
, IncludedEdge)
¶Group data into discrete bins or categories. -
-n is the number of bins to group the values into. -
-edges is an array of edge values defining the bins. -
-dur is a duration
value indicating the length of time of each
-bin.
-
If 'categorical'
is specified, the resulting values are a categorical
-array instead of a numeric array of bin indexes.
-
Returns: - Y - the bin index or category of each value from X - E - the list of bin edge values -
-out =
dispstrs (x)
¶Display strings for array. -
-Gets the display strings for each element of x. The display strings -should be short, one-line, human-presentable strings describing the -value of that element. -
-The default implementation of dispstrs
can accept input of any
-type, and has decent implementations for Octave’s standard built-in types,
-but will have opaque displays for most user-defined objects.
-
This is a polymorphic method that user-defined classes may override -with their own custom display that is more informative. -
-Returns a cell array the same size as x. -
-Represents durations or periods of time as an amount of fixed-length -time (i.e. fixed-length seconds). It does not care about calendar things -like months and days that vary in length over time. -
-This is an attempt to reproduce the functionality of Matlab’s duration
. It
-also contains some Octave-specific extensions.
-
Duration values are stored as double numbers of days, so they are an -approximate type. In display functions, by default, they are displayed with -millisecond precision, but their actual precision is closer to nanoseconds -for typical times. -
-duration
: double
days ¶The underlying datenums that represent the durations, as number of (whole and -fractional) days. These are uniform 24-hour days, not calendar days. -
-This is a planar property: the size of days
is the same size as the
-containing duration
array object.
-
duration
: char
Format ¶The format to display this duration
in. Currently unsupported.
-
out =
char (obj)
¶Convert to char. The contents of the strings will be the same as
-returned by dispstrs
.
-
This is primarily a convenience method for use on scalar objs. -
-Returns a 2-D char array with one row per element in obj. -
-out =
duration (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
hours (obj)
¶Equivalent number of hours. -
-Gets the number of fixed-length 60-minute hours that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-out =
linspace (from, to, n)
¶Linearly-spaced values in time duration space. -
-Constructs a vector of duration
s that represent linearly spaced points
-starting at from and going up to to, with n points in the
-vector.
-
from and to are implicitly converted to duration
s.
-
n is how many points to use. If omitted, defaults to 100. -
-Returns an n-long datetime
vector.
-
out =
milliseconds (obj)
¶Equivalent number of milliseconds. -
-Gets the number of milliseconds that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-out =
minutes (obj)
¶Equivalent number of minutes. -
-Gets the number of fixed-length 60-second minutes that is equivalent -to this duration. -
-Returns double array the same size as obj. -
-obj =
duration.ofDays (dnums)
¶Converts a double array representing durations in whole and fractional days
-to a duration
array. This is the method that is used for implicit conversion
-of numerics in many cases.
-
Returns a duration
array of the same size as the input.
-
out =
eqn (A, B)
¶Determine element-wise equality, treating NaNs as equal -
-out = eqn (A, B) -
-eqn
is just like eq
(the function that implements the
-==
operator), except
-that it considers NaN and NaN-like values to be equal. This is the element-wise
-equivalent of isequaln
.
-
eqn
uses isnanny
to test for NaN and NaN-like values,
-which means that NaNs and NaTs are considered to be NaN-like, and
-string arrays’ “missing” and categorical objects’ “undefined” values
-are considered equal, because they are NaN-ish.
-
Developer’s note: the name “eqn
” is a little unfortunate,
-because “eqn” could also be an abbreviation for “equation”. But this
-name follows the isequaln
pattern of appending an “n” to the
-corresponding non-NaN-equivocating function.
-
See also: eq
, isequaln
, isnanny
-
out =
head (A)
¶out =
head (A, k)
¶Get first K rows of an array. -
-Returns the array A, subsetted to its first k rows. This means
-subsetting it to the first (min (k, size (A, 1)))
elements along
-dimension 1, and leaving all other dimensions unrestricted.
-
A is the array to subset. -
-k is the number of rows to get. k defaults to 8 if it is omitted -or empty. -
-If there are less than k rows in A, returns all rows. -
-Returns an array of the same type as A, unless ()-indexing A -produces an array of a different type, in which case it returns that type. -
-See also: tail -
-out =
hours (x)
¶Create a duration
x hours long, or get the hours in a duration
-x.
-
If input is numeric, returns a duration
array that is that many hours in
-time.
-
If input is a duration
, converts the duration
to a number of hours.
-
Returns an array the same size as x. -
out =
iscalendarduration (x)
¶True if input is a calendarDuration
array, false otherwise.
-
Respects iscalendarduration
override methods on user-defined classes, even if
-they do not inherit from calendarDuration
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
iscategorical (x)
¶True if input is a categorical
array, false otherwise.
-
Respects iscategorical
override methods on user-defined classes, even if
-they do not inherit from categorical
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
isdatetime (x)
¶True if input is a datetime
array, false otherwise.
-
Respects isdatetime
override methods on user-defined classes, even if
-they do not inherit from datetime
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
isduration (x)
¶True if input is a duration
array, false otherwise.
-
Respects isduration
override methods on user-defined classes, even if
-they do not inherit from duration
or were known to Tablicious at
-authoring time.
-
Returns a scalar logical. -
-out =
isnanny (X)
¶Test if elements are NaN or NaN-like -
-Tests if input elements are NaN, NaT, or otherwise NaN-like. This is true
-if isnan()
or isnat()
returns true, and is false for types that do not support
-isnan()
or isnat()
.
-
This function only exists because: -
-isnanny()
smooths over those differences so you can call it polymorphically on
-any input type. Hopefully.
-
Under normal operation, isnanny()
should not throw an error for any type or
-value of input.
-
See also: ismissing, isnan
, isnat
, eqn, isequaln
-
out =
istable (x)
¶True if input is a table
array or other table-like type, false
-otherwise.
-
Respects istable
override methods on user-defined classes, even if
-they do not inherit from table
or were known to Tablicious at
-authoring time.
-
User-defined classes should only override istable
to return true if
-they conform to the table
public interface. That interface is not
-well-defined or documented yet, so maybe you don’t want to do that yet.
-
Returns a scalar logical. -
-out =
istabular (x)
¶True if input is eitehr a table
or timetable
array, or an object
-like them.
-
Respects istable
and istimetable
override methods on user-defined
-classes, even if they do not inherit from table
or were known to Tablicious
-at authoring time.
-
Returns a scalar logical. -
-out =
istimetable (x)
¶True if input is a timetable
array or other timetable-like type, false
-otherwise.
-
Respects istimetable
override methods on user-defined classes, even if
-they do not inherit from table
or were known to Tablicious at
-authoring time.
-
User-defined classes should only override istimetable
to return true if
-they conform to the table
public interface. That interface is not
-well-defined or documented yet, so maybe you don’t want to do that yet.
-
Returns a scalar logical. -
-Represents a complete day using the Gregorian calendar. -
-This class is useful for indexing daily-granularity data or representing -time periods that cover an entire day in local time somewhere. The major -purpose of this class is "type safety", to prevent time-of-day values -from sneaking in to data sets that should be daily only. As a secondary -benefit, this uses less memory than datetimes. -
-localdate
: double
dnums ¶The underlying datenum values that represent the days. The datenums are at -the midnight that is at the start of the day it represents. -
-These are doubles, but -they are restricted to be integer-valued, so they represent complete days, with -no time-of-day component. -
-localdate
: char
Format ¶The format to display this localdate
in. Currently unsupported.
-
out =
datenum (obj)
¶Convert this to datenums that represent midnight on obj’s days. -
-Returns double array of same size as this. -
-out =
datestr (obj)
¶out =
datestr (obj, …)
¶Format obj as date strings. Supports all arguments that core Octave’s
-datestr
does.
-
Returns date strings as a 2-D char array. -
-out =
datestrs (obj)
¶out =
datestrs (obj, …)
¶Format obj as date strings, returning cellstr.
-Supports all arguments that core Octave’s datestr
does.
-
Returns a cellstr array the same size as obj. -
-out =
datestruct (obj)
¶Converts this to a “datestruct” broken-down time structure. -
-A “datestruct” is a format of struct that Tablicious came up with. It is a scalar
-struct with fields Year, Month, and Day, each containing
-a double array the same size as the date array it represents. This format
-differs from the “datestruct” used by datetime
in that it lacks
-Hour, Minute, and Second components. This is done for efficiency.
-
The values in the returned broken-down time are those of the local time -in obj’s defined time zone, if it has one. -
-Returns a struct with fields Year, Month, and Day. -Each field contains a double array of the same size as this. -
-out =
dispstrs (obj)
¶Get display strings for each element of obj. -
-Returns a cellstr the same size as obj. -
-out =
isnan (obj)
¶True if input elements are NaT. This is an alias for isnat
-to support type compatibility and polymorphic programming.
-
Returns logical array the same size as obj. -
-out =
isnat (obj)
¶True if input elements are NaT. -
-Returns logical array the same size as obj. -
-obj =
localdate ()
¶Constructs a new scalar localdate
containing the current local date.
-
obj =
localdate (datenums)
¶obj =
localdate (datestrs)
¶obj =
localdate (Y, M, D)
¶obj =
localdate (…, 'Format'
, Format)
¶Constructs a new localdate
array based on input values.
-
out =
localdate.NaT ()
¶out =
localdate.NaT (sz)
¶“Not-a-Time”: Creates NaT-valued arrays. -
-Constructs a new datetime
array of all NaT
values of
-the given size. If no input sz is given, the result is a scalar NaT
.
-
NaT
is the datetime
equivalent of NaN
. It represents a missing
-or invalid value. NaT
values never compare equal to, greater than, or less
-than any value, including other NaT
s. Doing arithmetic with a NaT
and
-any other value results in a NaT
.
-
This static method is provided because the global NaT
function creates
-datetime
s, not localdate
s
-
out =
posixtime (obj)
¶Converts this to POSIX time values for midnight of obj’s days. -
-Converts this to POSIX time values that represent the same date. The -returned values will be doubles that will not include fractional second values. -The times returned are those of midnight UTC on obj’s days. -
-Returns double array of same size as this. -
-out =
milliseconds (x)
¶Create a duration
x milliseconds long, or get the milliseconds in a duration
-x.
-
If input is numeric, returns a duration
array that is that many milliseconds in
-time.
-
If input is a duration
, converts the duration
to a number of milliseconds.
-
Returns an array the same size as x. -
out =
hours (x)
¶Create a duration
x hours long, or get the hours in a duration
-x.
-
Generic auto-converting missing value. -
-missing
is a generic missing value that auto-converts to other
-types.
-
A missing
array indicates a missing value, of no particular type. It auto-
-converts to other types when it is combined with them via concatenation or
-other array combination operations.
-
This class is currently EXPERIMENTAL. Use at your own risk. -
-Note: This class does not actually work for assignment. If you do this: -
-x = 1:5 - x(3) = missing -
It’s supposed to work, but I can’t figure out how to do this in a normal -classdef object, because there doesn’t seem to be any function that’s implicitly -called for type conversion in that assignment. Darn it. -
-out =
dispstrs (obj)
¶Display strings. -
-Gets display strings for each element in obj. -
-For missing
, the display strings are always '<missing>'
.
-
Returns a cellstr the same size as obj. -
-out =
ismissing (obj)
¶Test whether elements are missing values. -
-ismissing
is always true for missing
arrays.
-
Returns a logical array the same size as obj. -
-out =
isnan (obj)
¶Test whether elements are NaN. -
-isnan
is always true for missing
arrays.
-
Returns a logical array the same size as obj. -
-out =
NaC ()
¶out =
NaC (sz)
¶“Not-a-Categorical". Creates missing-valued categorical arrays. -
-Returns a new categorical
array of all missing values of
-the given size. If no input sz is given, the result is a scalar missing
-categorical.
-
NaC
is the categorical
equivalent of NaN
or NaT
. It
-represents a missing, invalid, or null value. NaC
values never compare
-equal to any value, including other NaC
s.
-
NaC
is a convenience function which is strictly a wrapper around
-categorical.undefined
and returns the same results, but may be more convenient
-to type and/or more readable, especially in array expressions with several values.
-
See also: categorical.undefined -
-out =
NaS ()
¶out =
NaS (sz)
¶“Not-a-String". Creates missing-valued string arrays. -
-Returns a new string
array of all missing values of
-the given size. If no input sz is given, the result is a scalar missing
-string.
-
NaS
is the string
equivalent of NaN
or NaT
. It
-represents a missing, invalid, or null value. NaS
values never compare
-equal to any value, including other NaS
s.
-
NaS
is a convenience function which is strictly a wrapper around
-string.missing
and returns the same results, but may be more convenient
-to type and/or more readable, especially in array expressions with several values.
-
See also: string.missing -
-out =
NaT ()
¶out =
NaT (sz)
¶“Not-a-Time”. Creates missing-valued datetime arrays. -
-Constructs a new datetime
array of all NaT
values of
-the given size. If no input sz is given, the result is a scalar NaT
.
-
NaT
is the datetime
equivalent of NaN
. It represents a missing
-or invalid value. NaT
values never compare equal to, greater than, or less
-than any value, including other NaT
s. Doing arithmetic with a NaT
and
-any other value results in a NaT
.
-
NaT
currently cannot create NaT arrays of type localdate
. To do that,
-use localdate.NaT instead.
-
(X)
¶(A, B, C, …)
¶('A'
, 'B'
, 'C'
, …)
¶A
B
C
…
¶Alias for prettyprint, for interactive use. -
-This is an alias for prettyprint(), with additional name-conversion magic. -
-If you pass in a char, instead of pretty-printing that directly, it will -grab and pretty-print the variable of that name from the caller’s workspace. -This is so you can conveniently run it from the command line. -
-[out1, out2, …, outN] =
scalarexpand (x1, x2, …, xN)
¶Expand scalar inputs to match size of non-scalar inputs. -
-Expands each scalar input argument to match the size of the non-scalar
-input arguments, and returns the expanded values in the corresponding
-output arguments. repmat
is used to do the expansion.
-
Works on any input types that support size
, isscalar
, and
-repmat
.
-
It is an error if any of the non-scalar inputs are not the same size as -all of the other non-scalar inputs. -
-Returns as many output arguments as there were input arguments. -
-Examples: -
-x1 = rand(3); -x2 = 42; -x3 = magic(3); -[x1, x2, x3] = scalarexpand (x1, x2, x3) -
out =
seconds (x)
¶Create a duration
x seconds long, or get the seconds in a duration
-x.
-
If input is numeric, returns a duration
array that is that many seconds in
-time.
-
If input is a duration
, converts the duration
to a number of seconds.
-
Returns an array the same size as x. -
out =
size2str (sz)
¶Format an array size for display. -
-Formats the given array size sz as a string for human-readable -display. It will be in the format “d1-by-d2-...-by-dN”, for the N -dimensions represented by sz. -
-sz is an array of dimension sizes, in the format returned by
-the size
function.
-
Returns a charvec. -
-Examples: -
str = size2str (size (magic (4))) - ⇒ str = 4-by-4 -
out =
splitapply (func, X, G)
¶out =
splitapply (func, X1, …, XN, G)
¶[Y1, …, YM] =
splitapply (…)
¶Split data into groups and apply function. -
-func is a function handle to call on each group of inputs in turn. -
-X, X1, …, XN are the input variables that are split into
-groups for the function calls. If X is a table
, then its contained
-variables are “popped out” and considered to be the X1 … XN
-input variables.
-
G is the grouping variable vector. It contains a list of integers that -identify which group each element of the X input variables belongs to. -NaNs in G mean that element is ignored. -
-Vertically concatenates the function outputs for each of the groups and returns them in -as many variables as you capture. -
-Returns the concatenated outputs of applying func to each group. -
-See also: table.groupby, table.splitapply -
-A string array of Unicode strings. -
-A string array is an array of strings, where each array element is a single -string. -
-The string class represents strings, where: -
This should correspond pretty well to what people think of as strings, and -is pretty compatible with people’s typical notion of strings in Octave. -
-String arrays also have a special “missing” value, that is like the string -equivalent of NaN for doubles or “undefined” for categoricals, or SQL NULL. -
-This is a slightly higher-level and more strongly-typed way of representing -strings than cellstrs are. (A cellstr array is of type cell, not a text- -specific type, and allows assignment of non-string data into it.) -
-Be aware that while string arrays interconvert with Octave chars and cellstrs, -Octave char elements represent 8-bit UTF-8 code units, not Unicode code points. -
-This class really serves three roles: -
-Not clear whether it’s a good fit to have the Unicode support wrapped -up in this. Maybe it should just be a simple object wrapper -wrapper, and defer Unicode semantics to when core Octave adopts them for -char and cellstr. On the other hand, because Octave chars are UTF-8, not UCS-2, -some methods like strlength() and reverse() are just going to be wrong if -they delegate straight to chars. -
-“Missing” string values work like NaNs. They are never considered equal, -less than, or greater to any other string, including other missing strings. -This applies to set membership and other equivalence tests. -
-TODO: Need to decide how far to go with Unicode semantics, and how much to -just make this an object wrapper over cellstr and defer to Octave’s existing -char/string-handling functions. -
-TODO: demote_strings should probably be static or global, so that other -functions can use it to hack themselves into being string-aware. -
-out =
cell (obj)
¶Convert to cell array. -
-Converts this to a cell, which will be a cellstr. Missing values are
-converted to ''
.
-
This method returns the same values as cellstr(obj)
; it is just provided
-for interface compatibility purposes.
-
Returns a cell array of the same size as obj. -
-out =
cellstr (obj)
¶Convert to cellstr. -
-Converts obj to a cellstr. Missing values are converted to ''
.
-
Returns a cellstr array of the same size as obj. -
-out =
char (obj)
¶Convert to char array. -
-Converts obj to a 2-D char array. It will have as many rows -as obj has elements. -
-It is an error to convert missing-valued string
arrays to
-char. (NOTE: This may change in the future; it may be more appropriate)
-to convert them to space-padded empty strings.)
-
Returns 2-D char array. -
-[out, outA, outB] =
cmp (A, B)
¶Value ordering comparison, returning -1/0/+1. -
-Compares each element of A and B, returning for
-each element i
whether A(i)
was less than (-1),
-equal to (0), or greater than (1) the corresponding B(i)
.
-
TODO: What to do about missing values? Should missings sort to the end -(preserving total ordering over the full domain), or should their comparisons -result in a fourth "null"/"undef" return value, probably represented by NaN? -FIXME: The current implementation does not handle missings. -
-Returns a numeric array out of the same size as the scalar expansion -of A and B. Each value in it will be -1, 0, or 1. -
-Also returns scalar-expanded copies of A and B as outA and -outB, as a programming convenience. -
-out =
string.decode (bytes, charsetName)
¶Decode encoded text from bytes. -
-Decodes the given encoded text in bytes according to the specified -encoding, given by charsetName. -
-Returns a scalar string. -
-See also: string.encode -
-out =
dispstrs (obj)
¶Display strings for array elements. -
-Gets display strings for all the elements in obj. These display strings
-will either be the string contents of the element, enclosed in "..."
,
-and with CR/LF characters replaced with '\r'
and '\n'
escape sequences,
-or "<missing>"
for missing values.
-
Returns a cellstr of the same size as obj. -
-out =
empty (sz)
¶Get an empty string array of a specified size. -
-The argument sz is optional. If supplied, it is a numeric size -array whose product must be zero. If omitted, it defaults to [0 0]. -
-The size may also be supplied as multiple arguments containing -scalar numerics. -
-Returns an empty string array of the requested size. -
-out =
encode (obj, charsetName)
¶Encode string in a given character encoding. -
-obj must be scalar. -
-charsetName (charvec) is the name of a character encoding. -(TODO: Document what determines the set of valid encoding names.) -
-Returns the encoded string as a uint8
vector.
-
See also: string.decode. -
-out =
erase (obj, match)
¶Erase matching substring. -
-Erases the substrings in obj which match the match input. -
-Returns a string array of the same size as obj. -
-out =
ismissing (obj)
¶Test whether array elements are missing. -
-For string
arrays, only the special “missing” value is
-considered missing. Empty strings are not considered missing,
-the way they are with cellstrs.
-
Returns a logical array the same size as obj
.
-
out =
isnanny (obj)
¶Test whether array elements are NaN-like. -
-Missing values are considered nannish; any other string value is not. -
-Returns a logical array of the same size as obj. -
-out =
isstring (obj)
¶Test if input is a string array. -
-isstring
is always true for string
inputs.
-
Returns a scalar logical. -
-out =
lower (obj)
¶Convert to lower case. -
-Converts all the characters in all the strings in obj to lower case. -
-This currently delegates to Octave’s own lower()
function to
-do the conversion, so whatever character class handling it has, this
-has.
-
Returns a string array of the same size as obj. -
-out =
string.missing (sz)
¶Missing string value. -
-Creates a string array of all-missing values of the specified size sz. -If sz is omitted, creates a scalar missing string. -
-Returns a string array of size sz or [1 1]. -
-See also: NaS -
-out =
plus (a, b)
¶String concatenation via plus operator. -
-Concatenates the two input arrays, string-wise. Inputs that are -not string arrays are converted to string arrays. -
-The concatenation is done by calling ‘strcat‘ on the inputs, and has the -same behavior. -
-Returns a string array the same size as the scalar expansion of its -inputs. -
-See also: string.strcat -
-out =
regexprep (obj, pat, repstr)
¶out =
regexprep (…, varargin)
¶Replace based on regular expression matching. -
-Replaces all the substrings matching a given regexp pattern pat with -the given replacement text repstr. -
-Returns a string array of the same size as obj. -
-out =
reverse (obj)
¶Reverse string, character-wise. -
-Reverses the characters in each string in obj. This operates on -Unicode characters (code points), not on bytes, so it is guaranteed -to produce valid UTF-8 as its output. -
-Returns a string array the same size as obj. -
-out =
reverse_bytes (obj)
¶Reverse string, byte-wise. -
-Reverses the bytes in each string in obj. This operates on bytes -(Unicode code units), not characters. -
-This may well produce invalid strings as a result, because reversing a -UTF-8 byte sequence does not necessarily produce another valid UTF-8 -byte sequence. -
-You probably do not want to use this method. You probably want to use
-string.reverse
instead.
-
Returns a string array the same size as obj. -
-See also: string.reverse -
-out =
strcat (varargin)
¶String concatenation. -
-Concatenates the corresponding elements of all the input arrays, -string-wise. Inputs that are not string arrays are converted to -string arrays. -
-The semantics of concatenating missing strings with non-missing -strings has not been determined yet. -
-Returns a string array the same size as the scalar expansion of its -inputs. -
-out =
strcmp (A, B)
¶String comparison. -
-Tests whether each element in A is exactly equal to the corresponding -element in B. Missing values are not considered equal to each other. -
-This does the same comparison as A == B
, but is not polymorphic.
-Generally, there is no reason to use strcmp
instead of ==
-or eq
on string arrays, unless you want to be compatible with
-cellstr inputs as well.
-
Returns logical array the size of the scalar expansion of A and B. -
-out =
strfind (obj, pattern)
¶out =
strfind (…, varargin)
¶Find pattern in string. -
-Finds the locations where pattern occurs in the strings of obj. -
-TODO: It’s ambiguous whether a scalar this should result in a numeric -out or a cell array out. -
-Returns either an index vector, or a cell array of index vectors. -
-obj =
string ()
¶obj =
string (in)
¶Construct a new string array. -
-The zero-argument constructor creates a new scalar string array -whose value is the empty string. -
-The other constructors construct a new string array by converting -various types of inputs. -
-out =
strlength (obj)
¶String length in characters (actually, UTF-16 code units). -
-Gets the length of each string, counted in UTF-16 code units. In most -cases, this is the same as the number of characters. The exception is for -characters outside the Unicode Basic Multilingual Plane, which are -represented with UTF-16 surrogate pairs, and thus will count as 2 characters -each. -
-The reason this method counts UTF-16 code units, instead of Unicode code -points (true characters), is for Matlab compatibility. -
-This is the string length method you probably want to use,
-not strlength_bytes
.
-
Returns double array of the same size as obj. Returns NaNs for missing -strings. -
-See also: string.strlength_bytes -
-out =
strlength_bytes (obj)
¶String length in bytes. -
-Gets the length of each string in obj, counted in Unicode UTF-8
-code units (bytes). This is the same as numel(str)
for the corresponding
-Octave char vector for each string, but may not be what you
-actually want to use. You may want strlength
instead.
-
Returns double array of the same size as obj. Returns NaNs for missing -strings. -
-See also: string.strlength -
-out =
strrep (obj, match, replacement)
¶out =
strrep (…, varargin)
¶Replace occurrences of pattern with other string. -
-Replaces matching substrings in obj with a given replacement string. -
-varargin is passed along to the core Octave strrep
function. This
-supports whatever options it does.
-TODO: Maybe document what those options are.
-
Returns a string array of the same size as obj. -
-out =
upper (obj)
¶Convert to upper case. -
-Converts all the characters in all the strings in obj to upper case. -
-This currently delegates to Octave’s own upper()
function to
-do the conversion, so whatever character class handling it has, this
-has.
-
Returns a string array of the same size as obj. -
-out =
struct2table (s)
¶out =
struct2table (…, 'AsArray'
, AsArray)
¶Convert struct to a table. -
-Converts the input struct s to a table
.
-
s may be a scalar struct or a nonscalar struct array. -
-The AsArray option is not implemented yet. -
-Returns a table
.
-
Tabular data array containing multiple columnar variables. -
-A table
is a tabular data structure that collects multiple parallel
-named variables.
-Each variable is treated like a column. (Possibly a multi-columned column, if
-that makes sense.)
-The types of variables may be heterogeneous.
-
A table object is like an SQL table or resultset, or a relation, or a -DataFrame in R or Pandas. -
-A table is an array in itself: its size is nrows-by-nvariables, -and you can index along the rows and variables by indexing into the table -along dimensions 1 and 2. -
-A note on accessing properties of a table
array: Because .-indexing is
-used to access the variables inside the array, it can’t also be directly used
-to access properties as well. Instead, do t.Properties.<property>
for
-a table t
. That will give you a property instead of a variable.
-(And due to this mechanism, it will cause problems if you have a table
-with a variable named Properties
. Try to avoid that.)
-
WARNING ABOUT HANDLE CLASSES IN TABLE VARIABLES -
-Using a handle class in a table variable (column) value may lead to unpredictable -and buggy behavior! A handle class array is a reference type, and it holds shared -mutable state, which may be shared with references to it in other table arrays or -outside the table array. The table class makes no guarantees about what it will -or will not do internally with arrays that are held in table variables, and any -operation on a table holding handle arrays may have unpredictable and undesirable -side effects. These side effects may change between versions of Tablicious. -
-We currently recommend that you do not use handle classes in table variables. It -may be okay to use handle classes *inside* cells or other non-handle composite types -that are used in table variables, but this hasn’t been fully thought through or -tested. -
-See also: tblish.table.grpstats, tblish.evalWithTableVars, tblish.examples.SpDb -
-table
: cellstr
VariableNames ¶The names of the variables in the table, as a cellstr row vector. -
-table
: cell
VariableValues ¶A cell vector containing the values for each of the variables.
-VariableValues(i)
corresponds to VariableNames(i)
.
-
table
: cellstr
RowNames ¶An optional list of row names that identify each row in the table. This -is a cellstr column vector, if present. -
-table
: cellstr
DimensionNames ¶Names for the two dimensions of the table array, as a cellstr row vector. Always
-exactly 2-long, because tables are always exactly 2-D. Defaults to
-{"Row", "Variables"}
. (I feel the singular "Row" and plural "Variables" here
-are inconsistent, but that’s what Matlab uses, so Tablicious uses it too, for
-Matlab compatibility.)
-
out =
addvars (obj, var1, …, varN)
¶out =
addvars (…, 'Before'
, Before)
¶out =
addvars (…, 'After'
, After)
¶out =
addvars (…, 'NewVariableNames'
, NewVariableNames)
¶Add variables to table. -
-Adds the specified variables to a table. -
-[outA, ixA, outB, ixB] =
antijoin (A, B)
¶Natural antijoin (AKA “semidifference”). -
-Computes the anti-join of A and B. The anti-join is defined as all the -rows from one input which do not have matching rows in the other input. -
-Returns: - outA - all the rows in A with no matching row in B - ixA - the row indexes into A which produced outA - outB - all the rows in B with no matching row in A - ixB - the row indexes into B which produced outB -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-[out, ixs] =
cartesian (A, B)
¶Cartesian product of two tables. -
-Computes the Cartesian product of two tables. The Cartesian product is -each row in A combined with each row in B. -
-Due to the definition and structural constraints of table, the two inputs -must have no variable names in common. It is an error if they do. -
-The Cartesian product is seldom used in practice. If you find yourself -calling this method, you should step back and re-evaluate what you are -doing, asking yourself if that is really what you want to happen. If nothing -else, writing a function that calls cartesian() is usually much less -efficient than alternate ways of arriving at the same result. -
-This implementation does not remove duplicate values. -TODO: Determine whether this duplicate-removing behavior is correct. -
-The ordering of the rows in the output is not specified, and may be implementation- -dependent. TODO: Determine if we can lock this behavior down to a fixed, -defined ordering, without killing performance. -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-out =
convertvars (obj, vars, dataType)
¶Convert variables to specified data type. -
-Converts the variables in obj specified by vars to the specified data type. -
-vars is a cellstr or numeric vector specifying which variables to convert. -
-dataType specifies the data type to convert those variables to. It is either -a char holding the name of the data type, or a function handle which will -perform the conversion. If it is the name of the data type, there must -either be a one-arg constructor of that type which accepts the specified -variables’ current types as input, or a conversion method of that name -defined on the specified variables’ current type. -
-Returns a table with the same variable names as obj, but with converted -types. -
-[G, TID] =
findgroups (obj)
¶Find groups within a table’s row values. -
-Finds groups within a table’s row values and get group numbers. A group -is a set of rows that have the same values in all their variable elements. -
-Returns: - G - A double column vector of group numbers created from obj. - TID - A table containing the row values corresponding to the group numbers. -
-[out, name]
= getvar (obj, varRef)
¶Get value and name for single table variable. -
-varRef is a variable reference. It may be a name or an index. It -may only specify a single table variable. -
-Returns: - out – the value of the referenced table variable - name – the name of the referenced table variable -
-[out1, …]
= getvars (obj, varRef)
¶Get values for one ore more table variables. -
-varRef is a variable reference in the form of variable names or -indexes. -
-Returns as many outputs as varRef referenced variables. Each output -contains the contents of the corresponding table variable. -
-[out] =
groupby (obj, groupvars, aggcalcs)
¶Find groups in table data and apply functions to variables within groups. -
-This works like an SQL "SELECT ... GROUP BY ..."
statement.
-
groupvars (cellstr, numeric) is a list of the grouping variables, -identified by name or index. -
-aggcalcs is a specification of the aggregate calculations to perform
-on them, in the form {
out_var,
fcn,
in_vars; ...}
, where:
- out_var (char) is the name of the output variable
- fcn (function handle) is the function to apply to produce it
- in_vars (cellstr) is a list of the input variables to pass to fcn
-
Returns a table. -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-out =
height (obj)
¶Number of rows in table. -
-For a zero-variable table, this currently always returns 0. This is a bug, -and will change in the future. It should be possible for zero-variable table -arrays to have any number of rows. -
-out =
horzcat (varargin)
¶Horizontal concatenation. -
-Combines tables by horizontally concatenating them. -Inputs that are not tables are automatically converted to tables by calling -table() on them. Inputs must have all distinct variable names. -
-Output has the same RowNames as varargin{1}
. The variable names and values
-are the result of the concatenation of the variable names and values lists
-from the inputs.
-
[out, ixa, ixb] =
innerjoin (A, B)
¶[…] =
innerjoin (A, B, …)
¶Combine two tables by rows using key variables. -
-Computes the relational inner join between two tables. “Inner” means that -only rows which had matching rows in the other input are kept in the -output. -
-TODO: Document options. -
-Returns: - out - A table that is the result of joining A and B - ix - Indexes into A for each row in out - ixb - Indexes into B for each row in out -
-[C, ia, ib] =
intersect (A, B)
¶Set intersection. -
-Computes the intersection of two tables. The intersection is defined to be the unique -row values which are present in both of the two input tables. -
-Returns: - C - A table containing all the unique row values present in both A and B. - ia - Row indexes into A of the rows from A included in C. - ib - Row indexes into B of the rows from B included in C. -
-out =
isempty (obj)
¶Test whether array is empty. -
-For tables, isempty
is true if the number of rows is 0 or the number
-of variables is 0.
-
[tf, loc] =
ismember (A, B)
¶Set membership. -
-Finds rows in A that are members of B. -
-Returns: - tf - A logical vector indicating whether each A(i,:) was present in B. - loc - Indexes into B of rows that were found. -
-out =
ismissing (obj)
¶out =
ismissing (obj, indicator)
¶Find missing values. -
-Finds missing values in obj’s variables. -
-If indicator is not supplied, uses the standard missing values for each -variable’s data type. If indicator is supplied, the same indicator list is -applied across all variables. -
-All variables in this must be vectors. (This is due to the requirement
-that size(out) == size(obj)
.)
-
Returns a logical array the same size as obj. -
-[C, ib] =
join (A, B)
¶[C, ib] =
join (A, B, …)
¶Combine two tables by rows using key variables, in a restricted form. -
-This is not a "real" relational join operation. It has the restrictions -that: - 1) The key values in B must be unique. - 2) Every key value in A must map to a key value in B. -These are restrictions inherited from the Matlab definition of table.join. -
-You probably don’t want to use this method. You probably want to use -innerjoin or outerjoin instead. -
-See also: table.innerjoin, table.outerjoin -
-out =
mergevars (obj, vars)
¶out =
mergevars (…, 'NewVariableName'
, NewVariableName)
¶out =
mergevars (…, 'MergeAsTable'
, MergeAsTable)
¶Merge table variables into a single variable. -
-out =
movevars (obj, vars, relLocation, location)
¶Move around variables in a table. -
-vars is a list of variables to move, specified by name or index. -
-relLocation is 'Before'
or 'After'
.
-
location indicates a single variable to use as the target location, -specified by name or index. If it is specified by index, it is the index -into the list of *unmoved* variables from obj, not the original full -list of variables in obj. -
-Returns a table with the same variables as obj, but in a different order. -
-out =
ndims (obj)
¶Number of dimensions -
-For tables, ndims(obj)
is always 2, because table arrays are always
-2-D (rows-by-columns).
-
out =
numel (obj)
¶Total number of elements in table (actually 1). -
-For compatibility reasons with Octave’s OOP interface and subsasgn behavior, -table’s numel is defined to always return 1. It is not useful for client -code to query a table’s size using numel. This is an incompatibility with -Matlab. -
-out =
outerfillvals (obj)
¶Get fill values for outer join. -
-Returns a table with the same variables as this, but containing only -a single row whose variable values are the values to use as fill values -when doing an outer join. -
-[out, ixa, ixb] =
outerjoin (A, B)
¶[…] =
outerjoin (A, B, …)
¶Combine two tables by rows using key variables, retaining unmatched rows. -
-Computes the relational outer join of tables A and B. This is like a -regular join, but also includes rows in each input which did not have -matching rows in the other input; the columns from the missing side are -filled in with placeholder values. -
-TODO: Document options. -
-Returns: - out - A table that is the result of the outer join of A and B - ixa - indexes into A for each row in out - ixb - indexes into B for each row in out -
-(obj)
¶Display table’s values in tabular format. This prints the contents -of the table in human-readable, tabular form. -
-Variables which contain objects are displayed using the strings
-returned by their dispstrs
method, if they define one.
-
[out, ixs] =
realjoin (A, B)
¶[…] =
realjoin (A, B, …)
¶"Real" relational inner join, without key restrictions -
-Performs a "real" relational natural inner join between two tables, -without the key restrictions that JOIN imposes. -
-Currently does not support tables which have RowNames. This may be -added in the future. -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-Name/value option arguments are: Keys, LeftKeys, RightKeys, -LeftVariables, RightVariables. -
-FIXME: Document those options. -
-Returns: - out - A table that is the result of joining A and B - ixs - Indexes into A for each row in out -
-out =
removevars (obj, vars)
¶Remove variables from table. -
-Deletes the variables specified by vars from obj. -
-vars may be a char, cellstr, numeric index vector, or logical -index vector. -
-out =
renamevars (obj, renameMap)
¶Rename variables in a table. -
-Renames selected variables in the table obj based on the mapping -provided in renameMap. -
-renameMap is an n-by-2 cellstr array, with the old variable names -in the first column, and the corresponding new variable names in the -second column. -
-Variables which are not included in renameMap are not modified. -
-It is an error if any variables named in the first column of renameMap -are not present in obj. -
-Renames -
out =
repelem (obj, R)
¶out =
repelem (obj, R_1, R_2)
¶Replicate elements of matrix. -
-Replicates elements of this table matrix by applying repelem to each of -its variables. This -
-Only two dimensions are supported for repelem
on tables.
-
out =
repmat (obj, sz)
¶Replicate matrix. -
-Repmats a table by repmatting each of its variables vertically. -
-For tables, repmatting is only supported along dimension 1. That is, the -values of sz(2:end) must all be exactly 1. This behavior may change in the -future to support repmatting horizontally, with the added variable names being -automatically changed to maintain uniqueness of variable names within the -resulting table. -
-Returns a new table with the same variable names and types as tbl, but -with a possibly different row count. -
-out =
restrict (obj, expr)
¶out =
restrict (obj, ix)
¶Subset rows using variable expression or index. -
-Subsets a table row-wise, using either an index vector or an expression -involving obj’s variables. -
-If the argument is a numeric or logical vector, it is interpreted as an -index into the rows of this. (Just as with ‘subsetrows (this, index)‘.) -
-If the argument is a char, then it is evaulated as an M-code expression,
-with all of this’ variables available as workspace variables, as with
-tblish.evalWithTableVars
. The output of expr must be a numeric or logical index
-vector (This form is a shorthand for
-out = subsetrows (this, tblish.evalWithTableVars (this, expr))
.)
-
TODO: Decide whether to name this to "where" to be more like SQL instead -of relational algebra. -
-Examples: -
[s,p,sp] = tblish.examples.SpDb; -prettyprint (restrict (p, 'Weight >= 14 & strcmp(Color, "Red")')) -
This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-See also: tblish.evalWithTableVars -
-out =
varfun (func, obj)
¶out =
varfun (…, 'OptionName'
, OptionValue, …)
¶Apply function to rows in table and collect outputs. -
-This applies the function func to the elements of each row of -obj’s variables, and collects the concatenated output(s) into the -variable(s) of a new table. -
-func is a function handle. It should take as many inputs as there
-are variables in obj. Or, it can take a single input, and you must
-specify 'SeparateInputs', false
to have the input variables
-concatenated before being passed to func. It may return multiple
-argouts, but to capture those past the first one, you must explicitly
-specify the 'NumOutputs'
or 'OutputVariableNames'
options.
-
Supported name/value options: -
'OutputVariableNames'
Names of table variables to store combined function output arguments in. -
'NumOutputs'
Number of output arguments to call function with. If omitted, defaults to -number of items in OutputVariableNames if it is supplied, otherwise -defaults to 1. -
'SeparateInputs'
If true, input variables are passed as separate input arguments to func. -If false, they are concatenated together into a row vector and passed as -a single argument. Defaults to true. -
'ErrorHandler'
A function to call as a fallback when calling func results in an error. -It is passed the caught exception, along with the original inputs passed -to func, and it has a “second chance” to compute replacement values -for that row. This is useful for converting raised errors to missing-value -fill values, or logging warnings. -
'ExtractCellContents'
Whether to “pop out” the contents of the elements of cell variables in -obj, or to leave them as cells. True/false; default is false. If -you specify this option, then obj may not have any multi-column -cell-valued variables. -
'InputVariables'
If specified, only these variables from obj are used as the function -inputs, instead of using all variables. -
'GroupingVariables'
Not yet implemented. -
'OutputFormat'
The format of the output. May be 'table'
(the default),
-'uniform'
, or 'cell'
. If it is 'uniform'
or 'cell'
,
-the output variables are returned in multiple output arguments from
-'rowfun'
.
-
Returns a table
whose variables are the collected output arguments
-of func if OutputFormat is 'table'
. Otherwise, returns
-multiple output arguments of whatever type func returned (if
-OutputFormat is 'uniform'
) or cells (if OutputFormat
-is 'cell'
).
-
out =
rows2vars (obj)
¶out =
rows2vars (obj, 'VariableNamesSource'
, VariableNamesSource)
¶out =
rows2vars (…, 'DataVariables'
, DataVariables)
¶Reorient table, swapping rows and variables dimensions. -
-This flips the dimensions of the given table obj, swapping the -orientation of the contained data, and swapping the row names/labels -and variable names. -
-The variable names become a new variable named “OriginalVariableNames”. -
-The row names are drawn from the column VariableNamesSource if it -is specified. Otherwise, if obj has row names, they are used. -Otherwise, new variable names in the form “VarN” are generated. -
-If all the variables in obj are of the same type, they are concatenated -and then sliced to create the new variable values. Otherwise, they are -converted to cells, and the new table has cell variable values. -
-[outA, ixA, outB, ixB] =
semijoin (A, B)
¶Natural semijoin. -
-Computes the natural semijoin of tables A and B. The semi-join of tables -A and B is the set of all rows in A which have matching rows in B, based -on comparing the values of variables with the same names. -
-This method also computes the semijoin of B and A, for convenience. -
-Returns: - outA - all the rows in A with matching row(s) in B - ixA - the row indexes into A which produced outA - outB - all the rows in B with matching row(s) in A - ixB - the row indexes into B which produced outB -
-This is a Tablicious/Octave extension, not defined in the Matlab table interface. -
-[C, ia] =
setdiff (A, B)
¶Set difference. -
-Computes the set difference of two tables. The set difference is defined to be -the unique row values which are present in table A that are not in table B. -
-Returns: - C - A table containing the unique row values in A that were not in B. - ia - Row indexes into A of the rows from A included in C. -
-out =
setDimensionNames (obj, names)
¶out =
setDimensionNames (obj, ix, names)
¶Set dimension names. -
-Sets the DimensionNames
for this table to a new list of names.
-
names is a char or cellstr vector. It must have the same number of elements -as the number of dimension names being assigned. -
-ix is an index vector indicating which dimension names to set. If -omitted, it sets all two of them. Since there are always two dimension, -the indexes in ix may never be higher than 2. -
-This method exists because the obj.Properties.DimensionNames = …
-assignment form did not originally work, possibly due to an Octave bug, or more
-likely due to a bug in Tablicious prior to the early 0.4.x versions. That was
-fixed around 0.4.4. This method may be deprecated and removed at some point, since
-it is not part of the standard Matlab table interface, and is now redundant with
-the obj.Properties.DimensionNames = …
assignment form.
-
out =
setRowNames (obj, names)
¶Set row names. -
-Sets the row names on obj to names. -
-names is a cellstr column vector, with the same number of rows as -obj has. -
-out =
setvar (obj, varRef, value)
¶Set value for a variable in table. -
-This sets (adds or replaces) the value for a variable in obj. It -may be used to change the value of an existing variable, or add a new -variable. -
-This method exists primarily because I cannot get obj.foo = value
to work,
-apparently due to an issue with Octave’s subsasgn support.
-
varRef is a variable reference, either the index or name of a variable. -If you are adding a new variable, it must be a name, and not an index. -
-value is the value to set the variable to. If it is scalar or -a single string as charvec, it is scalar-expanded to match the number -of rows in obj. -
-out =
setVariableNames (obj, names)
¶out =
setVariableNames (obj, ix, names)
¶Set variable names. -
-Sets the VariableNames
for this table to a new list of names.
-
names is a char or cellstr vector. It must have the same number of elements -as the number of variable names being assigned. -
-ix is an index vector indicating which variable names to set. If -omitted, it sets all of them present in obj. -
-This method exists because the obj.Properties.VariableNames = …
-assignment form does not work, possibly due to an Octave bug.
-
[C, ia, ib] =
setxor (A, B)
¶Set exclusive OR. -
-Computes the setwise exclusive OR of two tables. The set XOR is defined to be -the unique row values which are present in one or the other of the two input -tables, but not in both. -
-Returns: - C - A table containing all the unique row values in the set XOR of A and B. - ia - Row indexes into A of the rows from A included in C. - ib - Row indexes into B of the rows from B included in C. -
-sz =
size (obj)
¶[nr, nv] =
size (obj)
¶[nr, nv, …] =
size (obj)
¶Gets the size of a table. -
-For tables, the size is [number-of-rows x number-of-variables].
-This is the same as [height(obj), width(obj)]
.
-
out =
splitapply (func, obj, G)
¶[Y1, …, YM] =
splitapply (func, obj, G)
¶Split table data into groups and apply function. -
-Performs a splitapply, using the variables in obj as the input X variables
-to the splitapply
function call.
-
See also: splitapply, table.groupby, tblish.table.grpstats -
-out =
splitvars (obj)
¶out =
splitvars (obj, vars)
¶out =
splitvars (…, 'NewVariableNames'
, NewVariableNames)
¶Split multicolumn table variables. -
-Splits multicolumn table variables into new single-column variables. -If vars is supplied, splits only those variables. If vars -is not supplied, splits all multicolumn variables. -
-obj =
squeeze (obj)
¶Remove singleton dimensions. -
-For tables, this is always a no-op that returns the input unmodified, -because tables always have exactly 2 dimensions, and 2-D arrays are unaffected -by squeeze. -
-out =
stack (obj, vars)
¶out =
stack (…, 'NewDataVariableName'
, NewDataVariableName)
¶out =
stack (…, 'IndexVariableName'
, IndexVariableName)
¶Stack multiple table variables into a single variable. -
-summary
(obj) ¶Display a summary of a table’s data. -
-Displays a summary of data in the input table. This will contain some -statistical information on each of its variables. The output is printed -to the Octave console (command window, stdout, or the like in your current -session), in a format suited for human consumption. The output format is -not fixed or formally defined, and may change over time. It is only -suitable for human display, and not for parsing or programmatic use. -
-This method supports, to some degree, extension by other packages. If your -Octave session has loaded other packages which supply extension implementaions -of ‘summary‘, Tablicious will use those in preference to its own internal -implementation, and you will get different, and hopefully better, output. -
-obj =
table ()
¶Constructs a new empty (0 rows by 0 variables) table. -
-obj =
table (var1, var2, …, varN)
¶Constructs a new table from the given variables. The variables passed as -inputs to this constructor become the variables of the table. Their names -are automatically detected from the input variable names that you used. -
-Note: If you call the constructor with exactly three arguments, and the first -argument is exactly the value ’__tblish_backdoor__’, that will trigger a special internal-use -backdoor calling form, and you will get incorrect results. This is a bug in -Tablicious. -
-obj =
table ('Size'
, sz, 'VariableTypes'
, varTypes)
¶Constructs a new table of the given size, and with the given variable types. -The variables will contain the default value for elements of that type. -
-obj =
table (…, 'VariableNames'
, varNames)
¶obj =
table (…, 'RowNames'
, rowNames)
¶Specifies the variable names or row names to use in the constructed table. -Overrides the implicit names garnered from the input variable names. -
-s =
table2struct (obj)
¶Converts obj to a homogeneous array. -
-c =
table2cell (obj)
¶Converts table to a cell array. Each variable in obj becomes -one or more columns in the output, depending on how many columns -that variable has. -
-Returns a cell array with the same number of rows as obj, and -with as many or more columns as obj has variables. -
-s =
table2struct (obj)
¶s =
table2struct (…, 'ToScalar'
, trueOrFalse)
¶Converts obj to a scalar structure or structure array. -
-Row names are not included in the output struct. To include them, you -must add them manually: - s = table2struct (tbl, ’ToScalar’, true); - s.RowNames = tbl.Properties.RowNames; -
-Returns a scalar struct or struct array, depending on the value of the
-ToScalar
option.
-
[C, ia, ib] =
union (A, B)
¶Set union. -
-Computes the union of two tables. The union is defined to be the unique -row values which are present in either of the two input tables. -
-Returns: - C - A table containing all the unique row values present in A or B. - ia - Row indexes into A of the rows from A included in C. - ib - Row indexes into B of the rows from B included in C. -
-out =
varfun (fcn, obj)
¶out =
varfun (…, 'OutputFormat'
, outputFormat)
¶out =
varfun (…, 'InputVariables'
, vars)
¶out =
varfun (…, 'ErrorHandler'
, errorFcn)
¶Apply function to table variables. -
-Applies the given function fcn to each variable in obj, -collecting the output in a table, cell array, or array of another type. -
-out =
varnames (obj)
¶out =
varnames (obj, varNames)
¶Get or set variable names for a table. -
-Returns cellstr in the getter form. Returns an updated datetime in the -setter form. -
-out =
vertcat (varargin)
¶Vertical concatenation. -
-Combines tables by vertically concatenating them. -
-Inputs that are not tables are automatically converted to tables by calling -table() on them. -
-The inputs must have the same number and names of variables, and their -variable value types and sizes must be cat-compatible. The types of the resulting -variables are the types that result from doing a ‘vertcat()‘ on the variables -from the corresponding input tables, in the order they were input in. -
-out =
tail (A)
¶out =
tail (A, k)
¶Get last K rows of an array. -
-Returns the array A, subsetted to its last k rows. This means
-subsetting it to the last (min (k, size (A, 1)))
elements along
-dimension 1, and leaving all other dimensions unrestricted.
-
A is the array to subset. -
-k is the number of rows to get. k defaults to 8 if it is omitted -or empty. -
-If there are less than k rows in A, returns all rows. -
-Returns an array of the same type as A, unless ()-indexing A -produces an array of a different type, in which case it returns that type. -
-See also: head -
-The tblish.dataset
class provides convenient access to the various
-datasets included with Tablicious.
-
This class just contains a bunch of static methods, each of which loads -the dataset of that name. It is provided as a convenience so you can use tab -completion or other run-time introspection on the dataset list. -
-out =
airmiles ()
¶Passenger Miles on Commercial US Airlines, 1937-1960 -
-The revenue passenger miles flown by commercial airlines in the -United States for each year from 1937 to 1960. -
-F.A.A. Statistical Handbook of Aviation. -
-t = tblish.dataset.airmiles; -plot (t.year, t.miles); -title ("airmiles data"); -xlabel ("Passenger-miles flown by U.S. commercial airlines") -ylabel ("airmiles"); - -
out =
AirPassengers ()
¶Monthly Airline Passenger Numbers 1949-1960 -
-The classic Box & Jenkins airline data. Monthly totals of international -airline passengers, 1949 to 1960. -
-Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976). Time Series -Analysis, Forecasting and Control. Third Edition. San Francisco: Holden-Day. -Series G. -
-## TODO: This example needs to be ported from R. - -
out =
airquality ()
¶New York Air Quality Measurements from 1973 -
-Daily air quality measurements in New York, May to September 1973. -
-Ozone
Ozone concentration (ppb) -
SolarR
Solar R (lang) -
Wind
Wind (mph) -
Temp
Temperature (degrees F) -
Month
Month (1-12) -
Day
Day of month (1-31) -
New York State Department of Conservation (ozone data) and the National -Weather Service (meteorological data). -
-Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A. (1983). -Graphical Methods for Data Analysis. Belmont, CA: Wadsworth. -
-t = tblish.dataset.airquality -# Plot a scatter-plot plus a fitted line, for each combination of measurements -vars = {"Ozone", "SolarR", "Wind", "Temp" "Month", "Day"}; -n_vars = numel (vars); -figure; -for i = 1:n_vars - for j = 1:n_vars - if (i == j) - continue - endif - ix_subplot = (n_vars * (j - 1) + i); - hax = subplot (n_vars, n_vars, ix_subplot); - var_x = vars{i}; - var_y = vars{j}; - x = t.(var_x); - y = t.(var_y); - scatter (hax, x, y, 10); - # Fit a cubic line to these points - # TODO: Find out exactly what kind of fitted line R's example is using, and - # port that. - hold on - p = polyfit (x, y, 3); - x_hat = unique(x); - p_y = polyval (p, x_hat); - plot (hax, x_hat, p_y, "r"); - endfor -endfor - -
out =
anscombe ()
¶Anscombe’s Quartet of “Identical” Simple Linear Regressions -
-Four sets of x/y pairs which have the same statistical properties, but are -very different. -
-The data comes in an array of 4 structs, each with fields as follows: -
-x
The X values for this pair. -
y
The Y values for this pair. -
Tufte, Edward R. (1989). The Visual Display of Quantitative Information. -13–14. Cheshire, CT: Graphics Press. -
-Anscombe, Francis J. (1973). Graphs in statistical analysis. The -American Statistician, 27, 17–21. -
-data = tblish.dataset.anscombe - -# Pick good limits for the plots -all_x = [data.x]; -all_y = [data.y]; -x_limits = [min(0, min(all_x)) max(all_x)*1.2]; -y_limits = [min(0, min(all_y)) max(all_y)*1.2]; - -# Do regression on each pair and plot the input and results -figure; -haxs = NaN (1, 4); -for i_pair = 1:4 - x = data(i_pair).x; - y = data(i_pair).y; - # TODO: Port the anova and other characterizations from the R code - # TODO: Do a linear regression and plot its line - hax = subplot (2, 2, i_pair); - haxs(i_pair) = hax; - xlabel (sprintf ("x%d", i_pair)); - ylabel (sprintf ("y%d", i_pair)); - scatter (x, y, "r"); -endfor - -# Fiddle with the plot axes parameters -linkaxes (haxs); -xlim (haxs(1), x_limits); -ylim (haxs(1), y_limits); - -
out =
attenu ()
¶Joyner-Boore Earthquake Attenuation Data -
-Event data for 23 earthquakes in California, showing peak accelerations. -
-event
Event number -
mag
Moment magnitude -
station
Station identifier -
dist
Station-hypocenter distance (km) -
accel
Peak acceleration (g) -
Joyner, W.B., D.M. Boore and R.D. Porcella (1981). Peak horizontal acceleration -and velocity from strong-motion records including records from the 1979 -Imperial Valley, California earthquake. USGS Open File report 81-365. Menlo -Park, CA. -
-Boore, D. M. and Joyner, W. B. (1982). The empirical prediction of ground -motion. Bulletin of the Seismological Society of America, 72, S269–S268. -
-# TODO: Port the example code from R -# It does coplot() and pairs(), which are higher-level plotting tools -# than core Octave provides. This could turn into a long example if we -# just use base Octave here. -
out =
attitude ()
¶The Chatterjee-Price Attitude Data -
-Aggregated data from a survey of clerical employees at a large financial -organization. -
-rating
Overall rating. -
complaints
Handling of employee complaints. -
privileges
Does not allow special privileges. -
learning
Opportunity to learn. -
raises
Raises based on performance. -
critical
Too critical. -
advance
Advancement. -
Chatterjee, S. and Price, B. (1977). Regression Analysis by Example. New York: -Wiley. (Section 3.7, p.68ff of 2nd ed.(1991).) -
-t = tblish.dataset.attitude - -tblish.examples.plot_pairs (t); - -# TODO: Display table summary - -# TODO: Whatever those statistical linear-model plots are that R is doing - - -
out =
austres ()
¶Australian Population -
-Numbers of Australian residents measured quarterly from March 1971 to March 1994. -
-date
The month of the observation. -
residents
The number of residents. -
Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and -Forecasting. New York: Springer-Verlag. -
-t = tblish.dataset.austres - -plot (datenum (t.date), t.residents); -datetick x -xlabel ("Month"); ylabel ("Residents"); title ("Australian Residents"); - -
out =
beavers ()
¶Body Temperature Series of Two Beavers -
-Body temperature readings for two beavers. -
-day
Day of observation (in days since the beginning of 1990), December 12–13 (beaver1) -and November 3–4 (beaver2). -
time
Time of observation, in the form 0330 for 3:30am -
temp
Measured body temperature in degrees Celsius. -
activ
Indicator of activity outside the retreat. -
P. S. Reynolds (1994) Time-series analyses of beaver body temperatures. -Chapter 11 of Lange, N., Ryan, L., Billard, L., Brillinger, D., Conquest, -L. and Greenhouse, J. (Eds.) (1994) Case Studies in Biometry. New York: John Wiley -and Sons. -
-# TODO: This example needs to be ported from R. -
out =
BJsales ()
¶Sales Data with Leading Indicator -
-Sales Data with Leading Indicator -
-record
Index of the record. -
lead
Leading indicator. -
sales
Sales volume. -
The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and -Control. San Francisco: Holden-Day. p. 537. -
-Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, -Second edition. New York: Springer-Verlag. p. 414. -
-# TODO: Come up with example code here - -
out =
BOD ()
¶Biochemical Oxygen Demand -
-Contains biochemical oxygen demand versus time in an evaluation of water quality. -
-Time
Time of the measurement (in days). -
demand
Biochemical oxygen demand (mg/l). -
Bates, D.M. and Watts, D.G. (1988). Nonlinear Regression Analysis and Its -Applications. New York: John Wiley & Sons. Appendix A1.4. -
-Originally from: Marske (1967). Biochemical Oxygen Demand Data -Interpretation Using Sum of Squares Surface, M.Sc. Thesis, University of -Wisconsin – Madison. -
-# TODO: Port this example from R - -
out =
cars ()
¶Speed and Stopping Distances of Cars -
-Speed of cars and distances taken to stop. Note that the data were recorded in the 1920s. -
-speed
Speed (mph). -
dist
Stopping distance (ft). -
Ezekiel, M. (1930). Methods of Correlation Analysis. New York: Wiley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-- -t = tblish.dataset.cars; - - -# TODO: Add Lowess smoothed lines to the plots - -figure; -plot (t.speed, t.dist, "o"); -xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)"); -title ("cars data"); - -figure; -loglog (t.speed, t.dist, "o"); -xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)"); -title ("cars data (logarithmic scales)"); - -# TODO: Do the linear model plot - -# Polynomial regression -figure; -plot (t.speed, t.dist, "o"); -xlabel ("Speed (mph)"); ylabel ("Stopping distance (ft)"); -title ("cars polynomial regressions"); -hold on -xlim ([0 25]); -x2 = linspace (0, 25, 200); -for degree = 1:4 - [P, S, mu] = polyfit (t.speed, t.dist, degree); - y2 = polyval(P, x2, [], mu); - plot (x2, y2); -endfor - - -
out =
ChickWeight ()
¶Weight versus age of chicks on different diets -
-weight
a numeric vector giving the body weight of the chick (gm). -
Time
a numeric vector giving the number of days since birth when the -measurement was made. -
Chick
an ordered factor with levels 18 < ... < 48 giving a unique -identifier for the chick. The ordering of the levels groups chicks on the same -diet together and orders them according to their final weight (lightest to -heaviest) within diet. -
Diet
a factor with levels 1, ..., 4 indicating which experimental diet -the chick received. -
Crowder, M. and Hand, D. (1990). Analysis of Repeated Measures. London: Chapman and -Hall. (example 5.3) -
-Hand, D. and Crowder, M. (1996), Practical Longitudinal Data Analysis. London: Chapman -and Hall. (table A.2) -
-Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS. -New York: Springer. -
-t = tblish.dataset.ChickWeight - -tblish.examples.coplot (t, "Time", "weight", "Chick"); - -
out =
chickwts ()
¶Chicken Weights by Feed Type -
-An experiment was conducted to measure and compare the effectiveness of various -feed supplements on the growth rate of chickens. -
-Newly hatched chicks were randomly allocated into six groups, and each group -was given a different feed supplement. Their weights in grams after six weeks -are given along with feed types. -
-weight
Chick weight at six weeks (gm). -
feed
Feed type. -
Anonymous (1948) Biometrika, 35, 214. -
-McNeil, D. R. (1977). Interactive Data Analysis
. New York: Wiley.
-
# This example requires the statistics package from Octave Forge - -t = tblish.dataset.chickwts - -# Boxplot by group -figure -g = groupby (t, "feed", { - "weight", @(x) {x}, "weight" -}); -boxplot (g.weight, 1); -xlabel ("feed"); ylabel ("Weight at six weeks (gm)"); -xticklabels ([{""} cellstr(g.feed')]); - -# Linear model -# TODO: This linear model thing and anova - -
out =
co2 ()
¶Mauna Loa Atmospheric CO2 Concentration -
-Atmospheric concentrations of CO2 are expressed in parts per million (ppm) and -reported in the preliminary 1997 SIO manometric mole fraction scale. Contains -monthly observations from 1959 to 1997. -
-date
Date of the month of the observation, as datetime. -
co2
CO2 concentration (ppm). -
The values for February, March and April of 1964 were missing and have -been obtained by interpolating linearly between the values for January -and May of 1964. -
-Keeling, C. D. and Whorf, T. P., Scripps Institution of Oceanography -(SIO), University of California, La Jolla, California USA 92093-0220. -
-ftp://cdiac.esd.ornl.gov/pub/maunaloa-co2/maunaloa.co2. -
-Cleveland, W. S. (1993). Visualizing Data
. New Jersey: Summit Press.
-
t = tblish.dataset.co2; - -plot (datenum (t.date), t.co2); -datetick ("x"); -xlabel ("Time"); ylabel ("Atmospheric concentration of CO2"); -title ("co2 data set"); - -
out =
crimtab ()
¶Student’s 3000 Criminals Data -
-Data of 3000 male criminals over 20 years old undergoing their sentences in the -chief prisons of England and Wales. -
-This dataset contains three separate variables. The finger_length
and
-body_height
variables correspond to the rows and columns of the
-count
matrix.
-
finger_length
Midpoints of intervals of finger lengths (cm). -
body_height
Body heights (cm). -
count
Number of prisoners in this bin. -
Student is the pseudonym of William Sealy Gosset. In his 1908 paper he wrote -(on page 13) at the beginning of section VI entitled Practical Test of the -forgoing Equations: -
-“Before I had succeeded in solving my problem analytically, I had endeavoured -to do so empirically. The material used was a correlation table containing -the height and left middle finger measurements of 3000 criminals, from a -paper by W. R. MacDonell (Biometrika, Vol. I., p. 219). The measurements -were written out on 3000 pieces of cardboard, which were then very thoroughly -shuffled and drawn at random. As each card was drawn its numbers were written -down in a book, which thus contains the measurements of 3000 criminals in a -random order. Finally, each consecutive set of 4 was taken as a sample—750 -in all—and the mean, standard deviation, and correlation of each sample -etermined. The difference between the mean of each sample and the mean of -the population was then divided by the standard deviation of the sample, giving -us the z of Section III.” -
-The table is in fact page 216 and not page 219 in MacDonell(1902). In the -MacDonell table, the middle finger lengths were given in mm and the heights -in feet/inches intervals, they are both converted into cm here. The midpoints -of intervals were used, e.g., where MacDonell has “4’ 7"9/16 – 8"9/16”, we -have 142.24 which is 2.54*56 = 2.54*(4’ 8"). -
-MacDonell credited the source of data (page 178) as follows: “The data on which -the memoir is based were obtained, through the kindness of Dr Garson, from the -Central Metric Office, New Scotland Yard... He pointed out on page 179 that: -“The forms were drawn at random from the mass on the office shelves; we are -therefore dealing with a random sampling.” -
-http://pbil.univ-lyon1.fr/R/donnees/criminals1902.txt thanks to Jean R. -Lobry and Anne-Béatrice Dufour. -
-Garson, J.G. (1900). The metric system of identification of criminals, as used -in in Great Britain and Ireland. The Journal of the Anthropological -Institute of Great Britain and Ireland, 30, 161–198. -
-MacDonell, W.R. (1902). On criminal anthropometry and the identification of -criminals. Biometrika, 1(2), 177–227. -
-Student (1908). The probable error of a mean. Biometrika
, 6, 1–25.
-
# TODO: Port this from R - -
out =
cupcake ()
¶Google Search popularity for "cupcake", 2004-2019 -
-Monthly popularity of worldwide Google search results for "cupcake", 2004-2019. -
-Month
Month when searches took place -
Cupcake
An indicator of search volume, in unknown units -
Google Trends, https://trends.google.com/trends/explore?q=%2Fm%2F03p1r4&date=all, -retrieved 2019-05-04 by Andrew Janke. -
-t = tblish.dataset.cupcake -plot (datenum (t.Month), t.Cupcake) -title ('“Cupcake” Google Searches'); xlabel ("Year"); ylabel ("Unknown popularity metric"); - -
out =
discoveries ()
¶Yearly Numbers of Important Discoveries -
-The numbers of “great” inventions and scientific discoveries in each year from 1860 to 1959. -
-year
Year. -
discoveries
Number of “great” discoveries that year. -
The World Almanac and Book of Facts, 1975 Edition, pages 315–318. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.discoveries; - -plot (t.year, t.discoveries); -xlabel ("Time"); ylabel ("Number of important discoveries"); -title ("discoveries data set"); - -
out =
DNase ()
¶Elisa assay of DNase -
-Data obtained during development of an ELISA assay for the recombinant protein DNase in rat serum. -
-Run
Ordered categorical
indicating the assay run.
-
conc
Known concentration of the protein (ng/ml). -
density
Measured optical density in the assay (dimensionless). -
Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated -Measurement Data. London: Chapman & Hall. (section 5.2.4, p. 134) -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
-t = tblish.dataset.DNase; - -# TODO: Port this from R - -tblish.examples.coplot (t, "conc", "density", "Run", "PlotFcn", @scatter); -tblish.examples.coplot (t, "conc", "density", "Run", "PlotFcn", @loglog, ... - "PlotArgs", {"o"}); - -
out =
esoph ()
¶Smoking, Alcohol and Esophageal Cancer -
-Data from a case-control study of (o)esophageal cancer in Ille-et-Vilaine, France. -
-item
Age group (years). -
alcgp
Alcohol consumption (gm/day). -
tobgp
Tobacco consumption (gm/day). -
ncases
Number of cases. -
ncontrols
Number of controls -
Breslow, N. E. and Day, N. E. (1980) Statistical Methods in Cancer Research. -Volume 1: The Analysis of Case-Control Studies. Oxford: IARC Lyon / Oxford University Press. -
-# TODO: Port this from R - -# TODO: Port the anova output - -# TODO: Port the fancy plot -# This involves a "mosaic plot", which is not supported by Octave, so this will -# take some work. - -
out =
euro ()
¶Conversion Rates of Euro Currencies -
-Conversion rates between the various Euro currencies. -
-This data comes in two separate variables. -
-euro
An 11-long vector of the value of 1 Euro in all participating currencies. -
euro_cross
An 11-by-11 matrix of conversion rates between various Euro currencies. -
euro_date
The date upon which these Euro conversion rates were fixed. -
The data set euro contains the value of 1 Euro in all currencies participating -in the European monetary union (Austrian Schilling ATS, Belgian Franc BEF, -German Mark DEM, Spanish Peseta ESP, Finnish Markka FIM, French Franc FRF, -Irish Punt IEP, Italian Lira ITL, Luxembourg Franc LUF, Dutch Guilder NLG and -Portuguese Escudo PTE). These conversion rates were fixed by the European -Union on December 31, 1998. To convert old prices to Euro prices, divide by the -respective rate and round to 2 digits. -
-Unknown. -
-This example data set was derived from the R 3.6.0 example datasets, and they -do not specify a source. -
-# TODO: Port this from R - -# TODO: Example conversion - -# TODO: "dot chart" showing euro-to-whatever conversion rates and vice versa - -
out =
eurodist ()
¶Distances Between European Cities and Between US Cities -
-eurodist
gives road distances (in km) between 21 cities in Europe. The
-data are taken from a table in The Cambridge Encyclopaedia.
-
UScitiesD
gives “straight line” distances between 10 cities in the US.
-
eurodist
????? -
TODO: Finish this. -
-Crystal, D. Ed. (1990). The Cambridge Encyclopaedia. Cambridge: -Cambridge University Press. -
-The US cities distances were provided by Pierre Legendre. -
-out =
EuStockMarkets ()
¶Daily Closing Prices of Major European Stock Indices -
-Contains the daily closing prices of major European stock indices: Germany DAX -(Ibis), Switzerland SMI, France CAC, and UK FTSE. The data are sampled in -business time, i.e., weekends and holidays are omitted. -
-A multivariate time series with 1860 observations on 4 variables. -
-The starting date is the 130th day of 1991, with a frequency of 260 observations -per year. -
-The data were kindly provided by Erste Bank AG, Vienna, Austria. -
-- -t = tblish.dataset.EuStockMarkets; - -# The fact that we're doing this munging means that table might have -# been the wrong structure for this data in the first place - -t2 = removevars (t, "day"); -index_names = t2.Properties.VariableNames; -day = 1:height (t2); -price = table2array (t2); - -price0 = price(1,:); - -rel_price = price ./ repmat (price0, [size(price, 1) 1]); - -figure; -plot (day, rel_price); -legend (index_names); -xlabel ("Business day"); -ylabel ("Relative price"); - - - -
out =
faithful ()
¶Old Faithful Geyser Data -
-Waiting time between eruptions and the duration of the eruption for the Old -Faithful geyser in Yellowstone National Park, Wyoming, USA. -
-eruptions
Eruption time (mins). -
waiting
Waiting time to next eruption (mins). -
W. Härdle. -
-Härdle, W. (1991). Smoothing Techniques with Implementation in S. New York: -Springer. -
-Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old -Faithful geyser. Applied Statistics, 39, 357–365. -
-t = tblish.dataset.faithful; - -# Munge the data, rounding eruption time to the second -e60 = 60 * t.eruptions; -ne60 = round (e60); -# TODO: Port zapsmall to Octave -eruptions = ne60 / 60; -# TODO: Display mean relative difference and bins summary - -# Histogram of rounded eruption times -figure -hist (ne60, max (ne60)) -xlabel ("Eruption time (sec)") -ylabel ("n") -title ("faithful data: Eruptions of Old Faithful") - -# Scatter plot of eruption time vs waiting time -figure -scatter (t.eruptions, t.waiting) -xlabel ("Eruption time (min)") -ylabel ("Waiting time to next eruption (min)") -title ("faithful data: Eruptions of Old Faithful") -# TODO: Port Lowess smoothing to Octave - -
out =
Formaldehyde ()
¶Determination of Formaldehyde -
-These data are from a chemical experiment to prepare a standard curve for the -determination of formaldehyde by the addition of chromatropic acid and -concentrated sulphuric acid and the reading of the resulting purple color on -a spectrophotometer. -
-record
Observation record number. -
carb
Carbohydrate (ml). -
optden
Optical Density -
Bennett, N. A. and N. L. Franklin (1954). Statistical Analysis in -Chemistry and the Chemical Industry. New York: Wiley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.Formaldehyde; - -figure -scatter (t.carb, t.optden) -# TODO: Add a linear model line -xlabel ("Carbohydrate (ml)") -ylabel ("Optical Density") -title ("Formaldehyde data") - -# TODO: Add linear model summary output -# TOD: Add linear model summary plot - -
out =
freeny ()
¶Freeny’s Revenue Data -
-Freeny’s data on quarterly revenue and explanatory variables. -
-Freeny’s dataset consists of one observed dependent variable -(revenue) and four explanatory variables (lagged quartery -revenue, price index, income level, and market potential). -
-date
Start date of the quarter for the observation. -
y
Observed quarterly revenue. -TODO: Determine units (probably millions of USD?) -
lag_quarterly_revenue
Quarterly revenue (y
), lagged 1 quarter.
-
price_index
A price index -
income_level
??? TODO: Fill this in -
market_potential
??? TODO: Fill this in -
Freeny, A. E. (1977). A Portable Linear Regression Package with Test -Programs. Bell Laboratories memorandum. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.freeny; - -summary (t) - -tblish.examples.plot_pairs (removevars (t, "date")) - -# TODO: Create linear model and print summary - -# TODO: Linear model plot - -
out =
HairEyeColor ()
¶Hair and Eye Color of Statistics Students -
-Distribution of hair and eye color and sex in 592 statistics students. -
-This data set comes in multiple variables -
-n
A 3-dimensional array containing the counts of students in each bucket. It -is arranged as hair-by-eye-by-sex. -
hair
Hair colors for the indexes along dimension 1. -
eye
Eye colors for the indexes along dimension 2. -
sex
Sexes for the indexes along dimension 3. -
The Hair x Eye table comes rom a survey of students at the University of -Delaware reported by Snee (1974). The split by Sex was added by Friendly -(1992a) for didactic purposes. -
-This data set is useful for illustrating various techniques for the analysis -of contingency tables, such as the standard chi-squared test or, more -generally, log-linear modelling, and graphical methods such as mosaic plots, -sieve diagrams or association plots. -
-http://euclid.psych.yorku.ca/ftp/sas/vcd/catdata/haireye.sas -
-Snee (1974) gives the two-way table aggregated over Sex. The Sex split of -the ‘Brown hair, Brown eye’ cell was changed to agree with that used by -Friendly (2000). -
-Snee, R. D. (1974). Graphical display of two-way contingency tables. -The American Statistician, 28, 9–12. -
-Friendly, M. (1992a). Graphical methods for categorical data. SAS User -Group International Conference Proceedings, 17, 190–200. -http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html -
-Friendly, M. (1992b). Mosaic displays for loglinear models. Proceedings -of the Statistical Graphics Section, American Statistical Association, pp. -61–68. http://www.math.yorku.ca/SCS/Papers/asa92.html -
-Friendly, M. (2000). Visualizing Categorical Data. SAS Institute, -ISBN 1-58025-660-0. -
-tblish.dataset.HairEyeColor - -# TODO: Aggregate over sex and display a table of counts - -# TODO: Port mosaic plot to Octave - -
out =
Harman23cor ()
¶Harman Example 2.3 -
-A correlation matrix of eight physical measurements on 305 girls between -ages seven and seventeen. -
-cov
An 8-by-8 correlation matrix. -
names
Names of the variables corresponding to the indexes of the correlation matrix’s -dimensions. -
Harman, H. H. (1976). Modern Factor Analysis, Third Edition Revised. -Chicago: University of Chicago Press. Table 2.3. -
-tblish.dataset.Harman23cor; - -# TODO: Port factanal to Octave - -
out =
Harman74cor ()
¶Harman Example 7.4 -
-A correlation matrix of 24 psychological tests given to 145 seventh and -eighth-grade children in a Chicago suburb by Holzinger and Swineford. -
-cov
A 2-dimensional correlation matrix. -
vars
Names of the variables corresponding to the indexes along the dimensions of
-cov
.
-
Harman, H. H. (1976). Modern Factor Analysis, Third Edition -Revised. Chicago: University of Chicago Press. Table 7.4. -
-tblish.dataset.Harman74cor; - -# TODO: Port factanal to Octave - -
out =
Indometh ()
¶Pharmacokinetics of Indomethacin -
-Data on the pharmacokinetics of indometacin (or, older spelling, -‘indomethacin’). -
-Subject
Subject identifier. -
time
Time since drug administration at which samples were drawn (hours). -
conc
Plasma concentration of indomethacin (mcg/ml). -
Each of the six subjects were given an intravenous injection of indometacin. -
-Kwan, Breault, Umbenhauer, McMahon and Duggan (1976). Kinetics of -Indomethacin absorption, elimination, and enterohepatic circulation in man. -Journal of Pharmacokinetics and Biopharmaceutics 4, 255–280. -
-Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated -Measurement Data. London: Chapman & Hall. (section 5.2.4, p. 129) -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
- -out =
infert ()
¶Infertility after Spontaneous and Induced Abortion -
-This is a matched case-control study dating from before the availability of -conditional logistic regression. -
-education
Index of the record. -
age
Age in years of case. -
parity
Count. -
induced
Number of prior induced abortions, grouped into “0”, “1”, or “2 or more”. -
case_status
0 = control, 1 = case. -
spontaneous
Number of prior spontaneous abortions, grouped into “0”, “1”, or “2 or more”. -
stratum
Matched set number. -
pooled_stratum
Stratum number. -
One case with two prior spontaneous abortions and two prior induced abortions is omitted. -
-Trichopoulos et al (1976). Br. J. of Obst. and Gynaec. 83, 645–650. -
-t = tblish.dataset.infert; - -# TODO: Port glm() (generalized linear model) stuff to Octave - -
out =
InsectSprays ()
¶Effectiveness of Insect Sprays -
-The counts of insects in agricultural experimental units treated with different -insecticides. -
-spray
The type of spray. -
count
Insect count. -
Beall, G., (1942). The Transformation of data from entomological field -experiments. Biometrika, 29, 243–262. -
-McNeil, D. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.InsectSprays; - -# TODO: boxplot - -# TODO: AOV plots - -
out =
iris ()
¶The Fisher Iris dataset: measurements of various flowers -
-This is the classic Fisher Iris dataset. -
-Species
The species of flower being measured. -
SepalLength
Length of sepals, in centimeters. -
SepalWidth
Width of sepals, in centimeters. -
PetalLength
Length of petals, in centimeters. -
PetalWidth
Width of petals, in centimeters. -
http://archive.ics.uci.edu/ml/datasets/Iris -
-https://en.wikipedia.org/wiki/Iris_flower_data_set -
-Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. -Annals of Eugenics, 7, Part II, 179-188. also in Contributions -to Mathematical Statistics (John Wiley, NY, 1950). -
-Duda, R.O., & Hart, P.E. (1973). Pattern Classification and Scene Analysis. -(Q327.D83) New York: John Wiley & Sons. ISBN 0-471-22361-1. See page 218. -
-The data were collected by Anderson, Edgar (1935). The irises of the Gaspe -Peninsula. Bulletin of the American Iris Society, 59, 2–5. -
-# TODO: Port this example from R - -
out =
islands ()
¶Areas of the World’s Major Landmasses -
-The areas in thousands of square miles of the landmasses which exceed 10,000 -square miles. -
-name
The name of the island. -
area
The area, in thousands of square miles. -
The World Almanac and Book of Facts, 1975, page 406. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.islands; - -# TODO: Port dot chart to Octave - -
out =
JohnsonJohnson ()
¶Quarterly Earnings per Johnson & Johnson Share -
-Quarterly earnings (dollars) per Johnson & Johnson share 1960–80. -
-date
Start date of the quarter. -
earnings
Earnings per share (USD). -
Shumway, R. H. and Stoffer, D. S. (2000). Time Series Analysis and its -Applications. Second Edition. New York: Springer. Example 1.1. -
-t = tblish.dataset.JohnsonJohnson - -# TODO: Yikes, look at all those plots. Port them to Octave. - -
out =
LakeHuron ()
¶Level of Lake Huron 1875-1972 -
-Annual measurements of the level, in feet, of Lake Huron 1875–1972. -
-year
Year of the measurement -
level
Lake level (ft). -
Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting -Methods. Second edition. New York: Springer. Series A, page 555. -
-Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series -and Forecasting. New York: Springer. Sections 5.1 and 7.6. -
-t = tblish.dataset.LakeHuron; - -plot (t.year, t.level) -xlabel ("Year") -ylabel ("Lake level (ft)") -title ("Level of Lake Huron") - -
out =
lh ()
¶Luteinizing Hormone in Blood Samples -
-A regular time series giving the luteinizing hormone in blood samples at 10 -minute intervals from a human female, 48 samples. -
-sample
The number of the observation. -
lh
Level of luteinizing hormone. -
P.J. Diggle (1990). Time Series: A Biostatistical Introduction. Oxford. -Table A.1, series 3. -
-t = tblish.dataset.lh; - -plot (t.sample, t.lh); -xlabel ("Sample Number"); -ylabel ("lh level"); - -
out =
LifeCycleSavings ()
¶Intercountry Life-Cycle Savings Data -
-Data on the savings ratio 1960–1970. -
-country
Name of the country. -
sr
Aggregate personal savings. -
pop15
Percentage of population under 15. -
pop75
Percentage of population over 75. -
dpi
Real per-capita disposable income. -
ddpi
Percent growth rate of dpi. -
Under the life-cycle savings hypothesis as developed by Franco Modigliani, the -savings ratio (aggregate personal saving divided by disposable income) is -explained by per-capita disposable income, the percentage rate of change in -per-capita disposable income, and two demographic variables: the percentage -of population less than 15 years old and the percentage of the population over -75 years old. The data are averaged over the decade 1960–1970 to remove the -business cycle or other short-term fluctuations. -
-The data were obtained from Belsley, Kuh and Welsch (1980). They in turn -obtained the data from Sterling (1977). -
-Sterling, Arnie (1977). Unpublished BS Thesis. Massachusetts Institute of -Technology. -
-Belsley, D. A., Kuh. E. and Welsch, R. E. (1980). Regression Diagnostics. -New York: Wiley. -
-t = tblish.dataset.LifeCycleSavings; - -# TODO: linear model - -# TODO: pairs plot with Lowess smoothed line - -
out =
Loblolly ()
¶Growth of Loblolly pine trees -
-Records of the growth of Loblolly pine trees. -
-height
Tree height (ft). -
age
Tree age (years). -
Seed
Seed source for the tree. Ordering is according to increasing maximum height. -
Kung, F. H. (1986). Fitting logistic growth curve with predetermined carrying -capacity. Proceedings of the Statistical Computing Section, American -Statistical Association, 340–343. -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
-t = tblish.dataset.Loblolly; - -t2 = t(t.Seed == "329",:); -scatter (t2.age, t2.height) -xlabel ("Tree age (yr)"); -ylabel ("Tree height (ft)"); -title ("Loblolly data and fitted curve (Seed 329 only)") - -# TODO: Compute and plot fitted curve - -
out =
longley ()
¶Longley’s Economic Regression Data -
-A macroeconomic data set which provides a well-known example for a highly -collinear regression. -
-Year
The year. -
GNP_deflator
GNP implicit price deflator (1954=100). -
GNP
Gross National Product. -
Unemployed
Number of unemployed. -
Armed_Forces
Number of people in the armed forces. -
Population
“Noninstitutionalized” population ≥ 14 years of age. -
Employed
Number of people employed. -
J. W. Longley (1967). An appraisal of least-squares programs from the point of -view of the user. Journal of the American Statistical Association, 62, -819–841. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.longley; - -# TODO: Linear model -# TODO: opar plot - -
out =
lynx ()
¶Annual Canadian Lynx trappings 1821-1934 -
-Annual numbers of lynx trappings for 1821–1934 in Canada. Taken from Brockwell -& Davis (1991), this appears to be the series considered by Campbell & Walker -(1977). -
-year
Year of the record. -
lynx
Number of lynx trapped. -
Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting -Methods. Second edition. New York: Springer. Series G (page 557). -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-Campbell, M. J. and Walker, A. M. (1977). A Survey of statistical work on -the Mackenzie River series of annual Canadian lynx trappings for the years -1821–1934 and a new analysis. Journal of the Royal Statistical Society -series A, 140, 411–431. -
-t = tblish.dataset.lynx; - -plot (t.year, t.lynx); -xlabel ("Year"); -ylabel ("Lynx Trapped"); - -
out =
morley ()
¶Michelson Speed of Light Data -
-A classical data of Michelson (but not this one with Morley) on measurements -done in 1879 on the speed of light. The data consists of five experiments, -each consisting of 20 consecutive ‘runs’. The response is the speed of -light measurement, suitably coded (km/sec, with 299000 subtracted). -
-Expt
The experiment number, from 1 to 5. -
Run
The run number within each experiment. -
Speed
Speed-of-light measurement. -
The data is here viewed as a randomized block experiment with experiment
-and run
as the factors. run
may also be considered a quantitative
-variate to account for linear (or polynomial) changes in the measurement over
-the course of a single experiment.
-
A. J. Weekes (1986). A Genstat Primer. London: Edward Arnold. -
-S. M. Stigler (1977). Do robust estimators work with real data? Annals -of Statistics 5, 1055–1098. (See Table 6.) -
-A. A. Michelson (1882). Experimental determination of the velocity of -light made at the United States Naval Academy, Annapolis. Astronomic -Papers, 1, 135–8. U.S. Nautical Almanac Office. (See Table 24.). -
-t = tblish.dataset.morley; - -# TODO: Port to Octave - -
out =
mtcars ()
¶Motor Trend 1974 Car Road Tests -
-The data was extracted from the 1974 Motor Trend US magazine, and -comprises fuel consumption and 10 aspects of automobile design and -performance for 32 automobiles (1973–74 models). -
-mpg
Fuel efficiency in miles/gallon -
cyl
Number of cylinders -
disp
Displacement (cu. in.) -
hp
Gross horsepower -
drat
Rear axle ratio -
wt
Weight (1,000 lbs) -
qsec
1/4 mile time -
vs
Engine type (0 = V-shaped, 1 = straight) -
am
Transmission type (0 = automatic, 1 = manual) -
gear
Number of forward gears -
carb
Number of carburetors -
Henderson and Velleman (1981) comment in a footnote to Table 1: “Hocking -[original transcriber]’s noncrucial coding of the Mazda’s rotary engine -as a straight six-cylinder engine and the Porsche’s flat engine as a V -engine, as well as the inclusion of the diesel Mercedes 240D, have been -retained to enable direct comparisons to be made with previous analyses.” -
-Henderson and Velleman (1981). Building multiple regression models -interactively. Biometrics, 37, 391–411. -
-# TODO: Port this example from R -
out =
nhtemp ()
¶Average Yearly Temperatures in New Haven -
-The mean annual temperature in degrees Fahrenheit in New Haven, Connecticut, -from 1912 to 1971. -
-year
Year of the observation. -
temp
Mean annual temperature (degrees F). -
Vaux, J. E. and Brinker, N. B. (1972) Cycles, 1972, 117–121. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.nhtemp; - -plot (t.year, t.temp); -title ("nhtemp data"); -xlabel ("Mean annual temperature in New Haven, CT (deg. F)"); - -
out =
Nile ()
¶Flow of the River Nile -
-Measurements of the annual flow of the river Nile at Aswan (formerly Assuan), -1871–1970, in m^3, “with apparent changepoint near 1898” -(Cobb(1978), Table 1, p.249). -
-year
Year of the record. -
flow
Annual flow (cubic meters). -
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/DKbook.html -
-Balke, N. S. (1993). Detecting level shifts in time series. Journal of -Business and Economic Statistics, 11, 81–92. -
-Cobb, G. W. (1978). The problem of the Nile: conditional solution to a -change-point problem. Biometrika 65, 243–51. -
-t = tblish.dataset.Nile; - -figure -plot (t.year, t.flow); - -# TODO: Port the rest of the example to Octave - -
out =
nottem ()
¶Average Monthly Temperatures at Nottingham, 1920-1939 -
-A time series object containing average air temperatures at -Nottingham Castle in degrees Fahrenheit for 20 years. -
-record
Index of the record. -
lead
Leading indicator. -
sales
Sales volume. -
Anderson, O. D. (1976). Time Series Analysis and Forecasting: -The Box-Jenkins approach. London: Butterworths. Series R. -
-# TODO: Come up with example code here - -
out =
npk ()
¶Classical N, P, K Factorial Experiment -
-A classical N, P, K (nitrogen, phosphate, potassium) factorial experiment -on the growth of peas conducted on 6 blocks. Each half of a fractional -factorial design confounding the NPK interaction was used on 3 of the plots. -
-block
Which block (1 to 6). -
N
Indicator (0/1) for the application of nitrogen. -
P
Indicator (0/1) for the application of phosphate. -
K
Indicator (0/1) for the application of potassium. -
yield
Yield of peas, in pounds/plot. Plots were 1/70 acre. -
Imperial College, London, M.Sc. exercise sheet. -
-Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics -with S. Fourth edition. New York: Springer. -
-t = tblish.dataset.npk; - -# TODO: Port aov() and LM to Octave - -
out =
occupationalStatus ()
¶Occupational Status of Fathers and their Sons -
-Cross-classification of a sample of British males according to each subject’s -occupational status and his father’s occupational status. -
-An 8-by-8 matrix of counts, with classifying fators origin
(father’s
-occupational status, levels 1:8) and destination
(son’s
-occupational status, levels 1:8).
-
Goodman, L. A. (1979). Simple Models for the Analysis of Association in -Cross-Classifications having Ordered Categories. J. Am. Stat. -Assoc., 74 (367), 537–552. -
-# TODO: Come up with example code here - -
out =
Orange ()
¶Growth of Orange Trees -
-Records of the growth of orange trees. -
-Tree
A categorical indicating on which tree the measurement is made. -Ordering is according to increasing maximum diameter. -
age
Age of the tree (days since 1968-12-31). -
circumference
Trunk circumference (mm). -This is probably “circumference at breast height”, a standard measurement in forestry. -
The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Draper, N. R. and Smith, H. (1998). Applied Regression Analysis (3rd ed). -New York: Wiley. (exercise 24.N). -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in S and -S-PLUS. New York: Springer. -
-t = tblish.dataset.Orange; - -# TODO: Port coplot to Octave - -# TODO: Linear model - -
out =
OrchardSprays ()
¶Potency of Orchard Sprays -
-An experiment was conducted to assess the potency of various constituents -of orchard sprays in repelling honeybees, using a Latin square design. -
-rowpos
Row of the design. -
colpos
Column of the design -
treatment
Treatment level. -
decrease
Response. -
Individual cells of dry comb were filled with measured amounts of lime -sulphur emulsion in sucrose solution. Seven different concentrations of lime -sulphur ranging from a concentration of 1/100 to 1/1,562,500 in successive -factors of 1/5 were used as well as a solution containing no lime sulphur. -
-The responses for the different solutions were obtained by releasing 100 -bees into the chamber for two hours, and then measuring the decrease in volume -of the solutions in the various cells. -
-An 8 x 8 Latin square design was used and the treatments were coded as follows: -
-A – highest level of lime sulphur -B – next highest level of lime sulphur -… -G – lowest level of lime sulphur -H – no lime sulphur -
-Finney, D. J. (1947). Probit Analysis. Cambridge. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.OrchardSprays; - -tblish.examples.plot_pairs (t); - -
out =
PlantGrowth ()
¶Results from an Experiment on Plant Growth -
-Results from an experiment to compare yields (as measured by dried weight of -plants) obtained under a control and two different treatment conditions. -
-group
Treatment condition group. -
weight
Weight of plants. -
Dobson, A. J. (1983). An Introduction to Statistical Modelling. -London: Chapman and Hall. -
-t = tblish.dataset.PlantGrowth; - -# TODO: Port anova to Octave - -
out =
precip ()
¶Annual Precipitation in US Cities -
-The average amount of precipitation (rainfall) in inches for each of 70 United -States (and Puerto Rico) cities. -
-city
City observed. -
precip
Annual precipitation (in). -
Statistical Abstracts of the United States, 1975. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.precip; - -# TODO: Port dot plot to Octave - -
out =
presidents ()
¶Quarterly Approval Ratings of US Presidents -
-The (approximately) quarterly approval rating for the President of the United -States from the first quarter of 1945 to the last quarter of 1974. -
-date
Approximate date of the observation. -
approval
Approval rating (%). -
The data are actually a fudged version of the approval ratings. See McNeil’s book -for details. -
-The Gallup Organisation. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.presidents; - -figure -plot (datenum (t.date), t.approval) -datetick ("x") -xlabel ("Date") -ylabel ("Approval rating (%)") -title ("presidents data") - -
out =
pressure ()
¶Vapor Pressure of Mercury as a Function of Temperature -
-Data on the relation between temperature in degrees Celsius and vapor pressure -of mercury in millimeters (of mercury). -
-temperature
Temperature (deg C). -
pressure
Pressure (mm Hg). -
Weast, R. C., ed. (1973). Handbook of Chemistry and Physics. Cleveland: CRC Press. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.pressure; - -figure -plot (t.temperature, t.pressure) -xlabel ("Temperature (deg C)") -ylabel ("Pressure (mm of Hg)") -title ("pressure data: Vapor Pressure of Mercury") - -figure -semilogy (t.temperature, t.pressure) -xlabel ("Temperature (deg C)") -ylabel ("Pressure (mm of Hg)") -title ("pressure data: Vapor Pressure of Mercury") - - -
out =
Puromycin ()
¶Reaction Velocity of an Enzymatic Reaction -
-Reaction velocity versus substrate concentration in an enzymatic reaction -involving untreated cells or cells treated with Puromycin. -
-state
Whether the cell was treated. -
conc
Substrate concentrations (ppm). -
rate
Instantaneous reaction rates (counts/min/min). -
Data on the velocity of an enzymatic reaction were obtained by Treloar -(1974). The number of counts per minute of radioactive product from the -reaction was measured as a function of substrate concentration in parts per -million (ppm) and from these counts the initial rate (or velocity) of the -reaction was calculated (counts/min/min). The experiment was conducted once -with the enzyme treated with Puromycin, and once with the enzyme untreated. -
-The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Bates, D.M. and Watts, D.G. (1988). Nonlinear Regression Analysis and -Its Applications. New York: Wiley. Appendix A1.3. -
-Treloar, M. A. (1974). Effects of Puromycin on Galactosyltransferase -in Golgi Membranes. M.Sc. Thesis, U. of Toronto. -
-t = tblish.dataset.Puromycin; - -# TODO: Port example to Octave - -
out =
quakes ()
¶Locations of Earthquakes off Fiji -
-The data set give the locations of 1000 seismic events of MB > 4.0. The events -occurred in a cube near Fiji since 1964. -
-lat
Latitude of event. -
long
Longitude of event. -
depth
Depth (km). -
mag
Richter magnitude. -
stations
Number of stations reporting. -
There are two clear planes of seismic activity. One is a major plate junction; -the other is the Tonga trench off New Zealand. These data constitute a subsample -from a larger dataset of containing 5000 observations. -
-This is one of the Harvard PRIM-H project data sets. They in turn obtained it -from Dr. John Woodhouse, Dept. of Geophysics, Harvard University. -
-G. E. P. Box and G. M. Jenkins (1976). Time Series Analysis, Forecasting and -Control. San Francisco: Holden-Day. p. 537. -
-P. J. Brockwell and R. A. Davis (1991). Time Series: Theory and Methods. -Second edition. New York: Springer-Verlag. p. 414. -
-# TODO: Come up with example code here - -
out =
randu ()
¶Random Numbers from Congruential Generator RANDU -
-400 triples of successive random numbers were taken from the VAX FORTRAN -function RANDU running under VMS 1.5. -
-record
Index of the record. -
x
X value of the triple. -
y
Y value of the triple. -
z
Z value of the triple. -
In three dimensional displays it is evident that the triples fall on 15 -parallel planes in 3-space. This can be shown theoretically to be true -for all triples from the RANDU generator. -
-These particular 400 triples start 5 apart in the sequence, that is they -are ((U[5i+1], U[5i+2], U[5i+3]), i= 0, ..., 399), and they are rounded -to 6 decimal places. -
-Under VMS versions 2.0 and higher, this problem has been fixed. -
-David Donoho -
-t = tblish.dataset.randu; - - -
out =
rivers ()
¶Lengths of Major North American Rivers -
-This data set gives the lengths (in miles) of 141 “major” rivers in North -America, as compiled by the US Geological Survey. -
-rivers
A vector containing 141 observations. -
World Almanac and Book of Facts, 1975, page 406. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.rivers; - -longest_river = max (rivers) -shortest_river = min (rivers) - -
out =
rock ()
¶Measurements on Petroleum Rock Samples -
-Measurements on 48 rock samples from a petroleum reservoir. -
-area
Area of pores space, in pixels out of 256 by 256. -
peri
Perimeter in pixels. -
shape
Perimeter/sqrt(area). -
perm
Permeability in milli-Darcies. -
Twelve core samples from petroleum reservoirs were sampled by 4 -cross-sections. Each core sample was measured for permeability, and each -cross-section has total area of pores, total perimeter of pores, and shape. -
-Data from BP Research, image analysis by Ronit Katz, U. Oxford. -
-t = tblish.dataset.rock; - -figure -scatter (t.area, t.perm) -xlabel ("Area of pores space (pixels out of 256x256)") -ylabel ("Permeability (milli-Darcies)") - -
out =
sleep ()
¶Student’s Sleep Data -
-Data which show the effect of two soporific drugs (increase in hours of sleep -compared to control) on 10 patients. -
-id
Patient ID. -
group
Drug given. -
extra
Increase in hours of sleep. -
The group
variable name may be misleading about the data: They
-represent measurements on 10 persons, not in groups.
-
Cushny, A. R. and Peebles, A. R. (1905). The action of optical isomers: -II hyoscines. The Journal of Physiology, 32, 501–510. -
-Student (1908). The probable error of the mean. Biometrika, 6, 20. -
-Scheffé, Henry (1959). The Analysis of Variance. New York, NY: Wiley. -
-t = tblish.dataset.sleep; - -# TODO: Port to Octave - -
out =
stackloss ()
¶Brownlee’s Stack Loss Plant Data -
-Operational data of a plant for the oxidation of ammonia to nitric acid. -
-AirFlow
Flow of cooling air. -
WaterTemp
Cooling Water Inlet temperature. -
AcidConc
Concentration of acid (per 1000, minus 500). -
StackLoss
Stack loss -
“Obtained from 21 days of operation of a plant for the oxidation of ammonia -(NH3) to nitric acid (HNO3). The nitric oxides produced are absorbed in a -countercurrent absorption tower”. (Brownlee, cited by Dodge, slightly reformatted by MM.) -
-AirFlow
represents the rate of operation of the plant. WaterTemp
is the
-temperature of cooling water circulated through coils in the absorption tower.
-AcidConc
is the concentration of the acid circulating, minus 50, times 10:
-that is, 89 corresponds to 58.9 per cent acid. StackLoss
(the dependent variable)
-is 10 times the percentage of the ingoing ammonia to the plant that escapes from
-the absorption column unabsorbed; that is, an (inverse) measure of the over-all
-efficiency of the plant.
-
Brownlee, K. A. (1960, 2nd ed. 1965). Statistical Theory and Methodology -in Science and Engineering. New York: Wiley. pp. 491–500. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-Dodge, Y. (1996). The guinea pig of multiple regression. In: Robust -Statistics, Data Analysis, and Computer Intensive Methods; In Honor of -Peter Huber’s 60th Birthday, 1996, Lecture Notes in Statistics -109, Springer-Verlag, New York. -
-t = tblish.dataset.stackloss; - -# TODO: Create linear model and print summary - -
out =
state ()
¶US State Facts and Figures -
-Data related to the 50 states of the United States of America. -
-abb
State abbreviation. -
name
State name. -
area
Area (sq mi). -
lat
Approximate center (latitude). -
lon
Approximate center (longitude). -
division
State division. -
revion
State region. -
Population
Population estimate as of July 1, 1975. -
Income
Per capita income (1974). -
Illiteracy
Illiteracy as of 1970 (percent of population). -
LifeExp
Lfe expectancy in years (1969-71). -
Murder
Murder and non-negligent manslaughter rate per 100,000 population (1976). -
HSGrad
Percent high-school graduates (1970). -
Frost
Mean number of days with minimum temperature below freezing (1931-1960) -in capital or large city. -
U.S. Department of Commerce, Bureau of the Census (1977) Statistical -Abstract of the United States. -
-U.S. Department of Commerce, Bureau of the Census (1977) County -and City Data Book. -
-Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.state; - -
out =
sunspot_month ()
¶Monthly Sunspot Data, from 1749 to “Present” -
-Monthly numbers of sunspots, as from the World Data Center, aka SIDC. This -is the version of the data that may occasionally be updated when new counts -become available. -
-month
Month of the observation. -
sunspots
Number of sunspots. -
WDC-SILSO, Solar Influences Data Analysis Center (SIDC), Royal Observatory -of Belgium, Av. Circulaire, 3, B-1180 BRUSSELS. -Currently at http://www.sidc.be/silso/datafiles. -
-t = tblish.dataset.sunspot_month; - - -
out =
sunspot_year ()
¶Yearly Sunspot Data, 1700-1988 -
-Yearly numbers of sunspots from 1700 to 1988 (rounded to one digit). -
-year
Year of the observation. -
sunspots
Number of sunspots. -
H. Tong (1996) Non-Linear Time Series. Clarendon Press, Oxford, p. 471. -
-t = tblish.dataset.sunspot_year; - -figure -plot (t.year, t.sunspots) -xlabel ("Year") -ylabel ("Sunspots") - -
out =
sunspots ()
¶Monthly Sunspot Numbers, 1749-1983 -
-Monthly mean relative sunspot numbers from 1749 to 1983. Collected at Swiss -Federal Observatory, Zurich until 1960, then Tokyo Astronomical Observatory. -
-month
Month of the observation. -
sunspots
Number of observed sunspots. -
Andrews, D. F. and Herzberg, A. M. (1985) Data: A Collection -of Problems from Many Fields for the Student and Research Worker. -New York: Springer-Verlag. -
-t = tblish.dataset.sunspots; - -figure -plot (datenum (t.month), t.sunspots) -datetick ("x") -xlabel ("Date") -ylabel ("Monthly sunspot numbers") -title ("sunspots data") - - -
out =
swiss ()
¶Swiss Fertility and Socioeconomic Indicators (1888) Data -
-Standardized fertility measure and socio-economic indicators for each of 47 -French-speaking provinces of Switzerland at about 1888. -
-Fertility
Ig, ‘common standardized fertility measure’. -
Agriculture
% of males involved in agriculture as occupation. -
Examination
% draftees receiving highest mark on army examination. -
Education
% education beyond primary school for draftees. -
Catholic
% ‘Catholic’ (as opposed to ‘Protestant’). -
InfantMortality
Live births who live less than 1 year. -
All variables but ‘Fertility’ give proportions of the population. -
-(paraphrasing Mosteller and Tukey): -
-Switzerland, in 1888, was entering a period known as the demographic transition; -i.e., its fertility was beginning to fall from the high level typical of -underdeveloped countries. -
-The data collected are for 47 French-speaking “provinces” at about 1888. -
-Here, all variables are scaled to [0, 100], where in the original, all but
-Catholic
were scaled to [0, 1].
-
Files for all 182 districts in 1888 and other years have been available at -https://opr.princeton.edu/archive/pefp/switz.aspx. -
-They state that variables Examination
and Education
are averages
-for 1887, 1888 and 1889.
-
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S -Language. Monterey: Wadsworth & Brooks/Cole. -
-t = tblish.dataset.swiss; - -# TODO: Port linear model to Octave - -
out =
Theoph ()
¶Pharmacokinetics of Theophylline -
-An experiment on the pharmacokinetics of theophylline. -
-Subject
Categorical identifying the subject on whom the observation was made. The -ordering is by increasing maximum concentration of theophylline observed. -
Wt
Weight of the subject (kg). -
Dose
Dose of theophylline administerred orally to the subject (mg/kg). -
Time
Time since drug administration when the sample was drawn (hr). -
conc
Theophylline concentration in the sample (mg/L). -
Boeckmann, Sheiner and Beal (1994) report data from a study by Dr. Robert -Upton of the kinetics of the anti-asthmatic drug theophylline. Twelve subjects -were given oral doses of theophylline then serum concentrations were measured -at 11 time points over the next 25 hours. -
-These data are analyzed in Davidian and Giltinan (1995) and Pinheiro and Bates -(2000) using a two-compartment open pharmacokinetic model, for which a -self-starting model function, SSfol, is available. -
-The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Boeckmann, A. J., Sheiner, L. B. and Beal, S. L. (1994). NONMEM Users -Guide: Part V. NONMEM Project Group, University of California, San Francisco. -
-Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated -Measurement Data. London: Chapman & Hall. (section 5.5, p. 145 and section 6.6, p. 176) -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models in -S and S-PLUS. New York: Springer. (Appendix A.29) -
-t = tblish.dataset.Theoph; - -# TODO: Coplot -# TODO: Yet another linear model to port to Octave - -
out =
Titanic ()
¶Survival of passengers on the Titanic -
-This data set provides information on the fate of passengers on the fatal -maiden voyage of the ocean liner ‘Titanic’, summarized according to -economic status (class), sex, age and survival. -
-n
is a 4-dimensional array resulting from cross-tabulating 2201 observations
-on 4 variables. The dimensions of the array correspond to the following variables:
-
Class
1st, 2nd, 3rd, Cre. -
Sex
Male, Female. -
Age
Child, Adult. -
Survived
No, Yes. -
The sinking of the Titanic is a famous event, and new books are still being -published about it. Many well-known facts—from the proportions of first-class -passengers to the ‘women and children first’ policy, and the fact that that -policy was not entirely successful in saving the women and children in the -third class—are reflected in the survival rates for various classes of -passenger. -
-These data were originally collected by the British Board of Trade in their -investigation of the sinking. Note that there is not complete agreement among -primary sources as to the exact numbers on board, rescued, or lost. -
-Due in particular to the very successful film ‘Titanic’, the last years saw a -rise in public interest in the Titanic. Very detailed data about the passengers -is now available on the Internet, at sites such as Encyclopedia Titanica -(https://www.encyclopedia-titanica.org/). -
-Dawson, Robert J. MacG. (1995). The ‘Unusual Episode’ Data Revisited. -Journal of Statistics Education, 3. -
-The source provides a data set recording class, sex, age, and survival status -for each person on board of the Titanic, and is based on data originally -collected by the British Board of Trade and reprinted in: -
-British Board of Trade (1990). Report on the Loss of the ‘Titanic’ -(S.S.). British Board of Trade Inquiry Report (reprint). Gloucester, -UK: Allan Sutton Publishing. -
-tblish.dataset.Titanic; - -# TODO: Port mosaic plot to Octave - -# TODO: Check for higher survival rates in children and females - -
out =
ToothGrowth ()
¶The Effect of Vitamin C on Tooth Growth in Guinea Pigs -
-The response is the length of odontoblasts (cells responsible for tooth growth)
-in 60 guinea pigs. Each animal received one of three dose levels of vitamin C
-(0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or
-ascorbic acid (a form of vitamin C and coded as VC
).
-
supp
Supplement type. -
dose
Dose (mg/day). -
len
Tooth length. -
C. I. Bliss (1952). The Statistics of Bioassay. Academic Press. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-Crampton, E. W. (1947). The growth of the odontoblast of the incisor -teeth as a criterion of vitamin C intake of the guinea pig. The -Journal of Nutrition, 33(5), 491–504. -
-t = tblish.dataset.ToothGrowth; - -tblish.examples.coplot (t, "dose", "len", "supp"); - -# TODO: Port Lowess smoothing to Octave - -
out =
treering ()
¶Yearly Treering Data, -6000-1979 -
-Contains normalized tree-ring widths in dimensionless units. -
-A univariate time series with 7981 observations. -
-Each tree ring corresponds to one year. -
-The data were recorded by Donald A. Graybill, 1980, from Gt Basin -Bristlecone Pine 2805M, 3726-11810 in Methuselah Walk, California. -
-Time Series Data Library: http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/, -series ‘CA535.DAT’. -
-For some photos of Methuselah Walk see -https://web.archive.org/web/20110523225828/http://www.ltrr.arizona.edu/~hallman/sitephotos/meth.html. -
-t = tblish.dataset.treering; - -
out =
trees ()
¶Diameter, Height and Volume for Black Cherry Trees -
-This data set provides measurements of the diameter, height and volume of -timber in 31 felled black cherry trees. Note that the diameter (in inches) -is erroneously labelled Girth in the data. It is measured at 4 ft 6 in -above the ground. -
-Girth
Tree diameter (rather than girth, actually) in inches. -
Height
Height in ft. -
Volume
Volume of timber in cubic feet. -
Ryan, T. A., Joiner, B. L. and Ryan, B. F. (1976). The Minitab -Student Handbook. Duxbury Press. -
-Atkinson, A. C. (1985). Plots, Transformations and Regression. -Oxford: Oxford University Press. -
-t = tblish.dataset.trees; - -figure -tblish.examples.plot_pairs (t); - -figure -loglog (t.Girth, t.Volume) -xlabel ("Girth") -ylabel ("Volume") - -# TODO: Transform to log space for the coplot - -# TODO: Linear model - -
out =
UCBAdmissions ()
¶Student Admissions at UC Berkeley -
-Aggregate data on applicants to graduate school at Berkeley for the six -largest departments in 1973 classified by admission and sex. -
-A 3-dimensional array resulting from cross-tabulating 4526 observations on -3 variables. The variables and their levels are as follows: -
-Admit
Admitted, Rejected. -
Gender
Male, Female. -
Dept
A, B, C, D, E, F. -
This data set is frequently used for illustrating Simpson’s paradox, see -Bickel et al (1975). At issue is whether the data show evidence of sex bias -in admission practices. There were 2691 male applicants, of whom 1198 (44.5%) -were admitted, compared with 1835 female applicants of whom 557 (30.4%) were -admitted. This gives a sample odds ratio of 1.83, indicating that males were -almost twice as likely to be admitted. In fact, graphical methods (as in the -example below) or log-linear modelling show that the apparent association -between admission and sex stems from differences in the tendency of males -and females to apply to the individual departments (females used to apply -more to departments with higher rejection rates). -
-The data are given in Box & Jenkins (1976). Obtained from the Time Series Data -Library at http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/. -
-Bickel, P. J., Hammel, E. A., and O’Connell, J. W. (1975). Sex bias in -graduate admissions: Data from Berkeley. Science, 187, 398–403. -http://www.jstor.org/stable/1739581. -
-tblish.dataset.UCBAdmissions; - -# TODO: Port mosaic plot to Octave - -
out =
UKDriverDeaths ()
¶Road Casualties in Great Britain 1969-84 -
-UKDriverDeaths
is a time series giving the monthly totals of car drivers in Great Britain killed
-or seriously injured Jan 1969 to Dec 1984. Compulsory wearing of seat belts
-was introduced on 31 Jan 1983.
-
Seatbelts
is more information on the same problem.
-
UKDriverDeaths
is a table with the following variables:
-
month
Month of the observation. -
deaths
Number of deaths. -
Seatbelts
is a table with the following variables:
-
month
Month of the observation. -
DriversKilled
Car drivers killed. -
drivers
Same as UKDriverDeaths
deaths
count.
-
front
Front-seat passengers killed or seriously injured. -
rear
Rear-seat passengers killed or seriously injured. -
kms
Distance driven. -
PetrolPrice
Petrol price. -
VanKilled
Number of van (“light goods vehicle”) drivers killed. -
law
0/1: was the seatbelt law in effect that month? -
Harvey, A.C. (1989). Forecasting, Structural Time Series Models and -the Kalman Filter. Cambridge: Cambridge University Press. pp. 519–523. -
-Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/ -
-Harvey, A. C. and Durbin, J. (1986). The effects of seat belt legislation -on British road casualties: A case study in structural time series -modelling. Journal of the Royal Statistical Society series A, 149, 187–227. -
-tblish.dataset.UKDriverDeaths; -d = UKDriverDeaths; -s = Seatbelts; - -# TODO: Port the model and plots to Octave - -
out =
UKgas ()
¶UK Quarterly Gas Consumption -
-Quarterly UK gas consumption from 1960Q1 to 1986Q4, in millions of therms. -
-date
Quarter of the observation -
gas
Gas consumption (MM therms). -
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/. -
-t = tblish.dataset.UKgas; - -plot (datenum (t.date), t.gas); -datetick ("x") -xlabel ("Month") -ylabel ("Gas consumption (MM therms)") - -
out =
UKLungDeaths ()
¶Monthly Deaths from Lung Diseases in the UK -
-Three time series giving the monthly deaths from bronchitis, emphysema and -asthma in the UK, 1974–1979. -
-date
Month of the observation. -
ldeaths
Total lung deaths. -
fdeaths
Lung deaths among females. -
mdeaths
Lung deaths among males. -
P. J. Diggle (1990). Time Series: A Biostatistical Introduction. Oxford. table A.3 -
-t = tblish.dataset.UKLungDeaths; - -figure -plot (datenum (t.date), t.ldeaths); -title ("Total UK Lung Deaths") -xlabel ("Month") -ylabel ("Deaths") - -figure -plot (datenum (t.date), [t.fdeaths t.mdeaths]); -title ("UK Lung Deaths buy sex") -legend ({"Female", "Male"}) -xlabel ("Month") -ylabel ("Deaths") - -
out =
USAccDeaths ()
¶Accidental Deaths in the US 1973-1978 -
-A time series giving the monthly totals of accidental deaths in the USA. -
-month
Month of the observation. -
deaths
Accidental deaths. -
Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods. -New York: Springer. -
-t = tblish.dataset.USAccDeaths; - -
out =
USArrests ()
¶Violent Crime Rates by US State -
-This data set contains statistics, in arrests per 100,000 residents for -assault, murder, and rape in each of the 50 US states in 1973. Also given -is the percent of the population living in urban areas. -
-State
State name. -
Murder
Murder arrests (per 100,000). -
Assault
Assault arrests (per 100,000). -
UrbanPop
Percent urban population. -
Rape
Rape arrests (per 100,000). -
USArrests
contains the data as in McNeil’s monograph. For the
-UrbanPop
percentages, a review of the table (No. 21) in the
-Statistical Abstracts 1975 reveals a transcription error for Maryland
-(and that McNeil used the same “round to even” rule), as found by
-Daniel S Coven (Arizona).
-
See the example below on how to correct the error and improve accuracy -for the ‘<n>.5’ percentages. -
-World Almanac and Book of Facts 1975. (Crime rates). -
-Statistical Abstracts of the United States 1975, p.20, (Urban rates), -possibly available as https://books.google.ch/books?id=zl9qAAAAMAAJ&pg=PA20. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.USArrests; - -summary (t); - -tblish.examples.plot_pairs (t(:,2:end)); - -# TODO: Difference between USArrests and its correction - -# TODO: +/- 0.5 to restore the original <n>.5 percentages - -
out =
USJudgeRatings ()
¶Lawyers’ Ratings of State Judges in the US Superior Court -
-Lawyers’ ratings of state judges in the US Superior Court. -
-CONT
Number of contacts of lawyer with judge. -
INTG
Judicial integrity. -
DMNR
Demeanor. -
DILG
Diligence. -
CFMG
Case flow managing. -
DECI
Prompt decisions. -
PREP
Preparation for trial. -
FAMI
Familiarity with law. -
ORAL
Sound oral rulings. -
WRIT
Sound written rulings. -
PHYS
Physical ability. -
RTEN
Worthy of retention. -
New Haven Register, 14 January, 1977 (from John Hartigan). -
-t = tblish.dataset.USJudgeRatings; - -figure -tblish.examples.plot_pairs (t(:,2:end)); -title ("USJudgeRatings data") - -
out =
USPersonalExpenditure ()
¶Personal Expenditure Data -
-This data set consists of United States personal expenditures (in billions -of dollars) in the categories: food and tobacco, household operation, -medical and health, personal care, and private education for the years 1940, -1945, 1950, 1955 and 1960. -
-A 2-dimensional matrix x
with Category along dimension 1 and Year along dimension 2.
-
The World Almanac and Book of Facts, 1962, page 756. -
-Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass: Addison-Wesley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.USPersonalExpenditure; - -# TODO: Port medpolish() from R, whatever that is. - -
out =
uspop ()
¶Populations Recorded by the US Census -
-This data set gives the population of the United States -(in millions) as recorded by the decennial census for the period 1790–1970. -
-year
Year of the census. -
population
Population, in millions. -
McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.uspop; - -figure -semilogy (t.year, t.population) -xlabel ("Year") -ylabel ("U.S. Population (millions)") - -
out =
VADeaths ()
¶Death Rates in Virginia (1940) -
-Death rates per 1000 in Virginia in 1940. -
-A 2-dimensional matrix deaths
, with age group along dimension 1 and
-demographic group along dimension 2.
-
The death rates are measured per 1000 population per year. They are -cross-classified by age group (rows) and population group (columns). The -age groups are: 50–54, 55–59, 60–64, 65–69, 70–74 and the population groups -are Rural/Male, Rural/Female, Urban/Male and Urban/Female. -
-This provides a rather nice 3-way analysis of variance example. -
-Molyneaux, L., Gilliam, S. K., and Florant, L. C.(1947) Differences -in Virginia death rates by color, sex, age, and rural or urban -residence. American Sociological Review, 12, 525–535. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.VADeaths; - -# TODO: Port to Octave - -
out =
volcano ()
¶Topographic Information on Auckland’s Maunga Whau Volcano -
-Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic -field. This data set gives topographic information for Maunga Whau on a -10m by 10m grid. -
-A matrix volcano
with 87 rows and 61 columns, rows corresponding
-to grid lines running east to west and columns to grid lines running south
-to north.
-
Digitized from a topographic map by Ross Ihaka. These data should not be regarded as accurate. -
-Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and -Control. San Francisco: Holden-Day. p. 537. -
-Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods. -Second edition. New York: Springer-Verlag. p. 414. -
-tblish.dataset.volcano; - -# TODO: Figure out how to do a topo map in Octave. Just a gridded color plot -# should be fine. And then maybe do a 3-d mesh plot. - -
out =
warpbreaks ()
¶The Number of Breaks in Yarn during Weaving -
-This data set gives the number of warp breaks per loom, where a loom -corresponds to a fixed length of yarn. -
-wool
Type of wool (A or B). -
tension
The level of tension (L, M, H). -
breaks
Number of breaks. -
There are measurements on 9 looms for each of the six types of warp (AL, AM, AH, BL, BM, BH). -
-Tippett, L. H. C. (1950). Technological Applications of Statistics. -New York: Wiley. Page 106. -
-Tukey, J. W. (1977). Exploratory Data Analysis. Reading, Mass: Addison-Wesley. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.warpbreaks; - -summary (t) - -# TODO: Port the plotting code and OPAR to Octave - -
out =
women ()
¶Average Heights and Weights for American Women -
-This data set gives the average heights and weights for American women aged 30–39. -
-height
Height (in). -
weight
Weight (lbs). -
The data set appears to have been taken from the American Society of Actuaries -Build and Blood Pressure Study for some (unknown to us) earlier year. -
-The World Almanac notes: “The figures represent weights in ordinary indoor -clothing and shoes, and heights with shoes”. -
-The World Almanac and Book of Facts, 1975. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-t = tblish.dataset.women; - -figure -scatter (t.height, t.weight) -xlabel ("Height (in)") -ylabel ("Weight (lb") -title ("women data: American women aged 30-39") - -
out =
WorldPhones ()
¶The World’s Telephones -
-The number of telephones in various regions of the world (in thousands). -
-A matrix with 7 rows and 8 columns. The columns of the matrix give the -figures for a given region, and the rows the figures for a year. -
-The regions are: North America, Europe, Asia, South America, Oceania, -Africa, Central America. -
-The years are: 1951, 1956, 1957, 1958, 1959, 1960, 1961. -
-AT&T (1961) The World’s Telephones. -
-McNeil, D. R. (1977). Interactive Data Analysis. New York: Wiley. -
-tblish.dataset.WorldPhones; - -# TODO: Port matplot() to Octave - -
out =
WWWusage ()
¶WWWusage -
-A time series of the numbers of users connected to the Internet through -a server every minute. -
-A time series of length 100. -
-Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State -Space Methods. Oxford: Oxford University Press. http://www.ssfpack.com/dkbook/ -
-Makridakis, S., Wheelwright, S. C. and Hyndman, R. J. (1998). Forecasting: -Methods and Applications. New York: Wiley. -
-# TODO: Come up with example code here - -
out =
zCO2 ()
¶Carbon Dioxide Uptake in Grass Plants -
-The CO2
data set has 84 rows and 5 columns of data from an experiment
-on the cold tolerance of the grass species Echinochloa crus-galli.
-
The CO2 uptake of six plants from Quebec and six plants from Mississippi was -measured at several levels of ambient CO2 concentration. Half the plants of -each type were chilled overnight before the experiment was conducted. -
-Potvin, C., Lechowicz, M. J. and Tardif, S. (1990). The statistical -analysis of ecophysiological response curves obtained from experiments -involving repeated measures. Ecology, 71, 1389–1400. -
-Pinheiro, J. C. and Bates, D. M. (2000). Mixed-effects Models -in S and S-PLUS. New York: Springer. -
-t = tblish.dataset.zCO2; - -# TODO: Coplot -# TODO: Port the linear model to Octave - -
Example dataset collection. -
-tblish.datasets
is a collection of example datasets to go with the
-Tablicious package.
-
The tblish.datasets
class provides methods for listing and loading
-the example datasets.
-
description
(datasetName) ¶out =
description (datasetName)
¶Get or display the description for a dataset. -
-Gets the description for the named dataset. If the output is captured, -it is returned as a charvec containing plain text suitable for human display. -If the output is not captured, displays the description to the console. -
-()
¶out =
list ()
¶List all datasets. -
-Lists all the example datasets known to this class. If the output is -captured, returns the list as a table. If the output is not captured, -displays the list. -
-Returns a table with variables Name, Description, and possibly more. -
-out =
tblish.evalWithTableVars (tbl, expr)
¶Evaluate an expression against a table array’s variables. -
-Evaluates the M-code expression expr in a workspace where all of tbl’s -variables have been assigned to workspace variables. -
-expr is a charvec containing an Octave expression. -
-As an implementation detail, the workspace will also contain some variables -that are prefixed and suffixed with "__". So try to avoid those in your -table variable names. -
-Returns the result of the evaluation. -
-Examples: -
-[s,p,sp] = tblish.examples.SpDb -tmp = join (sp, p); -shipment_weight = tblish.evalWithTableVars (tmp, "Qty .* Weight") -
See also: table.restrict -
-[fig, hax] =
tblish.examples.coplot (tbl, xvar, yvar, gvar)
¶[fig, hax] =
tblish.examples.coplot (fig, tbl, xvar, yvar, gvar)
¶[fig, hax] =
tblish.examples.coplot (…, OptionName, OptionValue, …)
¶Conditioning plot. -
-tblish.examples.coplot
produces conditioning plots. This is a kind of plot that breaks up the
-data into groups based on one or two grouping variables, and plots each group of data
-in a separate subplot.
-
tbl is a table
containing the data to plot.
-
xvar is the name of the table variable within tbl to use as the X values. -May be a variable name or index. -
-yvar is the name of the table variable within tbl to use as the Y values. -May be a variable name or index. -
-gvar is the name of the table variable or variables within tbl to use as -the grouping variable(s). The grouping variables split the data into groups based on -the distinct values in those variables. gvar may specify either one or two -grouping variables (but not more). It can be provided as a charvec, cellstr, or index -array. Records with a missing value for their grouping variable(s) are ignored. -
-fig is the figure handle to plot into. If fig is not provided, a new figure -is created. -
-Name/Value options: -
-PlotFcn
The plotting function to use, supplied as a function handle. Defaults to @plot
.
-It must be a function that provides the signature fcn(hax, X, Y, …)
.
-
PlotArgs
A cell array of arguments to pass in to the plotting function, following the hax, -x, and y arguments. -
Returns: - fig – the figure handle it plotted into - hax – array of axes handles to all the axes for the subplots -
-out =
tblish.examples.plot_pairs (data)
¶out =
tblish.examples.plot_pairs (data, plot_type)
¶out =
tblish.examples.plot_pairs (fig, …)
¶Plot pairs of variables against each other. -
-data is the data holding the variables to plot. It may be either a
-table
or a struct. Each variable or field in the table
-or struct is considered to be one variable. Each must hold a vector, and
-all the vectors of all the variables must be the same size.
-
plot_type is a charvec indicating what plot type to do in each subplot.
-("scatter"
is the default.) Valid plot_type values are:
-
"scatter"
A plain scatter plot. -
"smooth"
A scatter plot + fitted line, like R’s panel.smooth
does.
-
fig is an optional figure handle to plot into. If omitted, a new -figure is created. -
-Returns the created figure, if the output is captured. -
-spdb =
tblish.examples.SpDb ()
¶[s, p, sp] =
tblish.examples.SpDb ()
¶The classic Suppliers-Parts example database. -
-Constructs the classic C. J. Date Suppliers-Parts ("SP") example database as tables. -This database is the one used as an example throughout Date’s "An Introduction to -Database Systems" textbook. -
-Returns the database as a set of three table arrays. If one argout is captured, the -tables are returned in the fields of a single struct. If multiple argouts are captured, the -tables are returned as three argouts with a single table in each, in the order (s, -p, sp). -
-out =
tblish.sizeof2 (x)
¶Approximate size of an array in bytes, with object support. -
-This is an alternative to Octave’s sizeof
function that tries to provide
-meaningful support for objects, including the classes defined in Tablicious. It is
-named "sizeof2" instead of "sizeof" to avoid a "shadowing core function" warning
-when loading Tablicious, because it seems that Octave does not consider packages
-(namespaces) when detecting shadowed functions.
-
This may be supplemented or replaced by sizeof
override methods on Tablicious’s
-classes. I’m not sure whether Octave’s sizeof
supports extension by method
-overrides, so I’m not doing that yet. If that happens, this sizeof2
function
-will stick around in a deprecated state for a while, and it will respect those override
-methods.
-
For tables, this returns the sum of sizeof
for all of its variables’
-arrays, plus the size of the VariableNames and any other metadata stored in obj.
-
This is currently broken for some types, because its implementation is in transition -from overridden methods on Tablicious’s objects to a separate function. -
-This is not supported, fully or at all, for all input types, but it has support -for the types defined in Tablicious, plus some Octave built-in types, and makes a -best effort at figuring out user-defined classdef objects. It currently does not -have extensibility support for customization by classdef classes, but that may be -added in the future, in which case its output may change significantly for classdef -objects in future releases. -
-x is an array of any type. -
-Returns a scalar numeric. Returns NaN for types that are known to not be supported, -instead of raising an error. Raises an error if it fails to determine the size of an -input of a type that it thought was supported. -
-See also: sizeof -
-[out] =
tblish.table.grpstats (tbl, groupvar)
¶[out] =
tblish.table.grpstats (…, 'DataVars'
, DataVars)
¶Statistics by group for a table array. -
-This is a table-specific implementation of grpstats
that works on table arrays.
-It is supplied as a function in the +tblish package to avoid colliding with
-the global grpstats
function supplied by the Statistics Octave Forge package.
-Depending on which version of the Statistics OF package you are using, it may or may
-not support table inputs to its grpstats
function. This function is supplied
-as an alternative you can use in an environment where table
arrays are not
-supported by the grpstats
that you have, though you need to make code changes
-and call it as tblish.table.grpstats(tbl)
instead of with a plain
-grpstats(tbl)
.
-
See also: table.groupby, table.findgroups, table.splitapply -
-out =
timezones ()
¶out =
timezones (area)
¶List all the time zones defined on this system. -
-This lists all the time zones that are defined in the IANA time zone database -used by this Octave. (On Linux and macOS, that will generally be the system -time zone database from /usr/share/zoneinfo. On Windows, it will be -the database redistributed with the Tablicious package. -
-If the return is captured, the output is returned as a table if your Octave -has table support, or a struct if it does not. It will have fields/variables -containing column vectors: -
-Name
The IANA zone name, as cellstr. -
Area
The geographical area the zone is in, as cellstr. -
Compatibility note: Matlab also includes UTCOffset and DSTOffset fields in -the output; these are currently unimplemented. -
-out =
todatetime (x)
¶Convert input to a Tablicious datetime array, with convenient interface. -
-This is an alternative to the regular datetime constructor, with a signature -and conversion logic that Tablicious’s author likes better. -
-This mainly exists because datetime’s constructor signature does not accept -datenums, and instead treats one-arg numeric inputs as datevecs. (For compatibility -with Matlab’s interface.) I think that’s less convenient: datenums seem to be -more common than datevecs in M-code, and it returns an object array that’s not the -same size as the input. -
-Returns a datetime array whose size depends on the size and type of the input -array, but will generally be the same size as the array of strings or numerics -the input array "represents". -
-out =
vartype (type)
¶Filter by variable type for use in suscripting. -
-Creates an object that can be used for subscripting into the variables -dimension of a table and filtering on variable type. -
-type is the name of a type as charvec. This may be anything that
-the isa
function accepts, or 'cellstr'
to select cellstrs,
-as determined by iscellstr
.
-
Returns an object of an opaque type. Don’t worry about what type it is;
-just pass it into the second argument of a subscript into a table
-object.
-
out =
vecfun (fcn, x, dim)
¶Apply function to vectors in array along arbitrary dimension. -
-This function is not implemented yet. -
-Applies a given function to the vector slices of an N-dimensional array, where -those slices are along a given dimension. -
-fcn is a function handle to apply. -
-x is an array of arbitrary type which is to be sliced and passed -in to fcn. -
-dim is the dimension along which the vector slices lay. -
-Returns the collected output of the fcn calls, which will be -the same size as x, but not necessarily the same type. -
-out =
years (x)
¶Create a duration
x years long, or get the years in a duration
-x.
-
If input is numeric, returns a duration
array in units of fixed-length
-years of 365.2425 days each.
-
If input is a duration
, converts the duration
to a number of fixed-length
-years as double.
-
Note: years
creates fixed-length years, which may not be what you want.
-To create a duration of calendar years (which account for actual leap days),
-use calyears
.
-
See calyears. -
Tablicious for GNU Octave is covered by the GNU GPLv3 and other Free and Open Source Software licenses. -
-The main code of Tablicious is licensed under the GNU GPL version 3. -
-The date/time portion of Tablicious includes some Unicode data files licensed under the Unicode License Agreement - Data Files and Software license. -
-The Tablicious test suite contains some files, specifically some table-related tests using MP-Test like t/t_01_table.m
, which are BSD 3-Clause licensed, and are adapted from MATPOWER written by Ray Zimmerman.
-
The Fisher Iris dataset is Public Domain. -
-This manual is for Tablicious, version 0.4.4-SNAPSHOT. -
-Copyright © 2019, 2023, 2024 Andrew Janke -
--- -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. -
-Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. -
-Permission is granted to copy and distribute translations of this manual -into another language, under the same conditions as for modified versions. -