From 4b62efe165b780cbc9b61be2b60a02ffd934edf2 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Fri, 7 Feb 2025 02:52:41 +0000 Subject: [PATCH] build based on a0ae52f --- dev/.documenter-siteinfo.json | 2 +- dev/concepts/index.html | 2 +- dev/examples/6e83bb1b.svg | 46 -------------- dev/examples/73a8fc8d.svg | 48 +++++++++++++++ dev/examples/977b08bf.svg | 44 ------------- dev/examples/a4f2251e.svg | 44 +++++++++++++ dev/examples/{b34a791c.svg => ac132618.svg} | 56 ++++++++--------- dev/examples/{385e3a2d.svg => c61108b7.svg} | 68 ++++++++++----------- dev/examples/index.html | 8 +-- dev/index.html | 2 +- dev/reference/align/index.html | 2 +- dev/reference/arithmetic/index.html | 2 +- dev/reference/creating_ops/index.html | 2 +- dev/reference/fundamentals/index.html | 2 +- dev/reference/internals/index.html | 4 +- dev/reference/misc_ops/index.html | 2 +- dev/reference/online_windowed/index.html | 2 +- dev/reference/sources/index.html | 2 +- 18 files changed, 170 insertions(+), 168 deletions(-) delete mode 100644 dev/examples/6e83bb1b.svg create mode 100644 dev/examples/73a8fc8d.svg delete mode 100644 dev/examples/977b08bf.svg create mode 100644 dev/examples/a4f2251e.svg rename dev/examples/{b34a791c.svg => ac132618.svg} (89%) rename dev/examples/{385e3a2d.svg => c61108b7.svg} (90%) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 401d087f..b2e71ed9 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.3","generation_timestamp":"2025-02-06T02:50:57","documenter_version":"1.8.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.3","generation_timestamp":"2025-02-07T02:52:33","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/dev/concepts/index.html b/dev/concepts/index.html index 17c5df79..b552160c 100644 --- a/dev/concepts/index.html +++ b/dev/concepts/index.html @@ -6,4 +6,4 @@ t_i &> t_{i-1}\ \forall i. \end{aligned}\]

Here we use $\mathcal{T}$ to denote the type of time.[1] We only require that there is a total order on $\mathcal{T}$ — but thinking about it as a real number is a good analogy. We also, somewhat sloppily, identify $\infty$ with $max \mathcal{T}$.

This restriction may be relaxed in the future.

Colloquially, we will refer to a time-value pair as a knot.[2]

$\mathcal{T}_x$ is the semi-infinite interval bounded below by the time of the first knot in $x$.

We define the TimeDag.value_type of $x$ to be the set $\mathcal{X}$ above, and in practice this can be any Julia type.

TimeDag primarily represents a time-series as a TimeDag.Node. It also stores time-series data in memory in the Block type.

Here is a visualisation of a time-series $x$:

A time series

Functional interpretation

We can also consider $x$ to be a function, $x : \mathcal{T}_x \rightarrow \mathcal{X}$. This is defined $x(t) = \max_i\ x_i\ \textrm{s.t.}\ t_i \leq t$.

Informally, this means that whenever we observe a value $x_i$, the 'value of' the time-series is $x_i$ until such time as we observe $x_{i+1}$.

Sometimes it is useful to define $x(t_{-}) = \oslash\ \forall\ t_{-} \in \mathcal{T} \setminus \mathcal{T}_x$. Here, $\oslash$ is a placeholder element that simply means "no value".

Info

Note that time is strictly increasing, and repeated times are not permitted. This conceptual choice is necessary to consider $x$ to be a map from time to value as above. Without this restriction, there is an ambiguity whenever a time is repeated.

Functions of time-series

General case

We wish to define a general notion of a function $f : \mathcal{TS} \times \cdots \times \mathcal{TS} \rightarrow \mathcal{TS}$. Let $z = f(x, y, \ldots)$, where $x$, $y$ and $z$ are all time-series.

Firstly, we define an indicator-like function $f_t(t, \ldots) \in \{0,1\}$, which returns $1$ iff we should emit a value at time $t$:

\[\{t_i\} = \{t \in \mathcal{T}\ |\ f_t(t, \{x(t') | t' \leq t\}, \{y(t') | t' \leq t\}, \ldots) = 1\}\]

Colloquially, whenever $f_t$ returns $1$ we say that $z$ ticks, i.e. emits a knot.

Then, we require that each value $z_i$ at time $t_i$ can be written as the result of a function $f'$:

\[z_i = f'(t_i, \{x(t) | t \leq t_i\}, \{y(t) | t \leq t_i\}, \ldots).\]

Info

Let us unpack this notation a bit:

  • Knots of $z$ are only allowed to depend on non-future values of $x$ and $y$.
  • $z$ can tick whenever it likes, possibly dependent on values of $x$ and $y$.
  • The knot emitted can be a function of time.

The first of these is an important requirement, and TimeDag aims to enforce this structurally.

Parameters

In the above discussion, all arguments to $f$ are time-series. Such functions could additionally have some other non-time-series constant parameters, which we will denote $\theta\in\Theta$. Strictly mathematically, note that a "constant" can just be viewed as a time-series with a single observation at $min \mathcal{T}$; so the above description is still fully general.

In practice (for efficient implementation) we will want function $f : \Theta \times \mathcal{TS} \times \cdots \rightarrow \mathcal{TS}$. So, $f(\theta, x, y, \ldots)$ then has some constant parameter(s) $\theta$.

We'll continue to drop the explicit $\theta$ dependence where it isn't interesting, to simplify notation.

Explicit state

It is useful to re-write the value computation by introducing the notion of a 'state' $\zeta_i$:

\[\begin{aligned} z_i, \zeta_i &= f_v(t_i, \zeta_{i-1}, x(t_i), y(t_i), \ldots)\\ -\end{aligned}\]

Each state $\zeta_{i-1}$ needs to package as much information about the history of the inputs as necessary to compute each $z_i$ (as well as the new state $\zeta_i$).

Batching

Note that, even after the re-arrangement in Explicit state, $f_t$ is still a bit awkward. One cannot directly implement it — otherwise one has to call $f_t$ for every $t$ in an infinite (or at least very large) set.

First, let us introduce the notion of slicing. Define an interval $\delta = [t_1,t_2) \subset \mathcal{T}$.[3] Then, the slice of $x$ over $\delta$, which we'll write as $x' = x[\delta]$, is a new time-series with support $\mathcal{T}_{x'} = \delta \cap \mathcal{T}_x$.

Let $\{\delta_i\}$ represent an ordered non-overlapping set of intervals, whose union covers all of $\mathcal{T}$. We then write, analogous to the definition of $f_v$:

\[z[\delta_i], \zeta_{\sup \delta_i} = f_b(\delta_i, \zeta_{\sup \delta_{i-1}}, x[\delta_i], y[\delta_i], \ldots).\]

This function outputs knots — time-value pairs — rather than just the values, and hence performs the roles of both $f_t$ and $f_v$ previously.

NB $\sup\delta_i$ indicates the supremum of the interval $\delta_i$, i.e. the upper bound. The state $\zeta$ is only subscripted by this upper bound; i.e. by a time, because it should not be path dependent. i.e. for a given time-series operation, we should always end up with the same state at a particular time, regardless of how many batches we have used to get there.

Info

It is useful to emphasise this distinction:

Helpfully, often $f$ has simple semantics & behaviour that can be reasoned about. The implementation details can be ignored in this reasoning.

Warning

A little thought shows that $f_b$, and hence TimeDag.run_node!, can express illegal time-series operations that future-peek. Care must be taken when implementing this low-level interface!

Where possible, when custom operations are required, use the higher-level abstractions referred to below.

Classes of function

All time-series functions in TimeDag are of the form of $f$ above. Here we identify a few categories of such functions which cover many of the cases of interest.

No inputs

A function $f : \emptyset \rightarrow \mathcal{TS}$ can be considered a source. That is, it generates a time-series with no inputs.

In this case, if $z = f()$, then the implementation $f_b$ technically reduces to $z[\delta] = f_b(\delta)$. In principle no state is required, since there is no external information to remember. However, in practice retaining the state term can be useful to increase implementation efficiency.

Single input (map over values)

Consider an unary - function operating on a time-series; $z = -x$. This is a "boring" time-series operation, in that all times of $z$ are identical to those of $x$. The values are determined by $z_i = -x_i\ \forall i$.

Some unary operators from Base, like Base.:-, have methods on TimeDag.Node defined within TimeDag.

More generally, wrap and wrapb let you create a time-series function from such an unary function. See Creating operations for more details.

Single input (lag)

A lag is a slightly more complex unary function. Rather than explain it mathematically, a visualisation can help:

lag

Time is increasing to the right. Each grey arrow indicates that one value is used in computing another — in the case of lag, the value is simply used directly. Note how, for this function, we never introduce new timestamps — we simply 'lag' the previous value onto the next timestamp.

A related concept is a time-lag, where each knot would be delayed by some fixed period of time $\partial t$:

Time lag

Single input (cumulative sum)

Similarly to a simple function operation on values, a cumulative sum over time (Base.sum) ticks whenever the input ticks. However, this time each value is a function of all preceding knots:

sum

Alignment

When considering a function of two or more time-series, a useful special-case is where the output ticks at some subset of the times that all the inputs tick. We consider alignment, which is a selection process with semantics similar (but not identical) to "joins" in database terminology.

We define three ways of performing alignment. For each one we document the TimeDag constant which should be used in function calls that accept an alignment, and give a graphical interpretation. Each diagram is shown for the case of two inputs; the docstrings describe the general case with more inputs.

Functions in TimeDag that accept multiple nodes typically default to using UNION alignment.

Union

Similar to an "outer join", with the key difference that we only emit knots once all inputs have started ticking.

TimeDag.UNIONConstant
UNION

For inputs (A, B, ...), tick whenever any input ticks so long as all inputs are active.

source

Union alignment

Intersect

Tick if and only if both inputs tick. This is identical to an "inner join".

TimeDag.INTERSECTConstant
INTERSECT

For inputs (A, B, ...), tick whenever all inputs tick simultaneously.

source

Intersect alignment

Left

Similar to a "left join", with the key difference that we only emit knots once all inputs have started ticking.

TimeDag.LEFTConstant
LEFT

For inputs (A, B, ...), tick whenever A ticks so long as all inputs are active.

source

Left alignment

Initial values

For the alignments above, it was noted that we have to wait for all inputs to start ticking before the output ticks.

It is possible to tell TimeDag that a given operation should consider its inputs to have some initial values. This behaves a little like a knot at the start of the evaluation window, however does not result in the creation of an output knot at that time. In the notation above, it is the definition of a value for $x(t_{-})$ which isn't $\oslash$.

Initial values are set seperately for each input. Most functions of two or more nodes will take an initial_values keyword argument to specify these.

Some more implementation details on the lower-level functionality that controls this is provided in Alignment implementation.

Computational graph

Nodes

So far we have introduced the notion of time-series operations. By working purely with TimeDag.NodeOps, we build up an abstract representation of the computation we want to do. A TimeDag.Node contains zero or more input nodes, as well as a TimeDag.NodeOp defining how they should be combined.

Evaluation

When we wish to evaluate a node over some interval $\delta$, we first evaluate all input nodes over the same interval, recursively. Given all inputs, we can evaluate a particular node using $f_b$, as defined previously. The practicalities of this are discussed further in Advanced evaluation.

Subgraph elimination

By using an Identity map we ensure that we never create duplicate nodes. This effectively eliminates the creation of common subgraphs, which means that when performing evaluation we never repeat work.

+\end{aligned}\]

Each state $\zeta_{i-1}$ needs to package as much information about the history of the inputs as necessary to compute each $z_i$ (as well as the new state $\zeta_i$).

Batching

Note that, even after the re-arrangement in Explicit state, $f_t$ is still a bit awkward. One cannot directly implement it — otherwise one has to call $f_t$ for every $t$ in an infinite (or at least very large) set.

First, let us introduce the notion of slicing. Define an interval $\delta = [t_1,t_2) \subset \mathcal{T}$.[3] Then, the slice of $x$ over $\delta$, which we'll write as $x' = x[\delta]$, is a new time-series with support $\mathcal{T}_{x'} = \delta \cap \mathcal{T}_x$.

Let $\{\delta_i\}$ represent an ordered non-overlapping set of intervals, whose union covers all of $\mathcal{T}$. We then write, analogous to the definition of $f_v$:

\[z[\delta_i], \zeta_{\sup \delta_i} = f_b(\delta_i, \zeta_{\sup \delta_{i-1}}, x[\delta_i], y[\delta_i], \ldots).\]

This function outputs knots — time-value pairs — rather than just the values, and hence performs the roles of both $f_t$ and $f_v$ previously.

NB $\sup\delta_i$ indicates the supremum of the interval $\delta_i$, i.e. the upper bound. The state $\zeta$ is only subscripted by this upper bound; i.e. by a time, because it should not be path dependent. i.e. for a given time-series operation, we should always end up with the same state at a particular time, regardless of how many batches we have used to get there.

Info

It is useful to emphasise this distinction:

Helpfully, often $f$ has simple semantics & behaviour that can be reasoned about. The implementation details can be ignored in this reasoning.

Warning

A little thought shows that $f_b$, and hence TimeDag.run_node!, can express illegal time-series operations that future-peek. Care must be taken when implementing this low-level interface!

Where possible, when custom operations are required, use the higher-level abstractions referred to below.

Classes of function

All time-series functions in TimeDag are of the form of $f$ above. Here we identify a few categories of such functions which cover many of the cases of interest.

No inputs

A function $f : \emptyset \rightarrow \mathcal{TS}$ can be considered a source. That is, it generates a time-series with no inputs.

In this case, if $z = f()$, then the implementation $f_b$ technically reduces to $z[\delta] = f_b(\delta)$. In principle no state is required, since there is no external information to remember. However, in practice retaining the state term can be useful to increase implementation efficiency.

Single input (map over values)

Consider an unary - function operating on a time-series; $z = -x$. This is a "boring" time-series operation, in that all times of $z$ are identical to those of $x$. The values are determined by $z_i = -x_i\ \forall i$.

Some unary operators from Base, like Base.:-, have methods on TimeDag.Node defined within TimeDag.

More generally, wrap and wrapb let you create a time-series function from such an unary function. See Creating operations for more details.

Single input (lag)

A lag is a slightly more complex unary function. Rather than explain it mathematically, a visualisation can help:

lag

Time is increasing to the right. Each grey arrow indicates that one value is used in computing another — in the case of lag, the value is simply used directly. Note how, for this function, we never introduce new timestamps — we simply 'lag' the previous value onto the next timestamp.

A related concept is a time-lag, where each knot would be delayed by some fixed period of time $\partial t$:

Time lag

Single input (cumulative sum)

Similarly to a simple function operation on values, a cumulative sum over time (Base.sum) ticks whenever the input ticks. However, this time each value is a function of all preceding knots:

sum

Alignment

When considering a function of two or more time-series, a useful special-case is where the output ticks at some subset of the times that all the inputs tick. We consider alignment, which is a selection process with semantics similar (but not identical) to "joins" in database terminology.

We define three ways of performing alignment. For each one we document the TimeDag constant which should be used in function calls that accept an alignment, and give a graphical interpretation. Each diagram is shown for the case of two inputs; the docstrings describe the general case with more inputs.

Functions in TimeDag that accept multiple nodes typically default to using UNION alignment.

Union

Similar to an "outer join", with the key difference that we only emit knots once all inputs have started ticking.

TimeDag.UNIONConstant
UNION

For inputs (A, B, ...), tick whenever any input ticks so long as all inputs are active.

source

Union alignment

Intersect

Tick if and only if both inputs tick. This is identical to an "inner join".

TimeDag.INTERSECTConstant
INTERSECT

For inputs (A, B, ...), tick whenever all inputs tick simultaneously.

source

Intersect alignment

Left

Similar to a "left join", with the key difference that we only emit knots once all inputs have started ticking.

TimeDag.LEFTConstant
LEFT

For inputs (A, B, ...), tick whenever A ticks so long as all inputs are active.

source

Left alignment

Initial values

For the alignments above, it was noted that we have to wait for all inputs to start ticking before the output ticks.

It is possible to tell TimeDag that a given operation should consider its inputs to have some initial values. This behaves a little like a knot at the start of the evaluation window, however does not result in the creation of an output knot at that time. In the notation above, it is the definition of a value for $x(t_{-})$ which isn't $\oslash$.

Initial values are set seperately for each input. Most functions of two or more nodes will take an initial_values keyword argument to specify these.

Some more implementation details on the lower-level functionality that controls this is provided in Alignment implementation.

Computational graph

Nodes

So far we have introduced the notion of time-series operations. By working purely with TimeDag.NodeOps, we build up an abstract representation of the computation we want to do. A TimeDag.Node contains zero or more input nodes, as well as a TimeDag.NodeOp defining how they should be combined.

Evaluation

When we wish to evaluate a node over some interval $\delta$, we first evaluate all input nodes over the same interval, recursively. Given all inputs, we can evaluate a particular node using $f_b$, as defined previously. The practicalities of this are discussed further in Advanced evaluation.

Subgraph elimination

By using an Identity map we ensure that we never create duplicate nodes. This effectively eliminates the creation of common subgraphs, which means that when performing evaluation we never repeat work.

diff --git a/dev/examples/6e83bb1b.svg b/dev/examples/6e83bb1b.svg deleted file mode 100644 index fd3a90cc..00000000 --- a/dev/examples/6e83bb1b.svg +++ /dev/null @@ -1,46 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/dev/examples/73a8fc8d.svg b/dev/examples/73a8fc8d.svg new file mode 100644 index 00000000..ccc93b12 --- /dev/null +++ b/dev/examples/73a8fc8d.svg @@ -0,0 +1,48 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/977b08bf.svg b/dev/examples/977b08bf.svg deleted file mode 100644 index 61ae88a7..00000000 --- a/dev/examples/977b08bf.svg +++ /dev/null @@ -1,44 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/dev/examples/a4f2251e.svg b/dev/examples/a4f2251e.svg new file mode 100644 index 00000000..aecf0849 --- /dev/null +++ b/dev/examples/a4f2251e.svg @@ -0,0 +1,44 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/b34a791c.svg b/dev/examples/ac132618.svg similarity index 89% rename from dev/examples/b34a791c.svg rename to dev/examples/ac132618.svg index a0304c78..d4527a4d 100644 --- a/dev/examples/b34a791c.svg +++ b/dev/examples/ac132618.svg @@ -1,42 +1,42 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/385e3a2d.svg b/dev/examples/c61108b7.svg similarity index 90% rename from dev/examples/385e3a2d.svg rename to dev/examples/c61108b7.svg index 75606dfe..d6586cb5 100644 --- a/dev/examples/385e3a2d.svg +++ b/dev/examples/c61108b7.svg @@ -1,48 +1,48 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/index.html b/dev/examples/index.html index 8fad7711..dc9894dc 100644 --- a/dev/examples/index.html +++ b/dev/examples/index.html @@ -25,7 +25,7 @@ | 2000-01-25T00:00:00 | 112.25 | | ⋮ | ⋮ | 484 rows omitted -

Above it is represented in a raw table-like form.[1] We can see that this block has values of type Float64. For Blocks with numeric value types, we can use the included plot recipe to visualise them:

plot(block; label="CL price")
Example block output

Creating nodes

The core of TimeDag is a computational graph of TimeDag.Nodes. These nodes represent time-series, and how they should be computed in terms of other time-series.

We can create a node from the block of data we already have:

price = block_node(block)
BlockNode{Float64}

The node knows its value_type, which will be Float64 (since the values will just be those of the block we created earlier).

value_type(price)
Float64

Now let's perform some computation — let's estimate the 50 day rolling standard deviation of returns.

We start by computing relative returns using lag; given a price $p_t$ at time $t$, the return series is $r_t = \frac{p_t - p_{t-1}}{p_{t-1}}$. We then use Statistics.std to define an online standard deviation over the specified window.

returns = (price - lag(price, 1)) / lag(price, 1)
+

Above it is represented in a raw table-like form.[1] We can see that this block has values of type Float64. For Blocks with numeric value types, we can use the included plot recipe to visualise them:

plot(block; label="CL price")
Example block output

Creating nodes

The core of TimeDag is a computational graph of TimeDag.Nodes. These nodes represent time-series, and how they should be computed in terms of other time-series.

We can create a node from the block of data we already have:

price = block_node(block)
BlockNode{Float64}

The node knows its value_type, which will be Float64 (since the values will just be those of the block we created earlier).

value_type(price)
Float64

Now let's perform some computation — let's estimate the 50 day rolling standard deviation of returns.

We start by computing relative returns using lag; given a price $p_t$ at time $t$, the return series is $r_t = \frac{p_t - p_{t-1}}{p_{t-1}}$. We then use Statistics.std to define an online standard deviation over the specified window.

returns = (price - lag(price, 1)) / lag(price, 1)
 using Statistics
 std_50 = std(returns, 50)
TimeDag.SimpleUnary{sqrt, true, Float64}()

Whilst it isn't normally necessary to inspect the graph by hand, we can visualise it with AbstractTrees.print_tree. This is often good enough for a simple text-based representation, but be aware that actually we have a graph, and not a tree. In the output, the Lag node appears twice, however it is in fact exactly the same object.

using AbstractTrees
 print_tree(std_50)
SimpleUnary{sqrt, true, Float64}()
@@ -39,8 +39,8 @@
          └─ BlockNode{Float64}

Now that we have defined our computation, we can evaluate it to form a concrete time-series. We use evaluate, and here we pass in a time range that covers all our input-data.

By evaluating both returns and std_50 in the same call, note that we do not duplicate work. (See Advanced evaluation for further discussion on this.)

returns_block, std_50_block = evaluate([returns, std_50], DateTime(2000), DateTime(2003))
 
 plot(returns_block; alpha=0.5, label="returns")
-plot!(std_50_block; label="50 day std")
Example block output

Other sources

The example so far has used a source node that simply wraps data that is already held in memory. More interesting cases are nodes that read or generate their data only when evaluated.

Here we use Base.rand to generate a stream of random numbers. It produces a value whenever its argument ticks — in this case, iterdates will tick once a day at midnight.[2]

It is good practice to consider this time to always be in UTC.

x = rand(iterdates())
+plot!(std_50_block; label="50 day std")
Example block output

Other sources

The example so far has used a source node that simply wraps data that is already held in memory. More interesting cases are nodes that read or generate their data only when evaluated.

Here we use Base.rand to generate a stream of random numbers. It produces a value whenever its argument ticks — in this case, iterdates will tick once a day at midnight.[2]

It is good practice to consider this time to always be in UTC.

x = rand(iterdates())
 plot(evaluate(x, DateTime(2001), DateTime(2003)); label="[2001, 2003)")
-plot!(evaluate(x, DateTime(2001), DateTime(2002)); label="[2001, 2002)")
Example block output

There are a couple of interesting things to note here:

  1. We can generate more data by evaluating over a longer range.
  2. So long as we start at the same time, we get exactly the same random numbers.

This second property is a general property of node evaluation — repeated evaluation should always give the same answer.

Finally, we show the correlation for two random numbers over an expanding window. As expected, it converges towards zero as more data is observed:

y = rand(iterdates())
+plot!(evaluate(x, DateTime(2001), DateTime(2002)); label="[2001, 2002)")
Example block output

There are a couple of interesting things to note here:

  1. We can generate more data by evaluating over a longer range.
  2. So long as we start at the same time, we get exactly the same random numbers.

This second property is a general property of node evaluation — repeated evaluation should always give the same answer.

Finally, we show the correlation for two random numbers over an expanding window. As expected, it converges towards zero as more data is observed:

y = rand(iterdates())
 correlation = cor(x, y)
-plot(evaluate(correlation, DateTime(2001), DateTime(2002)); label="correlation")
Example block output

Information on other source nodes included with TimeDag is available in Sources. If you wish to create your own source nodes, e.g. to read data directly from a database table, refer to Creating sources.

  • 1A Block is compatible with Tables.jl, which means that it can be easily converted to a DataFrame or similar.
  • 2Note that TimeDag's time axis doesn't include timezone information.
+plot(evaluate(correlation, DateTime(2001), DateTime(2002)); label="correlation")Example block output

Information on other source nodes included with TimeDag is available in Sources. If you wish to create your own source nodes, e.g. to read data directly from a database table, refer to Creating sources.

  • 1A Block is compatible with Tables.jl, which means that it can be easily converted to a DataFrame or similar.
  • 2Note that TimeDag's time axis doesn't include timezone information.
diff --git a/dev/index.html b/dev/index.html index 779fac37..df40850d 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · TimeDag.jl

TimeDag

Welcome to the documentation for TimeDag.jl!

TimeDag enables you to build and run time-series models efficiently.

You might want to use this package if some of the following apply:

  • You are processing data with a natural time ordering.
  • You need to handle data sources that update irregularly.
  • You are building online-updating statistical models.
  • Your input data is too large to fit in memory.
  • Your system has several components that share similar computation.
  • You want to create a real-time system, but also test it over a large historical dataset.

This package was built with Invenia's work in electricity grids in mind. Other domains that could be suitable include sensor, system monitoring, and financial market data.

Getting started

It might be helpful to begin with the Concepts, and then to look at the Examples. After that, the documentation under Reference->Node ops should give an idea of what functionality is available.

Roadmap

This section indicates various core functionality that is either possible, or in progress:

Basic operations

  • [x] Lagging by fixed number of knots
  • [x] Lagging by fixed time interval
  • [x] Alignment of arbitrary numbers of node arguments to a node op

Source node ops

  • [x] In-memory, from an existing Block
  • [x] From a Tea file
  • [ ] From a generic Table, with some schema constraints

Array-values

  • [ ] Nodes should be aware of the size of each value, when it is provably constant.

Statistics

  • [x] Fixed-window sum, mean, std, cov, etc.
  • [ ] Time-windowed sum, mean, standard-deviation, covariance, etc.
  • [ ] Exponentially-weighted mean, std, cov, correlation
  • [ ] Integration with OnlineStats.jl — should be easy to wrap an estimator into a node.

Evaluation & scheduling

  • [x] Single-threaded evaluation of a graph
  • [ ] Optimise value-independent ops by using alignment_base concept.
  • [ ] Graph compilation / transformations
  • [ ] Parallel evaluation of a batch within time-independent nodes
  • [ ] Parallelising scheduler
+Home · TimeDag.jl

TimeDag

Welcome to the documentation for TimeDag.jl!

TimeDag enables you to build and run time-series models efficiently.

You might want to use this package if some of the following apply:

  • You are processing data with a natural time ordering.
  • You need to handle data sources that update irregularly.
  • You are building online-updating statistical models.
  • Your input data is too large to fit in memory.
  • Your system has several components that share similar computation.
  • You want to create a real-time system, but also test it over a large historical dataset.

This package was built with Invenia's work in electricity grids in mind. Other domains that could be suitable include sensor, system monitoring, and financial market data.

Getting started

It might be helpful to begin with the Concepts, and then to look at the Examples. After that, the documentation under Reference->Node ops should give an idea of what functionality is available.

Roadmap

This section indicates various core functionality that is either possible, or in progress:

Basic operations

  • [x] Lagging by fixed number of knots
  • [x] Lagging by fixed time interval
  • [x] Alignment of arbitrary numbers of node arguments to a node op

Source node ops

  • [x] In-memory, from an existing Block
  • [x] From a Tea file
  • [ ] From a generic Table, with some schema constraints

Array-values

  • [ ] Nodes should be aware of the size of each value, when it is provably constant.

Statistics

  • [x] Fixed-window sum, mean, std, cov, etc.
  • [ ] Time-windowed sum, mean, standard-deviation, covariance, etc.
  • [ ] Exponentially-weighted mean, std, cov, correlation
  • [ ] Integration with OnlineStats.jl — should be easy to wrap an estimator into a node.

Evaluation & scheduling

  • [x] Single-threaded evaluation of a graph
  • [ ] Optimise value-independent ops by using alignment_base concept.
  • [ ] Graph compilation / transformations
  • [ ] Parallel evaluation of a batch within time-independent nodes
  • [ ] Parallelising scheduler
diff --git a/dev/reference/align/index.html b/dev/reference/align/index.html index c3eeb723..1e49e1d2 100644 --- a/dev/reference/align/index.html +++ b/dev/reference/align/index.html @@ -1,2 +1,2 @@ -Alignment ops · TimeDag.jl

Alignment ops

TimeDag.leftFunction
left(x, y[, alignment::Alignment; initial_values=nothing]) -> Node

Construct a node that ticks according to alignment with the latest value of x.

It is "left", in the sense of picking the left-hand of the two arguments x and y.

source
TimeDag.rightFunction
right(x, y[, alignment::Alignment; initial_values=nothing]) -> Node

Construct a node that ticks according to alignment with the latest value of y.

It is "right", in the sense of picking the right-hand of the two arguments x and y.

source
TimeDag.alignFunction
align(x, y) -> Node

Form a node that ticks with the values of x whenever y ticks.

source
TimeDag.align_onceFunction
align_once(x, y) -> Node

Similar to align(x, y), except knots from x will be emitted at most once.

This means that the resulting node will tick at a subset of the times that y ticks.

source
TimeDag.coalignFunction
coalign(x, [...; alignment::Alignment]) -> Node...

Given at least one node(s) x, or values that are convertible to nodes, align all of them.

We guarantee that all nodes that are returned will have the same alignment. The values of each node will correspond to the values of the input nodes.

The choice of alignment is controlled by alignment, which defaults to UNION.

source
TimeDag.first_knotFunction
first_knot(x::Node{T}) -> Node{T}

Get a node which ticks with only the first knot of x, and then never ticks again.

source
TimeDag.active_countFunction
active_count(nodes...) -> Node{Int64}

Get a node of the number of the given nodes (at least one) which are active.

source
TimeDag.prependFunction
prepend(x, y) -> Node

Create a node that ticks with knots from x until y is active, and thereafter from y.

Note that the value_type of the returned node will be that of the promoted value types of x and y.

source
TimeDag.throttleFunction
throttle(x::Node, n::Integer) -> Node

Return a node that only ticks every n knots.

The first knot encountered on the input will always be emitted.

Info

The throttled node is stateful and depends on the starting point of the evaluation.

source
TimeDag.lagFunction
lag(x::Node, n::Integer)

Construct a node which takes values from x, but lags them by n knots.

This means that we do not introduce any new timestamps that do not appear in x, however we will not emit knots for the first n values that appear when evaluating x.

Note

If x is a constant node, and n > 0, lag(x, n) will be an empty_node of the same value_type as x.

Conceptually, this is consistent with the view that a constant is represented by a single knot at the start of time.

source
lag(x::Node, w::TimePeriod)

Construct a node which takes values from x, but lags them by period w.

Note

For any constant, lagging by an amount of time is a no-op. This is because the constant is represented as a single value at the start of time (which will later appear at the start of the evaluation window).

source
Base.diffFunction
diff(x::Node[, n=1])

Compute the n-knot difference of x, i.e. x - lag(x, n).

source
TimeDag.count_knotsFunction
count_knots(x) -> Node{Int64}

Return a node that ticks with the number of knots seen in x since evaluation began.

source
Base.mergeFunction
merge(x::Node...) -> Node

Given at least one node x, create a node that emits the union of knots from all x.

If one or more of the inputs contain knots at the same time, then only one will be emitted. The last input in which a knot occurs at a particular time will take precedence.

If the inputs x have different value types, then the resultant value type will be promoted as necessary to accommodate all inputs.

source
+Alignment ops · TimeDag.jl

Alignment ops

TimeDag.leftFunction
left(x, y[, alignment::Alignment; initial_values=nothing]) -> Node

Construct a node that ticks according to alignment with the latest value of x.

It is "left", in the sense of picking the left-hand of the two arguments x and y.

source
TimeDag.rightFunction
right(x, y[, alignment::Alignment; initial_values=nothing]) -> Node

Construct a node that ticks according to alignment with the latest value of y.

It is "right", in the sense of picking the right-hand of the two arguments x and y.

source
TimeDag.alignFunction
align(x, y) -> Node

Form a node that ticks with the values of x whenever y ticks.

source
TimeDag.align_onceFunction
align_once(x, y) -> Node

Similar to align(x, y), except knots from x will be emitted at most once.

This means that the resulting node will tick at a subset of the times that y ticks.

source
TimeDag.coalignFunction
coalign(x, [...; alignment::Alignment]) -> Node...

Given at least one node(s) x, or values that are convertible to nodes, align all of them.

We guarantee that all nodes that are returned will have the same alignment. The values of each node will correspond to the values of the input nodes.

The choice of alignment is controlled by alignment, which defaults to UNION.

source
TimeDag.first_knotFunction
first_knot(x::Node{T}) -> Node{T}

Get a node which ticks with only the first knot of x, and then never ticks again.

source
TimeDag.active_countFunction
active_count(nodes...) -> Node{Int64}

Get a node of the number of the given nodes (at least one) which are active.

source
TimeDag.prependFunction
prepend(x, y) -> Node

Create a node that ticks with knots from x until y is active, and thereafter from y.

Note that the value_type of the returned node will be that of the promoted value types of x and y.

source
TimeDag.throttleFunction
throttle(x::Node, n::Integer) -> Node

Return a node that only ticks every n knots.

The first knot encountered on the input will always be emitted.

Info

The throttled node is stateful and depends on the starting point of the evaluation.

source
TimeDag.lagFunction
lag(x::Node, n::Integer)

Construct a node which takes values from x, but lags them by n knots.

This means that we do not introduce any new timestamps that do not appear in x, however we will not emit knots for the first n values that appear when evaluating x.

Note

If x is a constant node, and n > 0, lag(x, n) will be an empty_node of the same value_type as x.

Conceptually, this is consistent with the view that a constant is represented by a single knot at the start of time.

source
lag(x::Node, w::TimePeriod)

Construct a node which takes values from x, but lags them by period w.

Note

For any constant, lagging by an amount of time is a no-op. This is because the constant is represented as a single value at the start of time (which will later appear at the start of the evaluation window).

source
Base.diffFunction
diff(x::Node[, n=1])

Compute the n-knot difference of x, i.e. x - lag(x, n).

source
TimeDag.count_knotsFunction
count_knots(x) -> Node{Int64}

Return a node that ticks with the number of knots seen in x since evaluation began.

source
Base.mergeFunction
merge(x::Node...) -> Node

Given at least one node x, create a node that emits the union of knots from all x.

If one or more of the inputs contain knots at the same time, then only one will be emitted. The last input in which a knot occurs at a particular time will take precedence.

If the inputs x have different value types, then the resultant value type will be promoted as necessary to accommodate all inputs.

source
diff --git a/dev/reference/arithmetic/index.html b/dev/reference/arithmetic/index.html index 90b2efb2..2b72dbb0 100644 --- a/dev/reference/arithmetic/index.html +++ b/dev/reference/arithmetic/index.html @@ -1,2 +1,2 @@ -Arithmetic · TimeDag.jl

Arithmetic

Base.absFunction
abs(x::Node)

Obtain a node with values constructed by applying abs to each input value.

source
Base.expFunction
exp(x::Node)

Obtain a node with values constructed by applying exp to each input value.

source
Base.logFunction
log(x::Node)

Obtain a node with values constructed by applying log to each input value.

source
Base.log10Function
log10(x::Node)

Obtain a node with values constructed by applying log10 to each input value.

source
Base.log2Function
log2(x::Node)

Obtain a node with values constructed by applying log2 to each input value.

source
Base.sqrtFunction
sqrt(x::Node)

Obtain a node with values constructed by applying sqrt to each input value.

source
Base.Math.cbrtFunction
cbrt(x::Node)

Obtain a node with values constructed by applying cbrt to each input value.

source
Base.signFunction
sign(x::Node)

Obtain a node with values constructed by applying sign to each input value.

source
Base.tanFunction
tan(x::Node)

Obtain a node with values constructed by applying tan to each input value.

source
Base.sinFunction
sin(x::Node)

Obtain a node with values constructed by applying sin to each input value.

source
Base.cosFunction
cos(x::Node)

Obtain a node with values constructed by applying cos to each input value.

source
Base.atanFunction
atan(x::Node)

Obtain a node with values constructed by applying atan to each input value.

source
Base.asinFunction
asin(x::Node)

Obtain a node with values constructed by applying asin to each input value.

source
Base.acosFunction
acos(x::Node)

Obtain a node with values constructed by applying acos to each input value.

source
Base.tanhFunction
tanh(x::Node)

Obtain a node with values constructed by applying tanh to each input value.

source
Base.sinhFunction
sinh(x::Node)

Obtain a node with values constructed by applying sinh to each input value.

source
Base.coshFunction
cosh(x::Node)

Obtain a node with values constructed by applying cosh to each input value.

source
Base.atanhFunction
atanh(x::Node)

Obtain a node with values constructed by applying atanh to each input value.

source
Base.asinhFunction
asinh(x::Node)

Obtain a node with values constructed by applying asinh to each input value.

source
Base.acoshFunction
acosh(x::Node)

Obtain a node with values constructed by applying acosh to each input value.

source
Base.invFunction
inv(x::Node)

Obtain a node with values constructed by applying inv to each input value.

source
Base.:+Function
+(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying + to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:-Function
-(x::Node)

Obtain a node with values constructed by applying - to each input value.

source
-(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying - to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:*Function
*(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying * to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:/Function
/(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying / to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:^Function
^(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying ^ to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.minFunction
min(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying min to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.maxFunction
max(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying max to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:!Function
!(x::Node)

Obtain a node with values constructed by applying ! to each input value.

source
Base.:>Function
>(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying > to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:<Function
<(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying < to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:>=Function
>=(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying >= to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:<=Function
<=(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying <= to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
+Arithmetic · TimeDag.jl

Arithmetic

Base.absFunction
abs(x::Node)

Obtain a node with values constructed by applying abs to each input value.

source
Base.expFunction
exp(x::Node)

Obtain a node with values constructed by applying exp to each input value.

source
Base.logFunction
log(x::Node)

Obtain a node with values constructed by applying log to each input value.

source
Base.log10Function
log10(x::Node)

Obtain a node with values constructed by applying log10 to each input value.

source
Base.log2Function
log2(x::Node)

Obtain a node with values constructed by applying log2 to each input value.

source
Base.sqrtFunction
sqrt(x::Node)

Obtain a node with values constructed by applying sqrt to each input value.

source
Base.Math.cbrtFunction
cbrt(x::Node)

Obtain a node with values constructed by applying cbrt to each input value.

source
Base.signFunction
sign(x::Node)

Obtain a node with values constructed by applying sign to each input value.

source
Base.tanFunction
tan(x::Node)

Obtain a node with values constructed by applying tan to each input value.

source
Base.sinFunction
sin(x::Node)

Obtain a node with values constructed by applying sin to each input value.

source
Base.cosFunction
cos(x::Node)

Obtain a node with values constructed by applying cos to each input value.

source
Base.atanFunction
atan(x::Node)

Obtain a node with values constructed by applying atan to each input value.

source
Base.asinFunction
asin(x::Node)

Obtain a node with values constructed by applying asin to each input value.

source
Base.acosFunction
acos(x::Node)

Obtain a node with values constructed by applying acos to each input value.

source
Base.tanhFunction
tanh(x::Node)

Obtain a node with values constructed by applying tanh to each input value.

source
Base.sinhFunction
sinh(x::Node)

Obtain a node with values constructed by applying sinh to each input value.

source
Base.coshFunction
cosh(x::Node)

Obtain a node with values constructed by applying cosh to each input value.

source
Base.atanhFunction
atanh(x::Node)

Obtain a node with values constructed by applying atanh to each input value.

source
Base.asinhFunction
asinh(x::Node)

Obtain a node with values constructed by applying asinh to each input value.

source
Base.acoshFunction
acosh(x::Node)

Obtain a node with values constructed by applying acosh to each input value.

source
Base.invFunction
inv(x::Node)

Obtain a node with values constructed by applying inv to each input value.

source
Base.:+Function
+(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying + to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:-Function
-(x::Node)

Obtain a node with values constructed by applying - to each input value.

source
-(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying - to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:*Function
*(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying * to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:/Function
/(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying / to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:^Function
^(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying ^ to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.minFunction
min(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying min to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.maxFunction
max(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying max to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:!Function
!(x::Node)

Obtain a node with values constructed by applying ! to each input value.

source
Base.:>Function
>(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying > to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:<Function
<(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying < to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:>=Function
>=(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying >= to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
Base.:<=Function
<=(x, y[, alignment=DEFAULT_ALIGNMENT; kwargs...])

Obtain a node with values constructed by applying <= to the input values.

An alignment can optionally be specified. x and y should be nodes, or constants that can be converted to nodes.

Other keyword arguments are passed to apply.

source
diff --git a/dev/reference/creating_ops/index.html b/dev/reference/creating_ops/index.html index 9244fe79..8252ca50 100644 --- a/dev/reference/creating_ops/index.html +++ b/dev/reference/creating_ops/index.html @@ -28,4 +28,4 @@ times = t1:Day(1):t2 values = ones(length(times)) return Block(times, values) -end +end diff --git a/dev/reference/fundamentals/index.html b/dev/reference/fundamentals/index.html index 859b68ab..7b7269d6 100644 --- a/dev/reference/fundamentals/index.html +++ b/dev/reference/fundamentals/index.html @@ -2,4 +2,4 @@ Fundamentals · TimeDag.jl

Fundamentals

Data

Time-series data is stored internally in a Block. More information on what we mean by a time-series is explained in Time-series.

TimeDag.BlockType
Block{T}()
 Block(times::AbstractVector{DateTime}, values::AbstractVector{T})
 Block(unchecked, times, values)

Represent some data in timeseries.

Conceptually this is a list of (time, value) pairs, or "knots". Times must be strictly increasing — i.e. no repeated timestamps are allowed.

The constructor Block(times, values) will verify that the input data satisfies this constraint, however Block(unchecked, times, values) will skip the checks. This is primarily intended for internal use, where the caller assumes responsibility for the validity of times & values.

Danger

TimeDag considers instances of Block to be completely immutable. Thus, when working with functions that accept blocks (e.g. TimeDag.run_node!), you must not modify times or values members.

source

Computational graph

The computational graph is formed of TimeDag.Node objects. A node is an abstract representation of a time-series, i.e. a sequence of (time, value) pairs. A node knows the type of its values, which can be queries with TimeDag.value_type.

Note that nodes should never be constructed directly by the user. Typically one will call a function like block_node or lag, which will construct a node.

TimeDag includes functions to construct many useful nodes, but often you will need to create a custom node. See Creating nodes for instructions on how to do this.

Info

All nodes should eventually be constructed with TimeDag.obtain_node. This uses the global Identity map to ensure that we do not duplicate nodes.

TimeDag.NodeType
Node(parents, op)

A node in the computational graph that combines zero or more parents with op to produce a timeseries.

Warning

Note that a Node is only declared mutable so that we can attach finalizers to instances. This is required for the WeakIdentityMap to work. Nodes should NEVER actually be mutated!

Due to subgraph elimination, nodes that are equivalent should always be identical objects. We therefore leave hash & == defined in terms of the objectid.

source

Every node contains parents, and a TimeDag.NodeOp.

TimeDag.NodeOpType
abstract type NodeOp{T} end

Represent a time-series operation whose output will be a time-series with value type T.

source
TimeDag.obtain_nodeFunction
obtain_node(parents::NTuple{N,Node}, op::NodeOp) -> Node

Get a node for the given op and parents. If an equivalent node already exists in the global identity map, use that one, otherwise create a new node, add to the identity map, and return it.

Constant propagation

If all parents are constant nodes, and op has a well-defined operation on constant inputs, we will immediately perform the computation and return a constant node wrapping the computed value.

source

Given a node, a rough-and-ready way to visualise the graph on the command line is with AbstractTrees.print_tree. This will not directly indicated repeated nodes, but for small graphs the output can be useful.

Evaluation

In order to get a concrete time-series (as a Block) for a node, it must be evaluated with evaluate. Evaluation additionally requires a time range, and involves pulling data corresponding to this interval through the graph of ancestors of the given node(s).

Tip

When evaluating a graph in a production system, it may be desirable to have more control over evaluation. If this sounds like you, please read the Advanced evaluation section!

TimeDag.evaluateFunction
evaluate(nodes::AbstractVector{Node}, t0, t1[; batch_interval]) -> Vector{Block}
-evaluate(node::Node, t0, t1[; batch_interval]) -> Block

Evaluate the specified node(s) over the specified time range [t0, t1), and return the corresponding Block(s).

If nodes have common dependencies, work will not be repeated when performing this evaluation.

source
+evaluate(node::Node, t0, t1[; batch_interval]) -> Block

Evaluate the specified node(s) over the specified time range [t0, t1), and return the corresponding Block(s).

If nodes have common dependencies, work will not be repeated when performing this evaluation.

source diff --git a/dev/reference/internals/index.html b/dev/reference/internals/index.html index 1a71a9bb..aa4cac39 100644 --- a/dev/reference/internals/index.html +++ b/dev/reference/internals/index.html @@ -19,11 +19,11 @@ state = TimeDag.evaluate_until!(state, DateTime(2020)) # Simulate an incremental update over a few hours. -@time state = TimeDag.evaluate_until!(state, DateTime(2020, 1, 1, 3))
  0.005225 seconds (75.24 k allocations: 2.386 MiB)

Note that this approach is unlikely to be suitable for lower latency applications (e.g. microseconds). For that case, one may benefit from a "push mode" evaluation, where new data are pushed onto the graph, and only affected nodes are re-evaluated. Such a feature isn't currently planned.

Scheduling

TimeDag currently runs all nodes in a single thread, however this is subject to change in the future.

Alignment implementation

If we want to define a new op that follows alignment semantics, it should derive from one of the following types.

TimeDag.BinaryNodeOpType
BinaryNodeOp{T,A<:Alignment} <: NodeOp{T}

An abstract type representing a node op with two parents, and using alignment A.

source

Instead of implementing TimeDag.run_node! directly, one instead implements some of the following functions. The exact alignment logic is then encapsulated, and doesn't need to be dealt with directly.

TimeDag.operator!Function
operator!(op::UnaryNodeOp{T}, (state,), (time,) x) -> T / Maybe{T}
+@time state = TimeDag.evaluate_until!(state, DateTime(2020, 1, 1, 3))
  0.004472 seconds (75.24 k allocations: 2.386 MiB)

Note that this approach is unlikely to be suitable for lower latency applications (e.g. microseconds). For that case, one may benefit from a "push mode" evaluation, where new data are pushed onto the graph, and only affected nodes are re-evaluated. Such a feature isn't currently planned.

Scheduling

TimeDag currently runs all nodes in a single thread, however this is subject to change in the future.

Alignment implementation

If we want to define a new op that follows alignment semantics, it should derive from one of the following types.

TimeDag.BinaryNodeOpType
BinaryNodeOp{T,A<:Alignment} <: NodeOp{T}

An abstract type representing a node op with two parents, and using alignment A.

source

Instead of implementing TimeDag.run_node! directly, one instead implements some of the following functions. The exact alignment logic is then encapsulated, and doesn't need to be dealt with directly.

TimeDag.operator!Function
operator!(op::UnaryNodeOp{T}, (state,), (time,) x) -> T / Maybe{T}
 operator!(op::BinaryNodeOp{T}, (state,), (time,) x, y) -> T / Maybe{T}
 operator!(op::NaryNodeOp{N,T}, (state,), (time,) x, y, z...) -> T / Maybe{T}

Perform the operation for this node.

When defining a method of this for a new op, follow these rules:

For stateful operations, this operator should mutate state as required.

The return value out should be of type T iff TimeDag.always_ticks is true, otherwise it should be of type TimeDag.Maybe{T}.

If out <: Maybe{T}, and has !valid(out), this indicates that we do not wish to emit a knot at this time, and it will be skipped. Otherwise, value(out) will be used as the output value.

source
TimeDag.always_ticksFunction
always_ticks(node) -> Bool
 always_ticks(op) -> Bool

Returns true iff the return value from operator! can be assumed to always be valid.

If true, operator!(::Node{T}, ...) should return a T. If false, operator!(::Node{T}, ...) should return a Maybe{T}.

Note, that for sensible performance characteristics, this should be knowable from typeof(op)

source
TimeDag.stateless_operatorFunction
stateless_operator(node) -> Bool
 stateless_operator(op) -> Bool

Returns true iff operator(op, ...) would never look at or modify the evaluation state.

If this returns true, create_operator_evaluation_state will not be used.

Note that if an op has stateless(op) returning true, then it necessarily should also return true here. The default implementation is to return stateless(op), meaning that if one is creating a node that is fully stateless, one need only define stateless.

For optimal performance, this should be knowable from the type of op alone.

source
TimeDag.time_agnosticFunction
time_agnostic(node) -> Bool
 time_agnostic(op) -> Bool

Returns true iff op does not care about the time of the input knot(s).

For optimal performance, this should be knowable from the type of op alone.

source
TimeDag.value_agnosticFunction
value_agnostic(node) -> Bool
 value_agnostic(op) -> Bool

Returns true iff op does not care about the value(s) of the input knot(s).

For optimal performance, this should be knowable from the type of op alone.

source
TimeDag.initial_leftFunction
initial_left(op::BinaryNodeOp) -> L

Specify the initial value to use for the first parent of the given op.

Needs to be defined if has_initial_values returns true, and alignment is UNION. For other alignments it is not required.

source
TimeDag.initial_rightFunction
initial_right(op::BinaryNodeOp) -> R

Specify the initial value to use for the second parent of the given op.

Needs to be defined if has_initial_values(op) returns true, and alignment is UNION or LEFT. For INTERSECT alignment it is not required.

source
TimeDag.create_operator_evaluation_stateFunction
create_operator_evaluation_state(parents, op::NodeOp) -> NodeEvaluationState

Create an empty evaluation state for the given node, when starting evaluation at the specified time.

Note that this is state that will be passed to operator. The overall node may additionally wrap this state with further state, if this is necessary for e.g. alignment.

source

For simple cases, the following node ops can be useful.

Tip

Rather than using the structures below directly, you probably want to use TimeDag.apply, wrap, or wrapb.

TimeDag.SimpleUnaryType
SimpleUnary{f,TimeAgnostic,T}

Represents a stateless unary operator that will always emit a value.

The value of the TimeAgnostic type parmater is coupled to time_agnostic.

source
TimeDag.SimpleBinaryType
SimpleBinary{f,TimeAgnostic,T,A}

Represents a stateless binary operator that will always emit a value.

The value of the TimeAgnostic type parmater is coupled to time_agnostic.

source
TimeDag.SimpleNaryType
SimpleNary{f,TimeAgnostic,N,T,A}

Represents a stateless Nary operator that will always emit a value.

The value of the TimeAgnostic type parmater is coupled to time_agnostic.

source

Maybe

TimeDag.MaybeType
Maybe{T}()
-Maybe(value::T)

A structure which can hold a value of type T, or represent the absence of a value.

The API is optimised for speed over memory usage, by allowing a function that may otherwise return Union{T, Nothing} to instead always return Maybe{T}, and hence be type-stable.

source
TimeDag.valueFunction
value(x::Maybe{T}) -> T

Returns the value stored in x, or throws an ArgumentError if !valid(x).

Note that, in a tight loop, it is preferable to use a combination of calls to valid and unsafe_value, as it will generate more optimal code.

source
TimeDag.unsafe_valueFunction
unsafe_value(x::Maybe{T}) -> T

Returns the value stored in x.

It is "unsafe" when !valid(x), in that the return value of this function is undefined. If T is a reference type, calling this function will result in an UndefRefError being thrown.

source

Other

TimeDag.output_typeFunction
output_type(f, arg_types...)

Return the output type of the specified function. Tries to be fast where possible.

Warning

This uses Base.promote_op, which is noted to be fragile. The problem is that whilst one might hope that typeof(f(map(oneunit, arg_types)...)) could be used, in practice there are a lot of types which do not define oneunit.

Ultimately this represents a tension between the desire of TimeDag to know the type of the output of a node without yet knowing the concrete values of the input type.

source
TimeDag.duplicateFunction
duplicate(x)

Return an object that is equal to x, but fully independent of it.

Note that for any parts of x that TimeDag considers to be immutable (e.g. Blocks), this can return the identical object.

Conceptually this is otherwise very similar to deepcopy(x).

source
+Maybe(value::T)

A structure which can hold a value of type T, or represent the absence of a value.

The API is optimised for speed over memory usage, by allowing a function that may otherwise return Union{T, Nothing} to instead always return Maybe{T}, and hence be type-stable.

source
TimeDag.valueFunction
value(x::Maybe{T}) -> T

Returns the value stored in x, or throws an ArgumentError if !valid(x).

Note that, in a tight loop, it is preferable to use a combination of calls to valid and unsafe_value, as it will generate more optimal code.

source
TimeDag.unsafe_valueFunction
unsafe_value(x::Maybe{T}) -> T

Returns the value stored in x.

It is "unsafe" when !valid(x), in that the return value of this function is undefined. If T is a reference type, calling this function will result in an UndefRefError being thrown.

source

Other

TimeDag.output_typeFunction
output_type(f, arg_types...)

Return the output type of the specified function. Tries to be fast where possible.

Warning

This uses Base.promote_op, which is noted to be fragile. The problem is that whilst one might hope that typeof(f(map(oneunit, arg_types)...)) could be used, in practice there are a lot of types which do not define oneunit.

Ultimately this represents a tension between the desire of TimeDag to know the type of the output of a node without yet knowing the concrete values of the input type.

source
TimeDag.duplicateFunction
duplicate(x)

Return an object that is equal to x, but fully independent of it.

Note that for any parts of x that TimeDag considers to be immutable (e.g. Blocks), this can return the identical object.

Conceptually this is otherwise very similar to deepcopy(x).

source
diff --git a/dev/reference/misc_ops/index.html b/dev/reference/misc_ops/index.html index d2fab8ac..159768cd 100644 --- a/dev/reference/misc_ops/index.html +++ b/dev/reference/misc_ops/index.html @@ -5,4 +5,4 @@ String

Note that the same thing would happen if calling convert(Any, "hello").

However, if we set upcast=true:

julia> x = convert_value(Any, constant("hello"); upcast=true);
 
 julia> value_type(x)
-Any
source +Anysource diff --git a/dev/reference/online_windowed/index.html b/dev/reference/online_windowed/index.html index 30fdd8c7..c746b432 100644 --- a/dev/reference/online_windowed/index.html +++ b/dev/reference/online_windowed/index.html @@ -10,4 +10,4 @@ ema(x::Node, w_eff::Integer) -> Node

Create a node which computes the exponential moving average of x.

The decay is specified either by α, which should satisfy 0 < α < 1, or by w_eff, which should be an integer greater than 1. If the latter is specified, then we compute α = 2 / (w_eff + 1).

For internal state $s_t$, with $s_0 = 0$, and resulting EMA series $m_t$, this has the form:

\[\begin{aligned} s_t &= s_{t-1} + (1 - \alpha) x_t \\ m_t &= \frac{\alpha s_t}{1 - (1 - \alpha)^t}. -\end{aligned}\]

For further information, see the notational conventions and discussion on Wikipedia. Note that this function implements the variant including the correction for the initial convergence problem.

source +\end{aligned}\]

For further information, see the notational conventions and discussion on Wikipedia. Note that this function implements the variant including the correction for the initial convergence problem.

source diff --git a/dev/reference/sources/index.html b/dev/reference/sources/index.html index 0e773474..971901d3 100644 --- a/dev/reference/sources/index.html +++ b/dev/reference/sources/index.html @@ -1,3 +1,3 @@ Sources · TimeDag.jl

Sources

The following functions construct nodes with no parents.

TimeDag.block_nodeFunction
block_node(block::Block)

Construct a node whose values are read directly from the given block.

source
TimeDag.constantFunction
constant(value) -> Node
-constant(T, value) -> Node{T}

Explicitly wrap value into a TimeDag constant node, regardless of its type.

If T is provided, this allows creation of a node with a value_type that is a supertype of the type of the value — otherwise the constant node will always just use the concrete type of value.

In many cases this isn't required, since many TimeDag functions which expect nodes will automatically wrap non-node arguments into a constant node.

Warning

If value is already a node, this will wrap it up in an additional node. This is very likely not what you want to do.

source
TimeDag.empty_nodeFunction
empty_node(T)

Construct a node with value type T which, if evaluated, will never tick.

source
TimeDag.iterdatesFunction
iterdates(time_of_day::Time=Time(0), tz::TimeZone=tz"UTC", occurrence=1)

Create a node which ticks exactly once a day at time_of_day in timezone tz.

This defaults to midnight in UTC. If tz is set otherwise, then each knot will appear at time_of_day in that timezone.

Note that: * All knot times in TimeDag are considered to be in UTC. * It is possible to select a time_of_day that does not exist for every day. This will lead to an exception being raised during evaluation.

In a given knot, each value will be of type DateTime, and equal the time of the knot.

source
TimeDag.pulseFunction
pulse(delta::TimePeriod[; epoch::DateTime])

Obtain a node which ticks every delta. Each value will equal the time of the knot.

Knots will be placed such that the difference between its time and epoch will always be an integer multiple of delta. By default epoch is set to the Julia DateTime epoch, which is DateTime(0, 12, 31).

source
TimeDag.tea_fileFunction
tea_file(path::AbstractString, value_field_name)

Get a node that will read data from the tea file at path.

Such a tea file must observe the following properties, which will be verified at runtime:

  • Have a primary time field which is compatible with a Julia DateTime.
  • Have exactly one column with name value_field_name.
  • Have strictly increasing times.

Upon node creation, the metadata section of the file will be parsed to infer the value type of the resulting node. However, the bulk of the data will only be read at evaluation time.

See also

source
+constant(T, value) -> Node{T}

Explicitly wrap value into a TimeDag constant node, regardless of its type.

If T is provided, this allows creation of a node with a value_type that is a supertype of the type of the value — otherwise the constant node will always just use the concrete type of value.

In many cases this isn't required, since many TimeDag functions which expect nodes will automatically wrap non-node arguments into a constant node.

Warning

If value is already a node, this will wrap it up in an additional node. This is very likely not what you want to do.

source
TimeDag.empty_nodeFunction
empty_node(T)

Construct a node with value type T which, if evaluated, will never tick.

source
TimeDag.iterdatesFunction
iterdates(time_of_day::Time=Time(0), tz::TimeZone=tz"UTC", occurrence=1)

Create a node which ticks exactly once a day at time_of_day in timezone tz.

This defaults to midnight in UTC. If tz is set otherwise, then each knot will appear at time_of_day in that timezone.

Note that: * All knot times in TimeDag are considered to be in UTC. * It is possible to select a time_of_day that does not exist for every day. This will lead to an exception being raised during evaluation.

In a given knot, each value will be of type DateTime, and equal the time of the knot.

source
TimeDag.pulseFunction
pulse(delta::TimePeriod[; epoch::DateTime])

Obtain a node which ticks every delta. Each value will equal the time of the knot.

Knots will be placed such that the difference between its time and epoch will always be an integer multiple of delta. By default epoch is set to the Julia DateTime epoch, which is DateTime(0, 12, 31).

source
TimeDag.tea_fileFunction
tea_file(path::AbstractString, value_field_name)

Get a node that will read data from the tea file at path.

Such a tea file must observe the following properties, which will be verified at runtime:

  • Have a primary time field which is compatible with a Julia DateTime.
  • Have exactly one column with name value_field_name.
  • Have strictly increasing times.

Upon node creation, the metadata section of the file will be parsed to infer the value type of the resulting node. However, the bulk of the data will only be read at evaluation time.

See also

source