From 31bc49e42bfd366abfb4b44e31c15a0eebe446ae Mon Sep 17 00:00:00 2001 From: lrjohnson0 Date: Fri, 19 Jul 2024 14:09:13 -0400 Subject: [PATCH] made a few changes to get GP content into the format that matches the rest of the content --- GP.qmd | 16 +- GP_Notes.qmd | 40 ++- GP_Practical.qmd | 51 ++-- ...ctical.qmd => VB_TimeDepData_practical.qmd | 0 docs/search.json | 280 ++++++++++++++++++ docs/site_libs/revealjs/dist/theme/quarto.css | 2 +- 6 files changed, 349 insertions(+), 40 deletions(-) rename VB_IntroTimeDepData_practical.qmd => VB_TimeDepData_practical.qmd (100%) diff --git a/GP.qmd b/GP.qmd index d1b03df..61e6d25 100644 --- a/GP.qmd +++ b/GP.qmd @@ -1,8 +1,14 @@ --- -title: "Gaussian Processes" +title: "VectorByte Methods Training" +subtitle: "Introduction to Gaussian Processes for Time Dependent Data" +editor: source +author: "The VectorByte Team (Parul Patil, Virginia Tech)" +title-slide-attributes: + data-background-image: VectorByte-logo_lg.png + data-background-size: contain + data-background-opacity: "0.2" format: revealjs -editor: visual -fontsize: 20pt +fontsize: 22pt html-math-method: katex --- @@ -120,11 +126,11 @@ ggplot() + - Recall that in Linear Regression, you have $\ \Sigma \ = \sigma^2 \mathbb{I}$ -- For a GP, the cavariance matrix ( \$ \Sigma \$ ) is defined by a kernel. +- For a GP, the covariance matrix ( $\Sigma$ ) is defined by a kernel. - Consider, -\$\Sigma\_n = \tau\^2  C_n  \$;    \$C_n = \exp \left( - \vert \vert x - x' \vert \vert\^2 \right) \$, where $x$ and $x'$ are input locations. +$\Sigma_n = \tau^2 C_n$ where $C_n = \exp \left( - \vert \vert x - x' \vert \vert^2 \right)$, and $x$ and $x'$ are input locations. . . . diff --git a/GP_Notes.qmd b/GP_Notes.qmd index 789f121..b8f393b 100644 --- a/GP_Notes.qmd +++ b/GP_Notes.qmd @@ -1,13 +1,25 @@ --- -title: "Notes" -author: "Parul Patil" -date: "2024-07-18" -output: html_document +title: "VectorByte Methods Training" +subtitle: "Introduction to Gaussian Processes for Time Dependent Data" +author: "The VectorByte Team (Parul Patil, Virginia Tech)" +title-slide-attributes: + data-background-image: VectorByte-logo_lg.png + data-background-size: contain + data-background-opacity: "0.2" +format: + html: + toc: true + toc-location: left + html-math-method: katex + css: styles.css --- -### Gaussian Processes +# Introduction to Gaussian Processes for Time Dependent Data -#### Introduction +This document introduces the conceptual background to Gaussian Process (GP) regression, along with mathematical concepts. We also demonstrate briefly fitting GPs using the `laGP` package in R. The material here is intended to give a more verbose introduction to what is covered in the [lecture](GP.qmd) in order to support a student to work through the [practical component](GP_Practical.qmd). This material has been adapted from chapter 5 of the book [Surrogates: Gaussian process modeling, design and optimization for the applied sciences](https://bobby.gramacy.com/surrogates/) by Robert Gramacy. + + +# Gaussian Processes A Gaussian Process (GP) is a non-parameteric and flexible method to model data. Here, we assume that the data follow a Multivariate Normal Distribution. It is widely used in the fields of spatial statistics commonly known as *kriging* as well as machine learning as a surrogate model due to it's ability of making fast predictions with good uncertainty quantification. @@ -17,7 +29,7 @@ For e.g. Suppose we wish to estimate the amount of energy released when a bomb e In the field of ecology, as we will see in this tutorial, we will use a Gaussian Process model as a Non-Parametric Regression tool, similar to Linear Regression. We will assume our data follows a GP, and use Bayesian Methods to infer the distribution of new locations given the data. The predictions obtained will have good uncertainty quantification which is ofcourse valuable in this field due to the noisy nature of our data. -##### Gaussian Process Prior +## Gaussian Process Prior In our setup, we assume that the data follows a Multivariate Normal Distribution. We can think of this as a Prior. Mathematically, we can write it as: @@ -35,7 +47,7 @@ $$\Sigma(x, x') = \exp \left( \ \vert \vert{x - x'} \vert \vert^2 \ \right)$$ This creates a positive semi-definite correlation kernel. It also uses the proximity of $x$ and $x'$ as a measure of correlation i.e. the closer two points in the input space are, the more highly their corresponding responses are correlated. We will learn the exact form of $\Sigma_n$ later in the tutorial. First, we need to learn about MVN Distribution and the Posterior Distribution given the data. -**Multivariate Normal Distribution** +## Multivariate Normal Distribution Suppose we have $X = (X_1, X_2)$ @@ -116,7 +128,7 @@ Y(x_p) \vert \ D_n \ & \sim \mathcal{N} \left(\mu(x_p) \ , \ \sigma^2(x_p) \righ \end{aligned} \end{equation} -**Visualization** +## Example: GP for Toy Example Suppose we have data from the following function, $$Y(x) = 5 \ \sin (x)$$ @@ -180,7 +192,7 @@ ggplot() + -##### Covariance Function +## Covariance Function As mentioned before, the main action is in the specification of $\Sigma_n$. Let's express $\Sigma_n = \tau^2 C_n$ where $C_n$ is the correlation function. We will be using the squared exponential distance based correlation function. The kernel can be written down mathematically as, @@ -190,7 +202,7 @@ Here, if $x$ and $x'$ are closer in distance, the responses are more highly corr We have three main hyper-parameters here: -###### $\tau^2$: Scale +### $\tau^2$: Scale This parameter dictates the amplitude of our function. A MVN Distribution with scale = 1 will usually have data between -2 to 2. As the scale increases, this range expands. Here is a plo that shows two different scales for the same function, with all other parameters fixed. @@ -224,7 +236,7 @@ matplot(X, t(Y_scaled), type = 'l', main = expression(paste(tau^2, " = 25")), ylab = "Y", xlab = "X", lwd = 2, col = "red") ``` -###### $\theta$: Length-scale +### $\theta$: Length-scale The length-scale controls the wiggliness of the function. It is also referred to as the rate of decay of correlation. The smaller it's value, the more wiggly the function gets. This is because we expect the change in directionalilty of the function to be rather quick. We will once again demonstrate how a difference in magnitude of $\theta$ affects the function, keeping everything else constant. @@ -262,7 +274,7 @@ $$C_n = \exp{ \left( - \sum_{k=1}^{m} \frac{\vert\vert x - x' \vert \vert ^2}{\ This type of modelling is also referred to a seperable GP since we can take the sum outside the exponent and it will be a product of $m$ independent dimensions. If we set all $\theta_k= \theta$, it would be an isotropic GP. -###### $g$: Nugget +### $g$: Nugget This parameter adds some noise into the function. For $g > 0$, it determines the magnitude of discontinuity as $x'$ tends to $x$. It is also called the nugget effect. For $g=0$, there would be no noise and the function would interpolate between the points. This effect is only added to the diagonals of the correlation matrix. We never add $g$ to the off-diagonal elements. This also allows for numeric stability. Below is a snippet of what different magnitudes of $g$ look like. @@ -306,7 +318,7 @@ For now, let's get back to GP and fitting and learn how to use it. We have alrea - $\tau^2$: An estimate is obtained as a closed form solution once we plug in g. -### Heteroskedastic Gaussian Processes +# Heteroskedastic Gaussian Processes Let $Y_N$ be the response vector of size N. Let $X_1, X_2 ... X_d$ be the input space. diff --git a/GP_Practical.qmd b/GP_Practical.qmd index cb5aca7..d7f70ba 100644 --- a/GP_Practical.qmd +++ b/GP_Practical.qmd @@ -1,16 +1,22 @@ --- -title: "Gaussian Processes: Practical" -author: "Parul Patil" -date: "2024-06-21" -output: html_document +title: "VectorByte Methods Training" +subtitle: "Practical: Introduction to Gaussian Processes for Time Dependent Data" +author: "The VectorByte Team (Parul Patil, Virginia Tech)" +format: + html: + toc: true + toc-location: left + html-math-method: katex + css: styles.css --- -## Gaussian Processes +# Objectives +This practical will lead you through fitting a few versions of GPs using two R packages: `laGP` and `hetGP`. We will begin with a toy example from the lecture and then move on to a real data example to forecast tick abundances for a NEON site. -### Basics: Fitting a GP Model +# Basics: Fitting a GP Model -Remember the function we looked at before? +Remember the function we looked at before: $$Y(x) = 5 \ \sin(x)$$ Now, let's learn how we actually use the library `laGP` to fit a GP and make predictions at new locations. Let's begin by loading some libraries @@ -117,10 +123,13 @@ ggplot() + Looks pretty cool. + +# Using GPs for data on tick abundances over time + We will try all this on a simple dataset: Tick Data from NEON Forecasting Challenge. We will first learn a little bit about this dataset, followed by setting up our predictors and using them in our model to predict tick density for the future season. We will also learn how to fit a separable GP and specify priors for our parameters. Finally, we will learn some basics about a HetGP (Heteroskedastic GP) and try and fit that model as well. -### Overview of the Data +## Overview of the Data - **Objective**: Forecast tick density for 4 weeks into the future @@ -144,7 +153,7 @@ target <- readr::read_csv("https://data.ecoforecast.org/neon4cast-targets/ticks/ head(target) ``` -#### Initial Setup +## Initial Setup - For a GP model, we assume the response ($Y$) should be normally distributed. @@ -180,7 +189,7 @@ fi <- function(y) { } ``` -#### Predictors +## Predictors - The goal is to forecast tick populations for a season so our response (Y) here, is the tick density. However, we do not have a traditional data set with an obvious input space. What is the X? @@ -278,7 +287,7 @@ $X^* = (X_1^*, X_2^* ... X_n^*)$ will be the standarized $X$'s with all $X_i^*$ We can either write a function for this, or in our case, we can just divide Iso-week by 53 since that would result effectively be the same. Our Sin Predictor already lies in the interval [0, 1]. -### Model Fitting +## Model Fitting Now, let's start with modelling. We will start with one random location out of the 9 locations. @@ -310,7 +319,7 @@ df_train <- subset(df, df$datetime <= cutoff) df_test <- subset(df, df$datetime > cutoff) ``` -#### GP Model +## GP Model Now we will setup our X's. We already have the functions to do this and can simply pass in the datetime. We then combine $X_1$ and $X_2$ to create out input matrix $X$. Remember, everything is ordered as in our dataset. @@ -433,11 +442,10 @@ rmse ``` -Next, we can attempt a hetGP. -#### HetGP Model +## HetGP Model -We are now interested in fitting a vector of nuggets rather than a single value. +Next, we can attempt a hetGP. We are now interested in fitting a vector of nuggets rather than a single value. Let's use the same data we have to fit a hetGP. We already have our data `(X, y)` as well as our prediction set `XXt`. We use the `mleHetGP` command to fit a GP and pass in our data. The default covariance structure is the Squared Exponential structure. We use the `predict` function in base R and pass the hetGP object i.e. `het_gpi` to make predictions on our set `XXt`. @@ -552,11 +560,13 @@ plot(X, y) # Add code to fit a hetGP model and visualise it as above ``` -### Challenges +# Challenges -- Fit a GP Model for the location "SERC" i.e. `site_number = 7`. +Now it's your turn to try to fit a GP on your own. Here are some options: -- Use an environmental predictor in your model. Following is a function `fx.green` that creates the variable given the `datetime` and the `location`. +### Fit a GP Model for the location "SERC" i.e. `site_number = 7`. + +### Use an environmental predictor in your model. Following is a function `fx.green` that creates the variable given the `datetime` and the `location`. Here is a snippet of the supporting file that you will use; You can look into the data.frame and try to plot `ker` for one site at a time and see what it yields. @@ -576,10 +586,11 @@ fx.green <- function(datetime, site, site_info = df_green){ } ``` -- Fit a GP Model for all the locations (*More advanced*). +### Fit a GP Model for all the locations (*More advanced*). Hint: Write a *function* that can fit a GP and make predictions. Then, write a *for loop* where you subset the data for each location and then call the function to fit the GP. - \ No newline at end of file + + diff --git a/VB_IntroTimeDepData_practical.qmd b/VB_TimeDepData_practical.qmd similarity index 100% rename from VB_IntroTimeDepData_practical.qmd rename to VB_TimeDepData_practical.qmd diff --git a/docs/search.json b/docs/search.json index f40afc3..f2dc71e 100644 --- a/docs/search.json +++ b/docs/search.json @@ -817,5 +817,285 @@ "title": "VectorByte Methods Training", "section": "Hypothesis Testing and Confidence Intervals", "text": "Hypothesis Testing and Confidence Intervals\nSuppose we sample some data \\{Y_i, i=1,\\dots,n \\}, where Y_i \\stackrel{\\mathrm{i.i.d.}}{\\sim}N(\\mu,\\sigma^2) for i=1,\\ldots,n, and that you want to test the null hypothesis H_0: ~\\mu=12 vs. the alternative H_a: \\mu \\neq 12, at the 0.05 significance level.\n\nWhat test statistic would you use? How do you estimate \\sigma?\nWhat is the distribution for this test statistic if the null is true?\nWhat is the distribution for the test statistic if the null is true and n \\rightarrow \\infty?\nDefine the test rejection region. (I.e., for what values of the test statistic would you reject the null?)\nHow would compute the p-value associated with a particular sample?\nWhat is the 95% confidence interval for \\mu? How should one interpret this interval?\nIf \\bar{Y} = 11, s_y = 1, and n=9, what is the test result? What is the 95% CI for \\mu?" + }, + { + "objectID": "GP.html#gaussian-process-introduction", + "href": "GP.html#gaussian-process-introduction", + "title": "VectorByte Methods Training", + "section": "Gaussian Process: Introduction", + "text": "Gaussian Process: Introduction\n\n\nA Gaussian Process model is a non paramteric and flexible regression model\nIt started being used in the field of spatial statistics, where it is called kriging.\nIt is also widely used in the field of machine learning since it makes fast predictions and gives good uncertainty quantification commonly used as a surrogate model.\n\n\n\nSurrogate Models: Imagine a case where field experiments are infeasible and computer experiments take a long time to run, we can approximate the computer experiments using a surrogate model.\n\nThe ability to this model to provide good UQ makes it very useful in other fields such as ecology where the data is sparse and noisy and therefore good uncertainty measures are paramount." + }, + { + "objectID": "GP.html#what-is-a-gp", + "href": "GP.html#what-is-a-gp", + "title": "VectorByte Methods Training", + "section": "What is a GP?", + "text": "What is a GP?\n\n\nHere, we assume that the data come from a Multivariate Normal Distribution (MVN).\nAny normal distribution can be described by a mean vector \\mu and a covariance matrix \\Sigma.\nWe then make predictions conditional on the data.\n\n\n\n\nMathematically, we can write it as,\n\nY_{\\ n \\times 1} \\sim N \\ ( \\ \\mu(X)_{\\ n \\times 1} \\ , \\ \\Sigma(X)_{ \\ n \\times n} \\ ) Here, Y is the response of interest and n is the number of observations.\n\n\n\nOur goal is to find Y_p \\ \\vert \\ Y, X which will also be a Normal distribution.\nTo understand how it works, let’s first visualize this concept and then look into the math" + }, + { + "objectID": "GP.html#visualizing-a-gp", + "href": "GP.html#visualizing-a-gp", + "title": "VectorByte Methods Training", + "section": "Visualizing a GP", + "text": "Visualizing a GP\n\n\n\n\n\n\n\n\n\n\n\n\n\nWe are using points closer to each other to correlate the responses.\nWhile making predictions, we average over the data around the points." + }, + { + "objectID": "GP.html#how-does-a-gp-work", + "href": "GP.html#how-does-a-gp-work", + "title": "VectorByte Methods Training", + "section": "How does a GP work", + "text": "How does a GP work\n\n\nAs we saw, we are averaging the data “nearby”… How do you define that?\n\n\n\n\nThis indicates that we are using distance in some way. Where…?\n\n\n\n\nRecall that in Linear Regression, you have \\ \\Sigma \\ = \\sigma^2 \\mathbb{I}\nFor a GP, the covariance matrix ( \\Sigma ) is defined by a kernel.\nConsider,\n\n\\Sigma_n = \\tau^2 C_n where C_n = \\exp \\left( - \\vert \\vert x - x' \\vert \\vert^2 \\right), and x and x' are input locations.\n\n\n\nThe covariance structure now depends on how close together the inputs. If inputs are close in distance, then the responses are more highly correlated.\nThe covariance will decay at an exponential rate as x moves away from x'." + }, + { + "objectID": "GP.html#how-to-make-predictions", + "href": "GP.html#how-to-make-predictions", + "title": "VectorByte Methods Training", + "section": "How to make predictions", + "text": "How to make predictions\n\n\nNow we will learn how to use a GP to make predictions at new locations.\nAs we learnt, we condition on the data. We can think of this as the prior.\n\n\\begin{equation}\nY_n \\ \\vert X_n \\sim \\mathcal{N} \\ ( \\ 0 \\ , \\ \\tau^2 \\ C_n(X) \\ ) \\\\\n\\end{equation}\n\n\nNow, consider, (\\mathcal{X}, \\mathcal{Y}) as the predictive set.\n\nThe goal is to find the distribution of \\mathcal{Y} \\ \\vert X_n, Y_n which in this case is the posterior distribution.\nBy properties of Normal distribution, the posterior is also normally distributed.\n\n\n\nWe also need to write down the mean and variance of the posterior distribution so it’s ready for use." + }, + { + "objectID": "GP.html#how-to-make-predictions-1", + "href": "GP.html#how-to-make-predictions-1", + "title": "VectorByte Methods Training", + "section": "How to make predictions", + "text": "How to make predictions\n\n\nFirst we will “stack” the predictions and the data.\n\n\\begin{equation}\n\\begin{bmatrix}\n\\mathcal{Y} \\\\\nY_n \\\\\n\\end{bmatrix}\n\\ \\sim \\ \\mathcal{N}\n\\left(\n\\;\n\\begin{bmatrix}\n0 \\\\\n0 \\\\\n\\end{bmatrix}\\; , \\;\n\\begin{bmatrix}\n\\Sigma(\\mathcal{X}, \\mathcal{X}) & \\Sigma(\\mathcal{X}, X_n)\\\\\n\\Sigma({X_n, \\mathcal{X}}) & \\Sigma_n\\\\\n\\end{bmatrix}\n\\;\n\\right)\n\\\\[5pt]\n\\end{equation}\n\n\n\nNow, let’s denote the predictive mean with \\mu(\\mathcal{X}) and predictive variance with \\sigma^2(\\mathcal{X})\n\n\\begin{equation}\n\\mathcal{Y} \\mid Y_n, X_n \\sim N\\left(\\mu(\\mathcal{X}) \\ , \\ \\sigma^2(\\mathcal{X})\\right)\n\\end{equation}\n\n\n\nWe will apply the properties of conditional Normal distributions.\n\n\\begin{equation}\n\\begin{aligned}\n\\mu(\\mathcal{X}) & = \\Sigma(\\mathcal{X}, X_n) \\Sigma_n^{-1} Y_n \\\\\n\\sigma^2(\\mathcal{X}) & = \\Sigma(\\mathcal{X}, \\mathcal{X}) - \\Sigma(\\mathcal{X}, X_n) \\Sigma_n^{-1} \\Sigma(X_n, \\mathcal{X}) \\\\\n\\end{aligned}\n\\end{equation}" + }, + { + "objectID": "GP.html#hyper-parameters", + "href": "GP.html#hyper-parameters", + "title": "VectorByte Methods Training", + "section": "Hyper Parameters", + "text": "Hyper Parameters\n\n\nA GP is non parameteric, however, has some hyper-parameters as is the case with any Bayesian setup.\n\nOne of the most common kernels which we will focus on is the squared exponential distance kernel written as\nC_n = \\exp{ \\left( -\\frac{\\vert\\vert x - x' \\vert \\vert ^2}{\\theta} \\right ) + g \\mathbb{I_n}} \n\n\nRecall, \\Sigma_n = \\tau^2 C_n\nWe have three main parameters here:\n\n\\tau^2: Scale\n\\theta: Length-scale\ng: Nugget" + }, + { + "objectID": "GP.html#scale", + "href": "GP.html#scale", + "title": "VectorByte Methods Training", + "section": "Scale", + "text": "Scale\n\n\nThis parameter can be used to adjust the amplitude of the data.\nA random draw from a multivariate normal distribution with \\tau^2 = 1 will produce data between -2 and 2.\n\n\n\n\nNow let’s visualize what happens when we increase \\tau^2 to 25." + }, + { + "objectID": "GP.html#length-scale", + "href": "GP.html#length-scale", + "title": "VectorByte Methods Training", + "section": "Length-scale", + "text": "Length-scale\n\n\nThis parameter controls the rate of decay of correlation.\nLarger \\theta will result in wigglier functions.\n\n\n\nLet’s also visualize different values of \\theta." + }, + { + "objectID": "GP.html#nugget", + "href": "GP.html#nugget", + "title": "VectorByte Methods Training", + "section": "Nugget", + "text": "Nugget\n\n\nIt is responsible for introducing noise into the covariance structure so there is some discontinuity in the data.\nWe will compare a sample from g ~ 0 (< 1e-8 for numeric stability) vs g = 0.1 to observe what actually happens.\n\n\n\n\n\n\n\n\n\n\n\n\n\nThis parameter prevents interpolation which in turn yields better UQ." + }, + { + "objectID": "GP.html#toy-example-1d-example", + "href": "GP.html#toy-example-1d-example", + "title": "VectorByte Methods Training", + "section": "Toy Example (1D Example)", + "text": "Toy Example (1D Example)\n\n\n\nX <- matrix(seq(0, 2*pi, length = 100), ncol =1)\nn <- nrow(X) \ntrue_y <- 5 * sin(X)\nobs_y <- true_y + rnorm(n, sd=1)\n\npar(mfrow = c(1, 1), mar = c(5, 5, 4, 2), cex.axis = 2, cex.lab = 2, cex.main = 3, font.lab = 2)\nplot(X, obs_y, ylim = c(-10, 10), main = \"GP fit\", xlab = \"X\", ylab = \"Y\",\n cex = 1.5, pch = 16)\nlines(X, true_y, col = 2, lwd = 3)" + }, + { + "objectID": "GP.html#toy-example-1d-example-1", + "href": "GP.html#toy-example-1d-example-1", + "title": "VectorByte Methods Training", + "section": "Toy Example (1D Example)", + "text": "Toy Example (1D Example)\n\n\neps <- sqrt(.Machine$double.eps)\ngpi <- laGP::newGP(X = X,Z = obs_y, d = 0.1, g = 0.1 * var(obs_y), dK = TRUE) \nmle <- laGP::mleGP(gpi = gpi, param = c(\"d\", \"g\"), tmin= c(eps, eps), tmax= c(10, var(obs_y))) \n\nXX <- matrix(seq(0, 2*pi, length = 1000), ncol =1)\np <- laGP::predGP(gpi = gpi, XX = XX)" + }, + { + "objectID": "GP.html#anisotropic-gaussian-processes", + "href": "GP.html#anisotropic-gaussian-processes", + "title": "VectorByte Methods Training", + "section": "Anisotropic Gaussian Processes", + "text": "Anisotropic Gaussian Processes\nSuppose our data is multi-dimensional, we can control the length-scale (\\theta) for each dimension.\nIn this situation, we can rewrite the C_n matrix as,\nC_\\theta(x , x') = \\exp{ \\left( -\\sum_{k=1}^{m} \\frac{ (x_k - x_k')^2 }{\\theta_k} \\right ) + g \\mathbb{I_n}}\nHere, \\theta = (\\theta_1, \\theta_2, …, \\theta_m) is a vector of length-scales, where m = dimension of the input space.\n\nWe can adjust the decay of correlation per dimension." + }, + { + "objectID": "GP.html#heteroskedastic-gaussian-processes", + "href": "GP.html#heteroskedastic-gaussian-processes", + "title": "VectorByte Methods Training", + "section": "Heteroskedastic Gaussian Processes", + "text": "Heteroskedastic Gaussian Processes\n\n\nHeteroskedasticity implies that the data is noisy, and thee noise is irregular.\n\n\n\n\n\n\n\n\n\n\n\n\n\nA Heteroskedastic Gaussian Process (hetGP) is used when the data is noisy.\nNoise is usually input dependent and may vary with one or more predictors" + }, + { + "objectID": "GP.html#hetgp-setup", + "href": "GP.html#hetgp-setup", + "title": "VectorByte Methods Training", + "section": "HetGP Setup", + "text": "HetGP Setup\n\nLet Y_n be the response vector of size n. Let X = (X_1, X_2 ... X_n) be the input space.\nThen, a regular GP is written as:\n\n\\begin{align*}\nY_N \\ & \\ \\sim GP \\left( 0 \\ , \\tau^2 C_n \\right); \\ \\text{where, }\\\\[2pt]\nC_n & \\ = \\exp{ \\left( -\\frac{\\vert\\vert x - x' \\vert \\vert ^2}{\\theta} \\right ) + g \\mathbb{I_n}}\n\\end{align*}\n\n\n\nIn case of a hetGP, we have:\n\n\\begin{aligned}\nY_n\\ & \\ \\sim GP \\left( 0 \\ , \\tau^2 C_{n, \\Lambda} \\right) \\ \\ \\text{where, }\\\\[2pt]\nC_{n, \\Lambda} & \\ = \\exp{ \\left( -\\frac{\\vert\\vert x - x' \\vert \\vert ^2}{\\theta} \\right ) + \\Lambda_n} \\ \\ \\ \\text{and, }\\ \\\\[2pt]\n\\ \\ \\Lambda_n \\ & \\ = \\ \\text{Diag}(\\lambda_1, \\lambda_2 ... , \\lambda_n) \\\\[2pt]\n\\end{aligned}\n\n\nInstead of one nugget for the GP, we have a vector of nuggets i.e. a unique nugget for each unique input.\n\n\n\n\nWe can make predictions exactly as seen in case of a GP in Equation(4) with \n\\begin{aligned}\n\\Sigma(X) \\ & \\ = \\nu \\left( \\ K_{\\theta_Y}(X) + \\Lambda_N(X) \\ \\right) \\\\[2pt]\n\\end{aligned}" + }, + { + "objectID": "GP.html#toy-example-1d-example-2", + "href": "GP.html#toy-example-1d-example-2", + "title": "VectorByte Methods Training", + "section": "Toy Example (1D Example)", + "text": "Toy Example (1D Example)" + }, + { + "objectID": "GP.html#toy-example-1d-example-3", + "href": "GP.html#toy-example-1d-example-3", + "title": "VectorByte Methods Training", + "section": "Toy Example (1D Example)", + "text": "Toy Example (1D Example)" + }, + { + "objectID": "GP.html#toy-example-1d-example-4", + "href": "GP.html#toy-example-1d-example-4", + "title": "VectorByte Methods Training", + "section": "Toy Example (1D Example)", + "text": "Toy Example (1D Example)" + }, + { + "objectID": "GP.html#intro-to-ticks-problem", + "href": "GP.html#intro-to-ticks-problem", + "title": "VectorByte Methods Training", + "section": "Intro to Ticks Problem", + "text": "Intro to Ticks Problem\n\nEFI-RCN held an ecological forecasting challenge\nWe focus on the Tick Populations theme which studies the abundance of the lone star tick (Amblyomma americanum)\n\nSome details about the challenge:\n\nObjective: Forecast tick density for 4 weeks into the future\nSites: The data is collected across 9 different sites, each plot was of size 1600m^2 using a drag cloth\nData: Sparse and irregularly spaced. We only have ~650 observations across 10 years at 9 locations" + }, + { + "objectID": "GP.html#predictors", + "href": "GP.html#predictors", + "title": "VectorByte Methods Training", + "section": "Predictors", + "text": "Predictors\n\nX_1 Iso-week: The week in which the tick density was recorded.\nX_2 Sine wave: \\left( \\text{sin} \\ ( \\frac{2 \\ \\pi \\ X_1}{106} ) \\right)^2." + }, + { + "objectID": "GP.html#practical", + "href": "GP.html#practical", + "title": "VectorByte Methods Training", + "section": "Practical", + "text": "Practical\n\nSetup these predictors\nTransform the data to normal\nFit a GP to the Data\nMake Predictions on a testing set\nCheck how predictions perform." + }, + { + "objectID": "GP_Notes.html", + "href": "GP_Notes.html", + "title": "VectorByte Methods Training", + "section": "", + "text": "This document introduces the conceptual background to Gaussian Process (GP) regression, along with mathematical concepts. We also demonstrate briefly fitting GPs using the laGP package in R. The material here is intended to give a more verbose introduction to what is covered in the lecture in order to support a student to work through the practical component. This material has been adapted from chapter 5 of the book Surrogates: Gaussian process modeling, design and optimization for the applied sciences by Robert Gramacy." + }, + { + "objectID": "GP_Notes.html#introduction", + "href": "GP_Notes.html#introduction", + "title": "VectorByte Methods Training", + "section": "", + "text": "A Gaussian Process (GP) is a non-parameteric and flexible method to model data. Here, we assume that the data follow a Multivariate Normal Distribution. It is widely used in the fields of spatial statistics commonly known as kriging as well as machine learning as a surrogate model due to it’s ability of making fast predictions with good uncertainty quantification.\nIn spatial statistics, we want to emphasize the relationship between the distance of different locations along with the response of interest. A GP allows us to do that as we will see in this tutorial. A different application of a GP involves it’s usage as a surrogate. A surrogate model is used to approximate a computer model and/or field experiments where running the experiments may be cost or time ineffective and/or infeasible.\nFor e.g. Suppose we wish to estimate the amount of energy released when a bomb explodes. Conducting this experiment repeatedly a large number of times to collect data seems infeasible. In such cases, we can use a surrogate model to the field data and make predictions for different input locations.\nIn the field of ecology, as we will see in this tutorial, we will use a Gaussian Process model as a Non-Parametric Regression tool, similar to Linear Regression. We will assume our data follows a GP, and use Bayesian Methods to infer the distribution of new locations given the data. The predictions obtained will have good uncertainty quantification which is ofcourse valuable in this field due to the noisy nature of our data." + }, + { + "objectID": "VB_TimeDepData_practical.html#exploring-the-data", + "href": "VB_TimeDepData_practical.html#exploring-the-data", + "title": "VectorByte Methods Training", + "section": "Exploring the Data", + "text": "Exploring the Data\nAs always, we first want to take a look at the data, to make sure we understand it, and that we don’t have missing or weird values.\n\nmozData<-read.csv(\"data/Culex_erraticus_walton_covariates_aggregated.csv\")\nsummary(mozData)\n\n Month_Yr sample_value MaxTemp Precip \n Length:36 Min. :0.00000 Min. :16.02 Min. : 0.000 \n Class :character 1st Qu.:0.04318 1st Qu.:22.99 1st Qu.: 2.162 \n Mode :character Median :0.73001 Median :26.69 Median : 4.606 \n Mean :0.80798 Mean :26.23 Mean : 5.595 \n 3rd Qu.:1.22443 3rd Qu.:30.70 3rd Qu.: 7.864 \n Max. :3.00595 Max. :33.31 Max. :18.307 \n\n\nWe can see that the minimum observed average number of mosquitoes it zero, and max is only 3 (there are likely many zeros averaged over many days in the month). There don’t appear to be any NAs in the data. In this case the dataset itself is small enough that we can print the whole thing to ensure it’s complete:\n\nmozData\n\n Month_Yr sample_value MaxTemp Precip\n1 2015-01 0.000000000 17.74602 3.303991888\n2 2015-02 0.018181818 17.87269 16.544265802\n3 2015-03 0.468085106 23.81767 2.405651215\n4 2015-04 1.619047619 26.03559 8.974406168\n5 2015-05 0.821428571 30.01602 0.567960943\n6 2015-06 3.005952381 31.12094 4.841342729\n7 2015-07 2.380952381 32.81130 3.849010353\n8 2015-08 1.826347305 32.56245 5.562845324\n9 2015-09 0.648809524 30.55155 10.409724627\n10 2015-10 0.988023952 27.22605 0.337750269\n11 2015-11 0.737804878 24.86768 18.306749680\n12 2015-12 0.142857143 22.46588 5.621475377\n13 2016-01 0.000000000 16.02406 3.550622029\n14 2016-02 0.020202020 19.42057 11.254680803\n15 2016-03 0.015151515 23.13610 4.785664728\n16 2016-04 0.026143791 24.98082 4.580424519\n17 2016-05 0.025252525 28.72884 0.053057634\n18 2016-06 0.833333333 30.96990 6.155417473\n19 2016-07 1.261363636 33.30509 4.496368193\n20 2016-08 1.685279188 32.09633 11.338749182\n21 2016-09 2.617142857 31.60575 2.868288451\n22 2016-10 1.212121212 29.14275 0.000000000\n23 2016-11 1.539772727 24.48482 0.005462681\n24 2016-12 0.771573604 20.46054 11.615521725\n25 2017-01 0.045454545 18.35473 0.000000000\n26 2017-02 0.036363636 23.65584 3.150710053\n27 2017-03 0.194285714 22.53573 1.430094952\n28 2017-04 0.436548223 26.15299 0.499381616\n29 2017-05 1.202020202 28.00173 6.580562663\n30 2017-06 0.834196891 29.48951 13.333939858\n31 2017-07 1.765363128 32.25135 7.493927035\n32 2017-08 0.744791667 31.86476 6.082113434\n33 2017-09 0.722222222 30.60566 4.631037395\n34 2017-10 0.142131980 27.73453 11.567112214\n35 2017-11 0.289772727 23.23140 1.195760473\n36 2017-12 0.009174312 18.93603 4.018254442" + }, + { + "objectID": "VB_TimeDepData_practical.html#plotting-the-data", + "href": "VB_TimeDepData_practical.html#plotting-the-data", + "title": "VectorByte Methods Training", + "section": "Plotting the data", + "text": "Plotting the data\nFirst we’ll examine the data itself, including the predictors:\n\nmonths<-dim(mozData)[1]\nt<-1:months ## counter for months in the data set\npar(mfrow=c(3,1))\nplot(t, mozData$sample_value, type=\"l\", lwd=2, \n main=\"Average Monthly Abundance\", \n xlab =\"Time (months)\", \n ylab = \"Average Count\")\nplot(t, mozData$MaxTemp, type=\"l\",\n col = 2, lwd=2, \n main=\"Average Maximum Temp\", \n xlab =\"Time (months)\", \n ylab = \"Temperature (C)\")\nplot(t, mozData$Precip, type=\"l\",\n col=\"dodgerblue\", lwd=2,\n main=\"Average Monthly Precip\", \n xlab =\"Time (months)\", \n ylab = \"Precipitation (in)\")\n\n\n\n\n\n\n\n\nVisually we noticed that there may be a bit of clumping in the values for abundance (this is subtle) – in particular, since we have a lot of very small/nearly zero counts, a transform, such as a square root, may spread things out for the abundances. It also looks like both the abundance and temperature data are more cyclical than the precipitation, and thus more likely to be related to each other. There’s also not visually a lot of indication of a trend, but it’s usually worthwhile to consider it anyway. Replotting the abundance data with a transformation:\n\nmonths<-dim(mozData)[1]\nt<-1:months ## counter for months in the data set\nplot(t, sqrt(mozData$sample_value), type=\"l\", lwd=2, \n main=\"Sqrt Average Monthly Abundance\", \n xlab =\"Time (months)\", \n ylab = \"Average Count\")\n\n\n\n\n\n\n\n\nThat looks a little bit better. I suggest we go with this for our response." + }, + { + "objectID": "VB_TimeDepData_practical.html#building-a-data-frame", + "href": "VB_TimeDepData_practical.html#building-a-data-frame", + "title": "VectorByte Methods Training", + "section": "Building a data frame", + "text": "Building a data frame\nBefore we get into model building, we always want to build a data frame to contain all of the predictors that we want to consider, at the potential lags that we’re interested in. In the lecture we saw building the AR, sine/cosine, and trend predictors:\n\nt <- 2:months ## to make building the AR1 predictors easier\n\nmozTS <- data.frame(\n Y=sqrt(mozData$sample_value[t]), # transformed response\n Yl1=sqrt(mozData$sample_value[t-1]), # AR1 predictor\n t=t, # trend predictor\n sin12=sin(2*pi*t/12), \n cos12=cos(2*pi*t/12) # periodic predictors\n )\n\nWe will also put in the temperature and precipitation predictors. But we need to think about what might be an appropriate lag. If this were daily or weekly data, we’d probably want to have a fairly sizable lag – mosquitoes take a while to develop, so the number we see today is not likely related to the temperature today. However, since these data are agregated across a whole month, as is the temperature/precipitaion, the current month values are likely to be useful. However, it’s even possible that last month’s values may be so we’ll add those in as well:\n\nmozTS$MaxTemp<-mozData$MaxTemp[t] ## current temps\nmozTS$MaxTempl1<-mozData$MaxTemp[t-1] ## previous temps\nmozTS$Precip<-mozData$Precip[t] ## current precip\nmozTS$Precipl1<-mozData$Precip[t-1] ## previous precip\n\nThus our full dataframe:\n\nsummary(mozTS)\n\n Y Yl1 t sin12 \n Min. :0.0000 Min. :0.0000 Min. : 2.0 Min. :-1.00000 \n 1st Qu.:0.2951 1st Qu.:0.2951 1st Qu.:10.5 1st Qu.:-0.68301 \n Median :0.8590 Median :0.8590 Median :19.0 Median : 0.00000 \n Mean :0.7711 Mean :0.7684 Mean :19.0 Mean :-0.01429 \n 3rd Qu.:1.1120 3rd Qu.:1.1120 3rd Qu.:27.5 3rd Qu.: 0.68301 \n Max. :1.7338 Max. :1.7338 Max. :36.0 Max. : 1.00000 \n cos12 MaxTemp MaxTempl1 Precip \n Min. :-1.00000 Min. :16.02 Min. :16.02 Min. : 0.000 \n 1st Qu.:-0.68301 1st Qu.:23.18 1st Qu.:23.18 1st Qu.: 1.918 \n Median : 0.00000 Median :27.23 Median :27.23 Median : 4.631 \n Mean :-0.02474 Mean :26.47 Mean :26.44 Mean : 5.660 \n 3rd Qu.: 0.50000 3rd Qu.:30.79 3rd Qu.:30.79 3rd Qu.: 8.234 \n Max. : 1.00000 Max. :33.31 Max. :33.31 Max. :18.307 \n Precipl1 \n Min. : 0.000 \n 1st Qu.: 1.918 \n Median : 4.631 \n Mean : 5.640 \n 3rd Qu.: 8.234 \n Max. :18.307 \n\n\n\nhead(mozTS)\n\n Y Yl1 t sin12 cos12 MaxTemp MaxTempl1\n1 0.1348400 0.0000000 2 8.660254e-01 5.000000e-01 17.87269 17.74602\n2 0.6841675 0.1348400 3 1.000000e+00 6.123234e-17 23.81767 17.87269\n3 1.2724180 0.6841675 4 8.660254e-01 -5.000000e-01 26.03559 23.81767\n4 0.9063270 1.2724180 5 5.000000e-01 -8.660254e-01 30.01602 26.03559\n5 1.7337683 0.9063270 6 1.224647e-16 -1.000000e+00 31.12094 30.01602\n6 1.5430335 1.7337683 7 -5.000000e-01 -8.660254e-01 32.81130 31.12094\n Precip Precipl1\n1 16.5442658 3.3039919\n2 2.4056512 16.5442658\n3 8.9744062 2.4056512\n4 0.5679609 8.9744062\n5 4.8413427 0.5679609\n6 3.8490104 4.8413427" + }, + { + "objectID": "VB_TimeDepData_practical.html#building-a-first-model", + "href": "VB_TimeDepData_practical.html#building-a-first-model", + "title": "VectorByte Methods Training", + "section": "Building a first model", + "text": "Building a first model\nWe will first build a very simple model – just a trend – to practice building the model, checking diagnostics, and plotting predictions.\n\nmod1<-lm(Y ~ t, data=mozTS)\nsummary(mod1)\n\n\nCall:\nlm(formula = Y ~ t, data = mozTS)\n\nResiduals:\n Min 1Q Median 3Q Max \n-0.81332 -0.47902 0.03671 0.37384 0.87119 \n\nCoefficients:\n Estimate Std. Error t value Pr(>|t|) \n(Intercept) 0.904809 0.178421 5.071 1.5e-05 ***\nt -0.007038 0.008292 -0.849 0.402 \n---\nSignif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\nResidual standard error: 0.4954 on 33 degrees of freedom\nMultiple R-squared: 0.02136, Adjusted R-squared: -0.008291 \nF-statistic: 0.7204 on 1 and 33 DF, p-value: 0.4021\n\n\nThe model output indicates that this model is not useful – the trend is not significant and it only explains about 2% of the variability. Let’s plot the predictions:\n\n## plot points and fitted lines\nplot(Y~t, data=mozTS, col=1, type=\"l\")\nlines(t, mod1$fitted, col=\"dodgerblue\", lwd=2)\n\n\n\n\n\n\n\n\nNot good – we’ll definitely need to try something else! Remember that since we’re using a linear model for this, that we should check our residual plots as usual, and then also plot the acf of the residuals:\n\npar(mfrow=c(1,3), mar=c(4,4,2,0.5)) \n\n## studentized residuals vs fitted\nplot(mod1$fitted, rstudent(mod1), col=1,\n xlab=\"Fitted Values\", \n ylab=\"Studentized Residuals\", \n pch=20, main=\"AR 1 only model\")\n\n## qq plot of studentized residuals\nqqnorm(rstudent(mod1), pch=20, col=1, main=\"\" )\nabline(a=0,b=1,lty=2, col=2)\n\n## histogram of studentized residuals\nhist(rstudent(mod1), col=1, \n xlab=\"Studentized Residuals\", \n main=\"\", border=8)\n\n\n\n\n\n\n\n\nThis doesn’t look really bad, although the histogram might be a bit weird. Finally the acf\n\nacf(mod1$residuals)\n\n\n\n\n\n\n\n\nThis is where we can see that we definitely aren’t able to capture the pattern. There’s substantial autocorrelation left at a 1 month lag, and around 6 months.\nFinally, for moving forward, we can extract the BIC for this model so that we can compare with other models that you’ll build next.\n\nn<-length(t)\nextractAIC(mod1, k=log(n))[2]\n\n[1] -44.11057" + }, + { + "objectID": "GP_Practical.html", + "href": "GP_Practical.html", + "title": "VectorByte Methods Training", + "section": "", + "text": "This practical will lead you through fitting a few versions of GPs using two R packages: laGP and hetGP. We will begin with a toy example from the lecture and then move on to a real data example to forecast tick abundances for a NEON site." + }, + { + "objectID": "GP_Practical.html#gaussian-processes", + "href": "GP_Practical.html#gaussian-processes", + "title": "Gaussian Processes: Practical", + "section": "", + "text": "Remember the function we looked at before? \\[Y(x) = 5 \\ \\sin(x)\\] Now, let’s learn how we actually use the library laGP to fit a GP and make predictions at new locations. Let’s begin by loading some libraries\n\nlibrary(mvtnorm)\nlibrary(laGP)\nlibrary(hetGP)\nlibrary(ggplot2)\n\nNow we create the data for our example. (X, y) is our given data,\n\n# number of data points\nn <- 8 \n\n# Inputs - between 0 and 2pi\nX <- matrix(seq(0, 2*pi, length= n), ncol=1) \n\n# Creating the response using the formula\ny <- 5*sin(X)\n\nFirst we use the newGP function to fit our GP. In this function, we need to pass in our inputs, outputs and any known parameters. If we do not know parameters, we can pass in some prior information we know about the parameters (which we will learn shortly) so that calculating the MLE is easier. Here, X indicates the input which must be a matrix, Z is used for response which is a vector, d is used for \\(\\theta\\) (length-scale), a scalar, which we assume to be known and equal to 1 and g is the nugget effect which is also a scalar. We pass in a very small value for numeric stability.\n\n# Fitting GP \ngpi <- newGP(X = X, Z = y, d = 1, g = sqrt(.Machine$double.eps))\n\nNow we have fit the GP and if you print gpi it only outputs a number. This is because it assigns a number to the model as a reference but has everything stored within it. We needn’t get into the details of this in this tutorial. We can assume that our model is stored in gpi and all we see is the reference number.\nWe will use XX as our predictive set i.e. we want to make predictions at those input locations. I have created this as a vector of 100 points between -0.5 and \\(2 \\pi\\) + 0.5. We are essentially using 8 data points to make predictions at 100 points. Let’s see how we do. We make use of the predGP function.\nWe need to pass our GP object as gpi = gpi (in our case) and the predictive input locations XX = XX for this.\n\n# Predictions\n\n# Creating a prediction set: sequence from -0.5 to 2*pi + 0.5 with 100 evenly spaced points\nXX <- matrix(seq(-0.5, 2*pi + 0.5, length= 100), ncol=1)\n\n# Using the function to make predictions using our gpi object\nyy <- predGP(gpi = gpi, XX = XX)\n\n# This will tell us the results that we have stored.\nnames(yy)\n\n[1] \"mean\" \"Sigma\" \"df\" \"llik\" \n\n\nnames(yy) will tell us what is stored in yy. As we can see, we have the mean i.e. the mean predictions \\(\\mu\\) and, Sigma i.e. the covariance matrix \\(\\Sigma\\).\nNow we can draw 100 random draws using the mean and variance matrix we obtained. We will also obtain the confidence bounds as shown below:\n\n# Since our posterior is a distribution, we take 100 samples from this.\nYY <- rmvnorm (100, yy$mean, yy$Sigma)\n\n# We calculate the 95% bounds\nq1 <- yy$mean + qnorm(0.025, 0, sqrt(diag(yy$Sigma))) \nq2 <- yy$mean + qnorm(0.975, 0, sqrt(diag(yy$Sigma))) \n\nNow, we will plot the mean prediction and the bounds along with each of our 100 draws.\n\n# Now for the plot\ndf <- data.frame(\n XX = rep(XX, each = 100),\n YY = as.vector(YY),\n Line = factor(rep(1:100, 100))\n)\nggplot() +\n # Plotting all 100 draws from our distribution\n geom_line(aes(x = df$XX, y = df$YY, group = df$Line), color = \"darkgray\", alpha = 0.5,\n linewidth =1.5) +\n # Plotting our data points\n geom_point(aes(x = X, y = y), shape = 20, size = 10, color = \"darkblue\") +\n # Plotting the mean from our 100 draws\n geom_line(aes(x = XX, y = yy$mean), size = 1, linewidth =3) +\n # Adding the True function\n geom_line(aes(x = XX, y = 5*sin(XX)), color = \"blue\", linewidth =3, alpha = 0.8) +\n # Adding quantiles\n geom_line(aes(x = XX, y = q1), linetype = \"dashed\", color = \"red\", size = 1,\n alpha = 0.7,linewidth =2) +\n geom_line(aes(x = XX, y = q2), linetype = \"dashed\", color = \"red\", size = 1,\n alpha = 0.7,linewidth =2) +\n labs(x = \"x\", y = \"y\") +\n # Setting the themes\n theme_minimal() + \n theme(\n axis.text.x = element_text(angle = 45, hjust = 1, size = 25, face = \"bold\"),\n axis.text.y = element_text(size = 25, face = \"bold\"),\n axis.title.y = element_text(margin = margin(r = 10), size = 20, face = \"bold\"),\n axis.title.x = element_text(margin = margin(r = 10), size = 20, face = \"bold\"),\n panel.grid.major = element_blank(),\n panel.grid.minor = element_blank(),\n panel.background = element_rect(fill = \"white\"),\n strip.background = element_rect(fill = \"white\", color = \"white\"),\n strip.text = element_text(color = \"black\")) +\n guides(color = \"none\")\n\n\n\n\n\n\n\n\nLooks pretty cool.\nWe will try all this on a simple dataset: Tick Data from NEON Forecasting Challenge. We will first learn a little bit about this dataset, followed by setting up our predictors and using them in our model to predict tick density for the future season. We will also learn how to fit a separable GP and specify priors for our parameters. Finally, we will learn some basics about a HetGP (Heteroskedastic GP) and try and fit that model as well.\n\n\n\n\nObjective: Forecast tick density for 4 weeks into the future\nSites: The data is collected across 9 different sites, each plot was of size 1600\\(m^2\\) using a drag cloth\nData: Sparse and irregularly spaced. We only have ~650 observations across 10 years at 9 locations\n\nLet’s start with loading all the libraries that we will need, load our data and understand what we have.\n\nlibrary(tidyverse)\nlibrary(laGP)\nlibrary(ggplot2)\n\n# Pulling the data from the NEON data base. \ntarget <- readr::read_csv(\"https://data.ecoforecast.org/neon4cast-targets/ticks/ticks-targets.csv.gz\", guess_max = 1e1)\n\n# Visualizing the data\nhead(target)\n\n# A tibble: 6 × 5\n datetime site_id variable observation iso_week\n <date> <chr> <chr> <dbl> <chr> \n1 2015-04-20 BLAN amblyomma_americanum 0 2015-W17\n2 2015-05-11 BLAN amblyomma_americanum 9.82 2015-W20\n3 2015-06-01 BLAN amblyomma_americanum 10 2015-W23\n4 2015-06-08 BLAN amblyomma_americanum 19.4 2015-W24\n5 2015-06-22 BLAN amblyomma_americanum 3.14 2015-W26\n6 2015-07-13 BLAN amblyomma_americanum 3.66 2015-W29\n\n\n\n\n\nFor a GP model, we assume the response (\\(Y\\)) should be normally distributed.\nSince tick density, our response, must be greater than 0, we need to use a transform.\nThe following is the most suitable transform for our application:\n\n\n\\[\\begin{equation}\n\\begin{aligned}\nf(y) \\ & = \\text{log } \\ (y + 1) \\ \\ ; \\ \\ \\ \\\\[2pt]\n<!-- \\ & = \\sqrt{y} \\ \\ \\ \\; \\ \\ \\ otherwise -->\n\\end{aligned}\n\\end{equation}\\]\nWe pass in (\\(response\\) + 1) into this function to ensure we don’;t take a log of 0. We will adjust this in our back transform.\nLet’s write a function for this, as well as the inverse of the transform.\n\n# transforms y\nf <- function(x) {\n y <- log(x + 1)\n return(y)\n}\n\n# This function back transforms the input argument\nfi <- function(y) {\n x <- exp(y) - 1\n return(x)\n}\n\n\n\n\n\nThe goal is to forecast tick populations for a season so our response (Y) here, is the tick density. However, we do not have a traditional data set with an obvious input space. What is the X? \n\nWe made a few plots earlier to help us identify what can be useful:\n\n\\(X_1\\) Iso-week: This is the iso-week number\nLet’s convert the iso-week from our target dataset as a numeric i.e. a number. Here is a function to do the same.\n\n# This function tells us the iso-week number given the date\nfx.iso_week <- function(datetime){\n # Gives ISO-week in the format yyyy-w## and we extract the ##\n x1 <- as.numeric(stringr::str_sub(ISOweek::ISOweek(datetime), 7, 8)) # find iso week #\n return(x1)\n}\n\ntarget$week <- fx.iso_week(target$datetime)\nhead(target)\n\n# A tibble: 6 × 6\n datetime site_id variable observation iso_week week\n <date> <chr> <chr> <dbl> <chr> <dbl>\n1 2015-04-20 BLAN amblyomma_americanum 0 2015-W17 17\n2 2015-05-11 BLAN amblyomma_americanum 9.82 2015-W20 20\n3 2015-06-01 BLAN amblyomma_americanum 10 2015-W23 23\n4 2015-06-08 BLAN amblyomma_americanum 19.4 2015-W24 24\n5 2015-06-22 BLAN amblyomma_americanum 3.14 2015-W26 26\n6 2015-07-13 BLAN amblyomma_americanum 3.66 2015-W29 29\n\n\n\n\\(X_2\\) Sine wave: We use this to give our model phases. We can consider this as a proxy to some other variables such as temperature which would increase from Jan to about Jun-July and then decrease. We use the following sin wave\n\n\\(X_2 = \\left( \\text{sin} \\ \\left( \\frac{2 \\ \\pi \\ X_1}{106} \\right) \\right)^2\\) where, \\(X_1\\) is the iso-week.\nUsually, a Sin wave for a year would have the periodicity of 53 to indicate 53 weeks. Why have we chosen 106 as our period? And we do we square it?\nLet’s use a visual to understand that.\n\nx <- c(1:106)\nsin_53 <- sin(2*pi*x/53)\nsin_106 <- (sin(2*pi*x/106))\nsin_106_2 <- (sin(2*pi*x/106))^2\n\npar(mfrow=c(1, 3), mar = c(4, 5, 4, 1), cex.axis = 2, cex.lab = 2, cex.main = 3, font.lab = 2)\nplot(x, sin_53, col = 2, pch = 19, ylim = c(-1, 1), ylab = \"sin wave\", main = \"period = 53\")\nabline(h = 0, lwd = 2)\nplot(x, sin_106, col = 3, pch = 19, ylim = c(-1, 1), ylab = \"sin wave\", main = \"period = 106\")\nabline(h = 0, lwd = 2)\nplot(x, sin_106_2, col = 4, pch = 19, ylim = c(-1, 1), ylab = \"sin wave\", main = \"period = 106 squared\")\nabline(h = 0, lwd = 2)\n\n\n\n\n\n\n\n\nSome observations:\n\nThe sin wave (period 53) goes increases from (0, 1) and decreases all the way to -1 before coming back to 0, all within the 53 weeks in the year. But this is not what we want to achieve.\nWe want the function to increase from Jan - Jun and then start decreasing till Dec. This means, we need a regular sin-wave to span 2 years so we can see this.\nWe also want the next year to repeat the same pattern i.e. we want to restrict it to [0, 1] interval. Thus, we square the sin wave.\n\n\nfx.sin <- function(datetime, f1 = fx.iso_week){\n # identify iso week#\n x <- f1(datetime) \n # calculate sin value for that week\n x2 <- (sin(2*pi*x/106))^2 \n return(x2)\n}\n\nFor a GP, it’s also useful to ensure that all our X’s are between 0 and 1. Usually this is done by using the following method\n\\(X_i^* = \\frac{X_i - \\min(X)}{\\max(X) - \\min(X) }\\) where \\(X = (X_1, X_2 ...X_n)\\)\n\\(X^* = (X_1^*, X_2^* ... X_n^*)\\) will be the standarized \\(X\\)’s with all \\(X_i^*\\) in the interval [0, 1].\nWe can either write a function for this, or in our case, we can just divide Iso-week by 53 since that would result effectively be the same. Our Sin Predictor already lies in the interval [0, 1].\n\n\n\n\nNow, let’s start with modelling. We will start with one random location out of the 9 locations.\n\n# Choose a random site number: Anything between 1-9.\nsite_number <- 6\n\n# Obtaining site name\nsite_names <- unique(target$site_id)\n\n# Subsetting all the data at that location\ndf <- subset(target, target$site_id == site_names[site_number])\nhead(df)\n\n# A tibble: 6 × 6\n datetime site_id variable observation iso_week week\n <date> <chr> <chr> <dbl> <chr> <dbl>\n1 2014-06-09 SCBI amblyomma_americanum 75.9 2014-W24 24\n2 2014-06-30 SCBI amblyomma_americanum 28.3 2014-W27 27\n3 2014-07-21 SCBI amblyomma_americanum 0 2014-W30 30\n4 2014-07-28 SCBI amblyomma_americanum 10.1 2014-W31 31\n5 2014-08-11 SCBI amblyomma_americanum 4.94 2014-W33 33\n6 2014-10-20 SCBI amblyomma_americanum 0 2014-W43 43\n\n\nWe will also select only those columns that we are interested in i.e. datetime and obervation. We don’t need site since we are only using one of them.\n\n# extracting only the datetime and obs columns\ndf <- df[, c(\"datetime\", \"observation\")]\n\nWe will use one site at first and fit a GP and make predictions. For this we first need to divide our data into a training set and a testing set. Since we have time series, we want to divide the data sequentially, i.e. we pick a date and everything before the date is our training set and after is our testing set where we check how well our model performs. We choose the date 2020-12-31.\n\n# Selecting a date before which we consider everything as training data and after this is testing data.\ncutoff = as.Date('2020-12-31')\ndf_train <- subset(df, df$datetime <= cutoff)\ndf_test <- subset(df, df$datetime > cutoff)\n\n\n\nNow we will setup our X’s. We already have the functions to do this and can simply pass in the datetime. We then combine \\(X_1\\) and \\(X_2\\) to create out input matrix \\(X\\). Remember, everything is ordered as in our dataset.\n\n# Setting up iso-week and sin wave predictors by calling the functions\nX1 <- fx.iso_week(df_train$datetime) # range is 1-53\nX2 <- fx.sin(df_train$datetime) # range is 0 to 1\n\n# Centering the iso-week by diving by 53\nX1c <- X1/ 53\n\n# We combine columns centered X1 and X2, into a matrix as our input space\nX <- as.matrix(cbind.data.frame(X1c, X2))\nhead(X)\n\n X1c X2\n[1,] 0.4528302 0.9782005\n[2,] 0.5094340 0.9991219\n[3,] 0.5660377 0.9575728\n[4,] 0.5849057 0.9305218\n[5,] 0.6226415 0.8587536\n[6,] 0.8113208 0.3120862\n\n\nNext step is to tranform the response to ensure it is normal.\n\n# Extract y: observation from our training model. \ny_obs <- df_train$observation\n\n# Transform the response\ny <- f(y_obs)\n\nNow, we can use the laGP library to fit a GP. First, we specify priors using darg and garg. We will specify a minimum and maximum for our arguments. We need to pass the input space for darg and the output vector for garg. You can look into the functions using ?function in R. We set the minimum to a very small value rather than 0 to ensure numeric stability.\n\n# A very small value for stability\neps <- sqrt(.Machine$double.eps) \n \n# Priors for theta and g. \nd <- darg(list(mle=TRUE, min =eps, max=5), X)\ng <- garg(list(mle=TRUE, min = eps, max = 1), y)\n\nNow, to fit the GP, we use newGPsep. We pass the input matrix and the response vector with some values of the parameters. Then, we use the jmleGPsep function to jointly estimate \\(\\theta\\) and \\(g\\) using MLE method. dK allows the GP object to store derivative information which is needed for MLE calculations. newGPsep will fit a separable GP as opposed to newGP which would fit an isotropic GP.\n\n# Fitting a GP with our data, and some starting values for theta and g\ngpi <- newGPsep(X, y, d = 0.1, g = 1, dK = T)\n\n# Jointly infer MLE for all parameters\nmle <- jmleGPsep(gpi, drange = c(d$min, d$max), grange = c(g$min, g$max), \n dab = d$ab, gab= g$ab)\n\nNow, we will create a grid from the first week in our dataset to 1 year into the future, and predict on the entire time series. We use predGPsep to make predictions.\n\n# Create a grid from start date in our data set to one year in future (so we forecast for next season)\nstartdate <- as.Date(min(df$datetime))# identify start week\ngrid_datetime <- seq.Date(startdate, Sys.Date() + 365, by = 7) # create sequence from \n\n# Build the inpu space for the predictive space (All weeks from 04-2014 to 07-2025)\nXXt1 <- fx.iso_week(grid_datetime)\nXXt2 <- fx.sin(grid_datetime)\n\n# Standardize\nXXt1c <- XXt1/53\n\n# Store inputs as a matrix\nXXt <- as.matrix(cbind.data.frame(XXt1c, XXt2))\n\n# Make predictions using predGP with the gp object and the predictive set\nppt <- predGPsep(gpi, XXt) \n\nStoring the mean and calculating quantiles.\n\n# Now we store the mean as our predicted response i.e. density along with quantiles\nyyt <- ppt$mean\nq1t <- ppt$mean + qnorm(0.025,0,sqrt(diag(ppt$Sigma))) #lower bound\nq2t <- ppt$mean + qnorm(0.975,0,sqrt(diag(ppt$Sigma))) # upper bound\n\nNow we can plot our data and predictions and see how well our model performed. We need to back transform our predictions to the original scale.\n\n# Back transform our data to original\ngp_yy <- fi(yyt)\ngp_q1 <- fi(q1t)\ngp_q2 <- fi(q2t)\n\n# Plot the observed points\nplot(as.Date(df$datetime), df$observation,\n main = paste(site_names[site_number]), col = \"black\",\n xlab = \"Dates\" , ylab = \"Abundance\",\n # xlim = c(as.Date(min(df$datetime)), as.Date(cutoff)),\n ylim = c(min(df_train$observation, gp_yy, gp_q1), max(df_train$observation, gp_yy, gp_q2)* 1.05))\n\n# Plot the testing set data \npoints(as.Date(df_test$datetime), df_test$observation, col =\"black\", pch = 19)\n\n# Line to indicate seperation between train and test data\nabline(v = as.Date(cutoff), lwd = 2)\n\n# Add the predicted response and the quantiles\nlines(grid_datetime, gp_yy, col = 4, lwd = 2)\nlines(grid_datetime, gp_q1, col = 4, lwd = 1.2, lty = 2)\nlines(grid_datetime, gp_q2, col = 4, lwd = 1.2, lty =2)\n\n\n\n\n\n\n\n\nThat looks pretty good? We can also look at the RMSE to see how the model performs. It is better o do this on the transformed scale. We will use yyt for this. We need to find those predictions which correspond to the datetime in our testing dataset df_test.\n\n# Obtain true observed values for testing set\nyt_true <- f(df_test$observation)\n\n# FInd corresponding predictions from our model in the grid we predicted on\nyt_pred <- yyt[which(grid_datetime %in% df_test$datetime)]\n\n# calculate RMSE\nrmse <- sqrt(mean((yt_true - yt_pred)^2))\nrmse\n\n[1] 0.8624903\n\n\nNext, we can attempt a hetGP.\n\n\n\nWe are now interested in fitting a vector of nuggets rather than a single value.\nLet’s use the same data we have to fit a hetGP. We already have our data (X, y) as well as our prediction set XXt. We use the mleHetGP command to fit a GP and pass in our data. The default covariance structure is the Squared Exponential structure. We use the predict function in base R and pass the hetGP object i.e. het_gpi to make predictions on our set XXt.\n\n# create predictors\nX1 <- fx.iso_week(df_train$datetime)\nX2 <- fx.sin(df_train$datetime)\n\n# standardize and put into matrix\nX1c <- X1/53\nX <- as.matrix(cbind.data.frame(X1c, X2))\n\n# Build prediction grid (From 04-2014 to 07-2025)\nXXt1 <- fx.iso_week(grid_datetime)\nXXt2 <- fx.sin(grid_datetime)\n\n# standardize and put into matrix\nXXt1c <- XXt1/53\nXXt <- as.matrix(cbind.data.frame(XXt1c, XXt2))\n\n# Transform the training response\ny_obs <- df_train$observation\ny <- f(y_obs)\n\n# Fit a hetGP model. X must be s matrix and nrow(X) should be same as length(y)\nhet_gpi <- hetGP::mleHetGP(X = X, Z = y)\n\n# Predictions using the base R predict command with a hetGP object and new locationss\nhet_ppt <- predict(het_gpi, XXt)\n\nNow we obtain the mean and the confidence bounds as well as transform the data to the original scale.\n\n# Mean density for predictive locations and Confidence bounds\nhet_yyt <- het_ppt$mean\nhet_q1t <- qnorm(0.975, het_ppt$mean, sqrt(het_ppt$sd2 + het_ppt$nugs))\nhet_q2t <- qnorm(0.025, het_ppt$mean, sqrt(het_ppt$sd2 + het_ppt$nugs)) \n\n# Back transforming to original scale\nhet_yy <- fi(het_yyt)\nhet_q1 <- fi(het_q1t)\nhet_q2 <- fi(het_q2t)\n\nWe can now plot the results similar to before. [Uncomment the code lines to see how a GP vs a HetGP fits the data]\n\n# Plot Original data\nplot(as.Date(df$datetime), df$observation,\n main = paste(site_names[site_number]), col = \"black\",\n xlab = \"Dates\" , ylab = \"Abundance\",\n # xlim = c(as.Date(min(df$datetime)), as.Date(cutoff)),\n ylim = c(min(df_train$observation, het_yy, het_q2), max(df_train$observation, het_yy, het_q1)* 1.2))\n\n# Add testing observations\npoints(as.Date(df_test$datetime), df_test$observation, col =\"black\", pch = 19)\n\n# Line to indicate our cutoff point\nabline(v = as.Date(cutoff), lwd = 2)\n\n# HetGP Model mean predictions and bounds.\nlines(grid_datetime, het_yy, col = 2, lwd = 2)\nlines(grid_datetime, het_q1, col = 2, lwd = 1.2, lty = 2)\nlines(grid_datetime, het_q2, col = 2, lwd = 1.2, lty =2)\n\n## GP model fits for the same data\n# lines(grid_datetime, gp_yy, col = 3, lwd = 2)\n# lines(grid_datetime, gp_q1, col = 3, lwd = 1.2, lty = 2)\n# lines(grid_datetime, gp_q2, col = 3, lwd = 1.2, lty =2)\n\nlegend(\"topleft\", legend = c(\"Train Y\",\"Test Y\", \"GP preds\", \"HetGP preds\"),\n col = c(1, 1, 2, 3), lty = c(NA, NA, 1, 1),\n pch = c(1, 19, NA, NA), cex = 0.5)\n\n\n\n\n\n\n\n\nThe mean predictions of a GP are similar to that of a hetGP; But the confidence bounds are different. A hetGP produces sligtly tighter bounds.\nWe can also compare the RMSE’s using the predictions of the hetGP model.\n\nyt_true <- f(df_test$observation) # Original data\nhet_yt_pred <- het_yyt[which(grid_datetime %in% df_test$datetime)] # model preds\n\n# calculate rmse for hetGP model\nrmse_het <- sqrt(mean((yt_true - het_yt_pred)^2))\nrmse_het\n\n[1] 0.8835813\n\n\nNow that we have learnt how to fit a GP and a hetGP, it’s time for a challenge.\nTry a hetGP on our sin example from before (but this time I have added noise).\n\n# Your turn\nset.seed(26)\nn <- 8 # number of points\nX <- matrix(seq(0, 2*pi, length= n), ncol=1) # build inputs \ny <- 5*sin(X) + rnorm(n, 0 , 2) # response with some noise\n\n# Predict on this set\nXX <- matrix(seq(-0.5, 2*pi + 0.5, length= 100), ncol=1)\n\n# Data visualization\nplot(X, y)\n\n\n\n\n\n\n\n# Add code to fit a hetGP model and visualise it as above\n\n\n\n\n\n\nFit a GP Model for the location “SERC” i.e. site_number = 7.\nUse an environmental predictor in your model. Following is a function fx.green that creates the variable given the datetime and the location.\n\nHere is a snippet of the supporting file that you will use; You can look into the data.frame and try to plot ker for one site at a time and see what it yields.\n\nsource('code/df_spline.R') # sources the cript to make greenness predictor\nhead(df_green) # how the dataset looks\n\n site iso ker\n1 BLAN 1 0\n2 BLAN 2 0\n3 BLAN 3 0\n4 BLAN 4 0\n5 BLAN 5 0\n6 BLAN 6 0\n\n# The function to create the environmental predictor similar to iso-week and sin wave\nfx.green <- function(datetime, site, site_info = df_green){\n ker <- NULL\n iso <- fx.iso_week(datetime) # identify iso week\n df.iso <- cbind.data.frame(datetime, iso) # combine date with iso week\n sites.ker <- subset(site_info, site == site)[,2:3] # obtain kernel for location\n df.green <- df.iso %>% left_join(sites.ker, by = 'iso') # join dataframes by iso week\n ker <- df.green$ker # return kernel\n return(ker)\n}\n\n\nFit a GP Model for all the locations (More advanced).\n\nHint: Write a function that can fit a GP and make predictions. Then, write a for loop where you subset the data for each location and then call the function to fit the GP." + }, + { + "objectID": "GP_Notes.html#gaussian-process-prior", + "href": "GP_Notes.html#gaussian-process-prior", + "title": "VectorByte Methods Training", + "section": "Gaussian Process Prior", + "text": "Gaussian Process Prior\nIn our setup, we assume that the data follows a Multivariate Normal Distribution. We can think of this as a Prior. Mathematically, we can write it as:\nY_n \\sim N \\ ( \\ \\mu \\ , \\ \\Sigma_n \\ )\nHere, Y and \\mu is an n \\times 1 vector and \\Sigma_n is a positive semi definite matrix. This means that,\nx^T \\Sigma_n x > 0 \\ \\text{ for all } \\ x \\neq 0.\nFor our purposes, we will consider \\mu = 0.\nIn Simple Linear Regression, we assume \\Sigma_n = \\sigma^2 \\mathbb{I}. This means that Y_1 \\ , Y_2 \\ \\ ... \\ \\ Y_n are uncorrelated with each other. However, In a GP, we assume that there is some correlation between the responses. A common covariance function is the squared exponential kernel, which invovles the Euclidean distance i.e. for two inputs x and x', the correlation is defined as,\n\\Sigma(x, x') = \\exp \\left( \\ \\vert \\vert{x - x'} \\vert \\vert^2 \\ \\right)\nThis creates a positive semi-definite correlation kernel. It also uses the proximity of x and x' as a measure of correlation i.e. the closer two points in the input space are, the more highly their corresponding responses are correlated. We will learn the exact form of \\Sigma_n later in the tutorial. First, we need to learn about MVN Distribution and the Posterior Distribution given the data." + }, + { + "objectID": "GP_Notes.html#multivariate-normal-distribution", + "href": "GP_Notes.html#multivariate-normal-distribution", + "title": "VectorByte Methods Training", + "section": "Multivariate Normal Distribution", + "text": "Multivariate Normal Distribution\nSuppose we have X = (X_1, X_2)\n\\begin{equation}\nX =\n\\begin{bmatrix}\nX_1 \\\\\nX_2 \\\\\n\\end{bmatrix} \\; , \\;\n\\mu =\n\\begin{bmatrix}\n\\mu_1\\\\\n\\mu_2\\\\\n\\end{bmatrix}\\; , \\;\n\\end{equation} where X_1 and \\mu_1 are q \\times 1 and X_2 and \\mu_2 are (N-q) \\times 1.\n\\begin{equation}\n\\Sigma =\n\\begin{bmatrix}\n\\Sigma_{11} & \\Sigma_{12}\\\\\n\\Sigma_{21} & \\Sigma_{22}\\\\\n\\end{bmatrix}\n\\; \\; \\;\n\\\\[5pt]\n\\text{with dimensions } \\; \\;\n\\begin{bmatrix}\nq \\times q & q \\times (N-q) \\\\\n(N -q) \\times q & (N-q) \\times (N-q)\\\\\n\\end{bmatrix}\n\\;\n\\end{equation}\nThen, we have\n\\begin{equation}\n\\begin{bmatrix}\nX_1 \\\\\nX_2 \\\\\n\\end{bmatrix}\n\\ \\sim \\ \\mathcal{N}\n\\left(\n\\;\n\\begin{bmatrix}\n\\mu_1 \\\\\n\\mu_2 \\\\\n\\end{bmatrix}\\; , \\;\n\\begin{bmatrix}\n\\Sigma_{11} & \\Sigma_{12}\\\\\n\\Sigma_{21} & \\Sigma_{22}\\\\\n\\end{bmatrix}\n\\;\n\\right)\n\\\\[5pt]\n\\end{equation}\nNow we can derive the conditional distribution of X_1 \\vert X_2 using properties of MVN.\nX_1 \\vert X_2 \\ \\sim \\mathcal{N} (\\mu_{X_1 \\vert X_2}, \\ \\Sigma_{X_1 \\vert X_2})\nwhere,\n\\mu_{X_1 \\vert X_2} = \\mu_1 + \\Sigma_{12}\\Sigma_{22}^{-1}(x_2 - \\mu_2)\n\\Sigma_{X_1 \\vert X_2} = \\Sigma_{11} - \\Sigma_{12}\\Sigma_{22}^{-1} \\Sigma_{21}\nNow, let’s look at this in our context.\nSuppose we have, D_n = (X_n, Y_n) where Y_n \\sim N \\ ( \\ 0 \\ , \\ \\Sigma_n \\ ). Now, for a new location x_p, we need to find the distribution ofY(x_p).\nWe want to find the distribution of Y(x_p) \\ \\vert \\ D_n. Using the information from above, we know this is normally distributed and we need to identify then mean and variance. Thus, we have\n\\begin{equation}\n\\begin{aligned}\nY(x_p) \\vert \\ D_n \\ & \\sim \\mathcal{N} \\left(\\mu(x_p) \\ , \\ \\sigma^2(x_p) \\right) \\; \\; \\text{where, }\\\\[3pt]\n\\mu(x_p) \\ & = \\Sigma(x_p, X_n) \\Sigma_n^{-1}Y_n \\; \\;\\\\[3pt]\n\\sigma^2(x_p) \\ & = \\Sigma(x_p, x_p) - \\Sigma(x_p, X_n) \\Sigma_n^{-1}\\Sigma(X_n, x_p) \\\\[3pt]\n\\end{aligned}\n\\end{equation}" + }, + { + "objectID": "GP_Notes.html#example-gp-for-toy-example", + "href": "GP_Notes.html#example-gp-for-toy-example", + "title": "VectorByte Methods Training", + "section": "Example: GP for Toy Example", + "text": "Example: GP for Toy Example\nSuppose we have data from the following function, Y(x) = 5 \\ \\sin (x)\nNow we use the above, and try and obtain Y(x_p) \\vert Y. Here is a visual of what a GP prediction for this function looks like. Each gray line is a draw from our predicted normal distribution, the blue line is the truth and the black line is the mean prediction from our GP. The red lines are confidence bounds. Pretty good? Let’s learn how we do that." + }, + { + "objectID": "GP_Notes.html#covariance-function", + "href": "GP_Notes.html#covariance-function", + "title": "VectorByte Methods Training", + "section": "Covariance Function", + "text": "Covariance Function\nAs mentioned before, the main action is in the specification of \\Sigma_n. Let’s express \\Sigma_n = \\tau^2 C_n where C_n is the correlation function. We will be using the squared exponential distance based correlation function. The kernel can be written down mathematically as,\nC_n = \\exp{ \\left( -\\frac{\\vert\\vert x - x' \\vert \\vert ^2}{\\theta} \\right ) + g \\mathbb{I_n}} \nHere, if x and x' are closer in distance, the responses are more highly correlated. Along with our input space we also notice three other paramters; which in this context are referred to as hyper-parameters as they are used for fine-tuning our predictions as opposed to directly affecting them.\nWe have three main hyper-parameters here:\n\n\\tau^2: Scale\nThis parameter dictates the amplitude of our function. A MVN Distribution with scale = 1 will usually have data between -2 to 2. As the scale increases, this range expands. Here is a plo that shows two different scales for the same function, with all other parameters fixed.\n\nlibrary(mvtnorm)\nlibrary(laGP)\n\nset.seed(24)\nn <- 100\nX <- as.matrix(seq(0, 20, length.out = n))\nDx <- laGP::distance(X)\n\ng <- sqrt(.Machine$double.eps)\nCn <- (exp(-Dx) + diag(g, n))\n\nY <- rmvnorm(1, sigma = Cn)\n\nset.seed(28)\ntau2 <- 25\nY_scaled <- rmvnorm(1, sigma = tau2 * Cn)\n\npar(mfrow = c(1, 2), mar = c(5, 5, 4, 2), cex.axis = 2, cex.lab = 2, cex.main = 3, font.lab = 2)\n\n# Plot 1\nmatplot(X, t(Y), type = 'l', main = expression(paste(tau^2, \" = 1\")), \n ylab = \"Y\", xlab = \"X\", lwd = 2, col = \"blue\")\n\n# Plot 2\nmatplot(X, t(Y_scaled), type = 'l', main = expression(paste(tau^2, \" = 25\")), \n ylab = \"Y\", xlab = \"X\", lwd = 2, col = \"red\")\n\n\n\n\n\n\n\n\n\n\n\\theta: Length-scale\nThe length-scale controls the wiggliness of the function. It is also referred to as the rate of decay of correlation. The smaller it’s value, the more wiggly the function gets. This is because we expect the change in directionalilty of the function to be rather quick. We will once again demonstrate how a difference in magnitude of \\theta affects the function, keeping everything else constant.\n\nset.seed(1)\nn <- 100\nX <- as.matrix(seq(0, 10, length.out = n))\nDx <- laGP::distance(X)\n\ng <- sqrt(.Machine$double.eps)\ntheta1 <- 0.5\nCn <- (exp(-Dx/theta1) + diag(g, n))\n\nY <- rmvnorm(1, sigma = Cn)\n\ntheta2 <- 5\nCn <- (exp(-Dx/theta2) + diag(g, n))\n\nY2 <- rmvnorm(1, sigma = Cn)\n\npar(mfrow = c(1, 2), mar = c(5, 5, 4, 2), cex.axis = 2, cex.lab = 2, cex.main = 3, font.lab = 2)\nmatplot(X, t(Y), type= 'l', main = expression(paste(theta, \" = 0.5\")),\n ylab = \"Y\", ylim = c(-2.2, 2.2), lwd = 2, col = \"blue\")\nmatplot(X, t(Y2), type= 'l', main = expression(paste(theta, \" = 5\")),\n ylab = \"Y\", ylim = c(-2.2, 2.2), lwd = 2, col = \"red\")\n\n\n\n\n\n\n\n\nAn extenstion: Anisotropic GP\nIn a multi-dimensional input setup, where X_{n \\times m} = (\\underline{X}_1, ... \\underline{X}_m). Here, the input space is m-dimensional and we have n observations. We can adjust the kernel so that each dimension has it’s own \\theta i.e. the rate of decay of correlation is different from one dimension to another. This can be done by simply modifying the correlation function, and writing it as,\nC_n = \\exp{ \\left( - \\sum_{k=1}^{m} \\frac{\\vert\\vert x - x' \\vert \\vert ^2}{\\theta_k} \\right ) + g \\mathbb{I_n}} \nThis type of modelling is also referred to a seperable GP since we can take the sum outside the exponent and it will be a product of m independent dimensions. If we set all \\theta_k= \\theta, it would be an isotropic GP.\n\n\ng: Nugget\nThis parameter adds some noise into the function. For g > 0, it determines the magnitude of discontinuity as x' tends to x. It is also called the nugget effect. For g=0, there would be no noise and the function would interpolate between the points. This effect is only added to the diagonals of the correlation matrix. We never add g to the off-diagonal elements. This also allows for numeric stability. Below is a snippet of what different magnitudes of g look like.\n\nlibrary(mvtnorm)\nlibrary(laGP)\n\nn <- 100\nX <- as.matrix(seq(0, 10, length.out = n))\nDx <- laGP::distance(X)\n\ng <- sqrt(.Machine$double.eps)\nCn <- (exp(-Dx) + diag(g, n))\nY <- rmvnorm(1, sigma = Cn)\n\nCn <- (exp(-Dx) + diag(1e-2, n))\n\nL <- rmvnorm(1, sigma = diag(1e-2, n))\nY2 <- Y + L\n\npar(mfrow = c(1, 2), mar = c(5, 5, 4, 2), cex.axis = 2, cex.lab = 2, cex.main = 3, font.lab = 2)\nplot(X, t(Y), main = expression(paste(g, \" < 1e-8\")),\n ylab = \"Y\", xlab = \"X\", pch = 19, cex = 1.5, col = 1)\nlines(X, t(Y), col = \"blue\", lwd = 3) \n\nplot(X, t(Y2), main = expression(paste(g, \" = 0.01\")),\n ylab = \"Y\", xlab = \"X\", pch = 19, cex = 1.5, col = 1)\nlines(X, t(Y), col = \"blue\", lwd = 3)\n\n\n\n\n\n\n\n\nAn extension: Heteroskedastic GP\nWe will study this in some detail later, but here instead of using one nugget g for the entire model, we use a vector of nuggets \\Lambda; one unique nugget for each unique input i.e. simply put, a different value gets added to each diagonal element.\nBack to GP fitting\nFor now, let’s get back to GP and fitting and learn how to use it. We have already seen a small example of the laGP package in action. However, we had not used any of the hyper-parameters in that case. We assumed to know all the information. However, that is not always the case. Without getting into the nitty-gritty details, here is how we obtain our parameters when we have some real data D_n = (X_n, Y_n).\n\ng and \\theta: An estimate can be obtained using MLE method by maximizing the likelihood. This is done using numerical algorithms.\n\\tau^2: An estimate is obtained as a closed form solution once we plug in g." + }, + { + "objectID": "GP_Notes.html#gaussian-processes", + "href": "GP_Notes.html#gaussian-processes", + "title": "VectorByte Methods Training", + "section": "", + "text": "A Gaussian Process (GP) is a non-parameteric and flexible method to model data. Here, we assume that the data follow a Multivariate Normal Distribution. It is widely used in the fields of spatial statistics commonly known as kriging as well as machine learning as a surrogate model due to it’s ability of making fast predictions with good uncertainty quantification.\nIn spatial statistics, we want to emphasize the relationship between the distance of different locations along with the response of interest. A GP allows us to do that as we will see in this tutorial. A different application of a GP involves it’s usage as a surrogate. A surrogate model is used to approximate a computer model and/or field experiments where running the experiments may be cost or time ineffective and/or infeasible.\nFor e.g. Suppose we wish to estimate the amount of energy released when a bomb explodes. Conducting this experiment repeatedly a large number of times to collect data seems infeasible. In such cases, we can use a surrogate model to the field data and make predictions for different input locations.\nIn the field of ecology, as we will see in this tutorial, we will use a Gaussian Process model as a Non-Parametric Regression tool, similar to Linear Regression. We will assume our data follows a GP, and use Bayesian Methods to infer the distribution of new locations given the data. The predictions obtained will have good uncertainty quantification which is ofcourse valuable in this field due to the noisy nature of our data." + }, + { + "objectID": "GP_Practical.html#overview-of-the-data", + "href": "GP_Practical.html#overview-of-the-data", + "title": "VectorByte Methods Training", + "section": "Overview of the Data", + "text": "Overview of the Data\n\nObjective: Forecast tick density for 4 weeks into the future\nSites: The data is collected across 9 different sites, each plot was of size 1600m^2 using a drag cloth\nData: Sparse and irregularly spaced. We only have ~650 observations across 10 years at 9 locations\n\nLet’s start with loading all the libraries that we will need, load our data and understand what we have.\n\nlibrary(tidyverse)\nlibrary(laGP)\nlibrary(ggplot2)\n\n# Pulling the data from the NEON data base. \ntarget <- readr::read_csv(\"https://data.ecoforecast.org/neon4cast-targets/ticks/ticks-targets.csv.gz\", guess_max = 1e1)\n\n# Visualizing the data\nhead(target)\n\n# A tibble: 6 × 5\n datetime site_id variable observation iso_week\n <date> <chr> <chr> <dbl> <chr> \n1 2015-04-20 BLAN amblyomma_americanum 0 2015-W17\n2 2015-05-11 BLAN amblyomma_americanum 9.82 2015-W20\n3 2015-06-01 BLAN amblyomma_americanum 10 2015-W23\n4 2015-06-08 BLAN amblyomma_americanum 19.4 2015-W24\n5 2015-06-22 BLAN amblyomma_americanum 3.14 2015-W26\n6 2015-07-13 BLAN amblyomma_americanum 3.66 2015-W29" + }, + { + "objectID": "GP_Practical.html#initial-setup", + "href": "GP_Practical.html#initial-setup", + "title": "VectorByte Methods Training", + "section": "Initial Setup", + "text": "Initial Setup\n\nFor a GP model, we assume the response (Y) should be normally distributed.\nSince tick density, our response, must be greater than 0, we need to use a transform.\nThe following is the most suitable transform for our application:\n\n\n\\begin{equation}\n\\begin{aligned}\nf(y) \\ & = \\text{log } \\ (y + 1) \\ \\ ; \\ \\ \\ \\\\[2pt]\n<!-- \\ & = \\sqrt{y} \\ \\ \\ \\; \\ \\ \\ otherwise -->\n\\end{aligned}\n\\end{equation}\nWe pass in (response + 1) into this function to ensure we don’;t take a log of 0. We will adjust this in our back transform.\nLet’s write a function for this, as well as the inverse of the transform.\n\n# transforms y\nf <- function(x) {\n y <- log(x + 1)\n return(y)\n}\n\n# This function back transforms the input argument\nfi <- function(y) {\n x <- exp(y) - 1\n return(x)\n}" + }, + { + "objectID": "GP_Practical.html#predictors", + "href": "GP_Practical.html#predictors", + "title": "VectorByte Methods Training", + "section": "Predictors", + "text": "Predictors\n\nThe goal is to forecast tick populations for a season so our response (Y) here, is the tick density. However, we do not have a traditional data set with an obvious input space. What is the X? \n\nWe made a few plots earlier to help us identify what can be useful:\n\nX_1 Iso-week: This is the iso-week number\nLet’s convert the iso-week from our target dataset as a numeric i.e. a number. Here is a function to do the same.\n\n# This function tells us the iso-week number given the date\nfx.iso_week <- function(datetime){\n # Gives ISO-week in the format yyyy-w## and we extract the ##\n x1 <- as.numeric(stringr::str_sub(ISOweek::ISOweek(datetime), 7, 8)) # find iso week #\n return(x1)\n}\n\ntarget$week <- fx.iso_week(target$datetime)\nhead(target)\n\n# A tibble: 6 × 6\n datetime site_id variable observation iso_week week\n <date> <chr> <chr> <dbl> <chr> <dbl>\n1 2015-04-20 BLAN amblyomma_americanum 0 2015-W17 17\n2 2015-05-11 BLAN amblyomma_americanum 9.82 2015-W20 20\n3 2015-06-01 BLAN amblyomma_americanum 10 2015-W23 23\n4 2015-06-08 BLAN amblyomma_americanum 19.4 2015-W24 24\n5 2015-06-22 BLAN amblyomma_americanum 3.14 2015-W26 26\n6 2015-07-13 BLAN amblyomma_americanum 3.66 2015-W29 29\n\n\n\nX_2 Sine wave: We use this to give our model phases. We can consider this as a proxy to some other variables such as temperature which would increase from Jan to about Jun-July and then decrease. We use the following sin wave\n\nX_2 = \\left( \\text{sin} \\ \\left( \\frac{2 \\ \\pi \\ X_1}{106} \\right) \\right)^2 where, X_1 is the iso-week.\nUsually, a Sin wave for a year would have the periodicity of 53 to indicate 53 weeks. Why have we chosen 106 as our period? And we do we square it?\nLet’s use a visual to understand that.\n\nx <- c(1:106)\nsin_53 <- sin(2*pi*x/53)\nsin_106 <- (sin(2*pi*x/106))\nsin_106_2 <- (sin(2*pi*x/106))^2\n\npar(mfrow=c(1, 3), mar = c(4, 5, 4, 1), cex.axis = 2, cex.lab = 2, cex.main = 3, font.lab = 2)\nplot(x, sin_53, col = 2, pch = 19, ylim = c(-1, 1), ylab = \"sin wave\", main = \"period = 53\")\nabline(h = 0, lwd = 2)\nplot(x, sin_106, col = 3, pch = 19, ylim = c(-1, 1), ylab = \"sin wave\", main = \"period = 106\")\nabline(h = 0, lwd = 2)\nplot(x, sin_106_2, col = 4, pch = 19, ylim = c(-1, 1), ylab = \"sin wave\", main = \"period = 106 squared\")\nabline(h = 0, lwd = 2)\n\n\n\n\n\n\n\n\nSome observations:\n\nThe sin wave (period 53) goes increases from (0, 1) and decreases all the way to -1 before coming back to 0, all within the 53 weeks in the year. But this is not what we want to achieve.\nWe want the function to increase from Jan - Jun and then start decreasing till Dec. This means, we need a regular sin-wave to span 2 years so we can see this.\nWe also want the next year to repeat the same pattern i.e. we want to restrict it to [0, 1] interval. Thus, we square the sin wave.\n\n\nfx.sin <- function(datetime, f1 = fx.iso_week){\n # identify iso week#\n x <- f1(datetime) \n # calculate sin value for that week\n x2 <- (sin(2*pi*x/106))^2 \n return(x2)\n}\n\nFor a GP, it’s also useful to ensure that all our X’s are between 0 and 1. Usually this is done by using the following method\nX_i^* = \\frac{X_i - \\min(X)}{\\max(X) - \\min(X) } where X = (X_1, X_2 ...X_n)\nX^* = (X_1^*, X_2^* ... X_n^*) will be the standarized X’s with all X_i^* in the interval [0, 1].\nWe can either write a function for this, or in our case, we can just divide Iso-week by 53 since that would result effectively be the same. Our Sin Predictor already lies in the interval [0, 1]." + }, + { + "objectID": "GP_Practical.html#model-fitting", + "href": "GP_Practical.html#model-fitting", + "title": "VectorByte Methods Training", + "section": "Model Fitting", + "text": "Model Fitting\nNow, let’s start with modelling. We will start with one random location out of the 9 locations.\n\n# Choose a random site number: Anything between 1-9.\nsite_number <- 6\n\n# Obtaining site name\nsite_names <- unique(target$site_id)\n\n# Subsetting all the data at that location\ndf <- subset(target, target$site_id == site_names[site_number])\nhead(df)\n\n# A tibble: 6 × 6\n datetime site_id variable observation iso_week week\n <date> <chr> <chr> <dbl> <chr> <dbl>\n1 2014-06-09 SCBI amblyomma_americanum 75.9 2014-W24 24\n2 2014-06-30 SCBI amblyomma_americanum 28.3 2014-W27 27\n3 2014-07-21 SCBI amblyomma_americanum 0 2014-W30 30\n4 2014-07-28 SCBI amblyomma_americanum 10.1 2014-W31 31\n5 2014-08-11 SCBI amblyomma_americanum 4.94 2014-W33 33\n6 2014-10-20 SCBI amblyomma_americanum 0 2014-W43 43\n\n\nWe will also select only those columns that we are interested in i.e. datetime and obervation. We don’t need site since we are only using one of them.\n\n# extracting only the datetime and obs columns\ndf <- df[, c(\"datetime\", \"observation\")]\n\nWe will use one site at first and fit a GP and make predictions. For this we first need to divide our data into a training set and a testing set. Since we have time series, we want to divide the data sequentially, i.e. we pick a date and everything before the date is our training set and after is our testing set where we check how well our model performs. We choose the date 2020-12-31.\n\n# Selecting a date before which we consider everything as training data and after this is testing data.\ncutoff = as.Date('2020-12-31')\ndf_train <- subset(df, df$datetime <= cutoff)\ndf_test <- subset(df, df$datetime > cutoff)" + }, + { + "objectID": "GP_Practical.html#gp-model", + "href": "GP_Practical.html#gp-model", + "title": "VectorByte Methods Training", + "section": "GP Model", + "text": "GP Model\nNow we will setup our X’s. We already have the functions to do this and can simply pass in the datetime. We then combine X_1 and X_2 to create out input matrix X. Remember, everything is ordered as in our dataset.\n\n# Setting up iso-week and sin wave predictors by calling the functions\nX1 <- fx.iso_week(df_train$datetime) # range is 1-53\nX2 <- fx.sin(df_train$datetime) # range is 0 to 1\n\n# Centering the iso-week by diving by 53\nX1c <- X1/ 53\n\n# We combine columns centered X1 and X2, into a matrix as our input space\nX <- as.matrix(cbind.data.frame(X1c, X2))\nhead(X)\n\n X1c X2\n[1,] 0.4528302 0.9782005\n[2,] 0.5094340 0.9991219\n[3,] 0.5660377 0.9575728\n[4,] 0.5849057 0.9305218\n[5,] 0.6226415 0.8587536\n[6,] 0.8113208 0.3120862\n\n\nNext step is to tranform the response to ensure it is normal.\n\n# Extract y: observation from our training model. \ny_obs <- df_train$observation\n\n# Transform the response\ny <- f(y_obs)\n\nNow, we can use the laGP library to fit a GP. First, we specify priors using darg and garg. We will specify a minimum and maximum for our arguments. We need to pass the input space for darg and the output vector for garg. You can look into the functions using ?function in R. We set the minimum to a very small value rather than 0 to ensure numeric stability.\n\n# A very small value for stability\neps <- sqrt(.Machine$double.eps) \n \n# Priors for theta and g. \nd <- darg(list(mle=TRUE, min =eps, max=5), X)\ng <- garg(list(mle=TRUE, min = eps, max = 1), y)\n\nNow, to fit the GP, we use newGPsep. We pass the input matrix and the response vector with some values of the parameters. Then, we use the jmleGPsep function to jointly estimate \\theta and g using MLE method. dK allows the GP object to store derivative information which is needed for MLE calculations. newGPsep will fit a separable GP as opposed to newGP which would fit an isotropic GP.\n\n# Fitting a GP with our data, and some starting values for theta and g\ngpi <- newGPsep(X, y, d = 0.1, g = 1, dK = T)\n\n# Jointly infer MLE for all parameters\nmle <- jmleGPsep(gpi, drange = c(d$min, d$max), grange = c(g$min, g$max), \n dab = d$ab, gab= g$ab)\n\nNow, we will create a grid from the first week in our dataset to 1 year into the future, and predict on the entire time series. We use predGPsep to make predictions.\n\n# Create a grid from start date in our data set to one year in future (so we forecast for next season)\nstartdate <- as.Date(min(df$datetime))# identify start week\ngrid_datetime <- seq.Date(startdate, Sys.Date() + 365, by = 7) # create sequence from \n\n# Build the inpu space for the predictive space (All weeks from 04-2014 to 07-2025)\nXXt1 <- fx.iso_week(grid_datetime)\nXXt2 <- fx.sin(grid_datetime)\n\n# Standardize\nXXt1c <- XXt1/53\n\n# Store inputs as a matrix\nXXt <- as.matrix(cbind.data.frame(XXt1c, XXt2))\n\n# Make predictions using predGP with the gp object and the predictive set\nppt <- predGPsep(gpi, XXt) \n\nStoring the mean and calculating quantiles.\n\n# Now we store the mean as our predicted response i.e. density along with quantiles\nyyt <- ppt$mean\nq1t <- ppt$mean + qnorm(0.025,0,sqrt(diag(ppt$Sigma))) #lower bound\nq2t <- ppt$mean + qnorm(0.975,0,sqrt(diag(ppt$Sigma))) # upper bound\n\nNow we can plot our data and predictions and see how well our model performed. We need to back transform our predictions to the original scale.\n\n# Back transform our data to original\ngp_yy <- fi(yyt)\ngp_q1 <- fi(q1t)\ngp_q2 <- fi(q2t)\n\n# Plot the observed points\nplot(as.Date(df$datetime), df$observation,\n main = paste(site_names[site_number]), col = \"black\",\n xlab = \"Dates\" , ylab = \"Abundance\",\n # xlim = c(as.Date(min(df$datetime)), as.Date(cutoff)),\n ylim = c(min(df_train$observation, gp_yy, gp_q1), max(df_train$observation, gp_yy, gp_q2)* 1.05))\n\n# Plot the testing set data \npoints(as.Date(df_test$datetime), df_test$observation, col =\"black\", pch = 19)\n\n# Line to indicate seperation between train and test data\nabline(v = as.Date(cutoff), lwd = 2)\n\n# Add the predicted response and the quantiles\nlines(grid_datetime, gp_yy, col = 4, lwd = 2)\nlines(grid_datetime, gp_q1, col = 4, lwd = 1.2, lty = 2)\nlines(grid_datetime, gp_q2, col = 4, lwd = 1.2, lty =2)\n\n\n\n\n\n\n\n\nThat looks pretty good? We can also look at the RMSE to see how the model performs. It is better o do this on the transformed scale. We will use yyt for this. We need to find those predictions which correspond to the datetime in our testing dataset df_test.\n\n# Obtain true observed values for testing set\nyt_true <- f(df_test$observation)\n\n# FInd corresponding predictions from our model in the grid we predicted on\nyt_pred <- yyt[which(grid_datetime %in% df_test$datetime)]\n\n# calculate RMSE\nrmse <- sqrt(mean((yt_true - yt_pred)^2))\nrmse\n\n[1] 0.8624903" + }, + { + "objectID": "GP_Practical.html#hetgp-model", + "href": "GP_Practical.html#hetgp-model", + "title": "VectorByte Methods Training", + "section": "HetGP Model", + "text": "HetGP Model\nNext, we can attempt a hetGP. We are now interested in fitting a vector of nuggets rather than a single value.\nLet’s use the same data we have to fit a hetGP. We already have our data (X, y) as well as our prediction set XXt. We use the mleHetGP command to fit a GP and pass in our data. The default covariance structure is the Squared Exponential structure. We use the predict function in base R and pass the hetGP object i.e. het_gpi to make predictions on our set XXt.\n\n# create predictors\nX1 <- fx.iso_week(df_train$datetime)\nX2 <- fx.sin(df_train$datetime)\n\n# standardize and put into matrix\nX1c <- X1/53\nX <- as.matrix(cbind.data.frame(X1c, X2))\n\n# Build prediction grid (From 04-2014 to 07-2025)\nXXt1 <- fx.iso_week(grid_datetime)\nXXt2 <- fx.sin(grid_datetime)\n\n# standardize and put into matrix\nXXt1c <- XXt1/53\nXXt <- as.matrix(cbind.data.frame(XXt1c, XXt2))\n\n# Transform the training response\ny_obs <- df_train$observation\ny <- f(y_obs)\n\n# Fit a hetGP model. X must be s matrix and nrow(X) should be same as length(y)\nhet_gpi <- hetGP::mleHetGP(X = X, Z = y)\n\n# Predictions using the base R predict command with a hetGP object and new locationss\nhet_ppt <- predict(het_gpi, XXt)\n\nNow we obtain the mean and the confidence bounds as well as transform the data to the original scale.\n\n# Mean density for predictive locations and Confidence bounds\nhet_yyt <- het_ppt$mean\nhet_q1t <- qnorm(0.975, het_ppt$mean, sqrt(het_ppt$sd2 + het_ppt$nugs))\nhet_q2t <- qnorm(0.025, het_ppt$mean, sqrt(het_ppt$sd2 + het_ppt$nugs)) \n\n# Back transforming to original scale\nhet_yy <- fi(het_yyt)\nhet_q1 <- fi(het_q1t)\nhet_q2 <- fi(het_q2t)\n\nWe can now plot the results similar to before. [Uncomment the code lines to see how a GP vs a HetGP fits the data]\n\n# Plot Original data\nplot(as.Date(df$datetime), df$observation,\n main = paste(site_names[site_number]), col = \"black\",\n xlab = \"Dates\" , ylab = \"Abundance\",\n # xlim = c(as.Date(min(df$datetime)), as.Date(cutoff)),\n ylim = c(min(df_train$observation, het_yy, het_q2), max(df_train$observation, het_yy, het_q1)* 1.2))\n\n# Add testing observations\npoints(as.Date(df_test$datetime), df_test$observation, col =\"black\", pch = 19)\n\n# Line to indicate our cutoff point\nabline(v = as.Date(cutoff), lwd = 2)\n\n# HetGP Model mean predictions and bounds.\nlines(grid_datetime, het_yy, col = 2, lwd = 2)\nlines(grid_datetime, het_q1, col = 2, lwd = 1.2, lty = 2)\nlines(grid_datetime, het_q2, col = 2, lwd = 1.2, lty =2)\n\n## GP model fits for the same data\n# lines(grid_datetime, gp_yy, col = 3, lwd = 2)\n# lines(grid_datetime, gp_q1, col = 3, lwd = 1.2, lty = 2)\n# lines(grid_datetime, gp_q2, col = 3, lwd = 1.2, lty =2)\n\nlegend(\"topleft\", legend = c(\"Train Y\",\"Test Y\", \"GP preds\", \"HetGP preds\"),\n col = c(1, 1, 2, 3), lty = c(NA, NA, 1, 1),\n pch = c(1, 19, NA, NA), cex = 0.5)\n\n\n\n\n\n\n\n\nThe mean predictions of a GP are similar to that of a hetGP; But the confidence bounds are different. A hetGP produces sligtly tighter bounds.\nWe can also compare the RMSE’s using the predictions of the hetGP model.\n\nyt_true <- f(df_test$observation) # Original data\nhet_yt_pred <- het_yyt[which(grid_datetime %in% df_test$datetime)] # model preds\n\n# calculate rmse for hetGP model\nrmse_het <- sqrt(mean((yt_true - het_yt_pred)^2))\nrmse_het\n\n[1] 0.8835813\n\n\nNow that we have learnt how to fit a GP and a hetGP, it’s time for a challenge.\nTry a hetGP on our sin example from before (but this time I have added noise).\n\n# Your turn\nset.seed(26)\nn <- 8 # number of points\nX <- matrix(seq(0, 2*pi, length= n), ncol=1) # build inputs \ny <- 5*sin(X) + rnorm(n, 0 , 2) # response with some noise\n\n# Predict on this set\nXX <- matrix(seq(-0.5, 2*pi + 0.5, length= 100), ncol=1)\n\n# Data visualization\nplot(X, y)\n\n\n\n\n\n\n\n# Add code to fit a hetGP model and visualise it as above" } ] \ No newline at end of file diff --git a/docs/site_libs/revealjs/dist/theme/quarto.css b/docs/site_libs/revealjs/dist/theme/quarto.css index 2577a64..dc2723a 100644 --- a/docs/site_libs/revealjs/dist/theme/quarto.css +++ b/docs/site_libs/revealjs/dist/theme/quarto.css @@ -1,4 +1,4 @@ -@import"./fonts/source-sans-pro/source-sans-pro.css";:root{--r-background-color: #fff;--r-main-font: Source Sans Pro, Helvetica, sans-serif;--r-main-font-size: 40px;--r-main-color: #222;--r-block-margin: 12px;--r-heading-margin: 0 0 12px 0;--r-heading-font: Source Sans Pro, Helvetica, sans-serif;--r-heading-color: #222;--r-heading-line-height: 1.2;--r-heading-letter-spacing: normal;--r-heading-text-transform: none;--r-heading-text-shadow: none;--r-heading-font-weight: 600;--r-heading1-text-shadow: none;--r-heading1-size: 2.5em;--r-heading2-size: 1.6em;--r-heading3-size: 1.3em;--r-heading4-size: 1em;--r-code-font: SFMono-Regular, Menlo, Monaco, Consolas, Liberation Mono, Courier New, monospace;--r-link-color: #2a76dd;--r-link-color-dark: #1a53a1;--r-link-color-hover: #5692e4;--r-selection-background-color: #98bdef;--r-selection-color: #fff}.reveal-viewport{background:#fff;background-color:var(--r-background-color)}.reveal{font-family:var(--r-main-font);font-size:var(--r-main-font-size);font-weight:normal;color:var(--r-main-color)}.reveal ::selection{color:var(--r-selection-color);background:var(--r-selection-background-color);text-shadow:none}.reveal ::-moz-selection{color:var(--r-selection-color);background:var(--r-selection-background-color);text-shadow:none}.reveal .slides section,.reveal .slides section>section{line-height:1.3;font-weight:inherit}.reveal h1,.reveal h2,.reveal h3,.reveal h4,.reveal h5,.reveal h6{margin:var(--r-heading-margin);color:var(--r-heading-color);font-family:var(--r-heading-font);font-weight:var(--r-heading-font-weight);line-height:var(--r-heading-line-height);letter-spacing:var(--r-heading-letter-spacing);text-transform:var(--r-heading-text-transform);text-shadow:var(--r-heading-text-shadow);word-wrap:break-word}.reveal h1{font-size:var(--r-heading1-size)}.reveal h2{font-size:var(--r-heading2-size)}.reveal h3{font-size:var(--r-heading3-size)}.reveal h4{font-size:var(--r-heading4-size)}.reveal h1{text-shadow:var(--r-heading1-text-shadow)}.reveal p{margin:var(--r-block-margin) 0;line-height:1.3}.reveal h1:last-child,.reveal h2:last-child,.reveal h3:last-child,.reveal h4:last-child,.reveal h5:last-child,.reveal h6:last-child{margin-bottom:0}.reveal img,.reveal video,.reveal iframe{max-width:95%;max-height:95%}.reveal strong,.reveal b{font-weight:bold}.reveal em{font-style:italic}.reveal ol,.reveal dl,.reveal ul{display:inline-block;text-align:left;margin:0 0 0 1em}.reveal ol{list-style-type:decimal}.reveal ul{list-style-type:disc}.reveal ul ul{list-style-type:square}.reveal ul ul ul{list-style-type:circle}.reveal ul ul,.reveal ul ol,.reveal ol ol,.reveal ol ul{display:block;margin-left:40px}.reveal dt{font-weight:bold}.reveal dd{margin-left:40px}.reveal blockquote{display:block;position:relative;width:70%;margin:var(--r-block-margin) auto;padding:5px;font-style:italic;background:rgba(255,255,255,.05);box-shadow:0px 0px 2px rgba(0,0,0,.2)}.reveal blockquote p:first-child,.reveal blockquote p:last-child{display:inline-block}.reveal q{font-style:italic}.reveal pre{display:block;position:relative;width:90%;margin:var(--r-block-margin) auto;text-align:left;font-size:.55em;font-family:var(--r-code-font);line-height:1.2em;word-wrap:break-word;box-shadow:0px 5px 15px rgba(0,0,0,.15)}.reveal code{font-family:var(--r-code-font);text-transform:none;tab-size:2}.reveal pre code{display:block;padding:5px;overflow:auto;max-height:400px;word-wrap:normal}.reveal .code-wrapper{white-space:normal}.reveal .code-wrapper code{white-space:pre}.reveal table{margin:auto;border-collapse:collapse;border-spacing:0}.reveal table th{font-weight:bold}.reveal table th,.reveal table td{text-align:left;padding:.2em .5em .2em .5em;border-bottom:1px solid}.reveal table th[align=center],.reveal table td[align=center]{text-align:center}.reveal table th[align=right],.reveal table td[align=right]{text-align:right}.reveal table tbody tr:last-child th,.reveal table tbody tr:last-child td{border-bottom:none}.reveal sup{vertical-align:super;font-size:smaller}.reveal sub{vertical-align:sub;font-size:smaller}.reveal small{display:inline-block;font-size:.6em;line-height:1.2em;vertical-align:top}.reveal small *{vertical-align:top}.reveal img{margin:var(--r-block-margin) 0}.reveal a{color:var(--r-link-color);text-decoration:none;transition:color .15s ease}.reveal a:hover{color:var(--r-link-color-hover);text-shadow:none;border:none}.reveal .roll span:after{color:#fff;background:var(--r-link-color-dark)}.reveal .r-frame{border:4px solid var(--r-main-color);box-shadow:0 0 10px rgba(0,0,0,.15)}.reveal a .r-frame{transition:all .15s linear}.reveal a:hover .r-frame{border-color:var(--r-link-color);box-shadow:0 0 20px rgba(0,0,0,.55)}.reveal .controls{color:var(--r-link-color)}.reveal .progress{background:rgba(0,0,0,.2);color:var(--r-link-color)}@media print{.backgrounds{background-color:var(--r-background-color)}}.top-right{position:absolute;top:1em;right:1em}.visually-hidden{border:0;clip:rect(0 0 0 0);height:auto;margin:0;overflow:hidden;padding:0;position:absolute;width:1px;white-space:nowrap}.hidden{display:none !important}.zindex-bottom{z-index:-1 !important}figure.figure{display:block}.quarto-layout-panel{margin-bottom:1em}.quarto-layout-panel>figure{width:100%}.quarto-layout-panel>figure>figcaption,.quarto-layout-panel>.panel-caption{margin-top:10pt}.quarto-layout-panel>.table-caption{margin-top:0px}.table-caption p{margin-bottom:.5em}.quarto-layout-row{display:flex;flex-direction:row;align-items:flex-start}.quarto-layout-valign-top{align-items:flex-start}.quarto-layout-valign-bottom{align-items:flex-end}.quarto-layout-valign-center{align-items:center}.quarto-layout-cell{position:relative;margin-right:20px}.quarto-layout-cell:last-child{margin-right:0}.quarto-layout-cell figure,.quarto-layout-cell>p{margin:.2em}.quarto-layout-cell img{max-width:100%}.quarto-layout-cell .html-widget{width:100% !important}.quarto-layout-cell div figure p{margin:0}.quarto-layout-cell figure{display:block;margin-inline-start:0;margin-inline-end:0}.quarto-layout-cell table{display:inline-table}.quarto-layout-cell-subref figcaption,figure .quarto-layout-row figure figcaption{text-align:center;font-style:italic}.quarto-figure{position:relative;margin-bottom:1em}.quarto-figure>figure{width:100%;margin-bottom:0}.quarto-figure-left>figure>p,.quarto-figure-left>figure>div{text-align:left}.quarto-figure-center>figure>p,.quarto-figure-center>figure>div{text-align:center}.quarto-figure-right>figure>p,.quarto-figure-right>figure>div{text-align:right}.quarto-figure>figure>div.cell-annotation,.quarto-figure>figure>div code{text-align:left}figure>p:empty{display:none}figure>p:first-child{margin-top:0;margin-bottom:0}figure>figcaption.quarto-float-caption-bottom{margin-bottom:.5em}figure>figcaption.quarto-float-caption-top{margin-top:.5em}div[id^=tbl-]{position:relative}.quarto-figure>.anchorjs-link{position:absolute;top:.6em;right:.5em}div[id^=tbl-]>.anchorjs-link{position:absolute;top:.7em;right:.3em}.quarto-figure:hover>.anchorjs-link,div[id^=tbl-]:hover>.anchorjs-link,h2:hover>.anchorjs-link,h3:hover>.anchorjs-link,h4:hover>.anchorjs-link,h5:hover>.anchorjs-link,h6:hover>.anchorjs-link,.reveal-anchorjs-link>.anchorjs-link{opacity:1}#title-block-header{margin-block-end:1rem;position:relative;margin-top:-1px}#title-block-header .abstract{margin-block-start:1rem}#title-block-header .abstract .abstract-title{font-weight:600}#title-block-header a{text-decoration:none}#title-block-header .author,#title-block-header .date,#title-block-header .doi{margin-block-end:.2rem}#title-block-header .quarto-title-block>div{display:flex}#title-block-header .quarto-title-block>div>h1{flex-grow:1}#title-block-header .quarto-title-block>div>button{flex-shrink:0;height:2.25rem;margin-top:0}tr.header>th>p:last-of-type{margin-bottom:0px}table,table.table{margin-top:.5rem;margin-bottom:.5rem}caption,.table-caption{padding-top:.5rem;padding-bottom:.5rem;text-align:center}figure.quarto-float-tbl figcaption.quarto-float-caption-top{margin-top:.5rem;margin-bottom:.25rem;text-align:center}figure.quarto-float-tbl figcaption.quarto-float-caption-bottom{padding-top:.25rem;margin-bottom:.5rem;text-align:center}.utterances{max-width:none;margin-left:-8px}iframe{margin-bottom:1em}details{margin-bottom:1em}details[show]{margin-bottom:0}details>summary{color:#6f6f6f}details>summary>p:only-child{display:inline}pre.sourceCode,code.sourceCode{position:relative}p code:not(.sourceCode){white-space:pre-wrap}code{white-space:pre}@media print{code{white-space:pre-wrap}}pre>code{display:block}pre>code.sourceCode{white-space:pre}pre>code.sourceCode>span>a:first-child::before{text-decoration:none}pre.code-overflow-wrap>code.sourceCode{white-space:pre-wrap}pre.code-overflow-scroll>code.sourceCode{white-space:pre}code a:any-link{color:inherit;text-decoration:none}code a:hover{color:inherit;text-decoration:underline}ul.task-list{padding-left:1em}[data-tippy-root]{display:inline-block}.tippy-content .footnote-back{display:none}.footnote-back{margin-left:.2em}.tippy-content{overflow-x:auto}.quarto-embedded-source-code{display:none}.quarto-unresolved-ref{font-weight:600}.quarto-cover-image{max-width:35%;float:right;margin-left:30px}.cell-output-display .widget-subarea{margin-bottom:1em}.cell-output-display:not(.no-overflow-x),.knitsql-table:not(.no-overflow-x){overflow-x:auto}.panel-input{margin-bottom:1em}.panel-input>div,.panel-input>div>div{display:inline-block;vertical-align:top;padding-right:12px}.panel-input>p:last-child{margin-bottom:0}.layout-sidebar{margin-bottom:1em}.layout-sidebar .tab-content{border:none}.tab-content>.page-columns.active{display:grid}div.sourceCode>iframe{width:100%;height:300px;margin-bottom:-0.5em}a{text-underline-offset:3px}div.ansi-escaped-output{font-family:monospace;display:block}/*! +@import"./fonts/source-sans-pro/source-sans-pro.css";:root{--r-background-color: #fff;--r-main-font: Source Sans Pro, Helvetica, sans-serif;--r-main-font-size: 22pt;--r-main-color: #222;--r-block-margin: 12px;--r-heading-margin: 0 0 12px 0;--r-heading-font: Source Sans Pro, Helvetica, sans-serif;--r-heading-color: #222;--r-heading-line-height: 1.2;--r-heading-letter-spacing: normal;--r-heading-text-transform: none;--r-heading-text-shadow: none;--r-heading-font-weight: 600;--r-heading1-text-shadow: none;--r-heading1-size: 2.5em;--r-heading2-size: 1.6em;--r-heading3-size: 1.3em;--r-heading4-size: 1em;--r-code-font: SFMono-Regular, Menlo, Monaco, Consolas, Liberation Mono, Courier New, monospace;--r-link-color: #2a76dd;--r-link-color-dark: #1a53a1;--r-link-color-hover: #5692e4;--r-selection-background-color: #98bdef;--r-selection-color: #fff}.reveal-viewport{background:#fff;background-color:var(--r-background-color)}.reveal{font-family:var(--r-main-font);font-size:var(--r-main-font-size);font-weight:normal;color:var(--r-main-color)}.reveal ::selection{color:var(--r-selection-color);background:var(--r-selection-background-color);text-shadow:none}.reveal ::-moz-selection{color:var(--r-selection-color);background:var(--r-selection-background-color);text-shadow:none}.reveal .slides section,.reveal .slides section>section{line-height:1.3;font-weight:inherit}.reveal h1,.reveal h2,.reveal h3,.reveal h4,.reveal h5,.reveal h6{margin:var(--r-heading-margin);color:var(--r-heading-color);font-family:var(--r-heading-font);font-weight:var(--r-heading-font-weight);line-height:var(--r-heading-line-height);letter-spacing:var(--r-heading-letter-spacing);text-transform:var(--r-heading-text-transform);text-shadow:var(--r-heading-text-shadow);word-wrap:break-word}.reveal h1{font-size:var(--r-heading1-size)}.reveal h2{font-size:var(--r-heading2-size)}.reveal h3{font-size:var(--r-heading3-size)}.reveal h4{font-size:var(--r-heading4-size)}.reveal h1{text-shadow:var(--r-heading1-text-shadow)}.reveal p{margin:var(--r-block-margin) 0;line-height:1.3}.reveal h1:last-child,.reveal h2:last-child,.reveal h3:last-child,.reveal h4:last-child,.reveal h5:last-child,.reveal h6:last-child{margin-bottom:0}.reveal img,.reveal video,.reveal iframe{max-width:95%;max-height:95%}.reveal strong,.reveal b{font-weight:bold}.reveal em{font-style:italic}.reveal ol,.reveal dl,.reveal ul{display:inline-block;text-align:left;margin:0 0 0 1em}.reveal ol{list-style-type:decimal}.reveal ul{list-style-type:disc}.reveal ul ul{list-style-type:square}.reveal ul ul ul{list-style-type:circle}.reveal ul ul,.reveal ul ol,.reveal ol ol,.reveal ol ul{display:block;margin-left:40px}.reveal dt{font-weight:bold}.reveal dd{margin-left:40px}.reveal blockquote{display:block;position:relative;width:70%;margin:var(--r-block-margin) auto;padding:5px;font-style:italic;background:rgba(255,255,255,.05);box-shadow:0px 0px 2px rgba(0,0,0,.2)}.reveal blockquote p:first-child,.reveal blockquote p:last-child{display:inline-block}.reveal q{font-style:italic}.reveal pre{display:block;position:relative;width:90%;margin:var(--r-block-margin) auto;text-align:left;font-size:.55em;font-family:var(--r-code-font);line-height:1.2em;word-wrap:break-word;box-shadow:0px 5px 15px rgba(0,0,0,.15)}.reveal code{font-family:var(--r-code-font);text-transform:none;tab-size:2}.reveal pre code{display:block;padding:5px;overflow:auto;max-height:400px;word-wrap:normal}.reveal .code-wrapper{white-space:normal}.reveal .code-wrapper code{white-space:pre}.reveal table{margin:auto;border-collapse:collapse;border-spacing:0}.reveal table th{font-weight:bold}.reveal table th,.reveal table td{text-align:left;padding:.2em .5em .2em .5em;border-bottom:1px solid}.reveal table th[align=center],.reveal table td[align=center]{text-align:center}.reveal table th[align=right],.reveal table td[align=right]{text-align:right}.reveal table tbody tr:last-child th,.reveal table tbody tr:last-child td{border-bottom:none}.reveal sup{vertical-align:super;font-size:smaller}.reveal sub{vertical-align:sub;font-size:smaller}.reveal small{display:inline-block;font-size:.6em;line-height:1.2em;vertical-align:top}.reveal small *{vertical-align:top}.reveal img{margin:var(--r-block-margin) 0}.reveal a{color:var(--r-link-color);text-decoration:none;transition:color .15s ease}.reveal a:hover{color:var(--r-link-color-hover);text-shadow:none;border:none}.reveal .roll span:after{color:#fff;background:var(--r-link-color-dark)}.reveal .r-frame{border:4px solid var(--r-main-color);box-shadow:0 0 10px rgba(0,0,0,.15)}.reveal a .r-frame{transition:all .15s linear}.reveal a:hover .r-frame{border-color:var(--r-link-color);box-shadow:0 0 20px rgba(0,0,0,.55)}.reveal .controls{color:var(--r-link-color)}.reveal .progress{background:rgba(0,0,0,.2);color:var(--r-link-color)}@media print{.backgrounds{background-color:var(--r-background-color)}}.top-right{position:absolute;top:1em;right:1em}.visually-hidden{border:0;clip:rect(0 0 0 0);height:auto;margin:0;overflow:hidden;padding:0;position:absolute;width:1px;white-space:nowrap}.hidden{display:none !important}.zindex-bottom{z-index:-1 !important}figure.figure{display:block}.quarto-layout-panel{margin-bottom:1em}.quarto-layout-panel>figure{width:100%}.quarto-layout-panel>figure>figcaption,.quarto-layout-panel>.panel-caption{margin-top:10pt}.quarto-layout-panel>.table-caption{margin-top:0px}.table-caption p{margin-bottom:.5em}.quarto-layout-row{display:flex;flex-direction:row;align-items:flex-start}.quarto-layout-valign-top{align-items:flex-start}.quarto-layout-valign-bottom{align-items:flex-end}.quarto-layout-valign-center{align-items:center}.quarto-layout-cell{position:relative;margin-right:20px}.quarto-layout-cell:last-child{margin-right:0}.quarto-layout-cell figure,.quarto-layout-cell>p{margin:.2em}.quarto-layout-cell img{max-width:100%}.quarto-layout-cell .html-widget{width:100% !important}.quarto-layout-cell div figure p{margin:0}.quarto-layout-cell figure{display:block;margin-inline-start:0;margin-inline-end:0}.quarto-layout-cell table{display:inline-table}.quarto-layout-cell-subref figcaption,figure .quarto-layout-row figure figcaption{text-align:center;font-style:italic}.quarto-figure{position:relative;margin-bottom:1em}.quarto-figure>figure{width:100%;margin-bottom:0}.quarto-figure-left>figure>p,.quarto-figure-left>figure>div{text-align:left}.quarto-figure-center>figure>p,.quarto-figure-center>figure>div{text-align:center}.quarto-figure-right>figure>p,.quarto-figure-right>figure>div{text-align:right}.quarto-figure>figure>div.cell-annotation,.quarto-figure>figure>div code{text-align:left}figure>p:empty{display:none}figure>p:first-child{margin-top:0;margin-bottom:0}figure>figcaption.quarto-float-caption-bottom{margin-bottom:.5em}figure>figcaption.quarto-float-caption-top{margin-top:.5em}div[id^=tbl-]{position:relative}.quarto-figure>.anchorjs-link{position:absolute;top:.6em;right:.5em}div[id^=tbl-]>.anchorjs-link{position:absolute;top:.7em;right:.3em}.quarto-figure:hover>.anchorjs-link,div[id^=tbl-]:hover>.anchorjs-link,h2:hover>.anchorjs-link,h3:hover>.anchorjs-link,h4:hover>.anchorjs-link,h5:hover>.anchorjs-link,h6:hover>.anchorjs-link,.reveal-anchorjs-link>.anchorjs-link{opacity:1}#title-block-header{margin-block-end:1rem;position:relative;margin-top:-1px}#title-block-header .abstract{margin-block-start:1rem}#title-block-header .abstract .abstract-title{font-weight:600}#title-block-header a{text-decoration:none}#title-block-header .author,#title-block-header .date,#title-block-header .doi{margin-block-end:.2rem}#title-block-header .quarto-title-block>div{display:flex}#title-block-header .quarto-title-block>div>h1{flex-grow:1}#title-block-header .quarto-title-block>div>button{flex-shrink:0;height:2.25rem;margin-top:0}tr.header>th>p:last-of-type{margin-bottom:0px}table,table.table{margin-top:.5rem;margin-bottom:.5rem}caption,.table-caption{padding-top:.5rem;padding-bottom:.5rem;text-align:center}figure.quarto-float-tbl figcaption.quarto-float-caption-top{margin-top:.5rem;margin-bottom:.25rem;text-align:center}figure.quarto-float-tbl figcaption.quarto-float-caption-bottom{padding-top:.25rem;margin-bottom:.5rem;text-align:center}.utterances{max-width:none;margin-left:-8px}iframe{margin-bottom:1em}details{margin-bottom:1em}details[show]{margin-bottom:0}details>summary{color:#6f6f6f}details>summary>p:only-child{display:inline}pre.sourceCode,code.sourceCode{position:relative}p code:not(.sourceCode){white-space:pre-wrap}code{white-space:pre}@media print{code{white-space:pre-wrap}}pre>code{display:block}pre>code.sourceCode{white-space:pre}pre>code.sourceCode>span>a:first-child::before{text-decoration:none}pre.code-overflow-wrap>code.sourceCode{white-space:pre-wrap}pre.code-overflow-scroll>code.sourceCode{white-space:pre}code a:any-link{color:inherit;text-decoration:none}code a:hover{color:inherit;text-decoration:underline}ul.task-list{padding-left:1em}[data-tippy-root]{display:inline-block}.tippy-content .footnote-back{display:none}.footnote-back{margin-left:.2em}.tippy-content{overflow-x:auto}.quarto-embedded-source-code{display:none}.quarto-unresolved-ref{font-weight:600}.quarto-cover-image{max-width:35%;float:right;margin-left:30px}.cell-output-display .widget-subarea{margin-bottom:1em}.cell-output-display:not(.no-overflow-x),.knitsql-table:not(.no-overflow-x){overflow-x:auto}.panel-input{margin-bottom:1em}.panel-input>div,.panel-input>div>div{display:inline-block;vertical-align:top;padding-right:12px}.panel-input>p:last-child{margin-bottom:0}.layout-sidebar{margin-bottom:1em}.layout-sidebar .tab-content{border:none}.tab-content>.page-columns.active{display:grid}div.sourceCode>iframe{width:100%;height:300px;margin-bottom:-0.5em}a{text-underline-offset:3px}div.ansi-escaped-output{font-family:monospace;display:block}/*! * * ansi colors from IPython notebook's *