Skip to content

Commit

Permalink
adding rerendered website
Browse files Browse the repository at this point in the history
  • Loading branch information
lrjohnson0 committed Jul 15, 2024
1 parent 9367f95 commit 2a6ac1b
Show file tree
Hide file tree
Showing 16 changed files with 986 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/Stats_review.html
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,7 @@ <h1>Random Variables (RVs)</h1>
<li>discrete (numbers of items or successes)</li>
<li>continuous (heights, times, weights)</li>
</ul>
<p>We usually use capital letters – e.g.&nbsp;<span class="math inline">X</span>, <span class="math inline">Y</span>, sometimes with bold or with subscripts – to denote the RVs. In contrast we use lower case letters, e.g.&nbsp;<span class="math inline">x</span>, <span class="math inline">y</span>, <span class="math inline">k</span>, to denote the values that the RV takes. For instance, lets say that the heights of the woman at Virginia Tech are the RV, <span class="math inline">X</span>, and <span class="math inline">X</span> has a normal distribution with mean 62 inches and variance 6<span class="math inline">^2</span>, i.e., <span class="math inline">X \sim \mathrm{N}(62,6^2)</span> distribution. Say we then observe the heights of 3 individuals drawn from this distribution – we would write this as: <span class="math inline">x=(</span> 58.8, 62.1, 64.2 <span class="math inline">)</span>.</p>
<p>We usually use capital letters – e.g.&nbsp;<span class="math inline">X</span>, <span class="math inline">Y</span>, sometimes with bold or with subscripts – to denote the RVs. In contrast we use lower case letters, e.g.&nbsp;<span class="math inline">x</span>, <span class="math inline">y</span>, <span class="math inline">k</span>, to denote the values that the RV takes. For instance, lets say that the heights of the woman at Virginia Tech are the RV, <span class="math inline">X</span>, and <span class="math inline">X</span> has a normal distribution with mean 62 inches and variance 6<span class="math inline">^2</span>, i.e., <span class="math inline">X \sim \mathrm{N}(62,6^2)</span> distribution. Say we then observe the heights of 3 individuals drawn from this distribution – we would write this as: <span class="math inline">x=(</span> 56.2, 58.4, 60.8 <span class="math inline">)</span>.</p>
<p><br> <br> </p>
</section>
<section id="probability-distributions" class="level1">
Expand Down Expand Up @@ -564,7 +564,7 @@ <h1>Probability Distributions in <code>R</code></h1>
<div class="cell">
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="fu">rnorm</span>(<span class="dv">3</span>, <span class="at">mean=</span><span class="dv">0</span>, <span class="at">sd=</span><span class="dv">1</span>) <span class="do">## random draws</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 0.02318727 0.28406256 0.56712882</code></pre>
<pre><code>[1] 0.5637509 -0.7347765 -0.4281301</code></pre>
</div>
</div>
<div class="cell">
Expand Down
Binary file modified docs/Stats_review_files/figure-html/unnamed-chunk-7-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/Stats_review_files/figure-html/unnamed-chunk-8-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/Stats_review_files/figure-html/unnamed-chunk-9-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
919 changes: 919 additions & 0 deletions docs/VB_IntroTimeDepData_practical.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/VB_RegDiagTrans_files/figure-revealjs/unnamed-chunk-19-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 37 additions & 0 deletions docs/data/Culex_erraticus_walton_covariates_aggregated.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
"Month_Yr","sample_value","MaxTemp","Precip"
"2015-01",0,17.7460156583238,3.30399188776126
"2015-02",0.0181818181818182,17.8726865768433,16.5442658017982
"2015-03",0.468085106382979,23.8176727092012,2.40565121451274
"2015-04",1.61904761904762,26.0355911254883,8.97440616825901
"2015-05",0.821428571428571,30.0160168011983,0.567960943028863
"2015-06",3.00595238095238,31.1209421157837,4.84134272920589
"2015-07",2.38095238095238,32.8113037109375,3.84901035286131
"2015-08",1.82634730538922,32.5624453493221,5.56284532361401
"2015-09",0.648809523809524,30.5515534877777,10.4097246267345
"2015-10",0.988023952095808,27.226053329285,0.337750268821231
"2015-11",0.73780487804878,24.8676840851947,18.3067496796934
"2015-12",0.142857142857143,22.4658838510513,5.62147537689834
"2016-01",0,16.0240589777629,3.55062202910301
"2016-02",0.0202020202020202,19.420570479499,11.2546808033655
"2016-03",0.0151515151515152,23.1360999213325,4.78566472774202
"2016-04",0.0261437908496732,24.9808234420477,4.58042451914619
"2016-05",0.0252525252525253,28.7288377549913,0.0530576339424258
"2016-06",0.833333333333333,30.9698980119493,6.15541747283905
"2016-07",1.26136363636364,33.3050937652588,4.49636819271836
"2016-08",1.68527918781726,32.0963299746441,11.3387491821789
"2016-09",2.61714285714286,31.605746710641,2.86828845062426
"2016-10",1.21212121212121,29.142751481798,0
"2016-11",1.53977272727273,24.4848198890686,0.00546268122905696
"2016-12",0.771573604060914,20.4605378088007,11.6155217247566
"2017-01",0.0454545454545455,18.3547306060791,0
"2017-02",0.0363636363636364,23.6558444976807,3.15071005333554
"2017-03",0.194285714285714,22.5357287979126,1.43009495224271
"2017-04",0.436548223350254,26.1529895859927,0.499381616484695
"2017-05",1.2020202020202,28.0017344156901,6.58056266321753
"2017-06",0.83419689119171,29.4895135158084,13.3339398578195
"2017-07",1.76536312849162,32.2513512552783,7.49392703533506
"2017-08",0.744791666666667,31.8647617797057,6.08211343444419
"2017-09",0.722222222222222,30.6056567298041,4.63103739480779
"2017-10",0.142131979695431,27.7345255449944,11.5671122144095
"2017-11",0.289772727272727,23.2313950061798,1.19576047259298
"2017-12",0.00917431192660551,18.9360333013972,4.01825444195248
28 changes: 28 additions & 0 deletions docs/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,34 @@
"section": "",
"text": "Main materials\n\nPre-workshop\nInformation about pre-workshop preparation – including software installation, expectations for what you should already be familiar with, and review materials – is available in the pre-work portion of the materials page.\n \n\n\n22 July 2024\n\n\n\nTime\nActivity\nMaterials\n\n\n\n\n\nArrival\n\n\n\n\n \n\n\n23 July 2024 (08:30 - 17:00)\n\n\n\n\n\n\n\n\nTime\nActivity\nMaterials\n\n\n\n\n\nComing Soon!\n\n\n\n12:00\nLunch\n\n\n\n\n \n\n\n24 July 2024 (08:30 - 17:00)\n\n\n\n\n\n\n\n\nTime\nActivity\nMaterials\n\n\n\n\n\nComing Soon!\n\n\n\n\n\n\n\n\n12:00\nLunch\n\n\n\n\n\n\n\n\n\n\n\n25 July 2024 (08:30 - 17:00)\n\n\n\nTime\nActivity\nMaterials\n\n\n\n\n\nComing Soon!\n\n\n\n\n\n\n26 July 2024\n\n\n\nTime\nActivity\nMaterials\n\n\n\n\n\nTravel\n\n\n\n\n\n\nPost-workshop\nEnjoy using these new techniques and databases!"
},
{
"objectID": "VB_IntroTimeDepData_practical.html#exploring-the-data",
"href": "VB_IntroTimeDepData_practical.html#exploring-the-data",
"title": "VectorByte Methods Training",
"section": "Exploring the Data",
"text": "Exploring the Data\nAs always, we first want to take a look at the data, to make sure we understand it, and that we don’t have missing or weird values.\n\nmozData&lt;-read.csv(\"data/Culex_erraticus_walton_covariates_aggregated.csv\")\nsummary(mozData)\n\n Month_Yr sample_value MaxTemp Precip \n Length:36 Min. :0.00000 Min. :16.02 Min. : 0.000 \n Class :character 1st Qu.:0.04318 1st Qu.:22.99 1st Qu.: 2.162 \n Mode :character Median :0.73001 Median :26.69 Median : 4.606 \n Mean :0.80798 Mean :26.23 Mean : 5.595 \n 3rd Qu.:1.22443 3rd Qu.:30.70 3rd Qu.: 7.864 \n Max. :3.00595 Max. :33.31 Max. :18.307 \n\n\nWe can see that the minimum observed average number of mosquitoes it zero, and max is only 3 (there are likely many zeros averaged over many days in the month). There don’t appear to be any NAs in the data. In this case the dataset itself is small enough that we can print the whole thing to ensure it’s complete:\n\nmozData\n\n Month_Yr sample_value MaxTemp Precip\n1 2015-01 0.000000000 17.74602 3.303991888\n2 2015-02 0.018181818 17.87269 16.544265802\n3 2015-03 0.468085106 23.81767 2.405651215\n4 2015-04 1.619047619 26.03559 8.974406168\n5 2015-05 0.821428571 30.01602 0.567960943\n6 2015-06 3.005952381 31.12094 4.841342729\n7 2015-07 2.380952381 32.81130 3.849010353\n8 2015-08 1.826347305 32.56245 5.562845324\n9 2015-09 0.648809524 30.55155 10.409724627\n10 2015-10 0.988023952 27.22605 0.337750269\n11 2015-11 0.737804878 24.86768 18.306749680\n12 2015-12 0.142857143 22.46588 5.621475377\n13 2016-01 0.000000000 16.02406 3.550622029\n14 2016-02 0.020202020 19.42057 11.254680803\n15 2016-03 0.015151515 23.13610 4.785664728\n16 2016-04 0.026143791 24.98082 4.580424519\n17 2016-05 0.025252525 28.72884 0.053057634\n18 2016-06 0.833333333 30.96990 6.155417473\n19 2016-07 1.261363636 33.30509 4.496368193\n20 2016-08 1.685279188 32.09633 11.338749182\n21 2016-09 2.617142857 31.60575 2.868288451\n22 2016-10 1.212121212 29.14275 0.000000000\n23 2016-11 1.539772727 24.48482 0.005462681\n24 2016-12 0.771573604 20.46054 11.615521725\n25 2017-01 0.045454545 18.35473 0.000000000\n26 2017-02 0.036363636 23.65584 3.150710053\n27 2017-03 0.194285714 22.53573 1.430094952\n28 2017-04 0.436548223 26.15299 0.499381616\n29 2017-05 1.202020202 28.00173 6.580562663\n30 2017-06 0.834196891 29.48951 13.333939858\n31 2017-07 1.765363128 32.25135 7.493927035\n32 2017-08 0.744791667 31.86476 6.082113434\n33 2017-09 0.722222222 30.60566 4.631037395\n34 2017-10 0.142131980 27.73453 11.567112214\n35 2017-11 0.289772727 23.23140 1.195760473\n36 2017-12 0.009174312 18.93603 4.018254442"
},
{
"objectID": "VB_IntroTimeDepData_practical.html#plotting-the-data",
"href": "VB_IntroTimeDepData_practical.html#plotting-the-data",
"title": "VectorByte Methods Training",
"section": "Plotting the data",
"text": "Plotting the data\nFirst we’ll examine the data itself, including the predictors:\n\nmonths&lt;-dim(mozData)[1]\nt&lt;-1:months ## counter for months in the data set\npar(mfrow=c(3,1))\nplot(t, mozData$sample_value, type=\"l\", lwd=2, \n main=\"Average Monthly Abundance\", \n xlab =\"Time (months)\", \n ylab = \"Average Count\")\nplot(t, mozData$MaxTemp, type=\"l\",\n col = 2, lwd=2, \n main=\"Average Maximum Temp\", \n xlab =\"Time (months)\", \n ylab = \"Temperature (C)\")\nplot(t, mozData$Precip, type=\"l\",\n col=\"dodgerblue\", lwd=2,\n main=\"Average Monthly Precip\", \n xlab =\"Time (months)\", \n ylab = \"Precipitation (in)\")\n\n\n\n\n\n\n\n\nVisually we noticed that there may be a bit of clumping in the values for abundance (this is subtle) – in particular, since we have a lot of very small/nearly zero counts, a transform, such as a square root, may spread things out for the abundances. It also looks like both the abundance and temperature data are more cyclical than the precipitation, and thus more likely to be related to each other. There’s also not visually a lot of indication of a trend, but it’s usually worthwhile to consider it anyway. Replotting the abundance data with a transformation:\n\nmonths&lt;-dim(mozData)[1]\nt&lt;-1:months ## counter for months in the data set\nplot(t, sqrt(mozData$sample_value), type=\"l\", lwd=2, \n main=\"Sqrt Average Monthly Abundance\", \n xlab =\"Time (months)\", \n ylab = \"Average Count\")\n\n\n\n\n\n\n\n\nThat looks a little bit better. I suggest we go with this for our response."
},
{
"objectID": "VB_IntroTimeDepData_practical.html#building-a-data-frame",
"href": "VB_IntroTimeDepData_practical.html#building-a-data-frame",
"title": "VectorByte Methods Training",
"section": "Building a data frame",
"text": "Building a data frame\nBefore we get into model building, we always want to build a data frame to contain all of the predictors that we want to consider, at the potential lags that we’re interested in. In the lecture we saw building the AR, sine/cosine, and trend predictors:\n\nt &lt;- 2:months ## to make building the AR1 predictors easier\n\nmozTS &lt;- data.frame(\n Y=sqrt(mozData$sample_value[t]), # transformed response\n Yl1=sqrt(mozData$sample_value[t-1]), # AR1 predictor\n t=t, # trend predictor\n sin12=sin(2*pi*t/12), \n cos12=cos(2*pi*t/12) # periodic predictors\n )\n\nWe will also put in the temperature and precipitation predictors. But we need to think about what might be an appropriate lag. If this were daily or weekly data, we’d probably want to have a fairly sizable lag – mosquitoes take a while to develop, so the number we see today is not likely related to the temperature today. However, since these data are agregated across a whole month, as is the temperature/precipitaion, the current month values are likely to be useful. However, it’s even possible that last month’s values may be so we’ll add those in as well:\n\nmozTS$MaxTemp&lt;-mozData$MaxTemp[t] ## current temps\nmozTS$MaxTempl1&lt;-mozData$MaxTemp[t-1] ## previous temps\nmozTS$Precip&lt;-mozData$Precip[t] ## current precip\nmozTS$Precipl1&lt;-mozData$Precip[t-1] ## previous precip\n\nThus our full dataframe:\n\nsummary(mozTS)\n\n Y Yl1 t sin12 \n Min. :0.0000 Min. :0.0000 Min. : 2.0 Min. :-1.00000 \n 1st Qu.:0.2951 1st Qu.:0.2951 1st Qu.:10.5 1st Qu.:-0.68301 \n Median :0.8590 Median :0.8590 Median :19.0 Median : 0.00000 \n Mean :0.7711 Mean :0.7684 Mean :19.0 Mean :-0.01429 \n 3rd Qu.:1.1120 3rd Qu.:1.1120 3rd Qu.:27.5 3rd Qu.: 0.68301 \n Max. :1.7338 Max. :1.7338 Max. :36.0 Max. : 1.00000 \n cos12 MaxTemp MaxTempl1 Precip \n Min. :-1.00000 Min. :16.02 Min. :16.02 Min. : 0.000 \n 1st Qu.:-0.68301 1st Qu.:23.18 1st Qu.:23.18 1st Qu.: 1.918 \n Median : 0.00000 Median :27.23 Median :27.23 Median : 4.631 \n Mean :-0.02474 Mean :26.47 Mean :26.44 Mean : 5.660 \n 3rd Qu.: 0.50000 3rd Qu.:30.79 3rd Qu.:30.79 3rd Qu.: 8.234 \n Max. : 1.00000 Max. :33.31 Max. :33.31 Max. :18.307 \n Precipl1 \n Min. : 0.000 \n 1st Qu.: 1.918 \n Median : 4.631 \n Mean : 5.640 \n 3rd Qu.: 8.234 \n Max. :18.307 \n\n\n\nhead(mozTS)\n\n Y Yl1 t sin12 cos12 MaxTemp MaxTempl1\n1 0.1348400 0.0000000 2 8.660254e-01 5.000000e-01 17.87269 17.74602\n2 0.6841675 0.1348400 3 1.000000e+00 6.123234e-17 23.81767 17.87269\n3 1.2724180 0.6841675 4 8.660254e-01 -5.000000e-01 26.03559 23.81767\n4 0.9063270 1.2724180 5 5.000000e-01 -8.660254e-01 30.01602 26.03559\n5 1.7337683 0.9063270 6 1.224647e-16 -1.000000e+00 31.12094 30.01602\n6 1.5430335 1.7337683 7 -5.000000e-01 -8.660254e-01 32.81130 31.12094\n Precip Precipl1\n1 16.5442658 3.3039919\n2 2.4056512 16.5442658\n3 8.9744062 2.4056512\n4 0.5679609 8.9744062\n5 4.8413427 0.5679609\n6 3.8490104 4.8413427"
},
{
"objectID": "VB_IntroTimeDepData_practical.html#building-a-first-model",
"href": "VB_IntroTimeDepData_practical.html#building-a-first-model",
"title": "VectorByte Methods Training",
"section": "Building a first model",
"text": "Building a first model\nWe will first build a very simple model – just a trend – to practice building the model, checking diagnostics, and plotting predictions.\n\nmod1&lt;-lm(Y ~ t, data=mozTS)\nsummary(mod1)\n\n\nCall:\nlm(formula = Y ~ t, data = mozTS)\n\nResiduals:\n Min 1Q Median 3Q Max \n-0.81332 -0.47902 0.03671 0.37384 0.87119 \n\nCoefficients:\n Estimate Std. Error t value Pr(&gt;|t|) \n(Intercept) 0.904809 0.178421 5.071 1.5e-05 ***\nt -0.007038 0.008292 -0.849 0.402 \n---\nSignif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\nResidual standard error: 0.4954 on 33 degrees of freedom\nMultiple R-squared: 0.02136, Adjusted R-squared: -0.008291 \nF-statistic: 0.7204 on 1 and 33 DF, p-value: 0.4021\n\n\nThe model output indicates that this model is not useful – the trend is not significant and it only explains about 2% of the variability. Let’s plot the predictions:\n\n## plot points and fitted lines\nplot(Y~t, data=mozTS, col=1, type=\"l\")\nlines(t, mod1$fitted, col=\"dodgerblue\", lwd=2)\n\n\n\n\n\n\n\n\nNot good – we’ll definitely need to try something else! Remember that since we’re using a linear model for this, that we should check our residual plots as usual, and then also plot the acf of the residuals:\n\npar(mfrow=c(1,3), mar=c(4,4,2,0.5)) \n\n## studentized residuals vs fitted\nplot(mod1$fitted, rstudent(mod1), col=1,\n xlab=\"Fitted Values\", \n ylab=\"Studentized Residuals\", \n pch=20, main=\"AR 1 only model\")\n\n## qq plot of studentized residuals\nqqnorm(rstudent(mod1), pch=20, col=1, main=\"\" )\nabline(a=0,b=1,lty=2, col=2)\n\n## histogram of studentized residuals\nhist(rstudent(mod1), col=1, \n xlab=\"Studentized Residuals\", \n main=\"\", border=8)\n\n\n\n\n\n\n\n\nThis doesn’t look really bad, although the histogram might be a bit weird. Finally the acf\n\nacf(mod1$residuals)\n\n\n\n\n\n\n\n\nThis is where we can see that we definitely aren’t able to capture the pattern. There’s substantial autocorrelation left at a 1 month lag, and around 6 months.\nFinally, for moving forward, we can extract the BIC for this model so that we can compare with other models that you’ll build next.\n\nn&lt;-length(t)\nextractAIC(mod1, k=log(n))[2]\n\n[1] -44.11057"
},
{
"objectID": "VB_RegDiagTrans_practical_soln.html#fit-the-linear-regression-model.-plot-the-data-and-fitted-line.",
"href": "VB_RegDiagTrans_practical_soln.html#fit-the-linear-regression-model.-plot-the-data-and-fitted-line.",
Expand Down

0 comments on commit 2a6ac1b

Please sign in to comment.