forked from frec-5174/eco4cast-in-R-book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathparticle-filter.qmd
615 lines (431 loc) · 23 KB
/
particle-filter.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
# Data assimilation: particle filter {#sec-particle-filter}
```{r}
#| echo: false
#| message: false
library(tidyverse)
set.seed(100)
```
## Review of Batch vs Sequential Methods
Before beginning the introduction of the particle filter it is important to revisit the MCMC-MH approach that we used to estimate the parameters of a Bayesian model. Here is the general algorithm again:
```
for(i in 1:num_iterations){
Choose new parameters based on previous parameters
for(t in 1:length_of_time_series){
Make predictions over full time series
}
Calculate likelihood of data given model and current parameters
Calculate probability of priors given current parameters
Accept or reject parameters
}
```
Above you see that the outermost for-loop is looping over the number of iterations (`num_iterations`). The inner loop is looping over the length of the time series (`length_of_time_series`). Therefore this approach tests each parameter value choose in an iteration (and latent state if using a state space model) over **ALL** time points in the time series. As a result, we all this a **batch** method because it considers all data as a single "batch" of data.
Strengths of the batch method are:\
- The parameters are consistent with all data - Straightforward to estimate parameters, uncertainty parameters, and latent states
Weaknesses:\
- Can be computationally slow\
- Require re-fitting model if there is even a single new data point
Alternatively, **sequential** methods only analyze data one time point as a time. Here is general code for a sequential method - notice that the for-loop order is reversed
```
Calculate prior distribution of parameters
for(t in 1:length_of_time_series){
for(i in 1:num_of_particles){
Make predictions for each particle based on previous value for particle
}
Compare particle to data (likelihood or other technique)
Weight particles based on the comparsion to the data
Adjust particles based on weights
}
```
In the sequential method we can restart at any time point, as long as we have the values for the states and parameters that are associated with each particle. When doing an iterative forecast, these values are what you would save to wait for new data to arrive.
## Introduction to Particle Filter
There are many sequential data assimilation methods that make different assumptions about data and model distributions in order to simplify the analysis. Many of the methods emerged before the our massive computational resources or were developed for **HUGE** problems like assimilating terabytes of data into a global weather model that simulates the physics of the atmosphere at 25 km vertical resolution. Examples include the Kalman Filter, Extended Kalman Filter, Ensemble Kalman Filter, and 4D-var. These methods all heavily use matrix algebra which I have not introduced in this class. Since these methods commonly assume that the data and model errors are normally distributed, the numerical version (Ensemble Kalman Filter) can run fewer particles (in this case: ensemble members) because it doesn't take as many samples to estimate the mean and variance of a distribution than it does to estimate the full distribution
Here I introduce the particle filter. The particle filter is a sequential method that will be more familiar to those that have learned the Bayesian methods that we have covered in the earlier chapters (@sec-bayes). It has the concept of a likelihood, of which you are well versed (@sec-likelihood). The particle filter does not assume a specific distribution of the data and model error so it requires more particles to estimate the full distributions. As a result, it is more appropriate for "smaller" problems like the ones we are tackling in this book
The particle filter is quite simple
1. Initialize a set of particles: Set the initial distribution of the states for each particle and, if estimating parameters, initial distribution of parameters that you want to estimate. Think of this as your initial priors.
2. Predict the next time step using your process model for each particle. Add process uncertainty to each particle (i.e., `rnorm`)
3. If there are observations at the time-step, calculate the likelihood of the data given the particle just like we calculated the likelihood in the likelihood and Bayesian exercises. For example: `LL <- dnorm(obs, mean = pred, sd_obs)`. You will typically use the uncertainty of the observations in the likelihood because you have already included the process uncertainty in #2. If observations are not available at the time step, then continue to the time step.
4. If there are observations at the time-step, resample the particles using the likelihood as the weights (don't forget to exponentiate the likelihood if you logged it in #3, the weights must be probabilities rather than log probabilities). The weighted sampling, with replacement, will randomly pick the more likely particles more often. You will keep the same number of particles but the values for each particle will change. Less likely particles will be replaced with more likely particles (though the less likelihood particles can still be selected). Be sure to resample all states together and, if also estimating parameters, the parameters as well. The key to resampling is the following:
```
## calculate likelihood (weight) for one state with an observation at that time-step
## The dimensiosn of x
wt <- dnorm(y[t], mean = x[t, ], sd = sd_data)
# Normalize the weights
wt_norm <- wt / sum(wt)
## resample ensemble members in proportion to their weight.
## Since the total number of samples = the number of particles then you will preserve the
## Same number of particles
resample_index <- sample(1:num_particles, num_particles, replace = TRUE, prob = wt_norm)
##Use the index to resample
x[t, ] <- x[t, resample_index]
```
4. Continue to next time step.
Fundamentally, the particle filter depends on two concepts that you have already been exposed to: likelihood and sampling from a set of particles (you sampled iterations from MCMC in previous exercises).
Specifically we refer to the particle filter described above as a bootstrap particle filter
## Example of Particle Filter
Here is an example of the particle filter applied to the [Google Flu data](https://en.wikipedia.org/wiki/Google_Flu_Trends), project that has been since stopped by Google due to [accuracy issues](https://www.science.org/doi/abs/10.1126/science.1248506). It is still a good dataset for illustrative purposes here. The exercises from Mike Dietze's Ecological Forecasting book also use Google Flu data.
First load in data. We are only going to focus on the first 15 weeks of the data so that you can see the particles more easily
```{r}
#| warning: false
gflu <- read_csv("data/gflu_data.txt", skip = 11, show_col_types = FALSE) %>%
select(Date, Virginia) |>
mutate(Date = as.Date(Date))
gflu <- gflu[1:15, ]
```
@fig-glu-va shows the data for Virginia
```{r}
#| fig-cap: Google flu data for Virginia
#| label: fig-glu-va
ggplot(data = gflu, aes(x = Date, y = log(Virginia))) +
geom_line() +
geom_point() +
labs(x = "Date", y = "Log(flu cases)", title = "Virginia Google Flu Trends") +
theme_bw()
```
### PF with no-observations
First we are going to run the particle filter with all data missing. This is equal to the random walk.
The `sd_add` would have been determined from a state-space Bayesian MCMC chain that used previously available data.
The `sd_obs` is the observation uncertainty.
The `sd_init` sets the initial uncertainty in the model states (you could use the distribution of the last latent state from a state-space Bayesian MCMC chain)
The key decision is the `num_particles`. More is always better but it comes at a computational and computer memory cost.
```{r}
num_particles <- 25
y <- log(gflu$Virginia)
nt <- length(y)
#This sets all the observations to NA after the first
y[2:nt] <- NA
sd_init <- 0.5
sd_add <- 0.2
sd_obs <- 0.2
x <- array(NA, dim = c(nt, num_particles))
x[1, ] <- rnorm(num_particles, mean = y[1], sd = sd_obs)
### resampling bootstrap particle filter
for(t in 2:nt){
## forward step
for(m in 1:num_particles){
x[t, m ] <- x[t - 1, m ] + rnorm(1, mean = 0, sd = sd_add)
}
## analysis step
if(!is.na(y[t])){
## calculate Likelihood (weights)
wt <- dnorm(y[t], mean = x[t, ], sd = sd_obs) ## calculate likelihood (weight)
wt_norm <- wt / sum(wt)
## resample ensemble members in proportion to their weight
resample_index <- sample(1:num_particles, num_particles, replace = TRUE, prob = wt_norm)
x[t, ] <- x[t, resample_index] ## update state
}
}
```
@fig-parts-alone particles individually
```{r}
#| warning: false
#| fig-cap: Results of the particle filter with each particle shown
#| label: fig-parts-alone
tibble(time = 1:nt,
as_tibble(x)) %>%
pivot_longer(cols = -time, names_to = "ensemble", values_to = "x") %>%
mutate(x = exp(x)) %>%
ggplot(aes(x = time, y = x, group = factor(ensemble))) +
geom_line() +
labs(x = "Date", y = "Flu cases", title = "State of Virginia") +
theme_bw()
```
Lets save the output as a different name so we can compare to a PF with observations
```{r}
x_no_obs <- x
```
### PF with observations
Now we can examine how the PF uses observations to update the model and modify the trajectory. The following is the same as above except that there are data at week 1, 5, and 10.
```{r}
num_particles <- 25
y <- log(gflu$Virginia)
nt <- length(y)
#This sets all the observations to NA after the first
y[2:nt] <- NA
y[c(5, 10)] <- log(gflu$Virginia)[c(5, 10)]
sd_init <- 0.5
sd_add <- 0.2
sd_obs <- 0.2
x <- array(NA, dim = c(nt, num_particles))
x[1, ] <- rnorm(num_particles, mean = y[1], sd = sd_init)
### resampling bootstrap particle filter
for(t in 2:nt){
## forward step
for(m in 1:num_particles){
x[t, m ] <- x[t - 1, m ] + rnorm(1, 0, sd_add)
}
## analysis step
if(!is.na(y[t])){
## calculate Likelihood (weights)
wt <- dnorm(y[t], mean = x[t, ], sd = sd_obs) ## calculate likelihood (weight)
wt_norm <- wt / sum(wt)
## resample ensemble members in proportion to their weight
resample_index <- sample(1:num_particles, num_particles, replace = TRUE, prob = wt_norm)
x[t, ] <- x[t, resample_index] ## update state
}
}
```
@fig-sim-prior-post2 shows the particles with the observations. You can see how the particles are adjusted when data are present.
```{r}
#| warning: false
#| fig-cap: Results of the particle filter with each particle shown. Observations are included
#| label: fig-sim-prior-post2
tibble(time = 1:nt,
as_tibble(x),
obs = y) %>%
pivot_longer(cols = -c("time","obs"), names_to = "ensemble", values_to = "x") %>%
mutate(x = exp(x),
obs = exp(obs)) %>%
ggplot(aes(x = time, y = x, group = factor(ensemble))) +
geom_line() +
geom_point(aes(y = obs), color = "red") +
labs(x = "Date", y = "Flu cases", title = "State of Virginia") +
theme_bw()
```
Save the output as a different object to compare to other PF simulations
```{r}
x_with_obs <- x
```
Now we can compare the influence of data assimilation on the last 5 weeks of the time-series (think of this as a 5-week forecast)
```{r}
#| warning: false
no_obs <- tibble(time = gflu$Date,
as_tibble(x_no_obs)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "no obs")
with_obs <- tibble(time = gflu$Date,
as_tibble(x_with_obs)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "with obs")
combined <- bind_rows(no_obs, with_obs)
gflu$obs_in_fit <- exp(y)
combined %>%
group_by(time, type) %>%
mutate(x = exp(x)) %>%
summarise(mean = mean(x),
upper = quantile(x, 0.975),
lower = quantile(x, 0.025),.groups = "drop") %>%
ggplot(aes(x = time, y = mean)) +
geom_line(aes(color = type)) +
geom_ribbon(aes(ymin = lower, ymax = upper, color = type, fill = type), alpha = 0.2) +
geom_point(data = gflu, aes(x = Date, y = Virginia), color = "red") +
geom_point(data = gflu, aes(x = Date, y = obs_in_fit), color = "black") +
labs(x = "Date", y = "Flu cases", title = "Google Flu Trends for Virginia") +
theme_bw()
```
### PF with parameter estimation
The estimate of parameters using PF is also straightforward. Here we will estimate the `b1` parameter in the Dynamic Linear Model from the state-space exercise. The DLM uses minimum temperature as a covariant
First, create the observed data that includes the minimum temperature. (same as above and using the `daymetr` package from the previous exercise)
```{r}
#| warning: false
gflu <- read_csv("data/gflu_data.txt", skip = 11, show_col_types = FALSE) %>%
select(Date, Virginia) |>
mutate(Date = as.Date(Date))
df <- daymetr::download_daymet(site = "Blacksburg",
lat = 37.22,
lon = -80.41,
start = 2003,
end = 2016,
internal = TRUE)$data
df$date <- as.Date(paste(df$year,df$yday,sep = "-"),"%Y-%j")
Tmean <- mean(df$tmin..deg.c.)
gflu$Tmin = df$tmin..deg.c.[match(gflu$Date,df$date)] - Tmean
gflu <- gflu[1:15, ]
```
Now modify the PF with the DLM model instead of the random walk. This should be familiar to you.
The values for `b0`, `b2`, and `sd_add` are the mean values from the MCMC chain in the previous assignment.
We are estimating `b1` (the sensitivity to minimum temperature). Just like we need to initialize the states at the first time step, we will initialize the distribution of the b1 at the first time step using a normal distribution with a `mean = b1_mean` and `sd = b1_sd`. This were the mean an sd of b1 from the MCMC chain in the previous assignment.
```{r}
y <- log(gflu$Virginia)
num_particles <- 25
nt <- length(y)
y[2:nt] <- NA
y[c(5, 10)] <- log(gflu$Virginia)[c(5, 10)]
sd_init <- 0.5
sd_add <- 0.130144
sd_obs <- 0.2
b0 <- 0.482462
b2 <- -0.063887
b1_mean <- -0.002479
b1_sd <- 0.01
x <- array(NA, dim = c(nt, num_particles))
x[1, ] <- rnorm(num_particles, y[1], sd = sd_init)
b1 <- array(NA, dim = c(nt, num_particles))
b1[1, ] <- rnorm(num_particles, mean = b1_mean, sd = b1_sd)
Tmin <- gflu$Tmin
```
Now run the PF with the DLM model. You need to also carry through the values for b1.
Importantly, the distribution of b1 carries through from the previous time-step. When there is an observation, b1 is resampled using the same index that the states are resampled. This ensures that the parameters match the states from the same particle.
```{r}
for(t in 2:nt){
## forward step
for(m in 1:num_particles){
pred <- x[t-1, m] + b0 + b1[t-1, m] * Tmin[t] + b2*x[t-1, m]
x[t, m ] <- pred + rnorm(1, mean = 0, sd = sd_add)
b1[t, m] <- b1[t-1, m]
}
## analysis step
if(!is.na(y[t])){
## calculate Likelihood (weights)
wt <- dnorm(y[t], mean = x[t, ], sd = sd_obs) ## calculate likelihood (weight)
wt_norm <- wt / sum(wt)
## resample ensemble members in proportion to their weight
resample_index <- sample(1:num_particles, num_particles, replace = TRUE, prob = wt_norm)
x[t, ] <- x[t, resample_index] ## update state
b1[t, ] <- b1[t, resample_index] ## Parameter update
}
}
```
@fig-pf-parameters shows the states from the PF when parameter fitting is included
```{r}
#| warning: false
#| fig-cap: Results of the particle filter when parameter fitting is included
#| label: fig-pf-parameters
tibble(time = 1:nt,
obs = y,
as_tibble(x)) %>%
pivot_longer(cols = -c(time,obs), names_to = "ensemble", values_to = "x") %>%
mutate(x = exp(x)) %>%
ggplot() +
geom_line(aes(x = time, y = x, group = factor(ensemble))) +
geom_point(aes(x = time, y = exp(obs)), color = "red") +
labs(x = "Date", y = "Flu cases", title = "State of Virginia") +
theme_bw()
```
And visualize the time-evolution of the parameter (@fig-parameter-timeseries). There are two new concepts illustrated below:
1. parameters distributions evolve through time. As a result the distribution of parameters is strongly influenced by the most recent observations. The distributions produced by a PF are not the same as the distributions produced by MCMC chain.\
2. particles can have degeneracy, whereby the values of the parameters collapse down to one or a few values. This occurs because the PF does not propose new parameter values, it only selects (through resampling) parameter values from the initial set that you started with. Over the time-step the PF weeds out bad parameters and only a few ones are left. This is a major issue with a PF. Degeneracy can also occur in the states but since we are adding process uncertainty (`sd_add`) the particles are able to separate through time. There are a couple of ways to solve degeneracy - increase the number of particles so you sample more initial parameter values or propose new parameter values as part of the PF (how do to this is beyond the scope of this lecture)
```{r}
#| echo: false
#| warning: false
#| fig-cap: Timing evolution of the parameters
#| label: fig-parameter-timeseries
tibble(time = 1:nt,
as_tibble(b1)) %>%
pivot_longer(cols = -c(time), names_to = "ensemble", values_to = "b1") %>%
ggplot() +
geom_line(aes(x = time, y = b1, group = factor(ensemble))) +
labs(x = "Date", y = "B1", title = "State of Virginia") +
theme_bw()
```
### Sensitivity to number of particles
The number of particles is a key decision when using a PF. To explore the sensitivity of the PF to the number of particles, here is a function that can be reused with different numbers of particles. It is the same as the DLM above and returns the x and b1 for the particles in a list
```{r}
bootstrap_pf <- function(num_particles, sd_add = 0.2, sd_obs = 0.2){
y <- log(gflu$Virginia)
nt <- length(y)
y[2:nt] <- NA
y[c(5, 10)] <- log(gflu$Virginia)[c(5, 10)]
sd_init <- 0.5
b0 <- 0.482462
b2 <- -0.063887
b1_mean <- -0.002479
b1_sd <- 0.01
sd_add <- 0.130144
x <- array(NA, dim = c(nt, num_particles))
x[1, ] <- rnorm(num_particles, mean = y[1], sd = sd_init)
b1 <- array(NA, dim = c(nt, num_particles))
b1[1, ] <- rnorm(num_particles, b1_mean, sd = b1_sd)
Tmin <- gflu$Tmin
### resampling bootstrap particle filter
for(t in 2:nt){
## forward step
for(m in 1:num_particles){
pred <- x[t-1, m] + b0 + b1[t-1, m] * Tmin[t] + b2*x[t-1, m]
x[t, m ] <- pred + rnorm(1, mean = 0, sd = sd_add)
b1[t, m] <- b1[t-1, m]
}
## analysis step
if(!is.na(y[t])){
## calulate Likelihood (weights)
wt <- dnorm(y[t], mean = x[t, ], sd = sd_obs) ## calculate likelihood (weight)
wt_norm <- wt / sum(wt)
## resample ensemble members in proportion to their weight
resample_index <- sample(1:num_particles, num_particles, replace = TRUE, prob = wt_norm)
x[t, ] <- x[t, resample_index] ## update state
b1[t, ] <- b1[t, resample_index] ## Parameter update
}
}
return(list(x = x, b1 = b1))
}
```
First, run the PF using 10, 100, 1000, and 10000 particles
```{r}
pf_10 <- bootstrap_pf(10)
pf_100 <- bootstrap_pf(100)
pf_1000 <- bootstrap_pf(1000)
pf_10000 <- bootstrap_pf(10000)
p10 <- tibble(time = gflu$Date,
as_tibble(pf_10$x)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "pf_10")
p100 <- tibble(time = gflu$Date,
as_tibble(pf_100$x)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "pf_100")
p1000 <- tibble(time = gflu$Date,
as_tibble(pf_1000$x)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "pf_1000")
p10000 <- tibble(time = gflu$Date,
as_tibble(pf_10000$x)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "pf_10000")
gflu$obs_in_fit <- exp(y)
```
And combine to a single plot (@fig-pf-num-sens). You see the width of the 95% confidence interval increases substantially from 10 to 1000 particles but then is similar from 1000 to 10000. This reflects what we learned from the very first exercise where you drew random samples from a distribution and found that between 1000 and 10000 random samples were required to approximate the distribution well.
```{r}
#|warning: false
#|fig-cap: Sensitivity of predictions to different number of particles
#| label: fig-pf-num-sens
bind_rows(p10, p100, p1000, p10000) %>%
group_by(time, type) %>%
mutate(x = exp(x)) %>%
summarise(mean = mean(x),
upper = quantile(x, 0.975),
lower = quantile(x, 0.025),.groups = "drop") %>%
ggplot(aes(x = time, y = mean)) +
geom_line(aes(color = type)) +
geom_ribbon(aes(ymin = lower, ymax = upper, color = type, fill = type), alpha = 0.2) +
geom_point(data = gflu, aes(x = Date, y = Virginia), color = "red") +
geom_point(data = gflu, aes(x = Date, y = obs_in_fit), color = "black") +
labs(x = "Date", y = "Flu cases", title = "Google Flu Trends for Virginia") +
theme_bw()
```
### Senstivity to observation uncertainity
We can also explore the sensitivity of the PF updating to our observation uncertainty. Intuitively, we should have stronger update (i.e., the particles are adjusted to be closer to the observation) when there is less uncertainty in the observations. The code below explores whether this intuition is correct.
```{r}
#| warning: false
pf_low_obs <- bootstrap_pf(5000, sd_obs = 0.1)
pf_high_obs <- bootstrap_pf(5000, sd_obs = 0.5)
pf_low <- tibble(time = gflu$Date,
as_tibble(pf_low_obs$x)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "low obs uncertainity")
pf_high <- tibble(time = gflu$Date,
as_tibble(pf_high_obs$x)) %>%
pivot_longer(cols = -c("time"), names_to = "ensemble", values_to = "x") %>%
mutate(type = "high obs uncertainity")
gflu$obs_in_fit <- exp(y)
```
Does @fig-pf-obs-sens match your intuition?
```{r}
#|warning: false
#|fig-cap: Sensitivity of predictions to observation uncertainty
#| label: fig-pf-obs-sens
bind_rows(pf_low, pf_high) %>%
group_by(time, type) %>%
mutate(x = exp(x)) %>%
summarise(mean = mean(x),
upper = quantile(x, 0.975),
lower = quantile(x, 0.025),.groups = "drop") %>%
ggplot(aes(x = time, y = mean)) +
geom_line(aes(color = type)) +
geom_ribbon(aes(ymin = lower, ymax = upper, color = type, fill = type), alpha = 0.2) +
geom_point(data = gflu, aes(x = Date, y = Virginia), color = "red") +
geom_point(data = gflu, aes(x = Date, y = obs_in_fit), color = "black") +
labs(x = "Date", y = "Flu cases", title = "Google Flu Trends for Virginia") +
theme_bw()
```
## Reading
## Problem set
The problem set is located in `particle_filter_problem_set.qmd` in https://github.com/frec-5174/book-problem-sets. You can fork the repository or copy the [code](https://github.com/frec-5174/book-problem-sets/blob/main/particle_filter_problem_set.qmd) into your own quarto document