Skip to content

Commit

Permalink
Merge pull request #3 from StateOfTheR/01_nimble
Browse files Browse the repository at this point in the history
Nimble
  • Loading branch information
Marie Etienne authored Aug 22, 2024
2 parents 76c2022 + 0d6ab3f commit ead8f9d
Show file tree
Hide file tree
Showing 2 changed files with 126 additions and 101 deletions.
127 changes: 126 additions & 1 deletion 01_nimble.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@ make_MCMC_comparison_pages(res, modelName = 'code_neg_bin',dir = "/tmp/",
```

# Another example: multivariate normal with zeri inflated component
# Another example: multivariate normal with zero inflated component

We consider a multivariate Gaussian model with zero inflation, where the probability in the zero inflation can depend on the variable. The vector of observations $Y_i$ lives in $\mathbb{R}^p$, and its distribution is defined conditionnally on a (multivariate) Bernoulli $W_i \in {0,1}^p$ and a multivariate Gaussian variable $Z_i\in\mathbb{R}^p$:

Expand Down Expand Up @@ -616,3 +616,128 @@ ggs(my_MCMC) %>% filter(stringr::str_detect(Parameter, "prec")) %>%
#| label: true-precision
print(round(Omega,3))
```


# Zero inflated multivariate normal revisited

All the above code uses a workaround to avoid defining a new distribution in Nimble which is a ZI multivariate normal.

Let $Y$ be a random vector.
We denote $\mathbf{i}_*$ the set of indexes for which $Y$ is non zero, and $\mathbf{i}_0$ the set of indices for which $Y$ is 0.

For the zero inflated normal distribution, the p.d.f. is given by:
$$
p(y \vert \mu, \Sigma, \pi) = \varphi(y_{\mathbf{i}_*}\vert \mu_{\mathbf{i}_*}, \Sigma_{\mathbf{i}_*;\mathbf{i}_*})\prod_{j\in \mathbf{i_0}}\pi_j \prod_{k\in \mathbf{i}_*}(1 - \pi_k)\,,
$$
where $\varphi(y_{\mathbf{i}_*}\vert \mu_{\mathbf{i}_*}, \Sigma_{\mathbf{i}_*;\mathbf{i}_*})$ is the p.d.f. of a multivariate normal distribution restricted to the indexes where $y$ is non 0 (and to the corresponding parameters).
This distribution can be coded as a `nimbleFunction`.
It is important here that the first argument must correspond to the value at which the density is evaluated, and that the `log` argument is mandatory.

```{r}
#| label: dZInormal
dZInormal <- nimbleFunction(
run = function(x = double(1),
prob = double(1),
mu = double(1),
sigma = double(2),
log = integer(0, default = 0)){
returnType(double(0))
non_nul_indexes <- which(x!=0)
nul_indexes <- which(x == 0)
p_term <- sum(log(prob[nul_indexes])) +
sum(log(1 - prob[non_nul_indexes]))
mu_term <- 0
if(length(non_nul_indexes) > 0){
chol_mat <- chol(sigma[non_nul_indexes, non_nul_indexes])
restricted_x = x[non_nul_indexes]
restricted_mu = mu[non_nul_indexes]
mu_term <- dmnorm_chol(restricted_x,
restricted_mu,
chol_mat, prec_param = FALSE, log = TRUE)
}
log_output <- p_term + mu_term
if(log){
return(log_output)
}
else{
return(exp(log_output))
}
}
)
```

From my experience, it is better to register this distribution to avoid any confusion about its parameters, nature and dimension of output, etc...

```{r}
registerDistributions(list(
dZInormal = list(BUGSdist = "dZInormal(prob, mu, sigma)", # How to call in nimble
discrete = FALSE, # Distribution is not discrete
pqAvail = FALSE, # CDF and quantile function are not available
types = c('value = double(1)', # The random variable is a vector
'prob = double(1)', # a vector
'mu = double(1)', # vector
'sigma = double(2)')) # double(2) is a matrix
))
```

Note that it tells us that we do not provide any way of simulating from this distribution. This is not a problem to run a standard MCMC.
Moreover, providing CDF and quantile function could allow some gain in efficiency (at least, that what the help pages says).

The code for the model is then direct:

```{r}
my_code <- nimbleCode({
for(j in 1:p){
prob[j] ~ dunif(0, 1)
mu[j] ~ dnorm(0, 1)
}
sigma[1:p, 1:p] ~ dwish(Ip[1:p, 1:p], p)
for(i in 1:n){
# I put my custom distribution
Y[i, 1:p] ~ dZInormal(prob[1:p], mu[1:p], sigma[1:p, 1:p])
}
})
```

And now, let's run it!


```{r}
#| label: fit-ZI-normal
#| cache: true
# Generating data
set.seed(123)
n_obs <- 1000
n_species <- 3
# Values
U <- mvtnorm::rmvnorm(n = n_obs,
mean = 0:(n_species - 1),
sigma = diag(1, n_species))
# Mask (matrix of zeros and ones)
Z <- rbinom(n = n_obs * n_species, size = 1, prob = .8) %>%
matrix(nrow = n_obs)
# Observations
Y <- round(U * Z, 9)
my_model <- nimbleModel(code = my_code,
constants = list(p = n_species,
n = n_obs,
Ip = diag(1, n_species)),
data = list(Y = Y),
inits = list(mu = rep(0, n_species),
prob = rep(0.5, n_species),
sigma = diag(1, n_species)))
results <- nimbleMCMC(my_model,
samplesAsCodaMCMC = TRUE,
nchains = 2, niter = 10000,
nburnin = 1000, thin = 10)
ggs(results) %>%
ggplot(aes(x = Iteration, y = value, color = factor(Chain))) +
facet_wrap(~Parameter, scales = "free") +
geom_line()
```


100 changes: 0 additions & 100 deletions script_dZInormal.R

This file was deleted.

0 comments on commit ead8f9d

Please sign in to comment.