Implementation

The ACAPT package is a draft product of an intersessional group of ACAP. The approach to the analysis of trend data facilitated by this package will be discussed in full at the 15th meeting of the ACAP Advisory Committee and associated Working Groups in May/June 2026. This approach will only be adopted by ACAP if formally agreed by the ACAP Advisory Committee.

Rationale

The ACAPT package provides facilities for estimating population trends of long lived species of albatross and petrels from survey data.

ACAPT is heavily influenced by the JARA package; users with more general needs may find JARA is better suited to their application.

Base Model

Annual estimates of total population size over the survey period are obtained by fitting a state space model to repeated surveys of one or more subpopulations. ACAPT offers two variant models that share a common core structure in which the log size of each subpopulation is modelled as a random walk.

The population size \(N_{ti}\) for subpopulation \(i\) in year \(t\) is assumed to evolve according to the multiplicative model \[ N_{(t+1)i} = \lambda_{ti} N_{ti} \] where the annual growth rates \(\lambda_{ti}\) are assumed to be log-Normally distributed \[ \lambda_{ti} \sim \operatorname{LN}(q_{t},\sigma^{2}) \] with the mean parameter \(q_{t}\) common to all subpopulations, and constant variance parameter \(\sigma^{2}\).

Log transforming, the model can be expressed in the form of a Normal random walk \[ \begin{aligned} x_{(t+1)i} & = x_{ti} + r_{ti} \\ r_{ti} & \sim \operatorname{N}(q_{t},\sigma_{r}^{2}) \end{aligned} \] where \(x_{ti} = \log N_{ti}\), \(r_{ti} = \log(\lambda_{ti})\) and \(\sigma_{r}=\sigma\).

It is assumed the subpopulations are intermittently surveyed, and the survey estimate \(y_{ti}\) for subpopulation \(i\) in year \(t\) is assumed to have a log Normal distribution \[ y_{ti} \sim \operatorname{LN}(x_{ti}, \sigma_{y}^{2}+\varsigma_{ti}^{2}). \]

with mean parameter \(x_{ti}\) and total variance parameter \(\sigma_{y}^{2}+\varsigma_{ti}^{2}\) consisting of a component of variance \(\varsigma_{ti}^{2}\) associated with the survey and an additional component of variance \(\sigma_{y}^{2}\) that is common to all surveys.

In practice, the precision of the survey is specified as the coefficient of variation \(\mathrm{cv}_{ti}\) of the survey estimate \(y_{ti}\), and \(\varsigma_{ti}^{2}\) is derived through the relation \[ \varsigma_{ti}^{2} = \log \left ( \mathrm{cv}_{ti}^{2}+1 \right ). \]

A Normal prior is assumed for the initial population size \(x_{t_{0}i}\) \[ x_{t_{0}i} \sim \operatorname{N}(\mu_{x_{t_{0}i}}, \tau_{x_{t_{0}i}}^{-1}) \] and Gamma priors are assumed for the precisions \(\tau_{y}=\sigma_{y}^{-2}\) and \(\tau_{r}=\sigma_{r}^{-2}\) \[ \begin{aligned} \tau_{y} & \sim \operatorname{Gamma}(a_{y}, b_{y}) \\ \tau_{r} & \sim \operatorname{Gamma}(a_{r}, b_{r}). \end{aligned} \] By default, the parameters \(a_{y}\), \(b_{y}\), \(a_{r}\) and \(b_{r}\) are chosen so that the priors on the precisions \(\tau_{y}\) and \(\tau_{r}\) are relatively diffuse. When relatively few surveys are available these precisions will be poorly determined by the data, and it may be necessary to impose more restrictive priors on these precisions.

Bayesian estimates of the model parameters are obtained by calling JAGS to sample from the posterior distribution of the model parameters.

ACAPT provides two variant models. The base model assumes the mean log growth rate \(q_{t}\) is constant over time. In the extended model the mean log growth rate \(q_{t}\) is a linear function of model covariates.

Survey Data

The survey data is provided as two dataframes d.est and d.cv. The d.est dataframe should contain the survey estimates of the subpopulation sizes (\(y_{ti}\)) and the d.cv dataframe should contain the corresponding coefficients of variation of the subpopulation size estimates. If a subpopulation is not surveyed in a given year, the corresponding entry in d.est should be NA and the corresponding entry in d.cv should be 0.

Format

The survey d.est and d.cv dataframes should each contain one column for each surveyed subpopulation, with a first column Year for the year of the survey. The column names of these dataframes must be identical, and there must be one row for each year to be modelled. The Year column from the two dataframes must be identical and the set of modelled years must be sequential and contiguous.

The checkData1 function checks that the year column is present and is appropriately formatted, and optionally checks if the dataframe contains NA values. If these conditions are not met, an error is raised.

checkData1 <- function(d,check.na=TRUE) {

  ## First column must be Year
  if(colnames(d)[1]!="Year")
    stop("The survey dataframes must have Year as first column")

  ## Year column must not contain NA values
  if(any(is.na(d$Year)))
    stop("The Year column must not contain NA values")

  ## Check years are contiguous
  if(!all(d$Year==seq.int(min(d$Year,na.rm=TRUE),max(d$Year,na.rm=TRUE))))
    stop("Years in the survey dataframes must be contiguous")

  ## Check for NA values
  if(check.na & any(is.na(d)))
      stop("The d.cv dataframe must not contain NA values")
}

The checkData2 function checks that each input dataframe is appropriately formatted, and that the column names and year ranges match. If these conditions are not met, an error is raised.

checkData2 <- function(d.est,d.cv) {

  ## Check the individual dataframes
  checkData1(d.est,check.na=FALSE)
  checkData1(d.cv,check.na=TRUE)

  ## Check column names match
  if(any(colnames(d.est)!=colnames(d.cv)))
    stop("The d.est and d.cv dataframe must have identical column names")

  ## Check year ranges match
  if(nrow(d.est)!=nrow(d.cv) || !all(d.est$Year==d.cv$Year))
    stop("Year range of the d.est and d.cv dataframes must match")
}

The mergeSurveyDF function merges the two dataframes of survey data into a single, “long” format, dataframe with four columns Year, Sub, Est, and CV.

mergeSurveyDF <- function(d.est,d.cv,drop.na=TRUE) {

  checkData2(d.est,d.cv)
  ## Expand to long format
  d <- data.frame(expand.grid(Year=d.est$Year,Sub=colnames(d.est)[-1]),
                  Est=as.vector(as.matrix(d.est[,-1])),
                  CV=as.vector(as.matrix(d.cv[,-1])))
  if(drop.na) d <- d[!is.na(d$Est),]
  d
}

Simulation

It is straightforward to simulate directly from the core model given the mean log growth rate \(q_{t}\) and the model variances and initial values. The ACAPTsimulate0 function simulates survey data given a dataframe d.cv coefficients of variation of the surveys, a vector of initial log subpopulation sizes x1, a vector of log growth rates q, and the two model variances y.sigma (\(\sigma_{y}\)) and r.sigma (\(\sigma_{r}\)). Zero entries in d.cv indicate that the subpopulation was not surveyed in that year.

ACAPTsimulate0 <- function(d.cv,x1,q,y.sigma,r.sigma) {

  ## Extract the survey precision data
  se2 <- log(as.matrix(d.cv[,-1])^2+1)
  n.yr <- nrow(se2)
  n.sub <- ncol(se2)
  ## Simulate the individual subpopulation log growth rates
  r <- matrix(rnorm(n.sub*(n.yr-1),rep(q,length.out=n.yr-1),r.sigma),
              nrow=n.yr-1,ncol=n.sub)
  ## Random walk for log subpopulation size
  x <- matrix(0,nrow=n.yr,ncol=n.sub)
  for(k in 1:n.sub) x[,k] <- cumsum(c(x1[k],r[,k]))
  ## Simulate survey counts based on se2
  sigma <- sqrt(se2+y.sigma^2)
  y <- matrix(rlnorm(n.yr*n.sub,x,sigma),nrow=n.yr,ncol=n.sub)
  y[se2==0] <- NA
  ## Label by subpopulation
  colnames(r) <- colnames(x) <- colnames(y) <- colnames(se2)
  ## Total population size and growth rate
  N <- rowSums(exp(x))
  lambda <- N[-1]/N[-length(N)]
  ## Collate results
  list(d.est=cbind(Year=d.cv$Year,as.data.frame(y)),
       d.cv=d.cv,
       q=q,r=r,x=x,y=y,N=N,lambda=lambda,
       y.sigma=y.sigma,r.sigma=r.sigma)
}

Models

ACAPT offers two variants of the core model which differ only in the representation of the mean log growth rate \(q_{t}\). The JAGS software package is used to derive Bayesian estimates of the model parameters by Markov Chain Monte Carlo. The functions ACAPTmodel and ACAPTmodel1 create JAGS scripts and data that define the base and extended models respectively. The objects returned by these functions are then used in conjunction with JAGSsample to sample from the posterior for the model parameters.

Base Model

The base model assumes the mean log growth rate \(q_{t}\) is constant over time \[ q_{t} = q. \]

In addition to the Normal prior for \(x_{t_{0}i}\) and the Gamma priors for the precisions \(\tau_{y}=\sigma_{y}^{-2}\) and \(\tau_{r}=\sigma_{r}^{-2}\), a Normal prior is assumed for \(q\) \[ q \sim \operatorname{N}(\mu_{q}, \tau_{q}^{-1}). \]

The ACAPTmodel function prepares the JAGS code and data to sample from the model, given the dataframes d.est and d.cv of survey data, and the parameters of the priors

ACAPTmodel <- function(d.est,d.cv,
                        x1.mu,x1.tau,q.mu=0,q.tau=0.01,
                        y.a=0.001,y.b=0.001,r.a=1,r.b=0.01) {

  checkData2(d.est,d.cv)

  ## JAGS model code
  jags.model <- "
model {

  for(i in 1:N.subs){

    ## Observation model
    for(k in 1:N.yrs) {
      sigma2[k,i] <- obs.sigma2[k,i]+1/y.tau
      tau[k,i] <- 1/sigma2[k,i]
      logy[k,i] ~ dnorm(x[k,i],tau[k,i])
    }

    ## Process model
    for(k in 1:(N.yrs-1)) {
      x[k+1,i] <- x[k,i]+r[k,i]
      r0[k,i] ~ dnorm(0,r.tau)
      r[k,i] <- r0[k,i]+q
    }

    ## Prior for initial x
    x[1,i] ~ dnorm(x1.mu[i],x1.tau[i])
  }

  ## Priors
  q ~ dnorm(q.mu,q.tau)
  y.tau ~ dgamma(y.a,y.b)
  r.tau ~ dgamma(r.a,r.b)
}"

  ## Parameters to be sampled
  jags.par <- c("x","r","q","y.tau","r.tau")

  ## Model data
  est <- as.matrix(d.est[,-1])
  se2 <- log(as.matrix(d.cv[,-1])^2+1)
  jags.data <- list(N.subs=ncol(est),
                    N.yrs=as.integer(nrow(est)),
                    logy=log(est),
                    obs.sigma2=se2,
                    x1.mu=x1.mu,x1.tau=x1.tau,
                    q.mu=q.mu,q.tau=q.tau,
                    y.a=y.a,y.b=y.b,
                    r.a=r.a,r.b=r.b)
  jags.init <- list(y.tau=100,r.tau=100)

  ## Collate model components
  obj <- list(call=match.call(),
              jags.model=jags.model,jags.par=jags.par,
              jags.data=jags.data,jags.init=jags.init,
              years=d.est$Year,subs=colnames(d.est)[-1])
  class(obj) <- "ACAPT"
  obj
}

The ACAPTsimulate function calls ACAPTsimulate0 to simulate from the base model.

ACAPTsimulate <- function(d.cv,x1,q,y.sigma,r.sigma) {

  checkData1(d.cv,check.na=TRUE)
  ## Simulate survey data given q (constant)
  ACAPTsimulate0(d.cv,x1,q,y.sigma,r.sigma)
}

Extended Model

The extended model assumes the mean log growth rate \(q_{t}\) has a regression structure \[ q = X \beta \] where \(X\) is an arbitrary design matrix provided by the user, and \(\beta\) is a vector of regression coefficients. The extended model reduces to the base model if the design matrix corresponds to just an intercept.

In addition to the Normal prior for \(x_{t_{0}i}\) and the Gamma priors for the precisions \(\tau_{y}=\sigma_{y}^{-2}\) and \(\tau_{r}=\sigma_{r}^{-2}\), independent Normal priors are imposed on the regression coefficients \[ \beta_{k} \sim \operatorname{N}(\mu_{k}, \tau_{k}^{-1}). \]

The ACAPTmodel1 function prepares the JAGS code and data to sample from the model, given the dataframes d.est and d.cv of survey data, a design matrix \(X\), and the parameters of the priors. As \(q\) is not required for the final year of data, \(X\) should have one less row than the survey data, but for consistency \(X\) is permitted to have the same number of rows as the survey data, in which case the final row is discarded.

ACAPTmodel1 <- function(d.est,d.cv,X,
                        beta.mu=0,beta.tau=0.01,x1.mu,x1.tau,
                        y.a=0.001,y.b=0.001,r.a=1,r.b=0.01) {

  checkData2(d.est,d.cv)
  ## Check X and beta are consistent with data
  if(nrow(X)==nrow(d.est)) X <- X[-nrow(X),,drop=FALSE]
  if(nrow(X)!=nrow(d.est)-1) stop("X must have one less row than d.est")
  beta.mu <- rep(beta.mu,length.out=ncol(X))
  beta.tau <- rep(beta.tau,length.out=ncol(X))

  ## JAGS model code
  jags.model <- "
model {

  for(i in 1:N.subs){

    ## Observation model
    for(k in 1:N.yrs) {
      sigma2[k,i] <- obs.sigma2[k,i]+1/y.tau
      tau[k,i] <- 1/sigma2[k,i]
      logy[k,i] ~ dnorm(x[k,i],tau[k,i])
    }

    ## Process model
    for(k in 1:(N.yrs-1)) {
      x[k+1,i] <- x[k,i]+r[k,i]
      r0[k,i] ~ dnorm(0,r.tau)
      r[k,i] <- r0[k,i]+q[k]
    }

    ## Prior for initial x
    x[1,i] ~ dnorm(x1.mu[i],x1.tau[i])
  }

  ## Linear predictor
  q <- X %*% beta

  ## Priors
  y.tau ~ dgamma(y.a,y.b)
  r.tau ~ dgamma(r.a,r.b)
  for(k in 1:length(beta.mu)) {
    beta[k] ~ dnorm(beta.mu[k],beta.tau[k])
  }
}"

  ## Parameters to be sampled
  jags.par <- c("x","r","q","beta","y.tau","r.tau")

  ## Model data
  est <- as.matrix(d.est[,-1])
  se2 <- log(as.matrix(d.cv[,-1])^2+1)
  jags.data <- list(N.subs=ncol(est),
                    N.yrs=as.integer(nrow(est)),
                    logy=log(est),
                    obs.sigma2=se2,
                    X = as.matrix(X),
                    beta.mu=beta.mu,beta.tau=beta.tau,
                    x1.mu=x1.mu,x1.tau=x1.tau,
                    y.a=y.a,y.b=y.b,
                    r.a=r.a,r.b=r.b)
  jags.init <- list(y.tau=100,r.tau=100)

  ## Collate model components
  obj <- list(call=match.call(),
              jags.model=jags.model,jags.par=jags.par,
              jags.data=jags.data,jags.init=jags.init,
              years=d.est$Year,subs=colnames(d.est)[-1])
  class(obj) <- "ACAPT"
  obj
}

The ACAPTsimulate1 function simulates from the model given the model parameters. This function generates the log growth rate \(q_{t}\) from the regression model which is then passed to ACAPTsimulate0 to generate the corresponding survey data. Again for consistency, \(X\) is permitted to have the same number of rows as the survey data, in which case the final row is discarded.

ACAPTsimulate1 <- function(d.cv,x1,X,beta,y.sigma,r.sigma) {

  checkData1(d.cv,check.na=TRUE)
  ## Check X and beta are consistent with data
  if(nrow(X)==nrow(d.cv)) X <- X[-nrow(X),,drop=FALSE]
  if(nrow(X)!=nrow(d.cv)-1) stop("X must have one less row than d.est")
  if(length(beta)!=ncol(X)) stop("beta must be of length ncol(X)")

  ## Linear predictor for mean log growth rate
  q <- X %*% beta
  ## Simulate survey data given q
  sim <- ACAPTsimulate0(d.cv,x1,q,y.sigma,r.sigma)
  sim$X <- X
  sim$beta <- beta
  sim
}

Sampling

The JAGSsample function calls JAGS to sample from the posterior distribution of the model parameters, given the JAGS code and data generated by one of the model functions. This function uses the facilities of the rjags package to sample from the prescribed model parameters with JAGS. The samples generated by JAGS are annotated and combined to form samples of the total population size and growth rate. The samples are returned as a list of mcarray objects, one for each model parameter.

JAGSsample <- function(model,n.chains=4,n.adapt=1000,n.update=1000,n.iter=5000,n.thin=1,
                       quiet=!interactive()) {
  ## Ensure the glm module is loaded
  load.module("glm",quiet=quiet)
  pb <- if(quiet) "none" else getOption("jags.pb","text")
  ## Create the JAGS model
  jags <- jags.model(file=textConnection(model$jags.model),
                     data=model$jags.data,
                     inits=model$jags.init,
                     n.chains=n.chains,
                     n.adapt=n.adapt,
                     quiet=quiet)
  ## Burnin
  update(jags,n.iter=n.update,progress.bar=pb)
  ## Sample
  s <- jags.samples(jags,model$jags.par,n.iter=n.iter,thin=n.thin,progress.bar=pb)
  ## Combine subpopulation samples
  s <- augmentSamples(s,model$years,model$subs)
  s
}

The augmentSamples function generates samples of the annual total population size \(N_{t}\) and population growth rate \(\lambda_{t}\) from the samples returned by JAGS, and annotates the samples with year and subpopulation labels.

augmentSamples <- function(s,years,subs) {

  ## Calculate the total population from the log subpopulation estimates
  s$N <- apply(exp(s$x),c(1,3,4),sum)
  class(s$N) <- "mcarray"
  attr(s$N,"varname") <- "N"
  attr(s$N,"type") <- attr(s$x,"type")
  attr(s$N,"iterations") <- attr(s$x,"iterations")

  ## Calculate the annual population growth rate
  s$lambda <- s$N[-1,,]/s$N[-nrow(s$N),,]
  class(s$lambda) <- "mcarray"
  attr(s$lambda,"varname") <- "lambda"
  attr(s$lambda,"type") <- attr(s$x,"type")
  attr(s$lambda,"iterations") <- attr(s$x,"iterations")

  ## Label samples by year and subpopulation
  n <- length(years)
  dimnames(s$x) <- list(Year=years,Sub=subs)
  dimnames(s$r) <- list(Year=years[-n],Sub=subs)
  if(dim(s$q)[1]>1) dimnames(s$q) <- list(Year=years[-n])
  dimnames(s$N) <- list(Year=years)
  dimnames(s$lambda) <- list(Year=years[-n])
  s
}

The ppSamples function generates samples from the posterior predictive distribution for the survey estimates from the samples returned by JAGS. The samples are returned as an mcarray object yp with the same dimensions as the samples of the log subpopulation sizes \(x_{ti}\).

ppSamples <- function(model,s,all=FALSE) {

  ## Create the total variance
  sigma <- sqrt(outer(model$jags.data$obs.sigma2,1/s$y.tau[1,,],FUN="+"))
  ## Draw samples from the posterior predictive distribution
  if(all) {
    yp <- array(rlnorm(length(s$x),s$x,sigma),dim(s$x))
  } else {
    yp <- array(NA,dim(s$x))
    keep <- as.vector(model$jags.data$obs.sigma2!=0)
    mn <- s$x[keep]
    yp[keep] <- rlnorm(length(mn),mn,sigma[keep])
  }
  ## Add annotations
  dimnames(yp) <- dimnames(s$x)
  class(yp) <- "mcarray"
  attr(yp,"varname") <- "yp"
  attr(yp,"type") <- attr(s$x,"type")
  attr(yp,"iterations") <- attr(s$x,"iterations")
  yp
}

Population change

The current population trend is estimated by extrapolating the population change observed in recent history to determine the population change over a target projection interval.

Given samples of the annual population sizes \(N_{t}\), samples of the (fractional) population change over a reference time interval \(T\) are calculated by first computing the population change \(N_{t}/N_{t-\delta}\) over the baseline time interval \(\delta\) for each year in the time window \(W\), and then extrapolating this ratio to the reference time interval \(T\), so that \[ R_{t} = \left ( \frac{N_{t}}{N_{t-\delta}} \right )^{\frac{T}{\delta}} \qquad t \in W. \]

The populationChange function generates samples of the population change over a reference time interval reference by extrapolating the population change over interval for each year in window

populationChange <- function(N,window,interval,reference=interval) {

  ## Indices of the reference window
  k <- match(window,as.numeric(dimnames(N)[[1]]))
  k <- k[k-interval>0]
  ## Calculate the population change over the specified interval
  R <- N[k,,]/N[k-interval,,]
  ## Extrapolate to the projection interval
  if(reference!=interval) R <- R^(reference/interval)
  attr(R,"varname") <- "R"
  R
}

The changeCategorySummary function summarizes the samples produced by populationChange to a table of proportions of change level categories. The change categories are defined by a vector of thresholds that define the boundaries between change categories, and a vector of category labels. There must be one more label than thresholds.

changeCategorySummary <- function(r,thresholds,labels) {

  ## Collapse array
  r <- as.vector(r)

  ## Check labels
  if(length(labels)!=length(thresholds)+1)
    stop("There should be one more label than thresholds")

  ## Classify into categories and tabulate
  cls <- cut(r,breaks=c(0,thresholds,max(r)),labels=labels)
  table(cls)/length(r)
}

The changeHistogram function summarizes the samples produced by populationChange to a format suitable for constructing a histogram of the posterior distribution of population changes annotated with change level categories. This function creates change level category labels annotated with the percentage of samples that fall into each category. It then bins and tabulates the samples to produce the binned densities needed to construct a histogram, and allocates each bin to a change level category. The thresholds should align with the histogram bins, and a warning is issued if that is not the case.

changeHistogram <- function(r,max,bin,thresholds,labels) {

  ## Calculate the percentage of samples in each category
  pct <- round(100*changeCategorySummary(r,thresholds,labels),1)
  pct[length(pct)] <- round(100-sum(pct[-length(pct)]),1)
  ## Create annotated labels
  ann <- paste0(labels," (",pct,"%)")
  ann <- ordered(ann,levels=ann)

  ## Histogram bins and midpoints
  brks <- seq(0,max,bin)
  midpts <- brks[-1]-bin/2
  if(!all(sapply(thresholds,function(x) min(abs(x-brks)) < 1.0E-8)))
    warning("Thresholds should align with histogram bins")

  ## Bin samples and tabulate
  z <- cut(r[r<max],breaks=brks)
  tbl <- as.data.frame(table(z),stringsAsFactors=FALSE)
  ## Classify midpoints into categories
  cls <- cut(midpts,breaks=c(0,thresholds,max),labels=labels)
  data.frame(Change=midpts,
             Density=tbl$Freq/(bin*length(r)),
             Class=cls,
             Label=ann[cls])
}

The changeDensity function is similar to changeHistogram, but summarizes to a format suitable for constructing a density plot instead of a histogram. The data from a density plot of the samples is generated with logKDE::logdensity_fft. This plot is then segmented into polygons that describe the density of samples in each change level category.

changeDensity <- function(r,max,thresholds,labels,n=512,adjust=2) {

  ## Calculate the percentage of samples in each category
  pct <- round(100*changeCategorySummary(r,thresholds,labels),1)
  pct[length(pct)] <- round(100-sum(pct[-length(pct)]),1)
  ## Create annotated labels
  ann <- paste0(labels," (",pct,"%)")
  ann <- ordered(ann,levels=ann)

  ## Density estimates
  d <- logdensity_fft(r,from=min(r),to=max,n=n,adjust=adjust)
  x <- d$x[is.finite(d$y)]
  y <- d$y[is.finite(d$y)]
  brks <- c(min(x),thresholds,max)
  do.call(rbind,lapply(seq_along(labels),function(k) {
    xs <- c(brks[k],x[x>brks[k] & x<brks[k+1]],brks[k+1])
    ys <- approx(x,y,xout=xs)$y
    data.frame(Change=c(brks[k],xs,brks[k+1]),
               Density=c(0,ys,0),
               Class=labels[k],
               Label=ann[k])
  }))
}

The functions iucnHistogram, iucnDensity, priorityHistogram, and priorityDensity are wrappers for changeHistogram and changeDensity that generate histograms and density plots of the posterior distribution of population changes annotated with IUCN threat levels and ACAP species priority.

Utilities

Annual Summaries

The annualSummary function creates annual summaries of model samples. Given an mcarray of samples of an annual quantity x this function creates a dataframe of annual posterior means and quantiles of x. The dataframe is in “long” format, with one column for the posterior mean and each of the requested quantiles, a column for year, and where appropriate a column for subpopulation. By default the median and 0.025 and 0.975 quantiles are calculated.

annualSummary <- function(x,q=c(0.5,0.025,0.975)) {

  smry <- function(x) c(Mean=mean(x,na.rm=TRUE),Q=quantile(x,prob=q,na.rm=TRUE))

  dm <- dim(x)
  nms <- dimnames(x)
  yr <- as.numeric(nms$Year)
  if(length(dim(x))==4) {
    ## x varies by year and subpopulation
    dim(x) <- c(prod(dm[1:2]),dm[3],dm[4])
    r <- cbind(expand.grid(Year=yr,Sub=nms$Sub),as.data.frame(t(apply(x,1,smry))))
  } else {
    ## x varies by year only
    r <- cbind(Year=yr,as.data.frame(t(apply(x,1,smry))))
  }
  ## Remove missing values
  r <- r[is.finite(r$Mean),]
  r
}

Coda Conversion

The as.coda function converts a list of mcarray objects like that returned by JAGSsample to a single mcmc.list object suitable for use with the coda or shinyStan packages. Individual mcarray objects may be converted to the coda format with as.mcmc.list, but this function converts and merges several mcarray objects.

The function converts the list of mcarray objects to a list of mcmc.list objects, which are then merged with mergeMCMCLists.

as.coda <- function(s) {
  ## Convert each mcarray to an mcmc.list and merge them
  do.call(mergeMCMCLists,lapply(s,as.mcmc.list))
}

The mergeMCMCLists function merges several mcmc.list objects into one mcmc.list object. If the start, end and thin parameters of the arguments are not identical, a warning is issued.

mergeMCMCLists <- function(...) {
  dots <- list(...)
  ## Extract sampling parameters
  start <- unique(sapply(dots,start))
  end <- unique(sapply(dots,end))
  thin <- unique(sapply(dots,thin))
  if(length(start)!=1L || length(end)!=1L || length(thin)!=1L)
    warning("All mcmc.list objects must have the same start, end and thin")
  ## Combine mcmc.list objects
  do.call(mcmc.list,
          lapply(
            do.call(mapply,c(cbind,dots,SIMPLIFY=FALSE)),
            mcmc,start=start,end=end,thin=thin))
}