Skip to contents

This function implements an R0 estimation due to White and Pagano (Statistics in Medicine, 2008). The method is based on maximum likelihood estimation in a Poisson transmission model. See details for important implementation notes.

Usage

wp(
  cases,
  mu = NA,
  serial = FALSE,
  grid_length = 100,
  max_shape = 10,
  max_scale = 10
)

Arguments

cases

Vector of case counts. The vector must be of length at least two and only contain positive integers.

mu

Mean of the serial distribution (defaults to NA). This must be a positive number or NA. If a number is specified, the value should match the case counts in time units. For example, if case counts are weekly and the serial distribution has a mean of seven days, then mu should be set to 1. If case counts are daily and the serial distribution has a mean of seven days, then mu should be set to 7.

serial

Whether to return the estimated serial distribution in addition to the estimate of R0 (defaults to FALSE). This must be a value identical to TRUE or FALSE.

grid_length

The length of the grid in the grid search (defaults to 100). This must be a positive integer. It will only be used if mu is set to NA. The grid search will go through all combinations of the shape and scale parameters for the gamma distribution, which are grid_length evenly spaced values from 0 (exclusive) to max_shape and max_scale (inclusive), respectively. Note that larger values will result in a longer search time.

max_shape

The largest possible value of the shape parameter in the grid search (defaults to 10). This must be a positive number. It will only be used if mu is set to NA. Note that larger values will result in a longer search time, and may cause numerical instabilities.

max_scale

The largest possible value of the scale parameter in the grid search (defaults to 10). This must be a positive number. It will only be used if mu is set to NA. Note that larger values will result in a longer search time, and may cause numerical instabilities.

Value

If serial is identical to TRUE, a list containing the following components is returned:

  • r0 - the estimate of R0

  • supp - the support of the estimated serial distribution

  • pmf - the probability mass function of the estimated serial distribution

Otherwise, if serial is identical to FALSE, only the estimate of R0 is returned.

Details

This method is based on a Poisson transmission model, and hence may be most most valid at the beginning of an epidemic. In their model, the serial distribution is assumed to be discrete with a finite number of possible values. In this implementation, if mu is not NA, the serial distribution is taken to be a discretized version of a gamma distribution with shape parameter 1 and scale parameter mu (and hence mean mu). When mu is NA, the function implements a grid search algorithm to find the maximum likelihood estimator over all possible gamma distributions with unknown shape and scale, restricting these to a prespecified grid (see the parameters grid_length, max_shape and max_scale). In both cases, the largest value of the support is chosen such that the cumulative distribution function of the original (pre-discretized) gamma distribution has cumulative probability of no more than 0.999 at this value.

When the serial distribution is known (i.e., mu is not NA), sensitivity testing of mu is strongly recommended. If the serial distribution is unknown (i.e., mu is NA), the likelihood function can be flat near the maximum, resulting in numerical instability of the optimizer. When mu is NA, the implementation takes considerably longer to run. Users should be careful about units of time (e.g., are counts observed daily or weekly?) when implementing.

The model developed in White and Pagano (2008) is discrete, and hence the serial distribution is finite discrete. In our implementation, the input value mu is that of a continuous distribution. The algorithm discretizes this input, and so the mean of the estimated serial distribution returned (when serial is set to TRUE) will differ from mu somewhat. That is to say, if the user notices that the input mu and the mean of the estimated serial distribution are different, this is to be expected, and is caused by the discretization.

See also

vignette("wp_serial", package="Rnaught") for examples of using the serial distribution.

Examples

# Weekly data.
cases <- c(1, 4, 10, 5, 3, 4, 19, 3, 3, 14, 4)

# Obtain R0 when the serial distribution has a mean of five days.
wp(cases, mu = 5 / 7)
#> [1] 1.107862

# Obtain R0 when the serial distribution has a mean of three days.
wp(cases, mu = 3 / 7)
#> [1] 1.067642

# Obtain R0 when the serial distribution is unknown.
# Note that this will take longer to run than when `mu` is known.
wp(cases)
#> [1] 1.495574

# Same as above, but specify custom grid search parameters. The larger any of
# the parameters, the longer the search will take, but with potentially more
# accurate estimates.
wp(cases, grid_length = 40, max_shape = 4, max_scale = 4)
#> [1] 1.495574

# Return the estimated serial distribution in addition to the estimate of R0.
estimate <- wp(cases, serial = TRUE)

# Display the estimate of R0, as well as the support and probability mass
# function of the estimated serial distribution returned by the grid search.
estimate$r0
#> [1] 1.495574
estimate$supp
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
estimate$pmf
#>  [1] 0.3295449612 0.1855210503 0.1282030815 0.0920057871 0.0672100630
#>  [6] 0.0496097863 0.0368701329 0.0275354532 0.0206388543 0.0155131855
#> [11] 0.0116867019 0.0088202295 0.0066670102 0.0050459517 0.0038232730
#> [16] 0.0028996361 0.0022009767 0.0016718921 0.0012708261 0.0009665381
#> [21] 0.0007354976 0.0005599523 0.0004264906 0.0003249676 0.0002477014