Title: | Bayesian Modeling of Spatio-Temporal Data with R |
---|---|
Description: | Fits, validates and compares a number of Bayesian models for spatial and space time point referenced and areal unit data. Model fitting is done using several packages: 'rstan', 'INLA', 'spBayes', 'spTimer', 'spTDyn', 'CARBayes' and 'CARBayesST'. Model comparison is performed using the DIC and WAIC, and K-fold cross-validation where the user is free to select their own subset of data rows for validation. Sahu (2022) <doi:10.1201/9780429318443> describes the methods in detail. |
Authors: | Sujit K. Sahu [aut, cre] |
Maintainer: | Sujit K. Sahu <[email protected]> |
License: | GPL-2 |
Version: | 0.7.9 |
Built: | 2025-02-23 05:38:54 UTC |
Source: | https://github.com/sujit-sahu/bmstdr |
Temperature and salinity data from Argo floats in the North Atlantic Ocean at three layers of depth: surface (less than 50 meters), mid-layer (between 475-525 meters) and deep (975 to 1025 meters) during 2003.
argo_floats_atlantic_2003
argo_floats_atlantic_2003
An object of class data.frame
with 6978 rows and 11 columns.
Sahu and Challenor (2008) @format A data frame with 6978 rows and 11 columns:
Longitude of the argo float
Latitude of the argo float
Cumulative day of the year in 2003
Day within each month in 2003
Month in 2003
Temperature recorded by the Argo float in degree Celsius
Salinity in practical salinity units
Centered and scaled values of longitude at each depth
Centered and scaled values of latitude at each depth
Centered and scaled values of longitude times latitude at each depth
Sahu SK, Challenor P (2008). “A space-time model for joint modeling of ocean temperature and salinity levels as measured by Argo floats.” Environmetrics, 19, 509-528.
head(argo_floats_atlantic_2003) # Data for the surface layer surface <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==1, ] # Data for the mid-layer midlayer <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==2, ] # Data at the deep ocean deep <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==3, ]
head(argo_floats_atlantic_2003) # Data for the surface layer surface <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==1, ] # Data for the mid-layer midlayer <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==2, ] # Data at the deep ocean deep <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==3, ]
Bayesian regression model fitting for areal and areal spatio-temporal data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.
Bcartime( formula, data, family, link = NULL, trials = NULL, offsetcol = NULL, formula.omega = NULL, scol = NULL, tcol = NULL, package = "CARBayes", model = "glm", AR = 1, W = NULL, adj.graph = NULL, residtype = "response", interaction = TRUE, Z = NULL, W.binary = NULL, changepoint = NULL, knots = NULL, validrows = NULL, prior.mean.delta = NULL, prior.mean.beta = NULL, prior.var.beta = NULL, prior.mean.gamma = NULL, prior.var.gamma = NULL, prior.sigma2 = NULL, prior.tau2 = c(2, 1), prior.delta = NULL, prior.var.delta = NULL, prior.lambda = NULL, prior.nu2 = c(2, 1), epsilon = 0, G = NULL, ind.area = NULL, trends = NULL, rho.T = NULL, rho.S = NULL, rho = NULL, rho.slo = NULL, rho.int = NULL, MALA = FALSE, N = 2000, burn.in = 1000, thin = 10, rseed = 44, Nchains = 4, verbose = TRUE, plotit = TRUE )
Bcartime( formula, data, family, link = NULL, trials = NULL, offsetcol = NULL, formula.omega = NULL, scol = NULL, tcol = NULL, package = "CARBayes", model = "glm", AR = 1, W = NULL, adj.graph = NULL, residtype = "response", interaction = TRUE, Z = NULL, W.binary = NULL, changepoint = NULL, knots = NULL, validrows = NULL, prior.mean.delta = NULL, prior.mean.beta = NULL, prior.var.beta = NULL, prior.mean.gamma = NULL, prior.var.gamma = NULL, prior.sigma2 = NULL, prior.tau2 = c(2, 1), prior.delta = NULL, prior.var.delta = NULL, prior.lambda = NULL, prior.nu2 = c(2, 1), epsilon = 0, G = NULL, ind.area = NULL, trends = NULL, rho.T = NULL, rho.S = NULL, rho = NULL, rho.slo = NULL, rho.int = NULL, MALA = FALSE, N = 2000, burn.in = 1000, thin = 10, rseed = 44, Nchains = 4, verbose = TRUE, plotit = TRUE )
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the regression model to be fitted. |
data |
The data frame for which the model formula is to be fitted. The data frame should be in long format having one row for each location and time combination. The data frame must be ordered by time within each site, and should optionally have a column, named s.index, providing the site indices. Thus the data, with n sites and T times within each site, should be organized in the order: (s1, t1), (s1, t2), ... (s1, T), ... (sn, t1), ... (sn, T). The data frame should also contain two columns giving the coordinates of the locations for spatio temporal model fitting. |
family |
One of either "gaussian", "binomial","poisson" or "zip", which respectively specify a Gaussian, binomial likelihood model with the logistic link function, a Poisson likelihood model with a log link function, or a zero-inflated Poisson model with a log link function. |
link |
The link function to use for INLA based model fitting. This is ignored for the CARBayes and CARBayesST models. |
trials |
Only used if family="binomial". Either the name (character) or number of the column in the supplied data frame containing the total number of trials The program will try to access data[, trials] to get the binomial denominators. |
offsetcol |
Only used in INLA based modeling. The column name or number in the data frame that should be used as the offset. |
formula.omega |
A one-sided formula object with no response variable (left side of the "~") needed, specifying the covariates in the logistic regression model for modelling the probability of an observation being a structural zero. Each covariate (or an offset) needs to be a vector of length K*1. Only required for zero-inflated Poisson models. |
scol |
Either the name (character) or number of the column in the supplied data frame identifying the spatial units. The program will try to access data[, scol] to identify the spatial units. If this is omitted, no spatial modeling will be performed. |
tcol |
Like the |
package |
Which package is to be used in model fitting? Currently available packages are:
|
model |
The specific spatio temporal model to be fitted.
If the package is "INLA" then the model argument should be a vector with two elements
giving the spatial model as the first component. The alternatives for the
spatial model are: "bym", "bym2", "besag", "besag2", "besagproper", "besagproper2", "iid"
and "none". The temporal model as the second second component can be one of
"iid", "rw1", "rw2", ar1" or "none".
In case the model component is "none" then no spatial/temporal random effects
will be fitted. No temporal random effects will be fitted in case |
AR |
The order of the autoregressive time series process that must be either 1 or 2. |
W |
A non-negative K by K neighborhood matrix (where K is the number of spatial units).
Typically a binary specification is used, where the jkth element equals one if areas (j, k)
are spatially close (e.g. share a common border) and is zero otherwise.
The matrix can be non-binary, but each row must contain at least one non-zero entry.
This argument may not need to be specified if |
adj.graph |
Adjacency graph which may be specified instead of the adjacency matrix
matrix. This argument is used if |
residtype |
Residual type, either "response" or "pearson", in GLM fitting with the packages CARBayes and CARBayesST. Default is "response" type observed minus fitted. The other option "pearson" is for Pearson residuals in GLM. For INLA based model fitting only the default response residuals are calculated. |
interaction |
TRUE or FALSE indicating whether the spatio-temporal interaction random effects should be included. Defaults to TRUE unless family="gaussian" in which case interactions are not allowed. |
Z |
A list, where each element is a K by K matrix of non-negative dissimilarity metrics. |
W.binary |
Logical, should the estimated neighbourhood matrix have only binary (0,1) values. |
changepoint |
A scalar indicating the position of the change point should one of the change point trend functions be included in the trends vector, i.e. if "CP" or "CT" is specified. |
knots |
A scalar indicating the number of knots to use should one of the monotonic cubic splines trend functions be included in the trends vector, i.e. if "MD" or "MI" is specified. |
validrows |
A vector providing the rows of the data frame which should be used for validation. The default NULL value instructs that validation will not be performed. |
prior.mean.delta |
A vector of prior means for the regression parameters delta (Gaussian priors are assumed) for the zero probability logistic regression component of the model. Defaults to a vector of zeros. |
prior.mean.beta |
A vector of prior means for the regression parameters beta (Gaussian priors are assumed). Defaults to a vector of zeros. |
prior.var.beta |
A vector of prior variances for the regression parameters beta (Gaussian priors are assumed). Defaults to a vector with values 100,000. |
prior.mean.gamma |
A vector of prior means for the temporal trend parameters (Gaussian priors are assumed). Defaults to a vector of zeros. |
prior.var.gamma |
A vector of prior variances for the temporal trend parameters (Gaussian priors are assumed). Defaults to a vector with values 100,000. |
prior.sigma2 |
The prior shape and scale in the form of c(shape, scale) for an Inverse-Gamma(shape, scale) prior for sigma2. Defaults to c(1, 0.01). |
prior.tau2 |
The prior shape and scale in the form of c(shape, scale) for an Inverse-Gamma(shape, scale) prior for tau2. Defaults to c(1, 0.01). |
prior.delta |
The prior maximum for the cluster smoothing parameter delta. Defaults to 10. |
prior.var.delta |
A vector of prior variances for the regression parameters delta (Gaussian priors are assumed) for the zero probability logistic regression component of the model. Defaults to a vector with values 100000. |
prior.lambda |
A vector of prior samples sizes for the Dirichlet prior controlling the probabilities that each trend function is chosen. The vector should be the same length as the trends vector and defaults to a vector of ones. |
prior.nu2 |
The prior shape and scale in the form of c(shape, scale) for an Inverse-Gamma(shape, scale) prior for nu2. Defaults to c(1, 0.01) and only used if family="Gaussian". |
epsilon |
Diagonal ridge parameter to add to the random effects prior precision matrix, only required when rho = 1, and the prior precision is improper. Defaults to 0. Only used for adaptive model fitting in CARBayesST. |
G |
The maximum number of distinct intercept terms (groups) to allow in the localised model. |
ind.area |
A vector of integers the same length as the number of data points (individuals) giving which spatial unit (nunmbered from 1 to K to align with the rows of the W matrix) each individual belongs to. |
trends |
A vector containing the temporal trend functions to include in the model, which include: constant ("Constant""); linear decreasing ("LD"); linear increasing ("LI"); Known change point, where the trend can increase towards the change point before subsequently decreasing ("CP"); or decrease towards the change point before subsequently increasing ("CT"); and monotonic cubic splines which are decreasing ("MD") or increasing ("MI"). At least two trends have to be selected, with the constant trend always included. To avoid identifiability problems only one of "LI" or "MI" can be included at a given time (similarily for "LD" and "MD"). |
rho.T |
The value in the interval [0, 1] that the temporal dependence parameter rho.T is fixed at if it should not be estimated. If this argument is NULL then rho.T is estimated in the model. |
rho.S |
The value in the interval [0, 1] that the spatial dependence parameter rho.S is fixed at if it should not be estimated. If this argument is NULL then rho.S is estimated in the model. |
rho |
The value in the interval [0, 1] that the spatial dependence parameter rho is fixed at if it should not be estimated. If this argument is NULL then rho is estimated in the model. Setting rho=1, reduces the random effects prior to the intrinsic CAR model but does require epsilon>0. |
rho.slo |
The value in the interval [0, 1] that the spatial dependence parameter for the slope of the linear time trend, rho.slo, is fixed at if it should not be estimated. If this argument is NULL then rho.slo is estimated in the model. |
rho.int |
The value in the interval [0, 1] that the spatial dependence parameter for the intercept of the linear time trend, rho.int, is fixed at if it should not be estimated. If this argument is NULL then rho.int is estimated in the model. |
MALA |
Logical, should the function use Metropolis adjusted Langevin algorithm (MALA) updates (TRUE) or simple random walk (FALSE, default) updates for the regression parameters and random effects. |
N |
MCMC sample size. |
burn.in |
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. |
thin |
The level of thinning to apply to the MCMC samples to reduce their temporal autocorrelation. Defaults to 1 (no thinning). |
rseed |
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. |
Nchains |
The number of parallel Markov chains to be used in the Metropolis coupled Markov chain Monte Carlo (MCMCMC) simulations. Defaults to 4. |
verbose |
Logical scalar value: whether to print various estimates and statistics. |
plotit |
Logical scalar value: whether to plot the predictions against the observed values. |
A list containing:
params - A table of parameter estimates
fit - The fitted model object.
fitteds - A vector of fitted values.
mchoice - Calculated model choice statistics if those have been
requested by the input argument mchoice=T
. Not all model fits will contain
all the model choice statistics.
residuals - A vector of residual values.
stats - The four validation statistics: rmse, mae, crps and coverage. This is present only if model validation has been performed.
yobs_preds - A data frame containing the validation rows of the model fitting data frame. The last five columns of this data frame contains the validation prediction summaries: mean, sd, median, and 95% prediction interval. This is present only if model validation has been performed.
valpreds - A matrix containing the MCMC samples of the validation predictions. The dimension of this matrix is the number of validations times the number of retained MCMC samples. This is present only if model validation has been performed.
validationplots - Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
sn - The number of areal units used in fitting.
tn - The number of time points used in fitting.
formula - The input formula for the regression part of the model.
scale.transform - It is there for compatibility with Bsptime
output.
package - The name of the package used for model fitting.
model - The name of the fitted model.
call - The command used to call the model fitting function.
computation.time - Computation time required to run the model fitting.
Gómez-Rubio V (2020).
Bayesian inference with INLA.
Boca Raton:Chapman and Hall/CRC,.
Lee D (2021).
“CARBayes version 5.2.3: An R Package for Spatial Areal Unit Modelling with Conditional Autoregressive Priors.”
University of Glasgow.
https://cran.r-project.org/package=CARBayes.
Lee D, Rushworth A, Napier G (2018).
“Spatio-Temporal Areal Unit Modeling in R with Conditional Autoregressive Priors Using the CARBayesST Package.”
Journal of Statistical Software, 84(9), 10.18637/jss.v084.i09.
Sahu SK (2022).
Bayesian Modeling of Spatio Temporal Data with R, 1st edition.
Chapman and Hall, Boca Raton.
https://doi.org/10.1201/9780429318443.
# Set the validation row numbers vs <- sample(nrow(engtotals), 5) # Total number of iterations N <- 60 # Number of burn in iterations burn.in <- 10 # Number of thinning iterations thin <- 1 # Set the model formula for binomial distribution based modeling f1 <- noofhighweeks ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2) ## Independent error logistic regression M1 <- Bcartime(formula = f1, data = engtotals, family = "binomial", trials = "nweek", N = N, burn.in = burn.in, thin = thin, verbose = TRUE) summary(M1) # Leroux model M1.leroux <- Bcartime(formula = f1, data = engtotals, scol = "spaceid", model = "leroux", W = Weng, family = "binomial", trials = "nweek", N = N, burn.in = burn.in, thin = thin) summary(M1.leroux) # BYM model M1.bym <- Bcartime(formula = f1, data = engtotals, scol = "spaceid", model = "bym", W = Weng, family = "binomial", trials = "nweek", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1.bym) # Validation for the Leroux model M1.leroux.v <- Bcartime(formula = f1, data = engtotals, scol = "spaceid", model = "leroux", W = Weng, family = "binomial", trials = "nweek", validrows = vs, N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1.leroux.v) ## Poisson Distribution based models #################################### # Model formula f2 <- covid ~ offset(logEdeaths) + jsa + log10(houseprice) + log(popdensity) + sqrt(no2) # Independent error Poisson regression M2 <- Bcartime(formula = f2, data = engtotals, family = "poisson", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2) ## Poisson regression with Leroux Model M2.leroux <- Bcartime(formula = f2, data = engtotals, scol = "spaceid", model = "leroux", family = "poisson", W = Weng, N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2.leroux) # Poisson regression with BYM Model M2.bym <- Bcartime(formula = f2, data = engtotals, scol = "spaceid", model = "bym", family = "poisson", W = Weng, N = N, burn.in = burn.in, thin = thin) summary(M2.bym) ## Gaussian distribution based models ############### f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) # Independent error model M3 <- Bcartime(formula = f3, data = engtotals, family = "gaussian", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3) # Leroux model M3.leroux <- Bcartime(formula = f3, data = engtotals, scol = "spaceid", model = "leroux", family = "gaussian", W = Weng, N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3.leroux) ## Validation M3.leroux.v <- Bcartime(formula = f3, data = engtotals, scol = "spaceid", model = "leroux", family = "gaussian", W = Weng, N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) summary(M3.leroux.v) ## Spatio-temporal modeling ################################################## head(engdeaths) dim(engdeaths) colnames(engdeaths) vs <- sample(nrow(engdeaths), 5) ## Binomial distribution engdeaths$nweek <- rep(1, nrow(engdeaths)) f1 <- highdeathsmr ~ jsa + log10(houseprice) + log(popdensity) M1st_linear <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "linear", family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = TRUE) summary(M1st_linear) M1st_sepspat <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "sepspatial", family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1st_sepspat) M1st_ar <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "ar", AR = 1, family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1st_ar) # Model validation M1st_ar.v <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "ar", AR = 1, family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) summary(M1st_ar.v) ## Spatio temporal Poisson models################################### colnames(engdeaths) f2 <- covid ~ offset(logEdeaths) + jsa + log10(houseprice) + log(popdensity) + n0 M2st_linear <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "linear", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_linear) M2st_anova <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_anova) M2st_anova_nointer <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", interaction = FALSE, family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_anova_nointer) M2st_sepspat <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "sepspatial", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_sepspat) M2st_ar <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "ar", AR = 1, family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_ar) M2st_ar.v <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "ar", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) M2st_anova.v <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) summary(M2st_ar.v) summary(M2st_anova.v) ## Spatio-temporal Normal models ############################### colnames(engdeaths) f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) M3st_linear <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "linear", family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_linear) M3st_anova <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_anova) M3st_anova_nointer <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", interaction = FALSE, family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_anova_nointer) M3st_ar <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "ar", AR = 2, family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_ar) # Execute the following examples if INLA is available if (require(INLA)) { N <- 55 burn.in <- 5 thin <- 1 vs <- sample(nrow(engtotals), 5) # Spatial Binomial GLM f1 <- noofhighweeks ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2) M1.inla.bym <- Bcartime(data = engtotals, formula = f1, W = Weng, scol = "spaceid", model = c("bym"), package = "inla", family = "binomial", link="logit", trials = "nweek", N = N, burn.in = burn.in, thin = thin) summary(M1.inla.bym) ## Spatial only Poisson f2inla <- covid ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2) M2.inla.bym <- Bcartime(data = engtotals, formula = f2inla, W = Weng, scol = "spaceid", offsetcol = "logEdeaths", model = c("bym"), package = "inla", link = "log", family = "poisson", N = N, burn.in = burn.in, thin = thin) summary(M2.inla.bym) ## Normal models f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) ## Fit BYM with iid random effect M3.inla.bym <- Bcartime(data = engtotals, formula = f3, W = Weng, scol = "spaceid", model = c("bym"), package = "inla", family = "gaussian", N = N, burn.in = burn.in, thin = thin) summary(M3.inla.bym) # validation M3.inla.bym.v <- Bcartime(data = engtotals, formula = f3, W = Weng, scol = "spaceid", model = c("bym"), package = "inla", family = "gaussian", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M3.inla.bym.v) ###### Spatio-temporal INLA models f1 <- highdeathsmr ~ jsa + log10(houseprice) + log(popdensity) nweek <- rep(1, nrow(engdeaths)) engdeaths$nweek <- rep(1, nrow(engdeaths)) ## INLA Binomial model <- c("bym", "ar1") M1st_inla.bym <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M1st_inla.bym) M1st_inla_v <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", offsetcol = "logEdeaths", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", validrow = vs, N = N, burn.in = burn.in, thin = thin) summary(M1st_inla_v) model <- c("bym", "none") M1st_inla.bym.none <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M1st_inla.bym.none) model <- c("bym") M1st_inla.bym.none <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M1st_inla.bym.none) ## Poisson models f2inla <- covid ~ jsa + log10(houseprice) + log(popdensity) + n0 + n1 + n2 + n3 model <- c("bym", "ar1") M2stinla <- Bcartime(data = engdeaths, formula = f2inla, W = Weng, scol = "spaceid", tcol = "Weeknumber", offsetcol = "logEdeaths", model = model, link = "log", family = "poisson", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M2stinla) M2stinla.v <- Bcartime(data = engdeaths, formula = f2inla, W = Weng, scol = "spaceid", tcol = "Weeknumber", offsetcol = "logEdeaths", model = model, link = "log", family = "poisson", package = "inla", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M2stinla.v) ## Normal models f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) model <- c("bym", "iid") M3inla.bym.iid <- Bcartime(data = engdeaths, formula = f3, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, family = "gaussian", package = "inla", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M3inla.bym.iid) model <- c("bym", "ar1") M3inla.bym.ar1 <- Bcartime(data = engdeaths, formula = f3, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, family = "gaussian", package = "inla", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M3inla.bym.ar1) }
# Set the validation row numbers vs <- sample(nrow(engtotals), 5) # Total number of iterations N <- 60 # Number of burn in iterations burn.in <- 10 # Number of thinning iterations thin <- 1 # Set the model formula for binomial distribution based modeling f1 <- noofhighweeks ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2) ## Independent error logistic regression M1 <- Bcartime(formula = f1, data = engtotals, family = "binomial", trials = "nweek", N = N, burn.in = burn.in, thin = thin, verbose = TRUE) summary(M1) # Leroux model M1.leroux <- Bcartime(formula = f1, data = engtotals, scol = "spaceid", model = "leroux", W = Weng, family = "binomial", trials = "nweek", N = N, burn.in = burn.in, thin = thin) summary(M1.leroux) # BYM model M1.bym <- Bcartime(formula = f1, data = engtotals, scol = "spaceid", model = "bym", W = Weng, family = "binomial", trials = "nweek", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1.bym) # Validation for the Leroux model M1.leroux.v <- Bcartime(formula = f1, data = engtotals, scol = "spaceid", model = "leroux", W = Weng, family = "binomial", trials = "nweek", validrows = vs, N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1.leroux.v) ## Poisson Distribution based models #################################### # Model formula f2 <- covid ~ offset(logEdeaths) + jsa + log10(houseprice) + log(popdensity) + sqrt(no2) # Independent error Poisson regression M2 <- Bcartime(formula = f2, data = engtotals, family = "poisson", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2) ## Poisson regression with Leroux Model M2.leroux <- Bcartime(formula = f2, data = engtotals, scol = "spaceid", model = "leroux", family = "poisson", W = Weng, N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2.leroux) # Poisson regression with BYM Model M2.bym <- Bcartime(formula = f2, data = engtotals, scol = "spaceid", model = "bym", family = "poisson", W = Weng, N = N, burn.in = burn.in, thin = thin) summary(M2.bym) ## Gaussian distribution based models ############### f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) # Independent error model M3 <- Bcartime(formula = f3, data = engtotals, family = "gaussian", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3) # Leroux model M3.leroux <- Bcartime(formula = f3, data = engtotals, scol = "spaceid", model = "leroux", family = "gaussian", W = Weng, N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3.leroux) ## Validation M3.leroux.v <- Bcartime(formula = f3, data = engtotals, scol = "spaceid", model = "leroux", family = "gaussian", W = Weng, N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) summary(M3.leroux.v) ## Spatio-temporal modeling ################################################## head(engdeaths) dim(engdeaths) colnames(engdeaths) vs <- sample(nrow(engdeaths), 5) ## Binomial distribution engdeaths$nweek <- rep(1, nrow(engdeaths)) f1 <- highdeathsmr ~ jsa + log10(houseprice) + log(popdensity) M1st_linear <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "linear", family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = TRUE) summary(M1st_linear) M1st_sepspat <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "sepspatial", family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1st_sepspat) M1st_ar <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "ar", AR = 1, family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M1st_ar) # Model validation M1st_ar.v <- Bcartime(formula = f1, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", trials = "nweek", W = Weng, model = "ar", AR = 1, family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) summary(M1st_ar.v) ## Spatio temporal Poisson models################################### colnames(engdeaths) f2 <- covid ~ offset(logEdeaths) + jsa + log10(houseprice) + log(popdensity) + n0 M2st_linear <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "linear", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_linear) M2st_anova <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_anova) M2st_anova_nointer <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", interaction = FALSE, family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_anova_nointer) M2st_sepspat <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "sepspatial", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_sepspat) M2st_ar <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "ar", AR = 1, family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M2st_ar) M2st_ar.v <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "ar", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) M2st_anova.v <- Bcartime(formula = f2, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, validrows = vs, verbose = FALSE) summary(M2st_ar.v) summary(M2st_anova.v) ## Spatio-temporal Normal models ############################### colnames(engdeaths) f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) M3st_linear <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "linear", family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_linear) M3st_anova <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_anova) M3st_anova_nointer <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "anova", interaction = FALSE, family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_anova_nointer) M3st_ar <- Bcartime(formula = f3, data = engdeaths, scol = "spaceid", tcol = "Weeknumber", W = Weng, model = "ar", AR = 2, family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in, thin = thin, verbose = FALSE) summary(M3st_ar) # Execute the following examples if INLA is available if (require(INLA)) { N <- 55 burn.in <- 5 thin <- 1 vs <- sample(nrow(engtotals), 5) # Spatial Binomial GLM f1 <- noofhighweeks ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2) M1.inla.bym <- Bcartime(data = engtotals, formula = f1, W = Weng, scol = "spaceid", model = c("bym"), package = "inla", family = "binomial", link="logit", trials = "nweek", N = N, burn.in = burn.in, thin = thin) summary(M1.inla.bym) ## Spatial only Poisson f2inla <- covid ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2) M2.inla.bym <- Bcartime(data = engtotals, formula = f2inla, W = Weng, scol = "spaceid", offsetcol = "logEdeaths", model = c("bym"), package = "inla", link = "log", family = "poisson", N = N, burn.in = burn.in, thin = thin) summary(M2.inla.bym) ## Normal models f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) ## Fit BYM with iid random effect M3.inla.bym <- Bcartime(data = engtotals, formula = f3, W = Weng, scol = "spaceid", model = c("bym"), package = "inla", family = "gaussian", N = N, burn.in = burn.in, thin = thin) summary(M3.inla.bym) # validation M3.inla.bym.v <- Bcartime(data = engtotals, formula = f3, W = Weng, scol = "spaceid", model = c("bym"), package = "inla", family = "gaussian", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M3.inla.bym.v) ###### Spatio-temporal INLA models f1 <- highdeathsmr ~ jsa + log10(houseprice) + log(popdensity) nweek <- rep(1, nrow(engdeaths)) engdeaths$nweek <- rep(1, nrow(engdeaths)) ## INLA Binomial model <- c("bym", "ar1") M1st_inla.bym <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M1st_inla.bym) M1st_inla_v <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", offsetcol = "logEdeaths", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", validrow = vs, N = N, burn.in = burn.in, thin = thin) summary(M1st_inla_v) model <- c("bym", "none") M1st_inla.bym.none <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M1st_inla.bym.none) model <- c("bym") M1st_inla.bym.none <- Bcartime(data = engdeaths, formula = f1, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, trials = "nweek", family = "binomial", link="logit", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M1st_inla.bym.none) ## Poisson models f2inla <- covid ~ jsa + log10(houseprice) + log(popdensity) + n0 + n1 + n2 + n3 model <- c("bym", "ar1") M2stinla <- Bcartime(data = engdeaths, formula = f2inla, W = Weng, scol = "spaceid", tcol = "Weeknumber", offsetcol = "logEdeaths", model = model, link = "log", family = "poisson", package = "inla", N = N, burn.in = burn.in, thin = thin) summary(M2stinla) M2stinla.v <- Bcartime(data = engdeaths, formula = f2inla, W = Weng, scol = "spaceid", tcol = "Weeknumber", offsetcol = "logEdeaths", model = model, link = "log", family = "poisson", package = "inla", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M2stinla.v) ## Normal models f3 <- sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity) model <- c("bym", "iid") M3inla.bym.iid <- Bcartime(data = engdeaths, formula = f3, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, family = "gaussian", package = "inla", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M3inla.bym.iid) model <- c("bym", "ar1") M3inla.bym.ar1 <- Bcartime(data = engdeaths, formula = f3, W = Weng, scol = "spaceid", tcol = "Weeknumber", model = model, family = "gaussian", package = "inla", validrows = vs, N = N, burn.in = burn.in, thin = thin) summary(M3inla.bym.ar1) }
Cauchy prior simulation example.
BCauchy( method = "exact", true.theta = 1, n = 25, N = 10000, rseed = 44, tuning.sd = 1 )
BCauchy( method = "exact", true.theta = 1, n = 25, N = 10000, rseed = 44, tuning.sd = 1 )
method |
Which method or package to use. Possibilities are:
|
true.theta |
True value of theta with a default value of 5. |
n |
Data sample size; defaults to 100. |
N |
is the number of Monte Carlo samples. |
rseed |
is the random number seed for drawing data samples. |
tuning.sd |
is the standard deviation of the proposal increment distribution for the random walk sampler. |
A list containing the estimated posterior mean, ybar (the data mean) and the values of the numerator and the denominator integrals The routine simulates n observations from N(theta, 1). Mean of the simulated data values are returned as ybar.
BCauchy(true.theta = 1, n=25) BCauchy(true.theta = 5, n=100) BCauchy(method="importance", true.theta = 1, n=25) BCauchy(method="importance", true.theta = 1, n=25, N=20000) BCauchy(method="rejection", true.theta = 1, n=25) BCauchy(method="independence", true.theta = 1, n=25) BCauchy(method="randomwalk", true.theta = 1, n=25, tuning.sd =1)
BCauchy(true.theta = 1, n=25) BCauchy(true.theta = 5, n=100) BCauchy(method="importance", true.theta = 1, n=25) BCauchy(method="importance", true.theta = 1, n=25, N=20000) BCauchy(method="rejection", true.theta = 1, n=25) BCauchy(method="independence", true.theta = 1, n=25) BCauchy(method="randomwalk", true.theta = 1, n=25, tuning.sd =1)
Model choice criteria calculation for univariate normal model for both known and unknown sigma^2
Bmchoice( case = "Exact.sigma2.known", y = ydata, mu0 = mean(y), sigma2 = 22, kprior = 1, prior.M = 1, prior.sigma2 = c(2, 1), N = 10000, rseed = 44 )
Bmchoice( case = "Exact.sigma2.known", y = ydata, mu0 = mean(y), sigma2 = 22, kprior = 1, prior.M = 1, prior.sigma2 = c(2, 1), N = 10000, rseed = 44 )
case |
One of the three cases:
|
y |
A vector of data values. Default is 28 ydata values from the package bmstdr |
mu0 |
The value of the prior mean if kprior=0. Default is the data mean. |
sigma2 |
Value of the known data variance; defaults to sample variance of the data. This is ignored in the third case when sigma2 is assumed to be unknown. |
kprior |
A scalar providing how many data standard deviation the prior mean is from the data mean. Default value is 0. |
prior.M |
Prior sample size, defaults to 10^(-4). |
prior.sigma2 |
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. |
N |
The number of samples to generate. |
rseed |
The random number seed. Defaults to 44 to reproduce the results in the book Sahu (2022). |
A list containing the exact values of pdic, dic, pdicalt, dicalt, pwaic1, waic1, pwaic2, waic2, gof, penalty and pmcc. Also prints out the posterior mean and variance. @references Sahu SK (2022). Bayesian Modeling of Spatio Temporal Data with R, 1st edition. Chapman and Hall, Boca Raton. https://doi.org/10.1201/9780429318443.
Bmchoice() b1 <- Bmchoice(case="Exact.sigma2.known") b2 <- Bmchoice(case="MC.sigma2.known") d1 <- Bmchoice(case="MC.sigma2.unknown") d2 <- Bmchoice(y=rt(100, df=8), kprior=1, prior.M=1)
Bmchoice() b1 <- Bmchoice(case="Exact.sigma2.known") b2 <- Bmchoice(case="MC.sigma2.known") d1 <- Bmchoice(case="MC.sigma2.unknown") d2 <- Bmchoice(y=rt(100, df=8), kprior=1, prior.M=1)
Model fitting and validation for spatio-temporal data from moving sensors in time.
Bmoving_sptime( formula, data, coordtype, coords, prior.sigma2 = c(2, 1), prior.tau2 = c(2, 1), prior.phi = NULL, prior.phi.param = NULL, scale.transform = "NONE", ad.delta = 0.8, t.depth = 12, s.size = 0.01, N = 2500, burn.in = 1000, no.chains = 1, validrows = 10, predspace = FALSE, newdata = NULL, mchoice = TRUE, plotit = FALSE, rseed = 44, verbose = TRUE, knots.coords = NULL, g_size = 5 )
Bmoving_sptime( formula, data, coordtype, coords, prior.sigma2 = c(2, 1), prior.tau2 = c(2, 1), prior.phi = NULL, prior.phi.param = NULL, scale.transform = "NONE", ad.delta = 0.8, t.depth = 12, s.size = 0.01, N = 2500, burn.in = 1000, no.chains = 1, validrows = 10, predspace = FALSE, newdata = NULL, mchoice = TRUE, plotit = FALSE, rseed = 44, verbose = TRUE, knots.coords = NULL, g_size = 5 )
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
The data frame for which the model formula is to be fitted. The data frame should be in long format having one row for each location and time combination. The data frame must be ordered by time within each site, and should optionally have a column, named s.index, providing the site indices. Thus the data, with n sites and T times within each site, should be organized in the order: (s1, t1), (s1, t2), ... (s1, T), ... (sn, t1), ... (sn, T). The data frame should also contain two columns giving the coordinates of the locations for spatio temporal model fitting. |
coordtype |
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. |
coords |
A vector of size 2 giving the column numbers of the data frame which contain the coordinates of the data locations. Here the supplied data frame must contain a column named 'time' which should indicate the time index of the data row. The values in the column 'time' should be positive integers starting from 1. |
prior.sigma2 |
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. |
prior.tau2 |
Shape and scale parameter value for the gamma prior on tau^2, the nugget effect. |
prior.phi |
Specifies the prior distribution for |
prior.phi.param |
Lower and upper limits of the uniform prior distribution for
phi the spatial decay parameter. For the default uniform distribution the values correspond
to an effective range that is between 1% and 100% of the maximum distance
between the data locations. For the Gamma distribution the default values are 2 and 1
and for the Cauchy distribution the default values are 0, 1 which specifies
a half-Cauchy distribution in |
scale.transform |
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. Default value is "NONE". |
ad.delta |
Adaptive delta controlling the behavior of Stan during fitting. |
t.depth |
Maximum allowed tree depth in the fitting process of Stan. |
s.size |
step size in the fitting process of Stan. |
N |
MCMC sample size. |
burn.in |
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. |
no.chains |
Number of parallel chains to run in Stan. |
validrows |
Either a number of randomly selected data rows to validate or a vector giving the row numbers of the data set for validation. |
predspace |
A 0-1 flag indicating whether spatial predictions are to be made. |
newdata |
A new data frame with the same column structure as the model fitting data set. |
mchoice |
Logical scalar value: whether model choice statistics should be calculated. |
plotit |
Logical scalar value: whether to plot the predictions against the observed values. |
rseed |
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. |
verbose |
Logical scalar value: whether to print various estimates and statistics. |
knots.coords |
Only relevant for GPP models fitted by either spTimer or spTDyn. Optional two column matrix of UTM-X and UTM-Y coordinates of the knots in kilometers. It is preferable to specify the g_size parameter instead. |
g_size |
Only relevant for GPP models fitted by either spTimer or spTDyn. The grid size c(m, n) for the knots for the GPP model. A square grid is assumed if this is passed on as a scalar. This does not need to be given if knots.coords is given instead. |
A list containing:
params - A table of parameter estimates
fit - The fitted model object.
datatostan - A list containing all the information sent to the rstan package.
prior.phi.param - This contains the values of
the hyperparameters of the prior distribution for the spatial
decay parameter .
prior.phi - This contains the name of of the prior
distribution for the spatial decay parameter .
validationplots - Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
fitteds - A vector of fitted values.
residuals - A vector of residual values.
package - The name of the package used for model fitting. This is always stan for this function.
model - The name of the fitted model.
call - The command used to call the model fitting function.
formula - The input formula for the regression part of the model.
scale.transform - The transformation adopted by the input argument with the same name.
sn - The number of data locations used in fitting.
tn - The number of time points used in fitting for each location.
computation.time - Computation time required to run the model fitting.
Bsptime
for spatio-temporal model fitting.
deep <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==3, ] deep$x2inter <- deep$xinter*deep$xinter deep$month <- factor(deep$month) deep$lat2 <- (deep$lat)^2 deep$sin1 <- round(sin(deep$time*2*pi/365), 4) deep$cos1 <- round(cos(deep$time*2*pi/365), 4) deep$sin2 <- round(sin(deep$time*4*pi/365), 4) deep$cos2 <- round(cos(deep$time*4*pi/365), 4) deep[, c( "xlat2", "xsin1", "xcos1", "xsin2", "xcos2")] <- scale(deep[,c("lat2", "sin1", "cos1", "sin2", "cos2")]) f2 <- temp ~ xlon + xlat + xlat2+ xinter + x2inter M2 <- Bmoving_sptime(formula=f2, data = deep, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, validrows =NULL, mchoice = FALSE) summary(M2) plot(M2) names(M2) # Testing for smaller data sets with different data pattern d2 <- deep[1:25, ] d2$time <- 1:25 # Now there is no missing times M1 <- Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, mchoice = FALSE) summary(M1) d2[26, ] <- d2[25, ] # With multiple observation at the same location and time M1 <- Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, mchoice = FALSE) summary(M1) d2[27, ] <- d2[24, ] d2[27, 3] <- 25 # With previous location re-sampled M1 <- Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, mchoice = FALSE) summary(M1)
deep <- argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==3, ] deep$x2inter <- deep$xinter*deep$xinter deep$month <- factor(deep$month) deep$lat2 <- (deep$lat)^2 deep$sin1 <- round(sin(deep$time*2*pi/365), 4) deep$cos1 <- round(cos(deep$time*2*pi/365), 4) deep$sin2 <- round(sin(deep$time*4*pi/365), 4) deep$cos2 <- round(cos(deep$time*4*pi/365), 4) deep[, c( "xlat2", "xsin1", "xcos1", "xsin2", "xcos2")] <- scale(deep[,c("lat2", "sin1", "cos1", "sin2", "cos2")]) f2 <- temp ~ xlon + xlat + xlat2+ xinter + x2inter M2 <- Bmoving_sptime(formula=f2, data = deep, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, validrows =NULL, mchoice = FALSE) summary(M2) plot(M2) names(M2) # Testing for smaller data sets with different data pattern d2 <- deep[1:25, ] d2$time <- 1:25 # Now there is no missing times M1 <- Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, mchoice = FALSE) summary(M1) d2[26, ] <- d2[25, ] # With multiple observation at the same location and time M1 <- Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, mchoice = FALSE) summary(M1) d2[27, ] <- d2[24, ] d2[27, 3] <- 25 # With previous location re-sampled M1 <- Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2, N=11, burn.in=6, mchoice = FALSE) summary(M1)
Calculates and plots the variogram cloud and an estimated variogram.
bmstdr_variogram( formula = yo3 ~ utmx + utmy, coordtype = "utm", data = nyspatial, nbins = 30 )
bmstdr_variogram( formula = yo3 ~ utmx + utmy, coordtype = "utm", data = nyspatial, nbins = 30 )
formula |
Its a formula argument for the response and the coordinates. |
coordtype |
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. |
data |
A data frame containing the response and the co-ordinates |
nbins |
Number of bins for the variogram. Default is 30. |
A list containing:
cloud - A data frame containing the variogram cloud. This contains pairs of all the data locations, distance between the locations and the variogram value for the pair.
variogram A data frame containing the variogram values in each bin.
cloudplot A ggplot2 object of the plot of the variogram cloud.
variogramplot A ggplot2 object of the plot of the binned variogram values.
a <- bmstdr_variogram(data=nyspatial, formula = yo3~utmx + utmy, coordtype="utm", nb=50) names(a) if (require(ggpubr)) ggarrange(a$cloudplot, a$variogramplot, nrow=1, ncol=2)
a <- bmstdr_variogram(data=nyspatial, formula = yo3~utmx + utmy, coordtype="utm", nb=50) names(a) if (require(ggpubr)) ggarrange(a$cloudplot, a$variogramplot, nrow=1, ncol=2)
N(theta, sigma2): Using different methods.
Bnormal( package = "exact", y = ydata, mu0 = mean(y), kprior = 0, prior.M = 1e-04, prior.sigma2 = c(0, 0), N = 2000, burn.in = 1000, rseed = 44 )
Bnormal( package = "exact", y = ydata, mu0 = mean(y), kprior = 0, prior.M = 1e-04, prior.sigma2 = c(0, 0), N = 2000, burn.in = 1000, rseed = 44 )
package |
Which package (or method) to use. Possibilities are:
|
y |
A vector of data values. Default is 28 ydata values from the package bmstdr |
mu0 |
The value of the prior mean if kprior=0. Default is the data mean. |
kprior |
A scalar providing how many data standard deviation the prior mean is from the data mean. Default value is 0. |
prior.M |
Prior sample size, defaults to 10^(-4).#' |
prior.sigma2 |
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. |
N |
is the number of Gibbs sampling iterations |
burn.in |
is the number of initial iterations to discard before making inference. |
rseed |
is the random number seed defaults to 44. |
A list containing the exact posterior means and variances of theta and sigma2
# Find the posterior mean and variance using `exact' methods - no sampling # or approximation Bnormal(kprior = 1, prior.M = 1, prior.sigma2 = c(2, 1)) # Use default non-informative prior Bnormal(mu0 = 0) # Start creating table y <- ydata mu0 <- mean(y) kprior <- 1 prior.M <- 1 prior.sigma2 <- c(2, 1) N <- 10000 eresults <- Bnormal(package = "exact", y = y, mu0 = mu0, kprior = kprior, prior.M = prior.M, prior.sigma2 = prior.sigma2) eresults # Run Gibbs sampling samps <- Bnormal(package = "RGibbs", y = y, mu0 = mu0, kprior = kprior, prior.M = prior.M, prior.sigma2 = prior.sigma2, N = N) gres <- list(mean_theta = mean(samps[, 1]), var_theta = var(samps[, 1]), mean_sigma2 = mean(samps[, 2]), var_sigma2 = var(samps[, 2])) glow <- list(theta_low = quantile(samps[, 1], probs = 0.025), var_theta = NA, sigma2_low = quantile(samps[, 2], probs = 0.025), var_sigma2 = NA) gupp <- list(theta_low = quantile(samps[, 1], probs = 0.975), var_theta = NA, sigma2_low = quantile(samps[, 2], probs = 0.975), var_sigma2 = NA) a <- rbind(unlist(eresults), unlist(gres), unlist(glow), unlist(gupp)) cvsamp <- sqrt(samps[, 2])/samps[, 1] cv <- c(NA, mean(cvsamp), quantile(cvsamp, probs = c(0.025, 0.975))) u <- data.frame(a, cv) rownames(u) <- c("Exact", "Estimate", "2.5%", "97.5%") print(u) # End create table ## # Compute using the model fitted by Stan u <- Bnormal(package = "stan", kprior = 1, prior.M = 1, prior.sigma = c(2, 1), N = 2000, burn.in = 1000) print(u) ### # Compute using INLA if (require(INLA)) { u <- Bnormal(package = "inla", kprior = 1, prior.M = 1, prior.sigma = c(2, 1), N = 1000) print(u) }
# Find the posterior mean and variance using `exact' methods - no sampling # or approximation Bnormal(kprior = 1, prior.M = 1, prior.sigma2 = c(2, 1)) # Use default non-informative prior Bnormal(mu0 = 0) # Start creating table y <- ydata mu0 <- mean(y) kprior <- 1 prior.M <- 1 prior.sigma2 <- c(2, 1) N <- 10000 eresults <- Bnormal(package = "exact", y = y, mu0 = mu0, kprior = kprior, prior.M = prior.M, prior.sigma2 = prior.sigma2) eresults # Run Gibbs sampling samps <- Bnormal(package = "RGibbs", y = y, mu0 = mu0, kprior = kprior, prior.M = prior.M, prior.sigma2 = prior.sigma2, N = N) gres <- list(mean_theta = mean(samps[, 1]), var_theta = var(samps[, 1]), mean_sigma2 = mean(samps[, 2]), var_sigma2 = var(samps[, 2])) glow <- list(theta_low = quantile(samps[, 1], probs = 0.025), var_theta = NA, sigma2_low = quantile(samps[, 2], probs = 0.025), var_sigma2 = NA) gupp <- list(theta_low = quantile(samps[, 1], probs = 0.975), var_theta = NA, sigma2_low = quantile(samps[, 2], probs = 0.975), var_sigma2 = NA) a <- rbind(unlist(eresults), unlist(gres), unlist(glow), unlist(gupp)) cvsamp <- sqrt(samps[, 2])/samps[, 1] cv <- c(NA, mean(cvsamp), quantile(cvsamp, probs = c(0.025, 0.975))) u <- data.frame(a, cv) rownames(u) <- c("Exact", "Estimate", "2.5%", "97.5%") print(u) # End create table ## # Compute using the model fitted by Stan u <- Bnormal(package = "stan", kprior = 1, prior.M = 1, prior.sigma = c(2, 1), N = 2000, burn.in = 1000) print(u) ### # Compute using INLA if (require(INLA)) { u <- Bnormal(package = "inla", kprior = 1, prior.M = 1, prior.sigma = c(2, 1), N = 1000) print(u) }
Bayesian regression model fitting for point referenced spatial data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.
Bspatial( formula, data, package = "none", model = "lm", coordtype = NULL, coords = NULL, validrows = NULL, scale.transform = "NONE", prior.beta0 = 0, prior.M = 1e-04, prior.sigma2 = c(2, 1), prior.tau2 = c(2, 0.1), phi = NULL, prior.phi.param = NULL, prior.range = c(1, 0.5), prior.sigma = c(1, 0.005), offset = c(10, 140), max.edge = c(50, 1000), cov.model = "exponential", N = 5000, burn.in = 1000, rseed = 44, n.report = 500, no.chains = 1, ad.delta = 0.99, s.size = 0.01, t.depth = 15, verbose = TRUE, plotit = TRUE, mchoice = FALSE, ... )
Bspatial( formula, data, package = "none", model = "lm", coordtype = NULL, coords = NULL, validrows = NULL, scale.transform = "NONE", prior.beta0 = 0, prior.M = 1e-04, prior.sigma2 = c(2, 1), prior.tau2 = c(2, 0.1), phi = NULL, prior.phi.param = NULL, prior.range = c(1, 0.5), prior.sigma = c(1, 0.005), offset = c(10, 140), max.edge = c(50, 1000), cov.model = "exponential", N = 5000, burn.in = 1000, rseed = 44, n.report = 500, no.chains = 1, ad.delta = 0.99, s.size = 0.01, t.depth = 15, verbose = TRUE, plotit = TRUE, mchoice = FALSE, ... )
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
The data frame for which the model formula is to be fitted.
If a spatial model is to be fitted then the data frame should contain
two columns containing the locations of the coordinates. See the |
package |
Which package is to be used in model fitting? Currently available packages are:
Further details and more examples are provided in Chapter 6 of the book Sahu (2022). |
model |
Only used when the package has been chosen to be "none". It can take one of two values: either "lm" or "spat". The "lm" option is for an independent error regression model while the "spat" option fits a spatial model without any nugget effect. |
coordtype |
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. |
coords |
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. |
validrows |
A vector of site indices which should be used for validation. This function does not allow some sites to be used for both fitting and validation. The remaining observations will be used for model fitting. The default NULL value instructs that validation will not be performed. |
scale.transform |
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. |
prior.beta0 |
A scalar value or a vector providing the prior mean for beta parameters. |
prior.M |
Prior precision value (or matrix) for beta. Defaults to a diagonal matrix with diagonal values 10^(-4). |
prior.sigma2 |
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. |
prior.tau2 |
Shape and scale parameter value for the gamma prior on tau^2, the nugget effect. |
phi |
The spatial decay parameter for the exponential covariance function. Only
used if the package is Stan or the model is a spatial model "spat" without nugget effect when the
|
prior.phi.param |
Lower and upper limits of the uniform prior distribution for
|
prior.range |
A length 2 vector, with (range0, Prange) specifying
that |
prior.sigma |
A length 2 vector, with (sigma0, Psigma) specifying
that |
offset |
Only used in INLA based modeling. Offset parameter. See documentation for |
max.edge |
Only used in INLA based modeling. See documentation for |
cov.model |
Only relevant for the spBayes package. Default is the exponential model.
See the documentation for |
N |
MCMC sample size. Default value 5000. |
burn.in |
How many initial iterations to discard. Default value 1000. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. |
rseed |
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. |
n.report |
How many times to report in MCMC progress. This is used only when the package is spBayes. |
no.chains |
Number of parallel chains to run in Stan. |
ad.delta |
Adaptive delta controlling the behavior of Stan during fitting. |
s.size |
step size in the fitting process of Stan. |
t.depth |
Maximum allowed tree depth in the fitting process of Stan. |
verbose |
Logical scalar value: whether to print various estimates and statistics. |
plotit |
Logical scalar value: whether to plot the predictions against the observed values. |
mchoice |
Logical scalar value: whether model choice statistics should be calculated. |
... |
Any additional arguments that may be passed on to the fitting package. |
A list containing:
params - A table of parameter estimates
fit - The fitted model object. This is present only if a named
package, e.g. spBayes
has been used.
max.d - Maximum distance between data locations.
This is in unit of kilometers unless the coordtype
argument
is set as plain
.
fitteds - A vector of fitted values.
mchoice - Calculated model choice statistics if those have been
requested by the input argument mchoice=TRUE
. Not all model fits will contain
all the model choice statistics.
stats - The four validation statistics: rmse, mae, crps and coverage. This is present only if model validation has been performed.
yobs_preds - A data frame containing the validation rows of the model fitting data frame. The last five columns of this data frame contains the validation prediction summaries: mean, sd, median, and 95% prediction interval. This is present only if model validation has been performed.
valpreds - A matrix containing the MCMC samples of the validation predictions. The dimension of this matrix is the number of validations times the number of retained MCMC samples. This is present only if model validation has been performed.
validationplots - Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
residuals - A vector of residual values.
sn - The number of data locations used in fitting.
tn Defaults to 1. Used for plotting purposes.
phi - If present this contains the fixed value of
the spatial decay parameter used to fit the model.
prior.phi.param - If present this contains the values of the hyperparameters
of the prior distribution for the spatial decay parameter .
prior.range - Present only if the INLA
package has been used
in model fitting. This contains the values of the hyperparameters
of the prior distribution for the range.
logliks - A list containing the log-likelihood values used in calculation of the model choice statistics if those have been requested in the first place.
formula - The input formula for the regression part of the model.
scale.transform - The transformation adopted by the input argument with the same name.
package - The name of the package used for model fitting.
model - The name of the fitted model.
call - The command used to call the model fitting function.
computation.time - Computation time required to run the model fitting.
Bsptime
for Bayesian spatio-temporal model fitting.
N <- 10 burn.in <- 5 n.report <- 2 a <- Bspatial(formula = mpg ~ wt, data = mtcars, package = "none", model = "lm", N = N) summary(a) plot(a) print(a) b <- Bspatial(formula = mpg ~ disp + wt + qsec + drat, data = mtcars, validrows = c(8, 11, 12, 14, 18, 21, 24, 28), N = N) #' print(b) summary(b) ## Illustration with the nyspatial data set head(nyspatial) ## Linear regression model fitting M1 <- Bspatial(formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, mchoice = TRUE, N = N) print(M1) plot(M1) a <- residuals(M1) summary(M1) ## Spatial model fitting M2 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4, mchoice = TRUE, N = N) names(M2) print(M2) plot(M2) a <- residuals(M2) summary(M2) ## Fit model 2 on the square root scale M2root <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, scale.transform = "SQRT") summary(M2root) ## Spatial model fitting using spBayes M3 <- Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, prior.phi = c(0.005, 2), mchoice = TRUE, N = N, burn.in = burn.in, n.report = n.report) summary(M3) # Spatial model fitting using stan (with a small number of iterations) M4 <- Bspatial(package = "stan", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4, N = N, burn.in = burn.in, mchoice = TRUE) summary(M4) ## K fold cross-validation for M2 only set.seed(44) x <- runif(n = 28) u <- order(x) # Here are the four folds s1 <- u[1:7] s2 <- u[8:14] s3 <- u[15:21] s4 <- u[22:28] summary((1:28) - sort(c(s1, s2, s3, s4))) ## check v1 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s1, phi = 0.4, N = N) v2 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s2, phi = 0.4, N = N) v3 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s3, phi = 0.4, N = N) v4 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s4, phi = 0.4, N = N) M2.val.table <- cbind(unlist(v1$stats), unlist(v2$stats), unlist(v3$stats), unlist(v4$stats)) dimnames(M2.val.table)[[2]] <- paste("Fold", 1:4, sep = "") round(M2.val.table, 3) ## Model validation s <- c(1, 5, 10) M1.v <- Bspatial(model = "lm", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, N = N, burn.in = burn.in) M2.v <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, phi = 0.4, N = N, burn.in = burn.in) M3.v <- Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, prior.phi = c(0.005, 2), n.report = 2, N = N, burn.in = burn.in) # Collect all the results Mall.table <- cbind(unlist(M1.v$stats), unlist(M2.v$stats), unlist(M3.v$stats)) colnames(Mall.table) <- paste("M", c(1:3), sep = "") round(Mall.table, 3) if (require(INLA) & require(inlabru)) { N <- 10 burn.in <- 5 # Spatial model fitting using INLA M5 <- Bspatial(package = "inla", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, mchoice = TRUE, N = N, burn.in = burn.in) summary(M5) }
N <- 10 burn.in <- 5 n.report <- 2 a <- Bspatial(formula = mpg ~ wt, data = mtcars, package = "none", model = "lm", N = N) summary(a) plot(a) print(a) b <- Bspatial(formula = mpg ~ disp + wt + qsec + drat, data = mtcars, validrows = c(8, 11, 12, 14, 18, 21, 24, 28), N = N) #' print(b) summary(b) ## Illustration with the nyspatial data set head(nyspatial) ## Linear regression model fitting M1 <- Bspatial(formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, mchoice = TRUE, N = N) print(M1) plot(M1) a <- residuals(M1) summary(M1) ## Spatial model fitting M2 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4, mchoice = TRUE, N = N) names(M2) print(M2) plot(M2) a <- residuals(M2) summary(M2) ## Fit model 2 on the square root scale M2root <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, scale.transform = "SQRT") summary(M2root) ## Spatial model fitting using spBayes M3 <- Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, prior.phi = c(0.005, 2), mchoice = TRUE, N = N, burn.in = burn.in, n.report = n.report) summary(M3) # Spatial model fitting using stan (with a small number of iterations) M4 <- Bspatial(package = "stan", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4, N = N, burn.in = burn.in, mchoice = TRUE) summary(M4) ## K fold cross-validation for M2 only set.seed(44) x <- runif(n = 28) u <- order(x) # Here are the four folds s1 <- u[1:7] s2 <- u[8:14] s3 <- u[15:21] s4 <- u[22:28] summary((1:28) - sort(c(s1, s2, s3, s4))) ## check v1 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s1, phi = 0.4, N = N) v2 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s2, phi = 0.4, N = N) v3 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s3, phi = 0.4, N = N) v4 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s4, phi = 0.4, N = N) M2.val.table <- cbind(unlist(v1$stats), unlist(v2$stats), unlist(v3$stats), unlist(v4$stats)) dimnames(M2.val.table)[[2]] <- paste("Fold", 1:4, sep = "") round(M2.val.table, 3) ## Model validation s <- c(1, 5, 10) M1.v <- Bspatial(model = "lm", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, N = N, burn.in = burn.in) M2.v <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, phi = 0.4, N = N, burn.in = burn.in) M3.v <- Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, prior.phi = c(0.005, 2), n.report = 2, N = N, burn.in = burn.in) # Collect all the results Mall.table <- cbind(unlist(M1.v$stats), unlist(M2.v$stats), unlist(M3.v$stats)) colnames(Mall.table) <- paste("M", c(1:3), sep = "") round(Mall.table, 3) if (require(INLA) & require(inlabru)) { N <- 10 burn.in <- 5 # Spatial model fitting using INLA M5 <- Bspatial(package = "inla", formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial, coordtype = "utm", coords = 4:5, mchoice = TRUE, N = N, burn.in = burn.in) summary(M5) }
Bayesian regression model fitting for point referenced spatio-temporal data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.
Bsptime( formula, data, package = "none", model = "GP", coordtype = NULL, coords = NULL, validrows = NULL, scale.transform = "NONE", prior.beta0 = 0, prior.M = 1e-04, prior.sigma2 = c(2, 1), prior.tau2 = c(2, 0.1), prior.sigma.eta = c(2, 0.001), phi.s = NULL, phi.t = NULL, prior.phi = "Gamm", prior.phi.param = NULL, phi.tuning = NULL, phi.npoints = NULL, prior.range = c(1, 0.5), prior.sigma = c(1, 0.005), offset = c(10, 140), max.edge = c(50, 1000), rhotp = 0, time.data = NULL, truncation.para = list(at = 0, lambda = 2), newcoords = NULL, newdata = NULL, annual.aggrn = "NONE", cov.model = "exponential", g_size = NULL, knots.coords = NULL, tol.dist = 0.005, N = 2000, burn.in = 1000, rseed = 44, n.report = 2, no.chains = 1, ad.delta = 0.8, t.depth = 15, s.size = 0.01, verbose = FALSE, plotit = TRUE, mchoice = FALSE, ... )
Bsptime( formula, data, package = "none", model = "GP", coordtype = NULL, coords = NULL, validrows = NULL, scale.transform = "NONE", prior.beta0 = 0, prior.M = 1e-04, prior.sigma2 = c(2, 1), prior.tau2 = c(2, 0.1), prior.sigma.eta = c(2, 0.001), phi.s = NULL, phi.t = NULL, prior.phi = "Gamm", prior.phi.param = NULL, phi.tuning = NULL, phi.npoints = NULL, prior.range = c(1, 0.5), prior.sigma = c(1, 0.005), offset = c(10, 140), max.edge = c(50, 1000), rhotp = 0, time.data = NULL, truncation.para = list(at = 0, lambda = 2), newcoords = NULL, newdata = NULL, annual.aggrn = "NONE", cov.model = "exponential", g_size = NULL, knots.coords = NULL, tol.dist = 0.005, N = 2000, burn.in = 1000, rseed = 44, n.report = 2, no.chains = 1, ad.delta = 0.8, t.depth = 15, s.size = 0.01, verbose = FALSE, plotit = TRUE, mchoice = FALSE, ... )
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
The data frame for which the model formula is to be fitted. The data frame should be in long format having one row for each location and time combination. The data frame must be ordered by time within each site, and should optionally have a column, named s.index, providing the site indices. Thus the data, with n sites and T times within each site, should be organized in the order: (s1, t1), (s1, t2), ... (s1, T), ... (sn, t1), ... (sn, T). The data frame should also contain two columns giving the coordinates of the locations for spatio temporal model fitting. |
package |
Which package is to be used in model fitting? Currently available packages are:
Further details and more examples are provided in Chapters 7-9 of the book Sahu (2022). |
model |
The model to be fitted. This argument is passed to the fitting package. In case the package is none, then it can be either "lm" or "separable". The "lm" option is for an independent error regression model while the other option fits a separable model without any nugget effect. The separable model fitting method cannot handle missing data. All missing data points in the response variable will be replaced by the grand mean of the available observations. |
coordtype |
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. |
coords |
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. |
validrows |
A vector of row numbers of the supplied data frame which should be used for validation. When the model is "separable" this argument must include all the time points for the sites to be validated. Otherwise, the user is allowed to select the row numbers of the data frame validation as they wish. The default NULL value instructs that validation will not be performed.
|
scale.transform |
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. Default value is "NONE". |
prior.beta0 |
A scalar value or a vector providing the prior mean for beta parameters. |
prior.M |
Prior precision value (or matrix) for beta. Defaults to a diagonal matrix with diagonal values 10^(-4). |
prior.sigma2 |
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. |
prior.tau2 |
Shape and scale parameter value for the gamma prior on tau^2, the nugget effect. |
prior.sigma.eta |
Shape and scale parameter value for the inverse gamma prior distribution for sigma^2 eta; only used in the spBayes package. |
phi.s |
Only used if the model is "separable". The value of the fixed spatial decay parameter for the exponential covariance function. If this is not provided then a value is chosen which corresponds to an effective range which is the maximum distance between the data locations. |
phi.t |
Only used if the model is "separable". The fixed decay parameter for the exponential covariance function in the temporal domain. If this is not provided then a value is chosen which corresponds to an effective temporal range which is the maximum time of the data set. |
prior.phi |
Specifies the prior distribution for |
prior.phi.param |
Lower and upper limits of the uniform prior distribution for
phi the spatial decay parameter. For the default uniform distribution the values correspond
to an effective range that is between 1% and 100% of the maximum distance
between the data locations. For the Gamma distribution the default values are 2 and 1
and for the Cauchy distribution the default values are 0, 1 which specifies
a half-Cauchy distribution in |
phi.tuning |
Only relevant for spTimer and spTDyn models. Tuning parameter fo sampling phi. See the help file for spT.Gibbs |
phi.npoints |
Only relevant for spTimer and spTDyn models. Number of points for the discrete uniform prior distribution on phi. See the help file for spT.Gibbs |
prior.range |
A length 2 vector, with (range0, Prange) specifying
that |
prior.sigma |
A length 2 vector, with (sigma0, Psigma) specifying
that |
offset |
Only used in INLA based modeling. Offset parameter. See documentation for |
max.edge |
Only used in INLA based modeling. See documentation for |
rhotp |
Only relevant for models fitted by spTDyn. Initial value for the rho parameters in the temporal dynamic model. The default is rhotp=0 for which parameters are sampled from the full conditional distribution via MCMC with initial value 0. If rhotp=1,parameters are not sampled and fixed at value 1. |
time.data |
Defining the segments of the time-series set up using the function |
truncation.para |
Provides truncation parameter lambda and truncation point "at" using list. Only used with the spTimer package for a truncated model. |
newcoords |
The locations of the prediction sites in similar format to the |
newdata |
The covariate values at the prediction sites specified by |
annual.aggrn |
This provides the options for calculating annual summary statistics by aggregating different time segments (e.g., annual mean). Currently implemented options are: "NONE", "ave" and "an4th", where "ave" = annual average, "an4th"= annual 4th highest. Only applicable if spT.time inputs more than one segment and when fit and predict are done simultaneously. Only used in the spTimer package. |
cov.model |
Model for the covariance function. Only relevant for the spBayes, spTimer and the spTDyn packages. Default is the exponential model.
See the documentation for |
g_size |
Only relevant for GPP models fitted by either spTimer or spTDyn. The grid size c(m, n) for the knots for the GPP model. A square grid is assumed if this is passed on as a scalar. This does not need to be given if knots.coords is given instead. |
knots.coords |
Only relevant for GPP models fitted by either spTimer or spTDyn. Optional two column matrix of UTM-X and UTM-Y coordinates of the knots in kilometers. It is preferable to specify the g_size parameter instead. |
tol.dist |
Minimum separation distance between any two locations out of those specified by coords, knots.coords and pred.coords. The default is 0.005. The program will exit if the minimum distance is less than the non-zero specified value. This will ensure non-singularity of the covariance matrices. |
N |
MCMC sample size. |
burn.in |
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. |
rseed |
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. |
n.report |
How many times to report in MCMC progress. This is only used when the package is spBayes or spTimer. |
no.chains |
Number of parallel chains to run in Stan. |
ad.delta |
Adaptive delta controlling the behavior of Stan during fitting. |
t.depth |
Maximum allowed tree depth in the fitting process of Stan. |
s.size |
step size in the fitting process of Stan. |
verbose |
Logical scalar value: whether to print various estimates and statistics. |
plotit |
Logical scalar value: whether to plot the predictions against the observed values. |
mchoice |
Logical scalar value: whether model choice statistics should be calculated. |
... |
Any additional arguments that may be passed on to the fitting package. |
A list containing:
params - A table of parameter estimates
fit - The fitted model object. This is present only if a
named package, e.g. spTimer
has been used.
max.d - Maximum distance between data locations.
This is in unit of kilometers unless the coordtype
argument
is set as plain
.
fitteds - A vector of fitted values.
mchoice - Calculated model choice statistics if
those have been requested by the input argument mchoice=TRUE
.
Not all model fits will contain all the model choice statistics.
stats - The four validation statistics: rmse, mae, crps and coverage. This is present only if model validation has been performed.
yobs_preds - A data frame containing the validation rows of the model fitting data frame. The last five columns of this data frame contains the validation prediction summaries: mean, sd, median, and 95% prediction interval. This is present only if model validation has been performed.
valpreds - A matrix containing the MCMC samples of the validation predictions. The dimension of this matrix is the number of validations times the number of retained MCMC samples. This is present only if model validation has been performed.
validationplots - Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
residuals - A vector of residual values.
sn - The number of data locations used in fitting.
tn - The number of time points used in fitting.
phi.s, phi.t - Adopted value of the spatial and temporal decay parameters if those were fixed during model fitting.
prior.phi - If present this contains the name of
the prior distribution for
the spatial decay parameter used to fit the
model.
prior.phi.param - If present this contains the values of
the hyperparameters of the prior distribution for the spatial
decay parameter .
prior.range - Present only if the INLA
package
has been used in model fitting. This contains the values of
the hyperparameters of the prior distribution for the range.
logliks - A list containing the log-likelihood values used in calculation of the model choice statistics if those have been requested in the first place.
knots.coords - The locations of the knots if the model has been fitted using the GPP method.
formula - The input formula for the regression part of the model.
scale.transform - The transformation adopted by the input argument with the same name.
package - The name of the package used for model fitting.
model - The name of the fitted model.
call - The command used to call the model fitting function.
computation.time - Computation time required to run the model fitting.
Sahu SK (2022). Bayesian Modeling of Spatio Temporal Data with R, 1st edition. Chapman and Hall, Boca Raton. https://doi.org/10.1201/9780429318443.
# Set the total number of iterations N <- 45 # Set the total number of burn-in iterations burn.in <- 5 # How many times to report progress n.report <- 2 # Model formula used in most model fitting f2 <- y8hrmax ~ xmaxtemp + xwdsp + xrh # Check out the data set head(nysptime) ## Fit linear regression model M1 <- Bsptime(model = "lm", data = nysptime, formula = f2, scale.transform = "SQRT", N = N, burn.in = burn.in, mchoice = TRUE) names(M1) plot(M1) print(M1) summary(M1) a <- residuals(M1, numbers = list(sn = 28, tn = 62)) M2 <- Bsptime(model = "separable", data = nysptime, formula = f2, coordtype = "utm", coords = 4:5, mchoice = TRUE, scale.transform = "SQRT", N = N, burn.in = burn.in) names(M2) plot(M2) print(M2) summary(M2) b <- residuals(M2) # Spatio-temporal model fitting and validation valids <- c(8, 11) vrows <- which(nysptime$s.index %in% valids) ## Fit separable spatio-temporal model M2.1 <- Bsptime(model = "separable", formula = f2, data = nysptime, validrows = vrows, coordtype = "utm", coords = 4:5, phi.s = 0.005, phi.t = 0.05, scale.transform = "SQRT", N = N) summary(M2.1) plot(M2.1) # Use spTimer to fit independent GP model M3 <- Bsptime(package = "spTimer", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE, N = N, burn.in = burn.in, n.report = 2) summary(M3) valids <- c(1, 5, 10) validt <- sort(sample(1:62, size = 31)) vrows <- getvalidrows(sn = 28, tn = 62, valids = valids, validt = validt) ymat <- matrix(nysptime$y8hrmax, byrow = TRUE, ncol = 62) yholdout <- ymat[valids, validt] # Perform validation M31 <- Bsptime(package = "spTimer", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, validrows = vrows, model = "GP", scale.transform = "NONE", N = N, burn.in = burn.in, n.report = 2) summary(M31) modfit <- M31$fit ## Extract the fits for the validation sites fitall <- data.frame(modfit$fitted) head(fitall) tn <- 62 fitall$s.index <- rep(1:28, each = tn) library(spTimer) vdat <- spT.subset(data = nysptime, var.name = c("s.index"), s = valids) fitvalid <- spT.subset(data = fitall, var.name = c("s.index"), s = valids) head(fitvalid) fitvalid$low <- fitvalid$Mean - 1.96 * fitvalid$SD fitvalid$up <- fitvalid$Mean + 1.96 * fitvalid$SD fitvalid$yobs <- sqrt(vdat$y8hrmax) fitvalid$yobs <- vdat$y8hrmax yobs <- matrix(fitvalid$yobs, byrow = TRUE, ncol = tn) y.valids.low <- matrix(fitvalid$low, byrow = TRUE, ncol = tn) y.valids.med <- matrix(fitvalid$Mean, byrow = TRUE, ncol = tn) y.valids.up <- matrix(fitvalid$up, byrow = TRUE, ncol = tn) library(ggplot2) p1 <- fig11.13.plot(yobs[1, ], y.valids.low[1, ], y.valids.med[1, ], y.valids.up[1, ], misst = validt) p1 <- p1 + ggtitle("Validation for Site 1") p1 p2 <- fig11.13.plot(yobs[2, ], y.valids.low[2, ], y.valids.med[2, ], y.valids.up[2, ], misst = validt) p2 <- p2 + ggtitle("Validation for Site 5") p2 p3 <- fig11.13.plot(yobs[3, ], y.valids.low[3, ], y.valids.med[3, ], y.valids.up[3, ], misst = validt) p3 <- p3 + ggtitle("Validation for Site 10") p3 ## Independent marginal GP model fitting using rstan M4 <- Bsptime(package = "stan", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, N = N, burn.in = burn.in, verbose = FALSE) summary(M4) # Spatio-temporal hierarchical auto-regressive modeling useing spTimer M5 <- Bsptime(package = "spTimer", model = "AR", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE, n.report = n.report, N = N, burn.in = burn.in) summary(M5) a <- residuals(M5) ## Spatio-temporal dynamic model fitting using spTDyn library(spTDyn) f3 <- y8hrmax ~ xmaxtemp + sp(xmaxtemp) + tp(xwdsp) + xrh M7 <- Bsptime(package = "sptDyn", model = "GP", formula = f3, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE, N = N, burn.in = burn.in, n.report = n.report) summary(M7) # Dynamic Model fitting using spBayes M8 <- Bsptime(package = "spBayes", formula = f2, data = nysptime, prior.sigma2 = c(2, 25), prior.tau2 = c(2, 25), prior.sigma.eta = c(2, 0.001), coordtype = "utm", coords = 4:5, scale.transform = "SQRT", N = N, burn.in = burn.in, n.report = n.report) summary(M8) ## Gussian Predictive Process based model fitting using spTimer M9 <- Bsptime(package = "spTimer", model = "GPP", g_size = 5, formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", N = N, burn.in = burn.in, n.report = n.report) summary(M9) # This INLA run may take a long time if (require(INLA) & require(inlabru)) { f2 <- y8hrmax ~ xmaxtemp + xwdsp + xrh M6 <- Bsptime(package = "inla", model = "AR", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", offset = c(100, 200), max.edge = c(500, 10000), mchoice = TRUE, plotit=TRUE) # Takes a minute summary(M6) }
# Set the total number of iterations N <- 45 # Set the total number of burn-in iterations burn.in <- 5 # How many times to report progress n.report <- 2 # Model formula used in most model fitting f2 <- y8hrmax ~ xmaxtemp + xwdsp + xrh # Check out the data set head(nysptime) ## Fit linear regression model M1 <- Bsptime(model = "lm", data = nysptime, formula = f2, scale.transform = "SQRT", N = N, burn.in = burn.in, mchoice = TRUE) names(M1) plot(M1) print(M1) summary(M1) a <- residuals(M1, numbers = list(sn = 28, tn = 62)) M2 <- Bsptime(model = "separable", data = nysptime, formula = f2, coordtype = "utm", coords = 4:5, mchoice = TRUE, scale.transform = "SQRT", N = N, burn.in = burn.in) names(M2) plot(M2) print(M2) summary(M2) b <- residuals(M2) # Spatio-temporal model fitting and validation valids <- c(8, 11) vrows <- which(nysptime$s.index %in% valids) ## Fit separable spatio-temporal model M2.1 <- Bsptime(model = "separable", formula = f2, data = nysptime, validrows = vrows, coordtype = "utm", coords = 4:5, phi.s = 0.005, phi.t = 0.05, scale.transform = "SQRT", N = N) summary(M2.1) plot(M2.1) # Use spTimer to fit independent GP model M3 <- Bsptime(package = "spTimer", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE, N = N, burn.in = burn.in, n.report = 2) summary(M3) valids <- c(1, 5, 10) validt <- sort(sample(1:62, size = 31)) vrows <- getvalidrows(sn = 28, tn = 62, valids = valids, validt = validt) ymat <- matrix(nysptime$y8hrmax, byrow = TRUE, ncol = 62) yholdout <- ymat[valids, validt] # Perform validation M31 <- Bsptime(package = "spTimer", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, validrows = vrows, model = "GP", scale.transform = "NONE", N = N, burn.in = burn.in, n.report = 2) summary(M31) modfit <- M31$fit ## Extract the fits for the validation sites fitall <- data.frame(modfit$fitted) head(fitall) tn <- 62 fitall$s.index <- rep(1:28, each = tn) library(spTimer) vdat <- spT.subset(data = nysptime, var.name = c("s.index"), s = valids) fitvalid <- spT.subset(data = fitall, var.name = c("s.index"), s = valids) head(fitvalid) fitvalid$low <- fitvalid$Mean - 1.96 * fitvalid$SD fitvalid$up <- fitvalid$Mean + 1.96 * fitvalid$SD fitvalid$yobs <- sqrt(vdat$y8hrmax) fitvalid$yobs <- vdat$y8hrmax yobs <- matrix(fitvalid$yobs, byrow = TRUE, ncol = tn) y.valids.low <- matrix(fitvalid$low, byrow = TRUE, ncol = tn) y.valids.med <- matrix(fitvalid$Mean, byrow = TRUE, ncol = tn) y.valids.up <- matrix(fitvalid$up, byrow = TRUE, ncol = tn) library(ggplot2) p1 <- fig11.13.plot(yobs[1, ], y.valids.low[1, ], y.valids.med[1, ], y.valids.up[1, ], misst = validt) p1 <- p1 + ggtitle("Validation for Site 1") p1 p2 <- fig11.13.plot(yobs[2, ], y.valids.low[2, ], y.valids.med[2, ], y.valids.up[2, ], misst = validt) p2 <- p2 + ggtitle("Validation for Site 5") p2 p3 <- fig11.13.plot(yobs[3, ], y.valids.low[3, ], y.valids.med[3, ], y.valids.up[3, ], misst = validt) p3 <- p3 + ggtitle("Validation for Site 10") p3 ## Independent marginal GP model fitting using rstan M4 <- Bsptime(package = "stan", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, N = N, burn.in = burn.in, verbose = FALSE) summary(M4) # Spatio-temporal hierarchical auto-regressive modeling useing spTimer M5 <- Bsptime(package = "spTimer", model = "AR", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE, n.report = n.report, N = N, burn.in = burn.in) summary(M5) a <- residuals(M5) ## Spatio-temporal dynamic model fitting using spTDyn library(spTDyn) f3 <- y8hrmax ~ xmaxtemp + sp(xmaxtemp) + tp(xwdsp) + xrh M7 <- Bsptime(package = "sptDyn", model = "GP", formula = f3, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE, N = N, burn.in = burn.in, n.report = n.report) summary(M7) # Dynamic Model fitting using spBayes M8 <- Bsptime(package = "spBayes", formula = f2, data = nysptime, prior.sigma2 = c(2, 25), prior.tau2 = c(2, 25), prior.sigma.eta = c(2, 0.001), coordtype = "utm", coords = 4:5, scale.transform = "SQRT", N = N, burn.in = burn.in, n.report = n.report) summary(M8) ## Gussian Predictive Process based model fitting using spTimer M9 <- Bsptime(package = "spTimer", model = "GPP", g_size = 5, formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", N = N, burn.in = burn.in, n.report = n.report) summary(M9) # This INLA run may take a long time if (require(INLA) & require(inlabru)) { f2 <- y8hrmax ~ xmaxtemp + xwdsp + xrh M6 <- Bsptime(package = "inla", model = "AR", formula = f2, data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT", offset = c(100, 200), max.edge = c(500, 10000), mchoice = TRUE, plotit=TRUE) # Takes a minute summary(M6) }
Calculate DIC function. Has two arguments: (1) log full likelihood at thetahat and (2) vector of log-likelihood at the theta samples Calculate the DIC criteria values
calculate_dic(loglikatthetahat, logliks)
calculate_dic(loglikatthetahat, logliks)
loglikatthetahat |
Log of the likelihood function at theta hat (Bayes). It is a scalar value. |
logliks |
A vector of log likelihood values at the theta samples |
a list containing four values pdic, pdicalt, dic and dicalt
Calculates the four validation statistics: RMSE, MAE, CRPS and coverage given the observed values and MCMC iterates.
calculate_validation_statistics(yval, yits, level = 95, summarystat = "mean")
calculate_validation_statistics(yval, yits, level = 95, summarystat = "mean")
yval |
A vector containing n observed values of the response variable. |
yits |
A n by N matrix of predictive samples from the n observations contained in yval. |
level |
The nominal coverage level, defaults to 95%. |
summarystat |
Summary statistics to use to calculate the validation predictions from the samples. It should be a function like mean or median which can be calculated by R. The default is mean. |
A list giving the rmse, mae, crps and coverage.
set.seed(4) vrows <- sample(nrow(nysptime), 100) M1 <- Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime, validrows=vrows, scale.transform = "SQRT") valstats <- calculate_validation_statistics(M1$yobs_preds$y8hrmax, yits=t(M1$valpreds)) unlist(valstats)
set.seed(4) vrows <- sample(nrow(nysptime), 100) M1 <- Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime, validrows=vrows, scale.transform = "SQRT") valstats <- calculate_validation_statistics(M1$yobs_preds$y8hrmax, yits=t(M1$valpreds)) unlist(valstats)
Calculate WAIC function. Has the sole argument component wise likelihood at thetahat samples. v must be a matrix Calculate the WAIC criteria values
calculate_waic(v)
calculate_waic(v)
v |
must be a N (MCMC) by n (data) sample size matrix of the log-likelihood evaluations |
a list containing four values p_waic, p_waic alt, waic and waic_alt
Prints and returns the estimates of the coefficients
## S3 method for class 'bmstdr' coef(object, digits = 3, ...)
## S3 method for class 'bmstdr' coef(object, digits = 3, ...)
object |
A bmstdr model fit object. |
digits |
How many significant digits after the decimal to print, defaults to 3. |
... |
Any other additional arguments |
Coefficients are returned as a data frame preserving the names of the covariates
The color palette used to draw maps to illustrate the package bmstdr, see Sahu (2022) It has the values in order: dodgerblue4, dodgerblue2, firebrick2, firebrick4 and purple.
colpalette
colpalette
An object of class character
of length 5.
Sahu SK (2022). Bayesian Modeling of Spatio Temporal Data with R, 1st edition. Chapman and Hall, Boca Raton. https://doi.org/10.1201/9780429318443.
colpalette
colpalette
Number of weekly Covid-19 deaths and cases in the 313 local Local Authority Districts, Counties and Unitary Authorities (LADCUA) in England during the 20 peaks in the first peak from March 13 to July 31, 2020.
engdeaths
engdeaths
An object of class data.frame
with 6260 rows and 24 columns.
Sahu and Böhning (2021). @format A data frame with 6260 rows and 24 columns:
Areacode identifier of the 313 Local Authority Districts, Counties and Unitary Authorities (LADCUA)
A numeric column identifying the map area needed for plotting
A numeric variable taking value 1 to 313 identifying the LADCUA's
Identifies one of the 9 English regions
Population number in mid-2019
Percentage of the working age population receiving job-seekers allowance during January 2020
Median house price in March 2020
Population density in mid-2019
Estimated average value of NO2 at the centroid of the LADCUA
Number of Covid-19 deaths within 28 days of a positive test
Number deaths
Number of cases
Log of the standardized case morbidity during the current week
Log of the standardized case morbidity during the week before
Log of the standardized case morbidity during the second week before
Log of the standardized case morbidity during the third week before
Log of the standardized case morbidity during the fourth week before
Expected number of Covid-19 deaths. See Sahu and Bohning (2021) for methodology.
Expected number of cases.
Log of the column Edeaths
Log of the column Ecases
A binary (0-1) random variable taking the value 1 if the SMR of Covid-19 death is higher than 1
Sahu SK, Böhning D (2021). “Bayesian spatio-temporal joint disease mapping of Covid-19 cases and deaths in local authorities of England.” Spatial Statistics. doi:10.1016/j.spasta.2021.100519.
colnames(engdeaths) dim(engdeaths) summary(engdeaths[, 11:24])
colnames(engdeaths) dim(engdeaths) summary(engdeaths[, 11:24])
Total number of weekly Covid-19 deaths and cases in the 313 local Local Authority Districts, Counties and Unitary Authorities (LADCUA) in England during the first peak from March 13 to July 31, 2020.
engtotals
engtotals
An object of class data.frame
with 313 rows and 19 columns.
Sahu and Böhning (2021). @format A data frame with 313 rows and 19 columns:
Areacode identifier of the 313 Local Authority Districts, Counties and Unitary Authorities (LADCUA)
A numeric column identifying the map area needed for plotting
A numeric variable taking value 1 to 313 identifying the LADCUA's
Identifies one of the 9 English regions
Population number in mid-2019
Percentage of the working age population receiving job-seekers allowance during January 2020
Median house price in March 2020
Population density in mid-2019
Start date of the week
Week numbers 11 to 30
Estimated average value of NO2 at the centroid of the LADCUA in that week
Number of Covid-19 deaths within 28 days of a positive test
Total number deaths
Total number of cases
Expected number of Covid-19 deaths. See Sahu and Bohning (2021) for methodology.
Expected number of cases.
Log of the column Edeaths
Log of the column Ecases
Standaridised morbidity rate for the number of cases,
noofcases
/Ecases
Number of weeks during March 13 to July 31, 2020. All values are 20.
Number of weeks out of 20 when the casesmr
was greater than 1
Sahu SK, Böhning D (2021). “Bayesian spatio-temporal joint disease mapping of Covid-19 cases and deaths in local authorities of England.” Spatial Statistics. doi:10.1016/j.spasta.2021.100519.
colnames(engtotals) dim(engtotals) summary(engtotals[, 5:14])
colnames(engtotals) dim(engtotals) summary(engtotals[, 5:14])
Draws a time series (ribbon) plot by combining fitted and predicted values
fig11.13.plot(yobs, ylow, ymed, yup, misst)
fig11.13.plot(yobs, ylow, ymed, yup, misst)
yobs |
A vector of the observed values |
ylow |
A vector of the lower limits of the fitted or predicted values |
ymed |
A vector of fitted or predicted values |
yup |
A vector of the upper limits of the fitted or predicted values |
misst |
An integer vector which contains the indices of the predicted values |
A ribbon plot, ggplot2 object, which shows observed values in red color and open circle, predicted values in blue color and filled circle.
Extract fitted values from bmstdr objects.
## S3 method for class 'bmstdr' fitted(object, ...)
## S3 method for class 'bmstdr' fitted(object, ...)
object |
A bmstdr model fit object. |
... |
Any other additional arguments. |
Returns a vector of fitted values.
This function is used to delete values outside the state of New York
fnc.delete.map.XYZ(xyz)
fnc.delete.map.XYZ(xyz)
xyz |
A list containing the x values, y values and interpolated z values at each x and y pair. |
Returns the input but with NA placed in z values corresponding to the locations whose x-y coordinates are outside the land boundary of the USA.
Obtains parameter estimates from MCMC samples
get_parameter_estimates(samps, level = 95)
get_parameter_estimates(samps, level = 95)
samps |
A matrix of N by p samples for the p parameters |
level |
Desired confidence level - defaults to 95%. |
A data frame containing four columns: mean, sd, low (er limit), and up (per limit) for the p parameters.
samps <- matrix(rnorm(10000), ncol= 10 ) dim(samps) a <- get_parameter_estimates(samps) a b <- get_parameter_estimates(samps, level=98) b
samps <- matrix(rnorm(10000), ncol= 10 ) dim(samps) a <- get_parameter_estimates(samps) a b <- get_parameter_estimates(samps, level=98) b
Obtains suitable validation summary statistics from MCMC samples obtained for validation.
get_validation_summaries(samps, level = 95)
get_validation_summaries(samps, level = 95)
samps |
A matrix of N by p samples for the p parameters |
level |
Desired confidence level - defaults to 95%. |
A data frame containing five columns: meanpred, medianpred, sdpred, low (er limit), and up (per limit) for the p parameters.
set.seed(4) vrows <- sample(nrow(nysptime), 100) M1 <- Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime, validrows=vrows, scale.transform = "SQRT") samps<- M1$valpreds valsums <- get_validation_summaries(samps) head(valsums)
set.seed(4) vrows <- sample(nrow(nysptime), 100) M1 <- Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime, validrows=vrows, scale.transform = "SQRT") samps<- M1$valpreds valsums <- get_validation_summaries(samps) head(valsums)
Returns a vector of row numbers for validation.
getvalidrows(sn, tn, valids, validt = NULL, allt = FALSE)
getvalidrows(sn, tn, valids, validt = NULL, allt = FALSE)
sn |
The total number of spatial locations. |
tn |
The total number of time points in each location. |
valids |
A vector of site numbers in (1:sn) to be used for validation. |
validt |
A vector of time points in (1:tn) to be used for validation. |
allt |
Whether all the time points should be used for validation. |
Integer vector providing the row numbers of the data frame for validation.
Output of this function is suitable as the argument validrows
for the
bmstdr
model fitting functions Bsptime, Bcartime
.
{ # To validate at site numbers 1, 5, and 10 at 31 randomly selected # time points for the nysptime data set we issue the following commands set.seed(44) vt <- sample(62, 31) vrows <- getvalidrows(sn=28, tn=62, valids=c(1, 5, 10), validt=vt) # To validate at sites 1 and 2 at all time points vrows <- getvalidrows(sn=28, tn=62, valids=c(1, 2), allt=TRUE) }
{ # To validate at site numbers 1, 5, and 10 at 31 randomly selected # time points for the nysptime data set we issue the following commands set.seed(44) vt <- sample(62, 31) vrows <- getvalidrows(sn=28, tn=62, valids=c(1, 5, 10), validt=vt) # To validate at sites 1 and 2 at all time points vrows <- getvalidrows(sn=28, tn=62, valids=c(1, 2), allt=TRUE) }
Values of three covariates for 100 grid locations in New York averaged over the 62 days during the months of July and August, 2006.
gridnyspatial
gridnyspatial
An object of class data.frame
with 100 rows and 8 columns.
See the NYgrid data set in spTimer package, Bakar and Sahu (2015). Each data row is the mean of the available covariate values at the 100 grid locations during the months of July and August in 2006.
@format A data frame with 100 rows and 8 columns:
site index (1 to 28)
Longitude of the site
Latitude of the site
UTM X-coordinate of the site
UTM Y-coordinate of the site
Average maximum temperature (degree Celsius) at the site over 62 days in July and August, 2006
Average wind speed (nautical mile per hour) over 62 days in July and August, 2006
Average relative humidity over 62 days in July and August, 2006
Bakar KS, Sahu SK (2015). “spTimer: Spatio-Temporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 1548-7660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
summary(gridnyspatial)
summary(gridnyspatial)
Values of three covariates for 100 grid locations in New York for the 62 days during the months of July and August, 2006.
Values of three covariates for 100 grid locations in New York for the 62 days during the months of July and August, 2006.
gridnysptime gridnysptime
gridnysptime gridnysptime
An object of class data.frame
with 6200 rows and 11 columns.
An object of class data.frame
with 6200 rows and 11 columns.
It is the same data set as NYgrid data set in the spTimer package, Bakar and Sahu (2015). with two added columns providing the UTM X- and Y- coordinates. Each data row is for a particular grid site and day as detailed below.
@format A data frame with 6200 rows and 11 columns:
site index (1 to 100)
Longitude of the site
Latitude of the site
UTM X-coordinate of the site
UTM Y-coordinate of the site
This is 2006 for all the rows
Month taking values 7 for July and 8 for August
Day taking values 1 to 31
Maximum temperature (degree Celsius)
wind speed (nautical mile per hour)
Relative humidity
It is the same data set as NYgrid data set in the spTimer package,
with two added columns providing the UTM X- and Y- coordinates. Each data row is for a particular grid site and day as detailed below.
@format A data frame with 6200 rows and 11 columns:
site index (1 to 100)
Longitude of the site
Latitude of the site
UTM X-coordinate of the site
UTM Y-coordinate of the site
This is 2006 for all the rows
Month taking values 7 for July and 8 for August
Day taking values 1 to 31
Maximum temperature (degree Celsius)
wind speed (nautical mile per hour)
Relative humidity
Bakar KS, Sahu SK (2015). “spTimer: Spatio-Temporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 1548-7660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
Bakar KS, Sahu SK (2015). “spTimer: Spatio-Temporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 1548-7660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
head(gridnysptime) summary(gridnysptime[, 9:11]) summary(gridnysptime[, 9:11])
head(gridnysptime) summary(gridnysptime[, 9:11]) summary(gridnysptime[, 9:11])
Calculate the hit and false alarm rates
hitandfalsealarm(thresh, yobs, ypred)
hitandfalsealarm(thresh, yobs, ypred)
thresh |
Threshold value |
yobs |
A vector of observations, may include missing values |
ypred |
Predictions |
A list containing the calculated hit and false alarm rates
Is it a bmstdr model fitted object?
is.bmstdr(x)
is.bmstdr(x)
x |
Any R object. |
A TRUE/FALSE logical output.
Banerjee, Carlin and Gelfand (2015) Mat'ern covariance function
materncov(d, phi, nu)
materncov(d, phi, nu)
d |
is the distance |
phi |
is the rate of decay |
nu |
is the smoothness parameter |
Returns the Mat'ern covariance for distance object d
Banerjee et al Mat'ern covariance function
maternfun(d, phi, nu)
maternfun(d, phi, nu)
d |
is the distance |
phi |
is the rate of decay |
nu |
is the smoothness parameter |
Returns the Mat'ern correlation function for distance object d
Average ozone concentration values and three covariates from 28 sites in New York.
nyspatial
nyspatial
An object of class data.frame
with 28 rows and 9 columns.
See the NYdata set in spTimer package, Bakar and Sahu (2015). Each data row is the mean of the available daily 8-hour maximum average ozone concentrations in parts per billion (ppb) at each of the 28 sites. The daily values are for the months of July and August in 2006. @format A data frame with 28 rows and 9 columns:
site index (1 to 28)
Longitude of the site
Latitude of the site
UTM X-coordinate of the site
UTM Y-coordinate of the site
Average ozone concentration value (ppb) at the site over 62 days in July and August, 2006
Average maximum temperature (degree Celsius) at the site over 62 days in July and August, 2006
Average wind speed (nautical mile per hour) over 62 days in July and August, 2006
Average relative humidity over 62 days in July and August, 2006
Bakar KS, Sahu SK (2015). “spTimer: Spatio-Temporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 1548-7660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
head(nyspatial) summary(nyspatial) pairs(nyspatial[, 6:9])
head(nyspatial) summary(nyspatial) pairs(nyspatial[, 6:9])
Daily 8-hour maximum ozone concentration values and three covariates from 28 sites in New York for the 62 days during the months of July and August, 2006.
nysptime
nysptime
An object of class data.frame
with 1736 rows and 12 columns.
It is the same as the NYdata set in the spTimer package, Bakar and Sahu (2015), with two added columns providing the UTM X- and Y- coordinates. Each data row is for a particular site and a day as detailed below.
@format A data frame with 1736 rows and 12 columns:
site index (1 to 28)
Longitude of the site
Latitude of the site
UTM X-coordinate of the site
UTM Y-coordinate of the site
This is 2006 for all the rows
Month taking values 7 for July and 8 for August
Day taking values 1 to 31
Daily 8-hour maximum ozone concentration value
Maximum temperature (degree Celsius)
wind speed (nautical mile per hour)
Relative humidity
Bakar KS, Sahu SK (2015). “spTimer: Spatio-Temporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 1548-7660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
head(nysptime) summary(nysptime[, 9:12])
head(nysptime) summary(nysptime[, 9:12])
Observed against predicted plot
obs_v_pred_plot( yobs, predsums, segments = TRUE, summarystat = "median", plotit = TRUE )
obs_v_pred_plot( yobs, predsums, segments = TRUE, summarystat = "median", plotit = TRUE )
yobs |
A vector containing the actual observations |
predsums |
A data frame containing predictive summary
statistics with the same number of rows as the length of the vector yobs.
The data frame must have columns named as meanpred, medianpred, sd, low and up.
Ideally this argument should be the output of the command
|
segments |
Logical: whether to draw line segments for the prediction intervals. |
summarystat |
Can take one of two values "median" (default) or "mean" indicating which one to use for the plot. |
plotit |
Logical scalar value: whether to plot the predictions against the observed values. |
Draws a plot only after removing the missing observations. It also returns a list of two ggplot2
objects: (i) a plot with intervals drawn pwithseg
and (ii) a plot without the segments drawn:
pwithoutseg
and (iii) a simple plot not showing the range of the prediction intervals.
set.seed(4) vrows <- sample(nrow(nysptime), 100) M1 <- Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime, validrows=vrows, scale.transform = "SQRT") psums <- get_validation_summaries(M1$valpreds) oplots <- obs_v_pred_plot(yobs=M1$yobs_preds$y8hrmax, predsum=psums) names(oplots) plot(oplots$pwithoutseg) plot(oplots$pwithseg)
set.seed(4) vrows <- sample(nrow(nysptime), 100) M1 <- Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime, validrows=vrows, scale.transform = "SQRT") psums <- get_validation_summaries(M1$valpreds) oplots <- obs_v_pred_plot(yobs=M1$yobs_preds$y8hrmax, predsum=psums) names(oplots) plot(oplots$pwithoutseg) plot(oplots$pwithseg)
Grid search method for choosing phi Calculates the validation statistics using the spatial model with a given range of values of the decay parameter phi.
phichoice_sp( formula, data, coordtype, coords, phis, scale.transform, s, N, burn.in, verbose = TRUE, ... )
phichoice_sp( formula, data, coordtype, coords, phis, scale.transform, s, N, burn.in, verbose = TRUE, ... )
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
The data frame for which the model formula is to be fitted. If a spatial model is to be fitted then the data frame should contain two columns containing the locations of the coordinates. See the coords argument below. |
coordtype |
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. |
coords |
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. |
phis |
A vector values of phi |
scale.transform |
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. |
s |
A vector giving the validation sites |
N |
MCMC sample size. Default value 5000. |
burn.in |
How many initial iterations to discard. Default value 1000. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. |
verbose |
Logical. Should it print progress? |
... |
Any additional parameter that may be passed to |
A data frame giving the phi values and the corresponding validation statistics
a <- phichoice_sp(formula=yo3~xmaxtemp+xwdsp+xrh, data=nyspatial, coordtype="utm", coords=4:5, phis=seq(from=0.1, to=1, by=0.4), scale.transform="NONE", s=c(8,11,12,14,18,21,24,28), N=20, burn.in=10, verbose=TRUE)
a <- phichoice_sp(formula=yo3~xmaxtemp+xwdsp+xrh, data=nyspatial, coordtype="utm", coords=4:5, phis=seq(from=0.1, to=1, by=0.4), scale.transform="NONE", s=c(8,11,12,14,18,21,24,28), N=20, burn.in=10, verbose=TRUE)
Calculates the validation statistics using the spatial model with a given range of values of the decay parameter phi.
phichoicep( formula, data, coordtype, coords, scale.transform, phis, phit, valids, N, burn.in, verbose = TRUE )
phichoicep( formula, data, coordtype, coords, scale.transform, phis, phit, valids, N, burn.in, verbose = TRUE )
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
The data frame for which the model formula is to be fitted. If a spatial model is to be fitted then the data frame should contain two columns containing the locations of the coordinates. See the coords argument below. |
coordtype |
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. |
coords |
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. |
scale.transform |
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. |
phis |
A vector values of phi for spatial decay |
phit |
A vector values of phi for temporal decay |
valids |
A vector giving the validation sites |
N |
MCMC sample size. |
burn.in |
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. |
verbose |
Logical. Should progress be printed? |
A data frame giving the phi values and the corresponding validation statistics
a <- phichoicep(formula=y8hrmax ~ xmaxtemp+xwdsp+xrh, data=nysptime, coordtype="utm", coords=4:5, scale.transform = "SQRT", phis=c(0.001, 0.005, 0.025, 0.125, 0.625), phit=c(0.05, 0.25, 1.25, 6.25), valids=c(8,11,12,14,18,21,24,28), N=20, burn.in=10, verbose=TRUE)
a <- phichoicep(formula=y8hrmax ~ xmaxtemp+xwdsp+xrh, data=nysptime, coordtype="utm", coords=4:5, scale.transform = "SQRT", phis=c(0.001, 0.005, 0.025, 0.125, 0.625), phit=c(0.05, 0.25, 1.25, 6.25), valids=c(8,11,12,14,18,21,24,28), N=20, burn.in=10, verbose=TRUE)
Plot method for bmstdr objects.
## S3 method for class 'bmstdr' plot(x, segments = TRUE, ...)
## S3 method for class 'bmstdr' plot(x, segments = TRUE, ...)
x |
A bmstdr model fit object. |
segments |
TRUE or FALSE. It decides whether to draw the prediction intervals as line segments. |
... |
Any other additional arguments. |
It plots the observed values on the original scale against the predictions and the 95% prediction intervals if validation has been performed. It then plots the residuals against fitted values. It then applies plotting method to the model fitted object as returned by the chosen named package. There is no return value.
Provides basic information regarding the fitted model.
## S3 method for class 'bmstdr' print(x, digits = 3, ...)
## S3 method for class 'bmstdr' print(x, digits = 3, ...)
x |
A bmstdr model fit object. |
digits |
How many significant digits after the decimal to print, defaults to 3. |
... |
Any other additional arguments |
No return value, called for side effects.
Extract residuals from a bmstdr fitted object.
## S3 method for class 'bmstdr' residuals(object, numbers = NULL, ...)
## S3 method for class 'bmstdr' residuals(object, numbers = NULL, ...)
object |
A bmstdr model fit object. |
numbers |
a list with two components: sn=number of spatial locations tn=number of time points. Residuals will be assumed to follow the arrangement of the data frame - sorted by space and then time within space. |
... |
Any other additional arguments. |
Returns a vector of residuals. If appropriate, it draws a time series plot of residuals. Otherwise, it draws a plot of residuals against observation numbers.
Provides basic summaries of model fitting.
## S3 method for class 'bmstdr' summary(object, digits = 3, ...)
## S3 method for class 'bmstdr' summary(object, digits = 3, ...)
object |
A bmstdr model fit object. |
digits |
How many significant digits after the decimal to print, defaults to 3. |
... |
Any other additional arguments |
No return value, called for side effects.
Prints the terms
## S3 method for class 'bmstdr' terms(x, ...)
## S3 method for class 'bmstdr' terms(x, ...)
x |
A bmstdr model fit object. |
... |
Any other additional arguments |
Terms in the model formula
A 313 by 313 proximity matrix for the 313 LADCUAS in England. Each entry is either 0 or 1 and is 1 if the corresponding row and column LADCUAs share a common boundary.
Weng
Weng
An object of class matrix
(inherits from array
) with 313 rows and 313 columns.
Sahu and Böhning (2021).
Sahu SK, Böhning D (2021). “Bayesian spatio-temporal joint disease mapping of Covid-19 cases and deaths in local authorities of England.” Spatial Statistics. doi:10.1016/j.spasta.2021.100519.
dim(Weng) summary(apply(Weng, 1, sum))
dim(Weng) summary(apply(Weng, 1, sum))
Average air pollution values from 28 sites in New York.
ydata ydata
ydata ydata
A vector with 28 real values.
An object of class numeric
of length 28.
This is obtained by calculating site-wise averages of the NYdata set in the 'spTimer' package Bakar and Sahu (2015). Each data point is the mean of the available daily 8-hour maximum average ozone concentrations in parts per billion (ppb) at each of the 28 sites. The daily values are for the month of July and August in 2006.
Bakar KS, Sahu SK (2015). “spTimer: Spatio-Temporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 1548-7660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
summary(ydata)
summary(ydata)