Title: | Probabilistic Population Projection |
---|---|
Description: | Generating population projections for all countries of the world using several probabilistic components, such as total fertility rate and life expectancy (Raftery et al., 2012 <doi:10.1073/pnas.1211452109>). |
Authors: | Hana Sevcikova [aut, cre], Adrian Raftery [aut], Thomas Buettner [aut] |
Maintainer: | Hana Sevcikova <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 11.0-2 |
Built: | 2025-02-22 03:04:12 UTC |
Source: | https://github.com/cran/bayesPop |
The package allows to generate population projections for all countries of the world using several probabilistic components, such as total fertility rate (TFR) and life expectancy. Generating subnational projections is also supported.
The main function is called pop.predict
. It uses trajectories of TFR from the bayesTFR package and life expectancy from the bayesLife package and for each trajectory it computes a population projection using the cohort component method. It results in probabilistic age and sex specific projections. Various plotting functions are available for results visualization (pop.trajectories.plot
, pop.pyramid
, pop.trajectories.pyramid
, pop.map
), as well as a summary function. Aggregations can be derived using pop.aggregate
. An expression language is available to obtain the distribution of various population quantities.
Subnational projections can be generated using pop.predict.subnat
. Function pop.aggregate.subnat
aggregates such projections.
Hana Sevcikova, Adrian Raftery, Thomas Buettner
Maintainer: Hana Sevcikova <[email protected]>
H. Sevcikova, A. E. Raftery (2016). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
A. E. Raftery, N. Li, H. Sevcikova, P. Gerland, G. K. Heilig (2012). Bayesian probabilistic population projections for all countries. Proceedings of the National Academy of Sciences 109:13915-13921. doi:10.1073/pnas.1211452109
P. Gerland, A. E. Raftery, H. Sevcikova, N. Li, D. Gu, T. Spoorenberg, L. Alkema, B. K. Fosdick, J. L. Chunn, N. Lalic, G. Bay, T. Buettner, G. K. Heilig, J. Wilmoth (2014). World Population Stabilization Unlikely This Century. Science 346:234-237.
H. Sevcikova, N. Li, V. Kantorova, P. Gerland and A. E. Raftery (2016). Age-Specific Mortality and Fertility Rates for Probabilistic Population Projections. In: Dynamic Demographic Analysis, ed. Schoen R. (Springer), pp. 285-310. Earlier version in arXiv:1503.05215.
H. Sevcikova, J. Raymer J., A. E. Raftery (2024). Forecasting Net Migration By Age: The Flow-Difference Approach. arXiv:2411.09878.
## Not run: sim.dir <- tempfile() # Generates population projection for one country country <- "Netherlands" pred <- pop.predict(countries=country, output.dir=sim.dir) summary(pred, country) pop.trajectories.plot(pred, country) dev.off() pop.trajectories.plot(pred, country, sum.over.ages=TRUE) pop.pyramid(pred, country) pop.pyramid(pred, country, year=2100, age=1:26) unlink(sim.dir, recursive=TRUE) ## End(Not run) # Here are commands needed to run probabilistic projections # from scratch, i.e. including TFR and life expectancy. # Note that running the first four commands # (i.e. predicting TFR and life expectancy) can take # LONG time (up to several days; see below for possible speed-up). # For a toy simulation, set the number of iterations (iter) # to a small number. ## Not run: sim.dir.tfr <- "directory/for/TFR" sim.dir.e0 <- "directory/for/e0" sim.dir.pop <- "directory/for/pop" # Estimate TFR parameters (speed-up by including parallel=TRUE) run.tfr.mcmc(iter="auto", output.dir=sim.dir.tfr, seed=1) # Predict TFR (if iter above < 4000, reduce burnin and nr.traj accordingly) tfr.predict(sim.dir=sim.dir.tfr, nr.traj=2000, burnin=2000) # Estimate e0 parameters (females) (speed-up by including parallel=TRUE) # Can be run independently of the two commands above run.e0.mcmc(sex="F", iter="auto", output.dir=sim.dir.e0, seed=1) # Predict female and male e0 # (if iter above < 22000, reduce burnin and nr.traj accordingly) e0.predict(sim.dir=sim.dir.e0, nr.traj=2000, burnin=20000) # Population prediction pred <- pop.predict(output.dir=sim.dir.pop, verbose=TRUE, inputs = list(tfr.sim.dir=sim.dir.tfr, e0F.sim.dir=sim.dir.e0, e0M.sim.dir="joint_")) pop.trajectories.plot(pred, "Madagascar", nr.traj=50, sum.over.ages=TRUE) pop.trajectories.table(pred, "Madagascar") ## End(Not run)
## Not run: sim.dir <- tempfile() # Generates population projection for one country country <- "Netherlands" pred <- pop.predict(countries=country, output.dir=sim.dir) summary(pred, country) pop.trajectories.plot(pred, country) dev.off() pop.trajectories.plot(pred, country, sum.over.ages=TRUE) pop.pyramid(pred, country) pop.pyramid(pred, country, year=2100, age=1:26) unlink(sim.dir, recursive=TRUE) ## End(Not run) # Here are commands needed to run probabilistic projections # from scratch, i.e. including TFR and life expectancy. # Note that running the first four commands # (i.e. predicting TFR and life expectancy) can take # LONG time (up to several days; see below for possible speed-up). # For a toy simulation, set the number of iterations (iter) # to a small number. ## Not run: sim.dir.tfr <- "directory/for/TFR" sim.dir.e0 <- "directory/for/e0" sim.dir.pop <- "directory/for/pop" # Estimate TFR parameters (speed-up by including parallel=TRUE) run.tfr.mcmc(iter="auto", output.dir=sim.dir.tfr, seed=1) # Predict TFR (if iter above < 4000, reduce burnin and nr.traj accordingly) tfr.predict(sim.dir=sim.dir.tfr, nr.traj=2000, burnin=2000) # Estimate e0 parameters (females) (speed-up by including parallel=TRUE) # Can be run independently of the two commands above run.e0.mcmc(sex="F", iter="auto", output.dir=sim.dir.e0, seed=1) # Predict female and male e0 # (if iter above < 22000, reduce burnin and nr.traj accordingly) e0.predict(sim.dir=sim.dir.e0, nr.traj=2000, burnin=20000) # Population prediction pred <- pop.predict(output.dir=sim.dir.pop, verbose=TRUE, inputs = list(tfr.sim.dir=sim.dir.tfr, e0F.sim.dir=sim.dir.e0, e0M.sim.dir="joint_")) pop.trajectories.plot(pred, "Madagascar", nr.traj=50, sum.over.ages=TRUE) pop.trajectories.table(pred, "Madagascar") ## End(Not run)
Creates sex- and age-specific net migration datasets out of the total net migration using different methods. The age.specific.migration
is a legacy function that distributes UN 5-year totals into ages using a residual method. The migration.totals2age
distribute given totals using Rogers-Castro and the Flow Difference Method (FDM).
age.specific.migration(wpp.year = 2019, years = seq(1955, 2100, by = 5), countries = NULL, smooth = TRUE, rescale = TRUE, ages.to.zero = 18:21, write.to.disk = FALSE, directory = getwd(), file.prefix = "migration", depratio = wpp.year == 2015, verbose = TRUE) migration.totals2age(df, ages = NULL, annual = FALSE, time.periods = NULL, scale = 1, method = "rc", sex = "M", id.col = "country_code", mig.is.rate = FALSE, rc.data = NULL, pop = NULL, pop.glob = NULL, ...) rcastro.schedule(annual = FALSE)
age.specific.migration(wpp.year = 2019, years = seq(1955, 2100, by = 5), countries = NULL, smooth = TRUE, rescale = TRUE, ages.to.zero = 18:21, write.to.disk = FALSE, directory = getwd(), file.prefix = "migration", depratio = wpp.year == 2015, verbose = TRUE) migration.totals2age(df, ages = NULL, annual = FALSE, time.periods = NULL, scale = 1, method = "rc", sex = "M", id.col = "country_code", mig.is.rate = FALSE, rc.data = NULL, pop = NULL, pop.glob = NULL, ...) rcastro.schedule(annual = FALSE)
wpp.year |
Integer determining which wpp package should be used to get the necessary data from. That package is required to have a dataset on total net migration (called |
years |
Array of years that the reconstruction should be made for. This should be a subset of years for which the total net migration is available. |
countries |
Numerical country codes to do the reconstruction for. By default it is performed on all countries included in the |
smooth |
Logical controlling if smoothing of the reconstructed curves is required. Due to rounding issues the residual method often yields unrealistic zig-zags on migration curves by age. Smoothing usually improves their look. |
rescale |
Logical controlling if the resulting migration should be rescaled to match the total migration. |
ages.to.zero |
Indices of age groups where migration should be set to zero. Default is 85 and older. |
write.to.disk |
If |
directory |
Directory where to write the results if |
file.prefix |
If |
depratio |
If it is |
verbose |
Logical controlling the amount of output messages. |
df |
data.frame, marix or data.table containing total migration counts or rates. Columns correspond to time, rows correspond to locations. Column “country_code” (or column identified by |
ages |
Labels of age groups into which the total migration is to be disaggregated. If it is missing, default age groups are determined depending on the argument |
annual |
Logical determining if the age groups are 5-year age groups ( |
time.periods |
Character vector determining which columns should be considered in the |
scale |
The migration schedule is multiplied by this number. It can be used for example, if total migration needs to be distributed between sexes. |
method |
Method to use for the distribution of totals into age groups. The “rc” method uses either a basic Rogers-Castro disaggregation via the function |
sex |
“M” or “F” determining the sex of this schedule. It only impacts the FDM methods. |
id.col |
Name of the unique identifier of the locations. |
mig.is.rate |
Logical indicating if the data in |
rc.data |
data.table containing either a family of Rogers-Castro proportions if For the “rc” method, mandatory columns are “age” and “prop”. Optionally, it can have a column “mig_sign” with values “Inmigration” and “Emigration” (distinguishing schedules to be applied for positive and negative migration, respectively) and a column “sex” with values “Female” and “Male”. The format corresponds to the dataset For the FDM methods, it has columns contained in the |
pop |
data.table with population counts needed for the FDM methods. It should have a location identifier column of the same name as |
pop.glob |
data.table with global population needed for the weighted FDM method (“fdmp”). It should have columns “year”, “age”, and “pop”. |
... |
Further arguments passed to the underlying functions. |
age.specific.migration
Unlike in wpp2012
, for the four releases of the WPP between 2015 and 2022, the wpp2015, wpp2017, wpp2019, and wpp2022, the UN Population Division did not publish the sex- and age-specific net migration counts, only the totals. However, since the sex- and age-schedules are needed for population projections, the age.specific.migration
function attempts to reconstruct those missing datasets. It uses the published population projections by age and sex, fertility and mortality projections from the wpp package. It computes the population projection without migration and sets the residual to the published population projection as the net migration. By default such numbers are then scaled so that the sum over sexes and ages corresponds to the total migration count.
If smooth
is TRUE
a smoothing procedure is performed over ages where necessary. Also, for simplicity, we set migration of old ages to zero (default is 85+). Both is done before the scaling. If it is desired to obtain raw residuals without any additional processing, set smooth=FALSE
, rescale=FALSE
, ages.to.zero=c()
.
This function works only for 5-year data.
migration.totals2age
This function should be used when working with annual data or data from wpp2022 and wpp2024. It allows users to disagregate total migration counts or rates (for multiple time periods and multiple locations) into age-specific ones by either a schedule similar to the one used by the UN in WPP2024 (method = "fdmnop"
), a Rogers-Castro (method = "rc"
), or by FDM weighted by population (method = "fdmp"
) as described in Sevcikova et al (2024). The FDM method needs additional info passed via the arguments rc.data
, pop
and pop.glob
. The default Rogers-Castro schedule can be accessed via the function rcastro.schedule
where the annual
argument specifies if it is for 1-year or 5-year age groups. Alternatively, an external schedule can be given via the rc.data
argument, where one can distinguish between schedules for each sex, as well as for positive and negative net migration. It has the same structure as the dataset DemoTools::mig_un_families
, but it should be a subset for a single family and converted to data.table
.
Function age.specific.migration
returns a list of two data frames (male
and female
), each having the same structure as migrationM
.
Function migration.totals2age
returns a data.table with the disaggregated counts.
Function rcastro.schedule
returns a vector of proportions for each age group.
Due to rounding issues and slight differences in the methodology, the functions do not reproduce the unpublished UN datasets exactly. It is only an approximation! Especially, the first age groups might be more off than other ages.
These functions are called automatically from pop.predict
if needed, depending on the inputs.
Thus, only users that need sex- and age-specific migration for other purposes, or modify the defaults, will need to call these functions explicitly.
Further note that the wpp2024 package does contain the age-specific net migration for projected years (datasets migprojAge1dt
, migprojAge5dt
). Thus, if running pop.predict
with wpp.year = 2024
and the default migration totals, no disagregation is necessary for the projected time periods. The disaggregation is only triggerered for the past time periods, or in a case when user-specific net migration totals are used.
Hana Sevcikova
H. Sevcikova, J. Raymer J., A. E. Raftery (2024). Forecasting Net Migration By Age: The Flow-Difference Approach. arXiv:2411.09878.
pop.predict
, migration
migrationM
, rcFDM
, vwBaseYear
## Not run: asmig <- age.specific.migration() head(asmig$male) head(asmig$female) ## End(Not run) # simple disaggregation for one location totmig <- c(30, -50, -100) names(totmig) <- 2018:2020 asmig.simple <- migration.totals2age(totmig, annual = TRUE, method = "rc") head(asmig.simple) ## Not run: # disaggregate WPP 2019 migration for all countries, one sex data(migration, package = "wpp2019") # assuming equal sex migration ratio asmig.all <- migration.totals2age(migration, scale = 0.5, method = "rc") # plot result for the US in 2095-2100 mig1sex.us <- subset(asmig.all, country_code == 840)[["2095-2100"]] plot(ts(mig1sex.us)) # check that the sum is half of the original total sum(mig1sex.us) == subset(migration, country_code == 840)[["2095-2100"]]/2 ## End(Not run)
## Not run: asmig <- age.specific.migration() head(asmig$male) head(asmig$female) ## End(Not run) # simple disaggregation for one location totmig <- c(30, -50, -100) names(totmig) <- 2018:2020 asmig.simple <- migration.totals2age(totmig, annual = TRUE, method = "rc") head(asmig.simple) ## Not run: # disaggregate WPP 2019 migration for all countries, one sex data(migration, package = "wpp2019") # assuming equal sex migration ratio asmig.all <- migration.totals2age(migration, scale = 0.5, method = "rc") # plot result for the US in 2095-2100 mig1sex.us <- subset(asmig.all, country_code == 840)[["2095-2100"]] plot(ts(mig1sex.us)) # check that the sum is half of the original total sum(mig1sex.us) == subset(migration, country_code == 840)[["2095-2100"]]/2 ## End(Not run)
The function returns a data frame containing codes and names of all countries used in the prediction.
## S3 method for class 'bayesPop.prediction' get.countries.table(object, ...)
## S3 method for class 'bayesPop.prediction' get.countries.table(object, ...)
object |
Object of class |
... |
Not used. |
Data frame with columns code
and name
.
Hana Sevcikova
Function get.pop.prediction
retrieves results of a prediction from disk and creates an object of class bayesPop.prediction
. Function has.pop.prediction
checks an existence of such results.
get.pop.prediction(sim.dir, aggregation = NULL, write.to.cache = TRUE) has.pop.prediction(sim.dir) pop.cleanup.cache(pop.pred)
get.pop.prediction(sim.dir, aggregation = NULL, write.to.cache = TRUE) has.pop.prediction(sim.dir) pop.cleanup.cache(pop.pred)
sim.dir |
Directory where the prediction is stored. It should correspond to the value of the |
aggregation |
If given, the prediction object is considered to be an aggregation and both arguments are passed to |
write.to.cache |
Logical controlling if other functions are allowed to write the cache of this prediction object (see Details). |
pop.pred |
Object of class |
The pop.predict
function stores resulting trajectories into a directory called output.dir
/prediction. Here the argument sim.dir
should correspond to output.dir
(i.e. without the “prediction” part).
In addition to retrieving prediction results, the get.pop.prediction
function also looks for a file called ‘cache.rda’ and loads it into an environment called cache
. If it does not exist, it creates an empty cache
environment. See pop.map
- Section Performance and Caching. The environment can be cleaned up using the pop.cleanup.cache
function which also deletes the ‘cache.rda’ file on disk. If write.to.cache
is FALSE
, other functions are not allowed to manipulate the ‘cache.rda’ file.
Function has.pop.prediction
returns a logical indicating if a prediction exists.
Function get.pop.prediction
returns an
object of class bayesPop.prediction
.
Hana Sevcikova
bayesPop.prediction
, get.pop.aggregation
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) summary(pred)
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) summary(pred)
Functions for obtaining life table quantities.
LifeTableMx(mx, sex = c("Male", "Female", "Total"), include01 = TRUE, abridged = TRUE, radix = 1, open.age = 130) LifeTableMxCol(mx, colname = c("Lx", "lx", "qx", "mx", "dx", "Tx", "sx", "ex", "ax"), ...)
LifeTableMx(mx, sex = c("Male", "Female", "Total"), include01 = TRUE, abridged = TRUE, radix = 1, open.age = 130) LifeTableMxCol(mx, colname = c("Lx", "lx", "qx", "mx", "dx", "Tx", "sx", "ex", "ax"), ...)
mx |
Vector of age-specific mortality rates nmx. If |
sex |
For which sex is the life table. |
include01 |
Logical. If it is |
abridged |
Logical. If |
radix |
Base of the life table. |
open.age |
Open age group. If smaller than the last age group of |
colname |
Name of the column of the life table that should be returned. |
... |
Arguments passed to underlying functions, e.g. |
Function LifeTableMx
returns a life table for one set of mortality rates. Function LifeTableMxCol
returns one column of the life table for (possibly) multiple sets of mortality rates. The underlying workhorse here is the life.table
function from the MortCast package. These functions only collapse the first age groups if needed for an abridged life table (LifeTableMx
) or/and combine results for multiple time periods into one object (LifeTableMxCol
).
Function LifeTableMx
returns a data frame with the following elements:
age |
Age groups |
mx |
mx, the input vector of mortality rates. |
qx |
nqx, probability of dying between ages x ad x+n. |
lx |
lx, number left alive at age x. |
dx |
ndx, cohort deaths between ages x ad x+n. |
Lx |
nLx, person-years lived between ages x and x+n. |
sx |
sx, survival rate at age x. |
Tx |
Tx, person-years lived above age x. |
ex |
e0x, expectation of life at age x. |
ax |
nax, average person-years lived in the interval by those dying in the interval. |
Function LifeTableMxCol
returns one given column of the life table, possibly as a matrix (if mx
is a matrix).
Hana Sevcikova, Thomas Buettner, Nan Li, Patrick Gerland
Preston, P., Heuveline, P., Guillot, M. (2001): Demography. Blackwell Publishing Ltd.
life.table
, pop.expressions
for examples on retrieving some life table quantities.
## Not run: sim.dir <- tempfile() pred <- pop.predict(countries="Ecuador", output.dir=sim.dir, wpp.year=2015, present.year=2015, keep.vital.events=TRUE, fixed.mx=TRUE, fixed.pasfr=TRUE) # get male mortality rates from 2020 for age groups 0-1, 1-4, 5-9, ... mxm <- pop.byage.table(pred, expression="MEC_M{age.index01(27)}", year=2020)[,1] print(LifeTableMx(mxm), digits=3) # female LT with first two age categories collapsed mxf <- pop.byage.table(pred, expression="MEC_F{age.index01(27)}", year=2020)[,1] print(LifeTableMx(mxf, sex="Female", include01=FALSE), digits=3) unlink(sim.dir, recursive=TRUE) ## End(Not run)
## Not run: sim.dir <- tempfile() pred <- pop.predict(countries="Ecuador", output.dir=sim.dir, wpp.year=2015, present.year=2015, keep.vital.events=TRUE, fixed.mx=TRUE, fixed.pasfr=TRUE) # get male mortality rates from 2020 for age groups 0-1, 1-4, 5-9, ... mxm <- pop.byage.table(pred, expression="MEC_M{age.index01(27)}", year=2020)[,1] print(LifeTableMx(mxm), digits=3) # female LT with first two age categories collapsed mxf <- pop.byage.table(pred, expression="MEC_F{age.index01(27)}", year=2020)[,1] print(LifeTableMx(mxf, sex="Female", include01=FALSE), digits=3) unlink(sim.dir, recursive=TRUE) ## End(Not run)
Help functions to easily generate commonly used expressions.
mac.expression(country) mac.expression1(country) mac.expression5(country)
mac.expression(country) mac.expression1(country) mac.expression5(country)
country |
Country code as defined for |
mac.expression
and mac.expression1
generate expressions for the mean age of childbearing of the given country, for 5-year age groups and 1-year age groups, respectively. mac.expression5
is a synonym for mac.expression
.
Note that pop.predict
has to be run with keep.vital.events=TRUE
for this to work.
mac.expression
returns a character string corresponding to the formula
where
denotes the country-specific percent age-specific fertility for the age group
.
mac.expression1
returns a character string corresponding to the formula
## Not run: sim.dir <- tempfile() # Run pop.predict with storing vital events pred <- pop.predict(countries=c("Germany", "France"), nr.traj=3, keep.vital.events=TRUE, output.dir=sim.dir) # plot the mean age of childbearing pop.trajectories.plot(pred, expression=mac.expression("FR"), cex.main = 0.7) unlink(sim.dir, recursive=TRUE) ## End(Not run)
## Not run: sim.dir <- tempfile() # Run pop.predict with storing vital events pred <- pop.predict(countries=c("Germany", "France"), nr.traj=3, keep.vital.events=TRUE, output.dir=sim.dir) # plot the mean age of childbearing pop.trajectories.plot(pred, expression=mac.expression("FR"), cex.main = 0.7) unlink(sim.dir, recursive=TRUE) ## End(Not run)
Dataset with values of the Lee-Carter bx parameter for countries where mortality was obtained using model life tables.
data(MLTbx)
data(MLTbx)
A data frame with nine rows and 28 columns. Each row corresponds to one mortality age pattern as defined in the vwBaseYear
dataset. Each column corresponds to an age group, starting with 0-1, 1-4, 5-9, 10-14, ... up to 125-129, 130+.
These values are used for countries for which the column AgeMortalityType
in vwBaseYear
is equal to “Model life tables”. In such a case a row is selected that corresponds to the corresponding value of the column AgeMortalityPattern
(also in vwBaseYear
). These values are then used instead of estimating the Lee-Carter from the country's historical data.
Data provided by the United Nations Population Division.
data(MLTbx) str(MLTbx)
data(MLTbx) str(MLTbx)
For a given indicator and a country, the function computes the probability of a peak happening before a given year, as well as a range of years between which a peak happens with given probability.
peak.probability(pop.pred, country = NULL, expression = NULL, year = NULL, pi = 95, verbose = TRUE, ...)
peak.probability(pop.pred, country = NULL, expression = NULL, year = NULL, pi = 95, verbose = TRUE, ...)
pop.pred |
Object of class |
country |
Name or numerical code or ISO-2 or ISO-3 character code of a country. If given, population is used as an indicator and the |
expression |
Expression defining an indicator. For syntax see |
year |
Used for computing the probability of a peak happenning before |
pi |
Probability between 0 and 100. Used for selecting a range of years between which a peak happens with probability given by this argument. |
verbose |
Logical. If |
... |
Additional arguments passed to the underlying functions. If |
Given an indicator, the function computes two quantities:
probability that the indicator reaches its peak before given year
;
range of years between which a peak happens with the given probability pi
.
The indicator can be either population (if country
is given), or it can be any expression defined as a function of time (see pop.expressions
).
List with elements:
prob.peak.less.given.year |
Probability that the indicator reaches its peak before |
given.year |
The value of |
peak.quantiles |
The lower bound, the upper bound and the median of years defining a time interval in which a peak happens with the given probability |
.
all.prob.peak.by.time |
Data frame containing the probability of peak happening in each projected year, as well as the corresponding cummulative probability. Years in which no peak is projected are not included. |
Hana Sevcikova
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir, write.to.cache=FALSE) # probability that population of Netherlands peaks before 2040 # and between which years it will peak with probablity 80% peak.probability(pred, "NL", year = 2040, pi = 80) # check visually with # pop.trajectories.plot(pred, "NL") # the same for female of age 45-49 peak.probability(pred, "NL", year = 2040, pi = 80, sex = "female", age = 10) # probability of a peak for the potential support ratio in Ecuador peak.probability(pred, expression = "PEC[5:13]/PEC[14:27]") # check visually that it already peaked # pop.trajectories.plot(pred, expression = "PEC[5:13]/PEC[14:27]")
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir, write.to.cache=FALSE) # probability that population of Netherlands peaks before 2040 # and between which years it will peak with probablity 80% peak.probability(pred, "NL", year = 2040, pi = 80) # check visually with # pop.trajectories.plot(pred, "NL") # the same for female of age 45-49 peak.probability(pred, "NL", year = 2040, pi = 80, sex = "female", age = 10) # probability of a peak for the potential support ratio in Ecuador peak.probability(pred, expression = "PEC[5:13]/PEC[14:27]") # check visually that it already peaked # pop.trajectories.plot(pred, expression = "PEC[5:13]/PEC[14:27]")
Aggregation of existing countries' population projections into projections of given regions, and accessing such aggregations.
pop.aggregate(pop.pred, regions, input.type = c("country", "region"), name = input.type, inputs = list(e0F.sim.dir = NULL, e0M.sim.dir = "joint_", tfr.sim.dir = NULL), my.location.file = NULL, verbose = FALSE, ...) get.pop.aggregation(sim.dir = NULL, pop.pred = NULL, name = NULL, write.to.cache = TRUE) pop.aggregate.subnat(pop.pred, regions, locations, ..., verbose = FALSE)
pop.aggregate(pop.pred, regions, input.type = c("country", "region"), name = input.type, inputs = list(e0F.sim.dir = NULL, e0M.sim.dir = "joint_", tfr.sim.dir = NULL), my.location.file = NULL, verbose = FALSE, ...) get.pop.aggregation(sim.dir = NULL, pop.pred = NULL, name = NULL, write.to.cache = TRUE) pop.aggregate.subnat(pop.pred, regions, locations, ..., verbose = FALSE)
pop.pred |
Object of class |
regions |
Vector of numerical codes of regions. It should correspond to values in the column “country_code” in the |
input.type |
There are two methods for aggregating projections depending on the type of inputs, “country”- and “region”-based, see Details. |
name |
Name of the aggregation. It becomes a part of a directory name where aggregation results are stored. |
inputs |
This argument is only used when the “region”-based method is selected. It is a list of inputs of probabilistic components of the projection:
|
my.location.file |
User-defined location file that can contain other agreggation groups than the default UN location file. It should have the same structure as the |
verbose |
Logical switching log messages on and off. |
sim.dir |
Simulation directory where aggregation is stored. It is the same directory used for creating the |
write.to.cache |
Logical controlling if functions operating on this object are allowed to write into its cache (see Details of |
locations |
Name of a tab-delimited file that contains definitions of the sub-regions. It should be the same file as used for the |
... |
Additional arguments. For a country-type aggregation, it can be logical |
Function pop.aggregate
triggers an aggregations over countries while function pop.aggregate.subnat
is used for aggregation over sub-regions to a country. The following details refer to the use of pop.aggregate
. For sub-national aggregation see Example in pop.predict.subnat
.
The dataset UNlocations
or my.location.file
is used to determine countries to be aggregated, in particular the field “location_type” of the entries with “country_code” given in the regions
argument. One can aggregate over the following location types: Type 0 means aggregating all countries of the world (or in the file), type 2 is aggregating over continents, type 3 is aggregating over regions within continents, and any other integer (except 4) correponds to user-defined aggregations. Note that type 4 is reserved as a location type of countries and thus, all aggregations are performed over entries of this type. For type 2, countries are matched using the “area_code” column; for type 3 the matching is done using the “reg_code” column of the UNlocations
dataset. E.g., if regions=908
(Europe) which has location type 2 in the default UNlocations
dataset, all countries are aggregated for which values of 908 are found in the “area_code” column. If the location type is other than 0, 2, 3 and 4, there must be a column in the file called “agcode_” with
being the location type. This column is then used to match the countries to be aggregated.
Consider the following example. Say we want to pair four countries (Germany [DE], France [FR], Netherlands [NL], Italy [IT]) in two different ways, so we have two overlapping groupings, each of which has two groups (A,B):
group A = (DE, FR), group B = (NL, IT)
group A = (DE, NL), group B = (FR, IT)
Then, my.location.file
should have the following entries:
country_code | name | location_type | agcode_98 | agcode_99 |
1001 | grouping1_groupA | 98 | -1 | -1 |
1002 | grouping1_groupB | 98 | -1 | -1 |
1003 | grouping2_groupA | 99 | -1 | -1 |
1004 | grouping2_groupB | 99 | -1 | -1 |
276 | Germany | 4 | 1001 | 1003 |
250 | France | 4 | 1001 | 1004 |
258 | Netherlands | 4 | 1002 | 1003 |
380 | Italy | 4 | 1002 | 1004 |
1005 | all | 0 | -1 | -1 |
The “country_code” of the groups is user-specific, but it must be unique within the file. Values of “country_code” for countries must match those in the prediction object. To run the aggregation for the four groups above we set regions=1001:1004
. Having “location_type” being 98 and 99, it is expected the file to have columns “agcode_98” and “agcode_99” containing assignements to each of the two groupings. Values in this columns corresponding to groups are not used and thus can have any value. For aggregating over all four countries, set regions=1005
which has “location_type” equal 0 and thus, it is aggregated over all entries with “location_type” equals 4.
There are two methods available for generating aggregations of population projection:
Aggregations are created by summing trajectories over countries of the given region.
The aggregation is generated using the same algorithm as population projections for single countries (function pop.predict
), but it operates on aggregated input components. These are created as follows. Here denotes countries over which we aggregate a region
,
,
, and
denote sex, age category and time, respectively.
denotes the present year of the prediction.
and
, respectively, denotes the historical population count and the Bayesian predictive median of population, respectively, of sex
, in age category
at time
for country
(refer to the links in parentheses for description of the data):
Aggregated migration code is the code of maximum counts over aggregated countries weighted by . Migration start year is the maximum of start years over aggregated countries.
We assume an aggregation of life expectancy for the given regions was generated prior to this call, using the run.e0.mcmc.extra
and e0.predict.extra
functions of the bayesLife package.
We assume an aggregation of total fertility for the given regions was generated prior to this call, using the run.tfr.mcmc.extra
and tfr.predict.extra
functions of the bayesTFR package.
Results of the aggregations are stored in the same top directory as the pop.pred
object, in a sudirectory called ‘aggregations_
name’. They can be accessed using the function get.pop.aggregation
. Note that multiple runs of this function with the same name will overwrite previous aggregations results of the same name.
Object of class bayesPop.prediction
containing the aggregated results. In addition it contains elements aggregation.method
giving the input.type
used, and aggregated.countries
which is a list of countries aggregated for each region.
Hana Sevcikova, Adrian Raftery
H. Sevcikova, A. E. Raftery (2016). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
pop.predict
, tfr.predict.extra
, e0.predict.extra
## Not run: sim.dir <- tempfile() pred <- pop.predict(countries=c(528,218,450), output.dir=sim.dir) aggr <- pop.aggregate(pred, 900) # aggregating World (i.e. all countries available in pred) pop.trajectories.plot(aggr, 900, sum.over.ages=TRUE) # countries over which we aggregated: subset(UNlocations, country_code %in% aggr$aggregated.countries[["900"]]) unlink(sim.dir, recursive=TRUE) ## End(Not run)
## Not run: sim.dir <- tempfile() pred <- pop.predict(countries=c(528,218,450), output.dir=sim.dir) aggr <- pop.aggregate(pred, 900) # aggregating World (i.e. all countries available in pred) pop.trajectories.plot(aggr, 900, sum.over.ages=TRUE) # countries over which we aggregated: subset(UNlocations, country_code %in% aggr$aggregated.countries[["900"]]) unlink(sim.dir, recursive=TRUE) ## End(Not run)
Extracts and plots population counts or results of expressions by cohorts.
cohorts(pop.pred, country = NULL, expression = NULL, pi = c(80, 95)) pop.cohorts.plot(pop.pred, country = NULL, expression = NULL, cohorts = NULL, cohort.data = NULL, pi = c(80, 95), dev.ncol = 5, show.legend = TRUE, legend.pos = "bottomleft", ann = par("ann"), add = FALSE, xlab = "", ylab = "", main = NULL, xlim = NULL, ylim = NULL, col = "red", ...)
cohorts(pop.pred, country = NULL, expression = NULL, pi = c(80, 95)) pop.cohorts.plot(pop.pred, country = NULL, expression = NULL, cohorts = NULL, cohort.data = NULL, pi = c(80, 95), dev.ncol = 5, show.legend = TRUE, legend.pos = "bottomleft", ann = par("ann"), add = FALSE, xlab = "", ylab = "", main = NULL, xlim = NULL, ylim = NULL, col = "red", ...)
pop.pred |
Object of class |
country |
Name or numerical code of a country. If it is not given, |
expression |
Expression defining the population measure to be plotted. For syntax see |
pi |
Probability interval. It can be a single number or an array. |
cohorts |
Years of the cohorts to be plotted. By default, 10 future cohorts (starting from the last observed one) are used. It can be a single number or an array. |
cohort.data |
List with the cohort data obtained via the |
dev.ncol |
Number of column for the graphics device. |
show.legend |
Logical controlling whether the legend should be drawn. |
legend.pos |
Position of the legend passed to the |
ann , xlab , ylab , main , xlim , ylim , col , ...
|
Graphical parameters passed to the |
add |
Logical specifying if the plot should be added to an existing graphics. |
pop.cohorts.plot
plots all cohorts passed in the cohorts
argument on the same scale of the -axis.
Function cohorts
returns a list where each element corresponds to one cohort. Each cohort element is a matrix with columns corresponding to years and rows corresponding to the median (first row) and quantiles of the given probability intervals.
Hana Sevcikova
pop.trajectories.plot
, pop.byage.plot
, pop.expressions
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) # Population cohorts pop.cohorts.plot(pred, "Netherlands") # plot specific cohorts using expression (must contain {}) pop.cohorts.plot(pred, expression="P528{}", cohorts=c(1960, 1980, 2000, 2020)) # the same as cohort.data <- cohorts(pred, expression="P528{}") pop.cohorts.plot(pred, cohort.data=cohort.data, cohorts=c(1960, 1980, 2000, 2020))
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) # Population cohorts pop.cohorts.plot(pred, "Netherlands") # plot specific cohorts using expression (must contain {}) pop.cohorts.plot(pred, expression="P528{}", cohorts=c(1960, 1980, 2000, 2020)) # the same as cohort.data <- cohorts(pred, expression="P528{}") pop.cohorts.plot(pred, cohort.data=cohort.data, cohorts=c(1960, 1980, 2000, 2020))
Documentation of expressions supported by functions pop.trajectories.plot
, pop.trajectories.plotAll
, pop.trajectories.table
, pop.byage.plot
, pop.byage.table
, cohorts
, pop.cohorts.plot
, pop.map
, pop.map.gvis
, write.pop.projection.summary
, get.pop.ex
, get.pop.exba
.
The functions above accept an argument expression
which should define a population measure, i.e. a quantity that can be computed from population projections, observed population data or vital events. Such an expression is a collection of basic components connected via usual arithmetic operators, such as +
, -
, *
, /
, ^
, %%
, %/%
, and combined using parentheses. In addition, standard R functions or predefined functions (see below) can be used within expressions.
A basic component is a character string constituted of four parts, two of which are optional. They must be in the following order:
Measure identification. One of the folowing upper-case characters:
‘P’ - population,
‘D’ - deaths,
‘B’ - births,
‘S’ - survival ratio,
‘F’ - fertility rate,
‘R’ - percent age-specific fertility,
‘M’ - mortality rate,
‘Q’ - probability of dying,
‘E’ - life expectancy,
‘G’ - net migration,
‘A’ - a_x column of the life table.
All but the ‘P’ and ‘G’ indicators are available only if the pop.predict
function was run with keep.vital.events=TRUE
.
Country part. One of the following:
Numerical country code (as used in UNlocations
, see https://en.wikipedia.org/wiki/ISO_3166-1_numeric),
two- or three-character ISO 3166 code, see https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2, https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3,
characters “XXX” which serves as a wildcard for a country code.
Sex part (optional): The country part can be followed by either “_F” (for female) or “_M” (for male).
Age part (optional): If used, the basic component is concluded by an age index given as an array. Such array is embraced by either brackets (“[” and “]”) or curly braces (“{” and “}”). The former invokes a summation of counts over given ages, the latter is used when no summation is desired. Note that if this part is missing, counts are automatically summed over all ages. To use all ages without summing, empty curly braces can be used.
For 5x5 predictions, the age index 1 corresponds to age 0-4, index 2 corresponds to age 5-9 etc. Indicators ‘S’, ‘M’, ‘Q’ and ‘E’ allow an index -1 which corresponds to age 0-1 and an index 0 which corresponds to age 1-4. Use the pre-defined functions age.index01(...)
and age.index05(...)
(see below) to define the right indices.
For 1x1 predictions, the age index starts with 0 for all indicators and matches exactly the age. I.e., indices 0,1,2,... correspond to ages 0,1,2,....
Not all combinations of the four parts above make sense. For example, ‘F’ and ‘R’ can be only combined with female sex, ‘B’, ‘F’ and ‘R’ can be only combined with a subset of the age groups, namely child-bearing ages (indices 4 to 10 in 5x5, or 11 to 55 in 1x1). Or, there is no point in summing the life table based indicators (M, Q, E, S, A) over multiple age groups, i.e. using brackets, or over sexes. Thus, if the sex part is omitted for the life table indicators, the life table is correctly aggregated over sexes, instead of a simple summation.
Examples of basic components are “P276”, “D50_F[4:10]”, “PXXX{14:27}”, “SCZE_M{}”, “QIE_M[-1]”.
When the expression is evaluated on a prediction object, each basic component is substituted by an array of four dimensions (using the get.pop
function):
Country dimension: Equals to one if a specific country code is given, or it equals the number of countries in the prediction object if a wildcard is used.
Age dimension: Equals to one if the third component above is missing or the age is defined within square brackets. If the age is defined within curly braces, this dimension corresponds to the length of the age array.
Time dimension: Depending on the time context of the expression, this dimension corresponds to either the number of projection periods or the number of observation periods.
Trajectory dimension: Corresponds to the number of trajectories in the prediction object, or one if the component is evaluated on observed data.
Depending on the context from which the expression is called, the trajectory dimension of the result of the expression can be reduced by computing given quantiles, and if only one country is evaluated, the first dimension is removed. In addition, with an exception of functions pop.byage.plot
, pop.byage.table
, cohorts
, and pop.cohorts.plot
, the expression should be constructed in a way that the age dimension is eliminated. This can be done for example by using brackets to define age, by using the apply
function or one of the pre-defined functions described below. When using within pop.byage.plot
, pop.byage.table
, cohorts
, or pop.cohorts.plot
, the expression MUST include curly braces.
While get.pop
can be used to obtain results of a basic component, functions get.pop.ex
and get.pop.exba
evaluate whole expressions.
The following functions can be used within an expression:
gmedian(f, cat)
It gives a median for grouped data with frequencies f
and categories cat
. This function is to be used in combination with apply
or pop.apply
(see below) along the age dimension. For example,
“apply(P380{}, c(1,3,4), gmedian, cats=seq(0, by=5, length=28))”
is an expression for median age in Italy. (See pop.apply
below for a simplified version.)
gmean(f, cat)
Works like gmedian
but gives the grouped mean.
age.func(data, fun="*")
This function applies fun
to data
and the corresponding age (the middle point of each age category). The default case would multiply data by the corresponding age. As gmedian
, it is to be used in combination with apply
or pop.apply
.
drop.age(data)
Drops the age dimension of the data. For example, if two basic components are combined where one is used within the apply
function, the other will need to change its dimension in order to have conformable arrays. For example,
“apply(age.func(P752{}), c(1,3,4), sum) / drop.age(P752)”
is an expression for the average age in Sweden. (See pop.apply
below for a simplified version.)
pop.apply(data, fun, ..., split.along=c("None", "age", "traj", "country"))
By default applies function fun
to the age dimension of data
and converts the result into the same format as returned by a basic component. This allows combining the apply
function with other basic components without having to modify their dimensions. For example,
“pop.apply(age.func(P752{}), fun=sum) / P752” gives the average age in Sweden, or
“pop.apply(P380{}, gmedian, cats=seq(0, by=5, length=28))” gives the median age of Italy.
If slice.along
is not ‘None’, it can be used as an apply
function where the data is sliced along one axis.
pop.combine(data1, data2, fun, ..., split.along=c("age", "traj", "country"))
Can be used if two basic components should be combined that result in different shapes. It tries to put data into the right format and calls pop.apply
. For example,
“pop.combine(PIND{}, PIND, '/')” give population by age per total population in India, or
“pop.combine(BFR - DFR, GFR, '+', split.along='traj')” gives births minus deaths plus net migration in France. Here, pop.combine
is necessary, because ‘GFR’ is a deterministic component and thus, has only one trajectory, whereas births and deaths are probabilistic.
age.index01(end)
Can be used with indicators ‘S’, ‘M’, ‘Q’ and ‘E’ only. It returns an array of age group indices that include ages 0-1 and 1-4 and exclude 0-4. The last age index is end
.
age.index05(end)
Returns an array of age group indices starting with group 0-4, 5-9 until the age group corresponding to index end
.
There is also a help function available that generates an expression for the mean age of childbearing, see mac.expression
.
The expression parser is simple and far from being perfect. We recommend to leave spaces around the basic components.
Hana Sevcikova, Adrian Raftery
H. Sevcikova, A. E. Raftery (2016). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
mac.expression
, get.pop
, pop.trajectories.plot
, pop.map
, write.pop.projection.summary
.
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir, write.to.cache=FALSE) # median age of women in child-bearing ages in Netherlands and all countries - trajectories pop.trajectories.plot(pred, nr.traj=0, expression="pop.apply(P528_F{4:10}, gmedian, cats= seq(15, by=5, length=8))") ## Not run: pop.trajectories.plotAll(pred, nr.traj=0, expression="pop.apply(PXXX_F{4:10}, gmedian, cats= seq(15, by=5, length=8))") ## End(Not run) # mean age of women in child-bearing ages in Netherlands - table pop.trajectories.table(pred, expression="pop.apply(age.func(P528_F{4:10}), fun=sum) / P528_F[4:10]") # - gives the same results as with "pop.apply(P528_F{4:10}, gmean, cats=seq(15, by=5, length=8))" # - for the mean age of childbearing, see ?mac.expression # migration per capita by age pop.byage.plot(pred, expression="GNL{} / PNL{}", year=2000) ## Not run: # potential support ratio - map (with the two countries # contained in pred object) pop.map(pred, expression="PXXX[5:13] / PXXX[14:27]") ## End(Not run) # proportion of 0-4 years old to whole population - export to an ASCII file dir <- tempfile() write.pop.projection.summary(pred, expression="PXXX[1] / PXXX", output.dir=dir) unlink(dir) ## Not run: # These are vital events only available if keep.vital.events=TRUE in pop.predict, e.g. # sim.dir.tmp <- tempfile() # pred <- pop.predict(countries="Netherlands", nr.traj=3, # keep.vital.events=TRUE, output.dir=sim.dir.tmp) # log female mortality rate by age for Netherlands in 2050, including 0-1 and 1-4 age groups pop.byage.plot(pred, expression="log(MNL_F{age.index01(27)})", year=2050) # trajectories of male 1q0 and table of 5q0 for Netherlands pop.trajectories.plot(pred, expression="QNLD_M[-1]") pop.trajectories.table(pred, expression="QNLD_M[1]") # unlink(sim.dir.tmp) ## End(Not run)
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir, write.to.cache=FALSE) # median age of women in child-bearing ages in Netherlands and all countries - trajectories pop.trajectories.plot(pred, nr.traj=0, expression="pop.apply(P528_F{4:10}, gmedian, cats= seq(15, by=5, length=8))") ## Not run: pop.trajectories.plotAll(pred, nr.traj=0, expression="pop.apply(PXXX_F{4:10}, gmedian, cats= seq(15, by=5, length=8))") ## End(Not run) # mean age of women in child-bearing ages in Netherlands - table pop.trajectories.table(pred, expression="pop.apply(age.func(P528_F{4:10}), fun=sum) / P528_F[4:10]") # - gives the same results as with "pop.apply(P528_F{4:10}, gmean, cats=seq(15, by=5, length=8))" # - for the mean age of childbearing, see ?mac.expression # migration per capita by age pop.byage.plot(pred, expression="GNL{} / PNL{}", year=2000) ## Not run: # potential support ratio - map (with the two countries # contained in pred object) pop.map(pred, expression="PXXX[5:13] / PXXX[14:27]") ## End(Not run) # proportion of 0-4 years old to whole population - export to an ASCII file dir <- tempfile() write.pop.projection.summary(pred, expression="PXXX[1] / PXXX", output.dir=dir) unlink(dir) ## Not run: # These are vital events only available if keep.vital.events=TRUE in pop.predict, e.g. # sim.dir.tmp <- tempfile() # pred <- pop.predict(countries="Netherlands", nr.traj=3, # keep.vital.events=TRUE, output.dir=sim.dir.tmp) # log female mortality rate by age for Netherlands in 2050, including 0-1 and 1-4 age groups pop.byage.plot(pred, expression="log(MNL_F{age.index01(27)})", year=2050) # trajectories of male 1q0 and table of 5q0 for Netherlands pop.trajectories.plot(pred, expression="QNLD_M[-1]") pop.trajectories.table(pred, expression="QNLD_M[1]") # unlink(sim.dir.tmp) ## End(Not run)
Generates a world map of various population measures for a given quantile and a projection or observed period, using different techniques: pop.map
use rworldmap, pop.ggmap
uses ggplot2, and pop.map.gvis
creates an interactive map via GoogleVis.
pop.map(pred, sex = c("both", "male", "female"), age = "all", expression = NULL, ...) pop.ggmap(pred, sex=c('both', 'male', 'female'), age='all', expression=NULL, ...) get.pop.map.parameters(pred, expression = NULL, sex = c("both", "male", "female"), age = "all", range = NULL, nr.cats = 50, same.scale = TRUE, quantile = 0.5, ...) pop.map.gvis(pred, ...)
pop.map(pred, sex = c("both", "male", "female"), age = "all", expression = NULL, ...) pop.ggmap(pred, sex=c('both', 'male', 'female'), age='all', expression=NULL, ...) get.pop.map.parameters(pred, expression = NULL, sex = c("both", "male", "female"), age = "all", range = NULL, nr.cats = 50, same.scale = TRUE, quantile = 0.5, ...) pop.map.gvis(pred, ...)
pred |
Object of class |
sex |
One of “both” (default), “male” or “female”. By default the male and female counts are summed up. This argument is only used if |
age |
Either a character string “all” (default) or an integer vector of age indices. Value 1 corresponds to age 0-4, value 2 corresponds to age 5-9 etc. Last age goup |
expression |
Expression defining the population measure to be plotted. For syntax see |
range |
Range of the population measure to be displayed. It is of the form |
nr.cats |
Number of color categories. |
same.scale |
Logical controlling if maps for all years of this prediction object should be on the same color scale. |
quantile |
Quantile for which the map should be generated. It must be equal to one of the values in |
... |
Additional arguments passed to the underlying functions. In |
pop.map
creates a single map for the given time period and quantile. If the package fields is installed, a color bar legend at the botom of the map is created.
Function get.pop.map.parameters
can be used in combination with pop.map
. It sets breakpoints for the color scheme.
Function pop.ggmap
is similar to pop.map
, but uses the ggplot2 package in combination with the geom_sf
function.
Function pop.map.gvis
creates an interactive map using the googleVis package and opens it in an internet browser. It also generates a table of the mapped values that can be sorted by columns interactively in the browser.
get.pop.map.parameters
returns a list with elements:
pred |
The object of class |
quantile |
Value of the argument |
catMethod |
If the argument |
numCats |
Number of categories. |
coulourPalette |
Subset of the rainbow palette, starting from dark blue and ending at red. |
... |
Additional arguments passed to the function. |
If the expression
argument or a non-standard combination of sex and age is used, quantiles are computed on the fly. In such a case, trajectory files for all countries have to be loaded from disk, which can be quite time expensive. Therefore a simple caching mechanism was added to the prediction object which allows re-using data from previously used expressions. The prediction object points to an environment called cache
which is a collection of data arrays that are results of evaluating expressions. The space-trimmed expressions are the names of the cache
entries. Every time a map function is called, it is checked if the corresponding expression is contained in the cache
. If it is not the case, the quantiles are computed on the fly, otherwise the existing values are taken.
When computing on the fly, the function tries to process it in parallel if possible, using the package parallel. In such a case, the computation is split into nodes where
is either the number of cores detected automatically (default), or the value of
getOption("cl.cores")
. Use options(cl.cores=n)
to modify the default. If a sequential processing is desired, set cl.cores
to 1.
The cache data are also stored on disk, namely in the simulation directory of the prediction object. By default, every update of the cache in memory is also updated on the disk. Thus, data expression results can be re-used in multiple R sessions. Function pop.cleanup.cache
deletes the content of the cache. This behaviour can be turned off by setting the argument write.to.cache=FALSE
in the get.pop.prediction
function. We use this settings in the examples throughout this manual whenever the example data from the installation directory is used, in order to prevent writing into the installation directory.
Hana Sevcikova
## Not run: ########################## # This example only makes sense if there is a simulation # for all countries. Below, only two countries are included, # so the map is useless. ########################## sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir=sim.dir, write.to.cache=FALSE) # Using ggplot2 pop.ggmap(pred) pop.ggmap(pred, year = 2100) # Using rworldmap # Uses heat colors with seven categories by default pop.map(pred, sex="female", age=4:10) # Female population in child-bearing age as a proportion of totals pop.map(pred, expression="PXXX_F[4:10] / PXXX") # The same with more colors params <- get.pop.map.parameters(pred, expression="PXXX_F[4:10] / PXXX") do.call("pop.map", params) # Another projection year on the same color scale do.call("pop.map", c(list(year=2043), params)) # Interactive map of potential support ratio (requires Flash) pop.map.gvis(pred, expression="PXXX[5:13] / PXXX[14:27]") ## End(Not run)
## Not run: ########################## # This example only makes sense if there is a simulation # for all countries. Below, only two countries are included, # so the map is useless. ########################## sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir=sim.dir, write.to.cache=FALSE) # Using ggplot2 pop.ggmap(pred) pop.ggmap(pred, year = 2100) # Using rworldmap # Uses heat colors with seven categories by default pop.map(pred, sex="female", age=4:10) # Female population in child-bearing age as a proportion of totals pop.map(pred, expression="PXXX_F[4:10] / PXXX") # The same with more colors params <- get.pop.map.parameters(pred, expression="PXXX_F[4:10] / PXXX") do.call("pop.map", params) # Another projection year on the same color scale do.call("pop.map", c(list(year=2043), params)) # Interactive map of potential support ratio (requires Flash) pop.map.gvis(pred, expression="PXXX[5:13] / PXXX[14:27]") ## End(Not run)
The function generates trajectories of probabilistic population projection for all countries for which input data is available, or any subset of them.
pop.predict(end.year = 2100, start.year = 1950, present.year = 2020, wpp.year = 2019, countries = NULL, output.dir = file.path(getwd(), "bayesPop.output"), annual = FALSE, inputs = list(popM=NULL, popF=NULL, mxM=NULL, mxF=NULL, srb=NULL, pasfr=NULL, patterns=NULL, migM=NULL, migF=NULL, migMt=NULL, migFt=NULL, mig=NULL, mig.fdm = NULL, e0F.file=NULL, e0M.file=NULL, tfr.file=NULL, e0F.sim.dir=NULL, e0M.sim.dir=NULL, tfr.sim.dir=NULL, migMtraj = NULL, migFtraj = NULL, migtraj = NULL, migFDMtraj = NULL, GQpopM = NULL, GQpopF = NULL, average.annual = NULL), nr.traj = 1000, keep.vital.events = FALSE, fixed.mx = FALSE, fixed.pasfr = FALSE, lc.for.hiv = TRUE, lc.for.all = TRUE, mig.is.rate = FALSE, mig.age.method = c("auto", "fdmp", "fdmnop", "rc"), mig.rc.fam = NULL, my.locations.file = NULL, replace.output = FALSE, verbose = TRUE, ...)
pop.predict(end.year = 2100, start.year = 1950, present.year = 2020, wpp.year = 2019, countries = NULL, output.dir = file.path(getwd(), "bayesPop.output"), annual = FALSE, inputs = list(popM=NULL, popF=NULL, mxM=NULL, mxF=NULL, srb=NULL, pasfr=NULL, patterns=NULL, migM=NULL, migF=NULL, migMt=NULL, migFt=NULL, mig=NULL, mig.fdm = NULL, e0F.file=NULL, e0M.file=NULL, tfr.file=NULL, e0F.sim.dir=NULL, e0M.sim.dir=NULL, tfr.sim.dir=NULL, migMtraj = NULL, migFtraj = NULL, migtraj = NULL, migFDMtraj = NULL, GQpopM = NULL, GQpopF = NULL, average.annual = NULL), nr.traj = 1000, keep.vital.events = FALSE, fixed.mx = FALSE, fixed.pasfr = FALSE, lc.for.hiv = TRUE, lc.for.all = TRUE, mig.is.rate = FALSE, mig.age.method = c("auto", "fdmp", "fdmnop", "rc"), mig.rc.fam = NULL, my.locations.file = NULL, replace.output = FALSE, verbose = TRUE, ...)
end.year |
End year of the projection. |
start.year |
First year of the historical data. |
present.year |
Year for which initial population data is to be used. |
wpp.year |
Year for which WPP data is used. The functions loads a package called wpp |
countries |
Array of country codes or country names for which a projection is generated. If it is |
output.dir |
Output directory of the projection. If there is an existing projection in |
annual |
Logical. If |
inputs |
A list of file names where input data is stored. It contains the following elements (Unless otherwise noted, these are tab delimited ASCII files; Names of default datasets from the corresponding wpp package which are used if the corresponding element is
|
nr.traj |
Number of trajectories to be generated. If this number is smaller than the number of available trajectories of the probabilistic components (TFR, life expectancy and migration), the trajectories are equidistantly thinned.
If all of those components contain less trajectories than |
keep.vital.events |
Logical. If |
fixed.mx |
Logical. If |
fixed.pasfr |
Logical. If |
lc.for.hiv |
Logical controlling if the modified Lee-Carter method should be used
for projection of mortality rates for countries with HIV epidemics. If |
lc.for.all |
Logical controlling if the modified Lee-Carter method should be used
for projection of mortality rates for all countries. If |
mig.is.rate |
Logical determining if migration data are to be interpreted as net migration rates ( |
mig.age.method |
If migration is given as totals, this argument determines a method to disaggregate into age-specific migration. The “rc” method uses a simple Rogers-Castro disaggregation, via the function Values “fdmp” and “fdmnop” trigger the Flow Difference Method (Sevcikova et al, 2024), where “fdmp” weights the flows by population, while “fdmnop” is an unweighted version. They both split the total net migration into total in- and out-migration and then disaggregate these flows separately. These two FDM methods use additional inputs in the The “auto” method (default) uses “rc” if sex-specific migration totals are given, i.e. in |
mig.rc.fam |
Data frame providing a single family of Rogers-Castro parameters to be used if |
my.locations.file |
Name of a tab-delimited ascii file with a set of all locations for which a projection is generated. Use this argument if you are projecting for a country/region that is not included in the standard |
replace.output |
Logical. If |
verbose |
Logical controlling the amount of output messages. |
... |
Additional arguments passed to the underlying function. These can be |
The population projection is computed using the cohort component method and is based on an algorithm used by the United Nation Population Division (see also Sevcikova et al (2016b) in the References below). For each country, one projection is calculated for each trajectory of male and female life expectancy, TFR and possibly migration. This results in a set of trajectories of population projection which forms its posterior distribution. The trajectories of life expectancy and TFR can be given either in its binary form generated by the packages bayesLife and bayesTFR, respectively (as directories e0M.sim.dir
, e0F.sim.dir
, tfr.sim.dir
of the inputs
argument), or they can be given as ASCII tables in csv format, see above. The number of trajectories for male and female life expectancy must match, as does for male and female migration.
The projection is generated sequentially location by location. Results are stored in a sub-directory of output.dir
called ‘prediction’. There is one binary file per location, called ‘totpop_country.rda’, where
is the country code. It contains six objects:
totp
, totpf
, totpm
(trajectories of total population, age-specific female and age-specific male, respectively), totp.hch
, totpf.hch
, totpm.hch
(the UN half-child variant for total population, age-specific female and age-specific male, respectively). Optionally, if keep.vital.events
is TRUE
, there is an additional file per country, called ‘vital_events_country.rda’, containing the following objects:
btm
, btf
(trajectories for births by age of mothers for male and female child, respectively), deathsm
, deathsf
(trajectories for age-specific male and female deaths, respectively), asfert
(trajectories of age-specific fertility), mxm
, mxf
(trajectories of male and female age-specific mortality rates), migm
, migf
(if used, these are trajectories of male and female age-specific migration), btm.hch
, btf.hch
, deathsm.hch
, deathsf.hch
, asfert.hch
, mxm.hch
, mxf.hch
(the UN half-child variant for age- and sex-specific births, deaths, fertility rates and mortality rates). An object of class bayesPop.prediction
is stored in the same directory in a file ‘prediction.rda’. It is updated every time a country projection is finished.
See pop.trajectories
for extracting trajectories.
To access a previously stored prediction object, use get.pop.prediction
.
Object of class bayesPop.prediction
with the following elements:
base.directory |
Full path to the base directory |
output.directory |
Sub-directory relative to |
nr.traj |
The actual number of trajectories of the projections. |
quantiles |
Three-dimensional array of projection quantiles (countries x number of quantiles x projection periods). The second dimension corresponds to the following quantiles: |
traj.mean.sd |
Three-dimensional array of projection mean and standard deviation (countries x 2 x projection periods). First and second matrix of the second dimension, respectively, is the mean and standard deviation, respectively. |
quantilesM , quantilesF
|
Quantiles of male and female projection, respectively. Same structure as |
traj.mean.sdM , traj.mean.sdF
|
Same as |
quantilesMage , quantilesFage
|
Four-dimensional array of age-specific quantiles of male and female projection, respectively (countries x age groups x number of quantiles x projection periods). The same quantiles are used as in |
quantilesPropMage , quantilesPropFage
|
Array of age-specific quantiles of male and female projection, respectively, divided by the total population. The dimensions are the same as in |
estim.years |
Vector of time for which historical data was used in the projections. |
proj.years |
Vector of projection time periods starting with the present period. |
wpp.year |
The wpp year used. |
inputs |
List of input data used for the projection. |
function.inputs |
Content of the |
countries |
Matrix of countries for which projection exists. It contains two columns: |
ages |
Vector of age groups. |
annual |
If |
cache |
This component is added by |
write.to.cache |
Logical determining if |
is.aggregation |
Logical determining if this object is a result of |
Hana Sevcikova, Thomas Buettner, based on code of Nan Li and helpful comments from Patrick Gerland
H. Sevcikova, A. E. Raftery (2016a). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
A. E. Raftery, N. Li, H. Sevcikova , P. Gerland, G. K. Heilig (2012). Bayesian probabilistic population projections for all countries. Proceedings of the National Academy of Sciences 109:13915-13921.
P. Gerland, A. E. Raftery, H. Sevcikova, N. Li, D. Gu, T. Spoorenberg, L. Alkema, B. K. Fosdick, J. L. Chunn, N. Lalic, G. Bay, T. Buettner, G. K. Heilig, J. Wilmoth (2014). World Population Stabilization Unlikely This Century. Science 346:234-237.
H. Sevcikova, N. Li, V. Kantorova, P. Gerland and A. E. Raftery (2016b). Age-Specific Mortality and Fertility Rates for Probabilistic Population Projections. In: Dynamic Demographic Analysis, ed. Schoen R. (Springer), pp. 285-310. Earlier version in arXiv:1503.05215.
H. Sevcikova, J. Raymer J., A. E. Raftery (2024). Forecasting Net Migration By Age: The Flow-Difference Approach. arXiv:2411.09878.
pop.trajectories.plot
, pop.pyramid
, pop.trajectories
, get.pop.prediction
, age.specific.migration
## Not run: sim.dir <- tempfile() # Countries can be given as a combination of numerical codes and names pred <- pop.predict(countries=c("Netherlands", 218, "Madagascar"), nr.traj=3, output.dir=sim.dir) pop.trajectories.plot(pred, "Ecuador", sum.over.ages=TRUE) unlink(sim.dir, recursive=TRUE) ## End(Not run)
## Not run: sim.dir <- tempfile() # Countries can be given as a combination of numerical codes and names pred <- pop.predict(countries=c("Netherlands", 218, "Madagascar"), nr.traj=3, output.dir=sim.dir) pop.trajectories.plot(pred, "Ecuador", sum.over.ages=TRUE) unlink(sim.dir, recursive=TRUE) ## End(Not run)
Generates trajectories of probabilistic population projection for subregions of a given country.
pop.predict.subnat(end.year = 2060, start.year = 1950, present.year = 2020, wpp.year = 2019, output.dir = file.path(getwd(), "bayesPop.output"), locations = NULL, default.country = NULL, annual = FALSE, inputs = list( popM = NULL, popF = NULL, mxM = NULL, mxF = NULL, srb = NULL, pasfr = NULL, patterns = NULL, migM = NULL, migF = NULL, migMt = NULL, migFt = NULL, mig = NULL, mig.fdm = NULL, e0F.file = NULL, e0M.file = NULL, tfr.file = NULL, e0F.sim.dir = NULL, e0M.sim.dir = NULL, tfr.sim.dir = NULL, migMtraj = NULL, migFtraj = NULL, migtraj = NULL, migFDMtraj = NULL, GQpopM = NULL, GQpopF = NULL, average.annual = NULL ), nr.traj = 1000, keep.vital.events = FALSE, fixed.mx = FALSE, fixed.pasfr = FALSE, lc.for.all = TRUE, mig.is.rate = FALSE, mig.age.method = c("rc", "fdmp", "fdmnop"), mig.rc.fam = NULL, pasfr.ignore.phase2 = FALSE, replace.output = FALSE, verbose = TRUE)
pop.predict.subnat(end.year = 2060, start.year = 1950, present.year = 2020, wpp.year = 2019, output.dir = file.path(getwd(), "bayesPop.output"), locations = NULL, default.country = NULL, annual = FALSE, inputs = list( popM = NULL, popF = NULL, mxM = NULL, mxF = NULL, srb = NULL, pasfr = NULL, patterns = NULL, migM = NULL, migF = NULL, migMt = NULL, migFt = NULL, mig = NULL, mig.fdm = NULL, e0F.file = NULL, e0M.file = NULL, tfr.file = NULL, e0F.sim.dir = NULL, e0M.sim.dir = NULL, tfr.sim.dir = NULL, migMtraj = NULL, migFtraj = NULL, migtraj = NULL, migFDMtraj = NULL, GQpopM = NULL, GQpopF = NULL, average.annual = NULL ), nr.traj = 1000, keep.vital.events = FALSE, fixed.mx = FALSE, fixed.pasfr = FALSE, lc.for.all = TRUE, mig.is.rate = FALSE, mig.age.method = c("rc", "fdmp", "fdmnop"), mig.rc.fam = NULL, pasfr.ignore.phase2 = FALSE, replace.output = FALSE, verbose = TRUE)
end.year |
End year of the projection. |
start.year |
First year of the historical data on mortality rates. It determines the length of the historical time series used in the Lee-Carter estimation. |
present.year |
Year for which initial population data is to be used. |
wpp.year |
Year for which WPP data is used. The function loads a package called wpp |
output.dir |
Output directory of the projection. |
locations |
Name of a tab-delimited file that contains definitions of the subregions. It has a similar structure as |
default.country |
Numerical code of a country to which the subregions belong to. It is used for extracting default datasets from the wpp package if some region-specific input datasets are missing. Alternatively, it can be also included in the |
annual |
Logical. If |
inputs |
A list of file names where input data is stored. Unless otherwise noted, these are tab delimited ASCII files with a mandatory column
|
nr.traj , keep.vital.events , fixed.mx , fixed.pasfr , lc.for.all , mig.is.rate , mig.age.method , mig.rc.fam , replace.output , verbose
|
These arguments have the same meaning as in |
pasfr.ignore.phase2 |
Logical. If |
Population projection for subnational units (regions) is performed by applying the cohort component method to subnational datasets on projected fertility (TFR), mortality and net migration, starting from given sex- and age-specific population counts. The only required inputs are the initial sex- and age-specific population counts in each region (popM
and popF
elements of the inputs
argument) and a file with a set of locations (argument locations
). If no other input datasets are given, those datasets are replaced by the corresponding "national" values, taken from the corresponding wpp package. The argument default.country
determines the country for those default "national" values. The default country can be also included in the locations
file as a record with location.type
being set to 0.
The TFR component can be given as a set of trajectories generated using the tfr.predict.subnat
function of the bayesTFR package (tfr.sim.dir
element). Alternatively, trajectories can be given in an ASCII file (tfr.file
).
Similarly, the $e_0$ component can be given as a set of trajectories using the e0.predict.subnat
function of the bayesLife package (e0F.sim.dir
element). If male projections are generated jointly (i.e. predict.jmale = TRUE
), set e0M.sim.dir = "joint_"
. Alternatively, trajectories can be given in an ASCII files (e0F.file
, e0M.file
).
Having a set of subnational TFR and $e_0$ trajectories, the cohort component method is applied to each of them to yield a distribution of future subnational population.
Projection of net migration can either be given as disaggregated sex- and age-specific datasets (migM
and migF
), or as sex totals (migMt
and migFt
), or as totals (mig
), or as sex- and age-specific trajectories (migMtraj
and migFtraj
), or as total trajectories (migtraj
). Alternatively, it can be given as shares between regions as columns in the patterns
dataset. These are: inmigrationM_share
, inmigrationF_share
, outmigrationM_share
, outmigrationF_share
. The sex specification and/or direction specification (in/out) can be omitted, e.g. it can be simply migration_share
. The function extracts the values of net migration projection on the national level and distributes it to regions according to the given shares. For positive (national) values, it uses the in-migration shares; for negative values it uses the out-migration shares. If the in/out prefix is omitted in the column names, the given migartion shares are used for both, positive and negative net migration projection. By default, if no migration datasets neither region-specific shares are given, the distribution between regions is proportional to the size of population. The age-specific schedules follow by default the Rogers-Castro age schedules. Note that when handling migration using shares as described here, it only affects the distribution of international migration into regions. It does not take into account between-region migration.
The package contains example datasets for Canada. Use these as templates for your own data. See Example below.
Object of class bayesPop.prediction
containing the subnational projections. Note that this object can be used in the various bayesPop functions exactly the same way as an object with national projections. However, the meaning of the argument country
in many of these functions (e.g. in pop.trajectories.plot
) changes to an identification of the region (either as a numerical code or name as defined in the locations
file).
We are greatful to Patrice Dion from Statistics Canada for providing us with example data. Note that the example datasets included in the package are not official STATCAN data - they only serve the purpose of illustration and templates. Data for the time period 2015-2020 has been imputed by the author.
Hana Sevcikova
pop.predict
, tfr.predict.subnat
, pop.aggregate.subnat
## Not run: # Subnational projections for Canada ######### data.dir <- file.path(find.package("bayesPop"), "extdata") # Use national data for tfr and e0 ### sim.dir <- tempfile() pred <- pop.predict.subnat(output.dir = sim.dir, locations = file.path(data.dir, "CANlocations.txt"), inputs = list(popM = file.path(data.dir, "CANpopM.txt"), popF = file.path(data.dir, "CANpopF.txt"), tfr.file = "median_" ), verbose = TRUE) pop.trajectories.plot(pred, "Alberta", sum.over.ages = TRUE) unlink(sim.dir, recursive=TRUE) # Use subnational probabilistic TFR simulation ### # Subnational TFR projections for Canada (from ?tfr.predict.subnat) my.subtfr.file <- file.path(find.package("bayesTFR"), 'extdata', 'subnational_tfr_template.txt') tfr.nat.dir <- file.path(find.package("bayesTFR"), "ex-data", "bayesTFR.output") tfr.reg.dir <- tempfile() tfr.preds <- tfr.predict.subnat(124, my.tfr.file = my.subtfr.file, sim.dir = tfr.nat.dir, output.dir = tfr.reg.dir, start.year = 2013) # Use subnational probabilistic e0 ### # Subnational e0 projections for Canada (from ?e0.predict.subnat) # (here using the same female and male data, just for illustration) my.sube0.file <- file.path(find.package("bayesLife"), 'extdata', 'subnational_e0_template.txt') e0.nat.dir <- file.path(find.package("bayesLife"), "ex-data", "bayesLife.output") e0.reg.dir <- tempfile() e0.preds <- e0.predict.subnat(124, my.e0.file = my.sube0.file, sim.dir = e0.nat.dir, output.dir = e0.reg.dir, start.year = 2018, predict.jmale = TRUE, my.e0M.file = my.sube0.file) # Population projections sim.dir <- tempfile() pred <- pop.predict.subnat(output.dir = sim.dir, locations = file.path(data.dir, "CANlocations.txt"), inputs = list(popM = file.path(data.dir, "CANpopM.txt"), popF = file.path(data.dir, "CANpopF.txt"), patterns = file.path(data.dir, "CANpatterns.txt"), tfr.sim.dir = file.path(tfr.reg.dir, "subnat", "c124"), e0F.sim.dir = file.path(e0.reg.dir, "subnat_ar1", "c124"), e0M.sim.dir = "joint_" ), verbose = TRUE) pop.trajectories.plot(pred, "Alberta", sum.over.ages = TRUE) pop.pyramid(pred, "Manitoba", year = 2050) get.countries.table(pred) # Aggregate to country level aggr <- pop.aggregate.subnat(pred, regions = 124, locations = file.path(data.dir, "CANlocations.txt")) pop.trajectories.plot(aggr, "Canada", sum.over.ages = TRUE) unlink(sim.dir, recursive = TRUE) unlink(tfr.reg.dir, recursive = TRUE) unlink(e0.reg.dir, recursive = TRUE) ## End(Not run)
## Not run: # Subnational projections for Canada ######### data.dir <- file.path(find.package("bayesPop"), "extdata") # Use national data for tfr and e0 ### sim.dir <- tempfile() pred <- pop.predict.subnat(output.dir = sim.dir, locations = file.path(data.dir, "CANlocations.txt"), inputs = list(popM = file.path(data.dir, "CANpopM.txt"), popF = file.path(data.dir, "CANpopF.txt"), tfr.file = "median_" ), verbose = TRUE) pop.trajectories.plot(pred, "Alberta", sum.over.ages = TRUE) unlink(sim.dir, recursive=TRUE) # Use subnational probabilistic TFR simulation ### # Subnational TFR projections for Canada (from ?tfr.predict.subnat) my.subtfr.file <- file.path(find.package("bayesTFR"), 'extdata', 'subnational_tfr_template.txt') tfr.nat.dir <- file.path(find.package("bayesTFR"), "ex-data", "bayesTFR.output") tfr.reg.dir <- tempfile() tfr.preds <- tfr.predict.subnat(124, my.tfr.file = my.subtfr.file, sim.dir = tfr.nat.dir, output.dir = tfr.reg.dir, start.year = 2013) # Use subnational probabilistic e0 ### # Subnational e0 projections for Canada (from ?e0.predict.subnat) # (here using the same female and male data, just for illustration) my.sube0.file <- file.path(find.package("bayesLife"), 'extdata', 'subnational_e0_template.txt') e0.nat.dir <- file.path(find.package("bayesLife"), "ex-data", "bayesLife.output") e0.reg.dir <- tempfile() e0.preds <- e0.predict.subnat(124, my.e0.file = my.sube0.file, sim.dir = e0.nat.dir, output.dir = e0.reg.dir, start.year = 2018, predict.jmale = TRUE, my.e0M.file = my.sube0.file) # Population projections sim.dir <- tempfile() pred <- pop.predict.subnat(output.dir = sim.dir, locations = file.path(data.dir, "CANlocations.txt"), inputs = list(popM = file.path(data.dir, "CANpopM.txt"), popF = file.path(data.dir, "CANpopF.txt"), patterns = file.path(data.dir, "CANpatterns.txt"), tfr.sim.dir = file.path(tfr.reg.dir, "subnat", "c124"), e0F.sim.dir = file.path(e0.reg.dir, "subnat_ar1", "c124"), e0M.sim.dir = "joint_" ), verbose = TRUE) pop.trajectories.plot(pred, "Alberta", sum.over.ages = TRUE) pop.pyramid(pred, "Manitoba", year = 2050) get.countries.table(pred) # Aggregate to country level aggr <- pop.aggregate.subnat(pred, regions = 124, locations = file.path(data.dir, "CANlocations.txt")) pop.trajectories.plot(aggr, "Canada", sum.over.ages = TRUE) unlink(sim.dir, recursive = TRUE) unlink(tfr.reg.dir, recursive = TRUE) unlink(e0.reg.dir, recursive = TRUE) ## End(Not run)
Functions for plotting probabilistic population pyramid. pop.pyramid
creates a classic pyramid using rectangles; pop.trajectories.pyramid
creates one or more pyramids using vertical lines (possibly derived from population trajectories). They can be used to view a prediction object created with this package, or any user-defined sex- and age-specific dataset. For the latter, function get.bPop.pyramid
should be used to translate user-defined data into a bayesPop.pyramid
object.
## S3 method for class 'bayesPop.prediction' pop.pyramid(pop.object, country, year = NULL, indicator = c("P", "B", "D"), pi = c(80, 95), proportion = FALSE, age = NULL, plot = TRUE, pop.max = NULL, ...) ## S3 method for class 'bayesPop.pyramid' pop.pyramid(pop.object, main = NULL, show.legend = TRUE, pyr1.par = list(border="black", col=NA, density=NULL, height=0.9), pyr2.par = list(density = -1, height = 0.3), show.birth.year = FALSE, col.pi = NULL, ann = par("ann"), axes = TRUE, grid = TRUE, cex.main = 0.9, cex.sub = 0.9, cex = 0.8, cex.axis = 0.8, ...) pop.pyramidAll(pop.pred, year = NULL, output.dir = file.path(getwd(), "pop.pyramid"), output.type = "png", one.file = FALSE, verbose = FALSE, ...) ## S3 method for class 'bayesPop.prediction' pop.trajectories.pyramid(pop.object, country, year = NULL, indicator = c("P", "B", "D"), pi = c(80, 95), nr.traj = NULL, proportion = FALSE, age = NULL, plot = TRUE, pop.max = NULL, ...) ## S3 method for class 'bayesPop.pyramid' pop.trajectories.pyramid(pop.object, main = NULL, show.legend = TRUE, show.birth.year = FALSE, col = rainbow, col.traj = "#00000020", omit.page.pars = FALSE, lwd = 2, ann = par("ann"), axes = TRUE, grid = TRUE, cex.main = 0.9, cex.sub = 0.9, cex = 0.8, cex.axis = 0.8, ...) pop.trajectories.pyramidAll(pop.pred, year = NULL, output.dir = file.path(getwd(), "pop.traj.pyramid"), output.type = "png", one.file = FALSE, verbose = FALSE, ...) ## S3 method for class 'bayesPop.pyramid' plot(x, ...) ## S3 method for class 'bayesPop.prediction' get.bPop.pyramid(data, country, year = NULL, indicator = c("P", "B", "D"), pi = c(80, 95), proportion = FALSE, age = NULL, nr.traj = 0, sort.pi=TRUE, pop.max = NULL, ...) ## S3 method for class 'data.frame' get.bPop.pyramid(data, main.label = NULL, legend = "observed", is.proportion = FALSE, ages = NULL, pop.max = NULL, LRmain = c("Male", "Female"), LRcolnames = c("male", "female"), CI = NULL, ...) ## S3 method for class 'matrix' get.bPop.pyramid(data, ...) ## S3 method for class 'list' get.bPop.pyramid(data, main.label = NULL, legend = NULL, CI = NULL, ...)
## S3 method for class 'bayesPop.prediction' pop.pyramid(pop.object, country, year = NULL, indicator = c("P", "B", "D"), pi = c(80, 95), proportion = FALSE, age = NULL, plot = TRUE, pop.max = NULL, ...) ## S3 method for class 'bayesPop.pyramid' pop.pyramid(pop.object, main = NULL, show.legend = TRUE, pyr1.par = list(border="black", col=NA, density=NULL, height=0.9), pyr2.par = list(density = -1, height = 0.3), show.birth.year = FALSE, col.pi = NULL, ann = par("ann"), axes = TRUE, grid = TRUE, cex.main = 0.9, cex.sub = 0.9, cex = 0.8, cex.axis = 0.8, ...) pop.pyramidAll(pop.pred, year = NULL, output.dir = file.path(getwd(), "pop.pyramid"), output.type = "png", one.file = FALSE, verbose = FALSE, ...) ## S3 method for class 'bayesPop.prediction' pop.trajectories.pyramid(pop.object, country, year = NULL, indicator = c("P", "B", "D"), pi = c(80, 95), nr.traj = NULL, proportion = FALSE, age = NULL, plot = TRUE, pop.max = NULL, ...) ## S3 method for class 'bayesPop.pyramid' pop.trajectories.pyramid(pop.object, main = NULL, show.legend = TRUE, show.birth.year = FALSE, col = rainbow, col.traj = "#00000020", omit.page.pars = FALSE, lwd = 2, ann = par("ann"), axes = TRUE, grid = TRUE, cex.main = 0.9, cex.sub = 0.9, cex = 0.8, cex.axis = 0.8, ...) pop.trajectories.pyramidAll(pop.pred, year = NULL, output.dir = file.path(getwd(), "pop.traj.pyramid"), output.type = "png", one.file = FALSE, verbose = FALSE, ...) ## S3 method for class 'bayesPop.pyramid' plot(x, ...) ## S3 method for class 'bayesPop.prediction' get.bPop.pyramid(data, country, year = NULL, indicator = c("P", "B", "D"), pi = c(80, 95), proportion = FALSE, age = NULL, nr.traj = 0, sort.pi=TRUE, pop.max = NULL, ...) ## S3 method for class 'data.frame' get.bPop.pyramid(data, main.label = NULL, legend = "observed", is.proportion = FALSE, ages = NULL, pop.max = NULL, LRmain = c("Male", "Female"), LRcolnames = c("male", "female"), CI = NULL, ...) ## S3 method for class 'matrix' get.bPop.pyramid(data, ...) ## S3 method for class 'list' get.bPop.pyramid(data, main.label = NULL, legend = NULL, CI = NULL, ...)
pop.object |
Object of class |
pop.pred |
Object of class |
x |
Object of class |
data |
Data frame, matrix, list or object of class |
country |
Name or numerical code of a country. It can also be given as ISO-2 or ISO-3 characters. |
year |
Year within the projection or estimation period to be plotted. Default is the start year of the prediction. It can also be a vector of years. |
indicator |
One of the characters “P” (population), “B” (births), “D” (deaths) determining the pyramid indicator. |
pi |
Probability interval. It can be a single number or an array. |
proportion |
Logical. If |
age |
Integer vector of age indices. In a 5-year simulation, value 1 corresponds to age 0-4, value 2 corresponds to age 5-9 etc. In a 1x1 simulation, values 1, 2, 3 correpond to ages 0, 1, 2. Last available age goup is 130+ which corresponds to index 27 in a 5-year simulation and index 131 in an annual simulation. The purpose of this argument here is mainly to control the height of the pyramid. |
plot |
If |
main |
Titel of the plot. By default it is the country name and projection year if known. |
show.legend |
Logical controlling if the plot legend is drawn. |
pyr1.par , pyr2.par
|
List of graphical parameters (color, border, density and height) for drawing the pyramid rectangles, for the first and second pyramid, respectively (see Details). The |
show.birth.year |
Logical. If |
col.pi |
Vector of colors for drawing the probability boxes. If it is given, it must be of the same length as |
ann |
Logical controlling if any annotation (main and legend) is plotted. |
axes |
Logical controlling if axes are plotted. |
grid |
Logical controlling if grid lines are plotted. |
cex.main , cex.sub , cex , cex.axis
|
Magnification to be used for the title, secondary titles on the right and left panels, legend and axes, respectively. |
output.dir |
Directory into which resulting graphs are stored. |
output.type |
Type of the resulting files. It can be “png”, “pdf”, “jpeg”, “bmp”, “tiff”, or “postscript”. |
one.file |
Logical. If |
verbose |
Logical switching log messages on and off. |
nr.traj |
Number of trajectories to be plotted. If |
col |
Colors generating function. It is called with an argument giving the number of pyramids to be plotted. Each color is then used for one pyramid, including its confidence intervals. |
col.traj |
Color used for trajectories. If more than one pyramid is drawn with its trajectories, this can be a vector of the size of number of pyramids. |
omit.page.pars |
Logical. If |
lwd |
Line width for the pyramids. |
sort.pi |
Logical controlling if the probability intervals are sorted in decreasing order. This has an effect on the order in which they are plotted and thus on overlapping of pyramid boxes. By default the largest intervals are plotted first. |
main.label |
Optional argument for the main title. |
legend |
Legend to be used. In case of multiple pyramids, this can be a vector for each of them. If not given and |
is.proportion |
Either logical, indicating if the values in |
ages |
Vector of age labels. It must be of the same length as the number of rows of |
pop.max |
Maximum value to be drawn in the pyramid. If it is not given, |
LRmain |
Vector of character strings giving the secondary titles for the left and right panel, respectively. |
LRcolnames |
Vector of character strings giving the column names of data to be used for the left and right panel of the pyramid, respectively. |
CI |
Confidence intervals. It should be of the same format as the |
... |
Arguments passed to the underlying functions. For |
The pop.pyramid
function generates one or two population pyramids in one plot. The first (main) one is usually the median of a future year prediction, but it can also be the current year or any population estimates. The second one serves the purpose of comparing two pyramids with one another and is drawn on top of the main pyramid. For example, one can use it to compare a future prediction with the present, or two different time points in the past, or two different geographies. The main pyramid can have confidence intervals associated with it, which are also plotted. If pop.pyramid
is called on a bayesPop.prediction
object, the main and secondary pyramid, respectively, is generated from data of a time period given by the first and second element, respectively, of the year
argument. In such a case, confidence intervals only of the first year are shown. Thus, it makes sense to set the first year to be a prediction year and the second year to an observed time period. If pop.pyramid
is called on a bayesPop.pyramid
object, data in the first and second element, respectively, of the bayesPop.pyramid$pyramid
list are used, and only the first element of bayesPop.pyramid$CI
is used.
Pyramids generated via the pop.trajectories.pyramid
function have different appearance and therefore more than two pyramids can be put into one figure. Furthermore, confidence intervals of more than one pyramid can be shown. Thus, all elements of bayesPop.pyramid$pyramid
and bayesPop.pyramid$CI
are plotted. In addition, single trajectories given in bayesPop.pyramid$trajectories
can be shown by setting the argument nr.traj
larger than 0.
Both, pop.pyramid
and pop.trajectories.pyramid
(if called with a bayesPop.prediction
object) use data from one country.
Functions pop.pyramidAll
and pop.trajectories.pyramidAll
create such pyramids for all countries for which a projection is available and for all years given by the year
argument which should be a list. In this case, one pyramid figure (possibly containing multiple pyramids) is created for each country and each element of the year
list.
The core of these functions operates on a bayesPop.pyramid
object which is automatically created when called with a bayesPop.prediction
object. If used with a user-defined data set, one has to convert such data into bayesPop.pyramid
using the function get.bPop.pyramid
(see an example below). In such a case, one can simply use the plot
function which then calls pop.pyramid
.
pop.pyramid
, pop.trajectories.pyramid
and get.bPop.pyramid
return an object of class bayesPop.pyramid
which is a list with the following components:
label |
Label used for the main titel. |
pyramid |
List of pyramid data, one element per pyramid. Each component is a data frame with at least two columns, containing data for the left and right panels of the pyramid. Their names must correspond to |
CI |
List of lists of confidence intervals with one element per pyramid. The order corresponds to the order in the |
trajectories |
List of lists of trajectories with one element per pyramid. As in the case of |
is.proportion |
Logical indicating if values in the various data frames in this object are proportions or raw values. |
is.annual |
Logical indicating if the data correspond to 1-year age groups. If |
pyr.year |
Year of the main pyramid. It is used as the base year when |
pop.max |
Maximum value for the x-axis. |
LRmain |
Vector of character strings determining the titles for the left and right panels, respectively. |
LRcolnames |
Vector of character strings determining the column names in |
Hana Sevcikova, Adrian Raftery, using feedback from Sam Clark and the bayesPop group at the University of Washington.
pop.trajectories.plot
, bayesPop.prediction
, summary.bayesPop.prediction
# pyramids for bayesPop prediction objects ########################################## sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) pop.pyramid(pred, "Netherlands", c(2045, 2010)) dev.new() pop.trajectories.pyramid(pred, "NL", c(2045, 2010, 1960), age=1:25, proportion=TRUE) # using manual manipulation of the data: e.g. show only the prob. intervals pred.pyr <- get.bPop.pyramid(pred, country="Ecuador", year=2090, age=1:27) pred.pyr$pyramid <- NULL plot(pred.pyr, show.birth.year = TRUE) # pyramids for user-defined data ################################ # this example dataset contains population estimates for the Washington state and King county # (Seattle area) in 2011 data <- read.table(file.path(find.package("bayesPop"), "ex-data", "popestimates_WAKing.txt"), header=TRUE, row.names=1) # extract data for two pyramids and put it into the right format head(data) WA <- data[,c("WA.male", "WA.female")]; colnames(WA) <- c("male", "female") King <- data[,c("King.male", "King.female")]; colnames(King) <- c("male", "female") # create and plot a bayesPop.pyramid object pyramid <- get.bPop.pyramid(list(WA, King), legend=c("Washington", "King")) plot(pyramid, main="Population in 2011", pyr2.par=list(height=0.7, col="violet", border="violet")) # show data as proportions and include birth year pyramid.prop <- get.bPop.pyramid(list(WA, King), is.proportion=NA, legend=c("Washington", "King"), pyr.year = 2011) pop.pyramid(pyramid.prop, main="Population in 2011 (proportions)", pyr1.par=list(col="lightgreen", border="lightgreen", density=2), pyr2.par=list(col="darkred", border="darkred"), show.birth.year = TRUE)
# pyramids for bayesPop prediction objects ########################################## sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) pop.pyramid(pred, "Netherlands", c(2045, 2010)) dev.new() pop.trajectories.pyramid(pred, "NL", c(2045, 2010, 1960), age=1:25, proportion=TRUE) # using manual manipulation of the data: e.g. show only the prob. intervals pred.pyr <- get.bPop.pyramid(pred, country="Ecuador", year=2090, age=1:27) pred.pyr$pyramid <- NULL plot(pred.pyr, show.birth.year = TRUE) # pyramids for user-defined data ################################ # this example dataset contains population estimates for the Washington state and King county # (Seattle area) in 2011 data <- read.table(file.path(find.package("bayesPop"), "ex-data", "popestimates_WAKing.txt"), header=TRUE, row.names=1) # extract data for two pyramids and put it into the right format head(data) WA <- data[,c("WA.male", "WA.female")]; colnames(WA) <- c("male", "female") King <- data[,c("King.male", "King.female")]; colnames(King) <- c("male", "female") # create and plot a bayesPop.pyramid object pyramid <- get.bPop.pyramid(list(WA, King), legend=c("Washington", "King")) plot(pyramid, main="Population in 2011", pyr2.par=list(height=0.7, col="violet", border="violet")) # show data as proportions and include birth year pyramid.prop <- get.bPop.pyramid(list(WA, King), is.proportion=NA, legend=c("Washington", "King"), pyr.year = 2011) pop.pyramid(pyramid.prop, main="Population in 2011 (proportions)", pyr1.par=list(col="lightgreen", border="lightgreen", density=2), pyr2.par=list(col="darkred", border="darkred"), show.birth.year = TRUE)
Obtain projection trajectories of population and vital events/rates. get.pop
allows to access trajectories using a basic component of an expression. get.pop.ex
and get.pop.exba
returns results of an expression defined “by time” and “by age”, respectively. get.trajectory.indices
creates a link to the probabilistic components of the projection by providing indices to the trajectories of TFR, e0 and migration. extract.trajectories.eq
returns trajectories (of population or expression) and their indices that are closest to given values or a quantile. Similarly, functions extract.trajectories.ge
and extract.trajectories.le
return trajectories and their indices that are greater equal and less equal, respectively, to the given values or a quantile.
pop.trajectories(pop.pred, country, sex = c("both", "male", "female"), age = "all", ...) get.pop(object, pop.pred, aggregation = NULL, observed = FALSE, ...) get.pop.ex(expression, pop.pred, observed = FALSE, as.dt = FALSE, ...) get.pop.exba(expression, pop.pred, observed = FALSE, as.dt = FALSE, ...) get.trajectory.indices(pop.pred, country, what = c("TFR", "e0M", "e0F", "migM", "migF")) extract.trajectories.eq(pop.pred, country = NULL, expression = NULL, quant = 0.5, values = NULL, nr.traj = 1, ...) extract.trajectories.ge(pop.pred, country = NULL, expression = NULL, quant = 0.5, values = NULL, all = TRUE, ...) extract.trajectories.le(pop.pred, country = NULL, expression = NULL, quant = 0.5, values = NULL, all = TRUE, ...)
pop.trajectories(pop.pred, country, sex = c("both", "male", "female"), age = "all", ...) get.pop(object, pop.pred, aggregation = NULL, observed = FALSE, ...) get.pop.ex(expression, pop.pred, observed = FALSE, as.dt = FALSE, ...) get.pop.exba(expression, pop.pred, observed = FALSE, as.dt = FALSE, ...) get.trajectory.indices(pop.pred, country, what = c("TFR", "e0M", "e0F", "migM", "migF")) extract.trajectories.eq(pop.pred, country = NULL, expression = NULL, quant = 0.5, values = NULL, nr.traj = 1, ...) extract.trajectories.ge(pop.pred, country = NULL, expression = NULL, quant = 0.5, values = NULL, all = TRUE, ...) extract.trajectories.le(pop.pred, country = NULL, expression = NULL, quant = 0.5, values = NULL, all = TRUE, ...)
pop.pred |
Object of class |
country |
Name or numerical code of a country. |
sex |
One of “both” (default), “male” or “female”. By default the male and female projections are summed up. |
age |
Either a character string “all” (default) or an integer vector of age indices. In a 5x5 simulation, value 1 corresponds to age 0-4, value 2 corresponds to age 5-9 etc. Last age goup |
object |
Character string giving a basic component of an expression (see pop.expressions). |
aggregation |
If the basic component is to be evaluated on an aggregated prediction object, this argument gives the name of the aggregation (corresponds argument |
observed |
Logical. Determines if the evaluation uses observed data ( |
expression |
Expression defining the trajectories measure. For syntax see |
as.dt |
Logical indicating if the result should be returned as a |
what |
A character string that defines to which component should the indices link to. Allowable options are “TFR”, “e0M” (male life expectancy), “e0F” (female life expectancy), “migM” (male migration), “migF” (female migration). |
quant |
Quantile used to select the closest trajectories to. |
values |
Vector of values used to select the closest trajectories to. If it is not of length 1, it has to be of the same length as the number of projected time periods. If it is not given, |
nr.traj |
Number of trajectories to return. This argument can be passed to any of the functions that contains .... |
all |
Logical indicating if the corresponding condition should apply to all time periods of a trajectory. If it is |
... |
Additional argument passed to the underlying functions. In case of |
Function pop.trajectories
returns an array of population trajectories for given sex and age.
Function get.pop
evaluates a basic component of an expression and results in a four-dimensional array. Internally, this function is used for evaluation after an expression is decomposed into basic components. It can be useful for example for debugging purposes, to obtain results from parts of an expression. In addition, while pop.trajectories
works only for population counts, get.pop
can be used for obtaining trajectories of vital events and rates. Note that the wildcard “XXX” in the expression cannot be used in get.pop
; use get.pop.ex
or get.pop.exba
instead.
Functions get.pop.ex
and get.pop.exba
evaluate a whole expression and the dimensions of the resulting array is collapsed depending on the specific expression. Use get.pop.ex
if the expected result of the expression does not contain the age dimension, i.e. it uses no brackets or square brackets. If it is not the case, i.e. the expression is defined using curly braces in order to include the age dimension, the get.pop.exba
function is to be used. Argument nr.traj
can be used to restrict the number of trajectories returned. Use one of those functions if results for all countries (i.e. if using “XXX”) is desired.
Function get.trajectory.indices
returns an array of indices that link back to the given probabilistic component. It is of the same length as number of trajectories in the prediction object. For example, an array of c(10, 15, 20)
(for a prediction with three trajectories) obtained with what="TFR"
means that the 1st, 2nd and 3rd population trajectory, respectively, were generated with the 10th, 15th and 20th TFR trajectory, respectively. If the input TFR and e0 were generated using bayesTFR
and bayesLife
, functions get.tfr.trajectories
and get.e0.trajectories
can be used to extract the corresponding TFR and e0 trajectories.
Function extract.trajectories.eq
can be used to select a given number of trajectories of any population quantity, including vital events, that are close to either specific values or to a given quantile. For example the default seting with quant=0.5
and nr.traj=1
returns the one trajectory that is “closest” to the median projection. As a measure of “closeness” the sum of absolute differences (across all time periods) is used.
Similarly, function extract.trajectories.ge
(extract.trajectories.le
) selects all trajectories that are greater (less) equal to the specific values or a given quantile. The argument all
specifies, if the greater/less condition should be valid for all time periods of the selected trajectories or at least one time period.
Function pop.trajectories
returns a two-dimensional array (time x trajectory).
Function get.pop
returns an array of four dimensions (country x age x time x trajectory). See pop.expressions for more details.
Functions get.pop.ex
and get.pop.exba
return an array of trajectories. Its dimensions depend on the expression and whether it is evaluated on observed data or projections. If as.dt
is TRUE
these functions return data.table
objects in long format.
Function get.trajectory.indices
returns a 1-d array of indices. If the given component is deterministic, it returns NULL
.
Functions extract.trajectories.eq
, extract.trajectories.ge
, extract.trajectories.le
return a list with two components. trajectories
: 2-d array of trajectories; index
: indices of the selected trajectories relative to the whole set of available trajectories.
Hana Sevcikova
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir, write.to.cache=FALSE) # observed female of Netherlands by age; 1x21x15x1 array popFNL <- get.pop("PNL_F{}", pred, observed=TRUE) # observed population for all countries in the prediction object, # here 2 countries; 2x1x15x1 array popAll <- get.pop("PXXX", pred, observed=TRUE) # future migration for all countries in the prediction object, # here 2 countries; 2x17 array migAll <- get.pop.ex("GXXX", pred) # projection population for Ecuador with 3 trajectories; # 1x1x17x3 array popEcu <- get.pop("P218", pred, observed=FALSE) # the above is equivalent to popEcu2 <- pop.trajectories(pred, "Ecuador") # Expression "PNL_F{} / PNL_M{}" evaluated on projections # is internally replaced by FtoM <- get.pop("PNL_F{}", pred) / get.pop("PNL_M{}", pred) # should return the same result as FtoMa <- get.pop.exba("PNL_F{} / PNL_M{}", pred) # the same expression by time (summed over ages) FtoMt <- get.pop.ex("PNL_F / PNL_M", pred) # the example simulation was generated with 3 TFR trajectories ... get.trajectory.indices(pred, "Netherlands", what="TFR") # ... and 1 e0 trajectory get.trajectory.indices(pred, "Netherlands", what="e0M") # The three trajectories of the population ratio of Ecuador to Netherlands get.pop.ex("PEC/PNL", pred) # Returns the trajectory closest to the upper 80% bound, including the corresponding index extract.trajectories.eq(pred, expression="PEC/PNL", quant=0.9) # Returns the median trajectory and the high variant, including the corresponding index extract.trajectories.ge(pred, expression="PEC/PNL", quant=0.45)
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir, write.to.cache=FALSE) # observed female of Netherlands by age; 1x21x15x1 array popFNL <- get.pop("PNL_F{}", pred, observed=TRUE) # observed population for all countries in the prediction object, # here 2 countries; 2x1x15x1 array popAll <- get.pop("PXXX", pred, observed=TRUE) # future migration for all countries in the prediction object, # here 2 countries; 2x17 array migAll <- get.pop.ex("GXXX", pred) # projection population for Ecuador with 3 trajectories; # 1x1x17x3 array popEcu <- get.pop("P218", pred, observed=FALSE) # the above is equivalent to popEcu2 <- pop.trajectories(pred, "Ecuador") # Expression "PNL_F{} / PNL_M{}" evaluated on projections # is internally replaced by FtoM <- get.pop("PNL_F{}", pred) / get.pop("PNL_M{}", pred) # should return the same result as FtoMa <- get.pop.exba("PNL_F{} / PNL_M{}", pred) # the same expression by time (summed over ages) FtoMt <- get.pop.ex("PNL_F / PNL_M", pred) # the example simulation was generated with 3 TFR trajectories ... get.trajectory.indices(pred, "Netherlands", what="TFR") # ... and 1 e0 trajectory get.trajectory.indices(pred, "Netherlands", what="e0M") # The three trajectories of the population ratio of Ecuador to Netherlands get.pop.ex("PEC/PNL", pred) # Returns the trajectory closest to the upper 80% bound, including the corresponding index extract.trajectories.eq(pred, expression="PEC/PNL", quant=0.9) # Returns the median trajectory and the high variant, including the corresponding index extract.trajectories.ge(pred, expression="PEC/PNL", quant=0.45)
The functions plot and tabulate the distribution of population projection for a given country, or for all countries, including the median and given probability intervals.
pop.trajectories.plot(pop.pred, country = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), age = "all", sum.over.ages = TRUE, half.child.variant = FALSE, nr.traj = NULL, typical.trajectory = FALSE, main = NULL, dev.ncol = 5, lwd = c(2, 2, 2, 2, 1), col = c("black", "red", "red", "blue", "#00000020"), show.legend = TRUE, ann = par("ann"), xshift = 0, ...) pop.trajectories.plotAll(pop.pred, output.dir=file.path(getwd(), "pop.trajectories"), output.type="png", expression = NULL, verbose=FALSE, ...) pop.trajectories.table(pop.pred, country = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), age = "all", half.child.variant = FALSE, xshift = 0, ...) pop.byage.plot(pop.pred, country = NULL, year = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), half.child.variant = FALSE, nr.traj = NULL, typical.trajectory=FALSE, xlim = NULL, ylim = NULL, xlab = "", ylab = "Population projection", main = NULL, lwd = c(2,2,2,1), col = c("red", "red", "blue", "#00000020"), show.legend = TRUE, add = FALSE, ann = par("ann"), type = "l", pch = NA, pt.cex = 1, ...) pop.byage.plotAll(pop.pred, output.dir=file.path(getwd(), "pop.byage"), output.type="png", expression = NULL, verbose=FALSE, ...) pop.byage.table(pop.pred, country = NULL, year = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), half.child.variant = FALSE)
pop.trajectories.plot(pop.pred, country = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), age = "all", sum.over.ages = TRUE, half.child.variant = FALSE, nr.traj = NULL, typical.trajectory = FALSE, main = NULL, dev.ncol = 5, lwd = c(2, 2, 2, 2, 1), col = c("black", "red", "red", "blue", "#00000020"), show.legend = TRUE, ann = par("ann"), xshift = 0, ...) pop.trajectories.plotAll(pop.pred, output.dir=file.path(getwd(), "pop.trajectories"), output.type="png", expression = NULL, verbose=FALSE, ...) pop.trajectories.table(pop.pred, country = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), age = "all", half.child.variant = FALSE, xshift = 0, ...) pop.byage.plot(pop.pred, country = NULL, year = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), half.child.variant = FALSE, nr.traj = NULL, typical.trajectory=FALSE, xlim = NULL, ylim = NULL, xlab = "", ylab = "Population projection", main = NULL, lwd = c(2,2,2,1), col = c("red", "red", "blue", "#00000020"), show.legend = TRUE, add = FALSE, ann = par("ann"), type = "l", pch = NA, pt.cex = 1, ...) pop.byage.plotAll(pop.pred, output.dir=file.path(getwd(), "pop.byage"), output.type="png", expression = NULL, verbose=FALSE, ...) pop.byage.table(pop.pred, country = NULL, year = NULL, expression = NULL, pi = c(80, 95), sex = c("both", "male", "female"), half.child.variant = FALSE)
pop.pred |
Object of class |
country |
Name or numerical code of a country. It can also be given as ISO-2 or ISO-3 characters. |
expression |
Expression defining the population measure to be plotted. For syntax see |
pi |
Probability interval. It can be a single number or an array. |
sex |
One of “both” (default), “male” or “female”. By default the male and female projections are summed up. |
age |
Either a character string “all” (default) or an integer vector of age indices. In a five year simulation, value 1 corresponds to age 0-4, value 2 corresponds to age 5-9 etc. Last age goup |
sum.over.ages |
Logical. If |
half.child.variant |
Logical. If TRUE the United Nations “+/-0.5 child” variant computed with fertility |
nr.traj |
Number of trajectories to be plotted. If |
typical.trajectory |
Logical. If |
xlim , ylim , xlab , ylab , main , ann , pt.cex
|
Graphical parameters passed to the |
xshift |
Constant added to the x-axis (year). |
dev.ncol |
Number of column for the graphics device if |
lwd , col
|
For the first three functions it is a vector of five elements giving the line width and color for: 1. observed data, 2. median, 3. quantiles, 4. half-child variant, 5. trajectories. For functions that show results by age it is a vector of four elements - as above without the first item (observed data). |
type , pch
|
Currently works for plotting by age only. It is a vector of four elements giving the plot type and point type for: 1. median, 2. quantiles, 3. half-child variant, 4. trajectories. The last element of the array is recycled. |
show.legend |
Logical controlling whether the legend should be drawn. |
... |
Additional graphical arguments. Functions |
output.dir |
Directory into which resulting graphs are stored. |
output.type |
Type of the resulting files. It can be “png”, “pdf”, “jpeg”, “bmp”, “tiff”, or “postscript”. |
verbose |
Logical switching log messages on and off. |
year |
Any year within the time period to be outputted. |
add |
Logical specifying if the plot should be added to an existing graphics. |
pop.trajectories.plot
plots trajectories of population projection by time for a given country. pop.trajectories.table
gives the same output as a table. pop.trajectories.plotAll
creates a set of graphs (one per country) that are stored in output.dir
. The projections can be visualized separately for each sex and age groups, or summed up over both sexes and/or given age groups. This is controlled by the arguments sex
, age
and sum.over.ages
.
pop.byage.plot
and pop.byage.table
plots/tabulate the posterior distribution by age for a given country and time period. pop.byage.plotAll
creates such plots for all countries.
The median and given probability intervals are computed using all available trajectories. Thus, nr.traj
does not influence those values - it is used only to control the number of trajectories plotted.
If plotting results of an expression and the function fails, to debug obtain values of that expression using the functions get.pop.ex
(for pop.trajectories.plot
) and get.pop.exba
(for pop.byage.plot
).
Hana Sevcikova
bayesPop.prediction
, summary.bayesPop.prediction
, pop.pyramid
, pop.expressions
, get.pop
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) pop.trajectories.plot(pred, country="Ecuador", pi=c(80, 95)) pop.trajectories.table(pred, country="ECU", pi=c(80, 95)) # female population of Ecuador in child bearing ages (by time) pop.trajectories.plot(pred, expression="PEC_F[4:10]") # Population by age in Netherands for two different years pop.byage.plot(pred, country="Netherlands", year=2050) pop.byage.plot(pred, expression="PNL{}", year=2000)
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) pop.trajectories.plot(pred, country="Ecuador", pi=c(80, 95)) pop.trajectories.table(pred, country="ECU", pi=c(80, 95)) # female population of Ecuador in child bearing ages (by time) pop.trajectories.plot(pred, expression="PEC_F[4:10]") # Population by age in Netherands for two different years pop.byage.plot(pred, country="Netherlands", year=2050) pop.byage.plot(pred, expression="PNL{}", year=2000)
The projections of percent age-specific fertility rate (PASFR) is normally computed within the pop.predict
function for each trajectory. This function allows to project PASFR outside of population projections for the median total fertility rate (TFR) or user-provided TFR, and export it.
project.pasfr(inputs = NULL, present.year = 2020, end.year = 2100, wpp.year = 2019, annual = FALSE, nr.est.points = if(annual) 15 else 3, digits = 2, out.file.name = "percentASFR.txt", verbose = FALSE) project.pasfr.traj(inputs = NULL, countries = NULL, nr.traj = NULL, present.year = 2020, end.year = 2100, wpp.year = 2019, annual = FALSE, nr.est.points = if(annual) 15 else 3, digits = 2, out.file.name = "percentASFRtraj.txt", verbose = FALSE)
project.pasfr(inputs = NULL, present.year = 2020, end.year = 2100, wpp.year = 2019, annual = FALSE, nr.est.points = if(annual) 15 else 3, digits = 2, out.file.name = "percentASFR.txt", verbose = FALSE) project.pasfr.traj(inputs = NULL, countries = NULL, nr.traj = NULL, present.year = 2020, end.year = 2100, wpp.year = 2019, annual = FALSE, nr.est.points = if(annual) 15 else 3, digits = 2, out.file.name = "percentASFRtraj.txt", verbose = FALSE)
inputs |
List of input data (file names) with the same meaning as in |
present.year |
Year of the last observed data point. |
end.year |
End year of the projection. |
wpp.year |
Year for which WPP data is used if one of the |
annual |
Logical that should be |
nr.est.points |
Number of time points to be used for estimating the continuation of the observed PASFR trend. By default it is 15 years, corresponding to three time points for 5-year data. |
digits |
Number of decimal places in the results. |
out.file.name |
Name of the resulting file. If |
verbose |
Logical switching verbose messages on and off. |
countries |
Vector of numerical country codes. By default the function is applied to all countries. |
nr.traj |
Number of trajectories on which the function should be applied. By default all trajectories are taken. Otherwise they are thinned appropriately. |
If the input TFR is given as an ASCII file (in inputs$tfr.file
), it can be either a csv (comma-separated) file in long format, with columns “LocID”, “Year”, “Trajectory” and “TF”. Or, it can be a tab-separated (wide format) file with column “country_code” and each year or time period as a separate column (see tfr
). In the latter case, an additional inputs
entry tfr.file.type = "w"
must be provided to specify the file is in the wide format, which is a case whe there is only one trajectory. Note that the TFR input should cover all projection time period as well as observed TFR as the function assesses the start of Phase III, which could be in the past.
If observed PASFR is given (in inputs$pasfr
), it is a tab-separated file in wide format as in percentASFR
. Fertility age patterns can be controlled by country via the inputs$patterns
entry, which is a dataset in the same format and meaning as vwBaseYear
.
In addition, if the present year differs by country, the inputs
list accepts the entry last.observed
, which is a tab-separated file with columns “country_code” and “last.observed”. It can contain the year of the last observed time period for each country.
In the project.pasfr
function, if the TFR input (given either as a long file or as a simulation directory), contains more than one trajectory, the median is derived over the trajectories for each time period. Then, PASFR corresponding to this median is projected using the method from Sevcikova et al (2016).
For project.pasfr.traj
, the PASFR is projected for single trajectories of TFR.
Returns invisible data frame with the projected PASFR.
Hana Sevcikova, Igor Ribeiro
H. Sevcikova, N. Li, V. Kantorova, P. Gerland and A. E. Raftery (2016). Age-Specific Mortality and Fertility Rates for Probabilistic Population Projections. In: Dynamic Demographic Analysis, ed. Schoen R. (Springer), pp. 285-310. Earlier version in arXiv:1503.05215.
# using TFR in simulation directory inputs <- list(tfr.sim.dir=file.path(find.package("bayesTFR"), "ex-data", "bayesTFR.output")) pasfr <- project.pasfr(inputs, out.file.name = NULL) head(pasfr) ## Not run: pasfr.traj <- project.pasfr.traj(inputs, out.file.name = NULL) head(pasfr.traj) ## End(Not run) # using TFR in wide-format file inputs2 <- list(tfr.file = file.path(find.package("wpp2019"), "data", "tfrprojMed.txt"), tfr.file.type = "w") pasfr2 <- project.pasfr(inputs2, out.file.name = NULL) head(pasfr2)
# using TFR in simulation directory inputs <- list(tfr.sim.dir=file.path(find.package("bayesTFR"), "ex-data", "bayesTFR.output")) pasfr <- project.pasfr(inputs, out.file.name = NULL) head(pasfr) ## Not run: pasfr.traj <- project.pasfr.traj(inputs, out.file.name = NULL) head(pasfr.traj) ## End(Not run) # using TFR in wide-format file inputs2 <- list(tfr.file = file.path(find.package("wpp2019"), "data", "tfrprojMed.txt"), tfr.file.type = "w") pasfr2 <- project.pasfr(inputs2, out.file.name = NULL) head(pasfr2)
Age-specific schedules of the inflow and outflow migration distribution used as input for the FDM method. rc1FDM
corresponds to 1-year ages, while rc5FDM
corresponds to 5-year age groups.
data(rc1FDM) data(rc5FDM)
data(rc1FDM) data(rc5FDM)
A data frame where countries and ages are rows. It has four columns:
country_code
Numerical Location Code (3-digit codes following ISO 3166-1 numeric standard) - see https://en.wikipedia.org/wiki/ISO_3166-1_numeric.
age
Either single ages from 0 to 100 (rc1FDM
) or 5-year age groups, such as “0-4”, “5-9”, ..., “100+” (rc5FDM
).
These datasets are used as the default datasets in pop.predict
if mig.age.method
is either “fdmp” or “fdmnop” and the inputs
item “mig.fdm” is not given. Other default parameters of the FDM method are read from the vwBaseYear
dataset.
Most of the values were provided by the United Nations Population Division.
H. Sevcikova, J. Raymer J., A. E. Raftery (2024). Forecasting Net Migration By Age: The Flow-Difference Approach. arXiv:2411.09878.
data(rc1FDM) head(rc1FDM)
data(rc1FDM) head(rc1FDM)
Summary of an object bayesPop.prediction
created using the pop.predict
function. The summary contains the mean, standard deviation and several commonly used quantiles of the simulated trajectories.
## S3 method for class 'bayesPop.prediction' summary(object, country = NULL, sex = c("both", "male", "female"), compact = TRUE, ...)
## S3 method for class 'bayesPop.prediction' summary(object, country = NULL, sex = c("both", "male", "female"), compact = TRUE, ...)
object |
Object of class |
country |
Country name or code. It can also be given as ISO-2 or ISO-3 characters. If it is |
sex |
One of “both” (default), “male”, or “female”. If it is not “both”, the summary is given for sex-specific trajectories. |
compact |
Logical switching between a smaller and larger number of displayed quantiles. |
... |
A list of further arguments. |
Hana Sevcikova
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) summary(pred, "Netherlands")
sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir) summary(pred, "Netherlands")
Datasets giving information on the baseyear and type of migration for each country. The 2012, 2015, 2017, 2019, 2022 and 2024 datasets also give information on country's specifics regarding mortality, fertility and migration age patterns.
data(vwBaseYear2024) data(vwBaseYear2022) data(vwBaseYear2019) data(vwBaseYear2017) data(vwBaseYear2015) data(vwBaseYear2012) data(vwBaseYear2010)
data(vwBaseYear2024) data(vwBaseYear2022) data(vwBaseYear2019) data(vwBaseYear2017) data(vwBaseYear2015) data(vwBaseYear2012) data(vwBaseYear2010)
A data frame containing the following variables:
country_code
Numerical Location Code (3-digit codes following ISO 3166-1 numeric standard) - see https://en.wikipedia.org/wiki/ISO_3166-1_numeric.
country
Country name. Not used by the package.
isSmall
UN internal code. Not used by the package.
ProjFirstYear
The base year of migration.
MigCode
Type of migration. Zero means migration is evenly distributed over each time interval. Code 9 means migration is captured at the end of each interval.
WPPAIDS
Dummy indicating if the country has generalized HIV/AIDS epidemics.
AgeMortalityType
Type of mortality age pattern. Only relevant for countries with the entry “Model life tables”. In such a case, the Lee-Carter parameter is not estimated from historical data. Instead is taken from the dataset
MLTbx
using a pattern given in the AgeMortalityPattern
column.
AgeMortalityPattern
If AgeMortalityType
is equal to “Model life tables”, this value determines which is selected from the
MLTbx
dataset. It must sorrespond to one of the rownames of MLTbx
, e.g. “CD East”, “CD West”, “UN Latin American”.
AgeMortProjMethod1
Method for projecting age-specific mortality rates. It is one of “LC” (modified Lee-Carter, uses function mortcast
), “PMD” (pattern mortality decline, uses function copmd
), “modPMD” (modified pattern mortality decline, uses function copmd(... use.modpmd = TRUE)
), “MLT” (model life tables, uses function mlt
), “LogQuad” (log quadratic method, uses function logquad
), or “HIVmortmod” (HIV model life tables as implemented in the HIV.LifeTables package which can be installed from the PPgP/HIV.LifeTables GitHub repo).
AgeMortProjMethod2
If the mortality rates are to be projected via a blend of two methods (see mortcast.blend
), this column determines the second method. The options are the same as in the column AgeMortProjMethod1
.
AgeMortProjPattern
If one of the AgeMortProjMethodX
colums contains the “MLT” method, this column determines the type of the life table (see the argument type
in the mlt
function).
AgeMortProjMethodWeights
If the mortality rates are to be projected via a blend of two methods, this column determines the weights in the first and the last year of the projection, respectively. It should be given as an R vector, e.g. “c(1, 0.5)” (see the argument weights
in mortcast.blend
).
AgeMortProjAdjSR
Code determining how the “PMD” method should be adjusted if it's used. 0 means no adjustment, 1 means the argument sexratio.adjust
in copmd
is set to TRUE
, and code 3 means that the argument adjust.sr.if.needed
in copmd
is set to TRUE
.
LatestAgeMortalityPattern
, LatestAgeMortalityPattern1
Indicator for how many latest time periods of historical mortality rates should be averaged to compute the
Lee-Carter and modPMD parameter. If
is zero, all time periods are used. If
is one, only the latest time period is used. If
is negative, the latest
time periods are excluded. This can have also a form of a vector where the first element is either a negative or a zero. If it is negative, the vector must have only two elements. In such a case, the first element (must be negative) determines how many latest time periods should be excluded, while the second element (must be positive) determines how many latest time periods to include after the exclusion. If the vector starts with a zero, the following numbers are interpreted as individual indices to the time periods starting from the latest time point. Here are a few examples, assuming the available mortality rates are on annual scale, from 1950 to 2023:
using all years from 1950 to 2023
using 2023, 2022, 2021
using 1950 - 2020
2023 and 2022 are excluded; using 2021, 2020, 2019
invalid specification - must have two elements if it starts with a negative
interpreted as an individual index; thus, using 2021 only
interpreted as individual indices; using 2023, 2021, 2020
If the LatestAgeMortalityPattern1
column is present, it should contain values related to an annual simulation (1x1) while the LatestAgeMortalityPattern
column relates to a 5x5 simulation.
SmoothLatestAgeMortalityPattern
If LatestAgeMortalityPattern
is not zero, this column indicates if the should be smoothed.
SmoothDFLatestAgeMortalityPattern
, SmoothDFLatestAgeMortalityPattern1
Degree of freedom for smoothing . By default (value 0) a half of the number of age groups is taken. If the
SmoothDFLatestAgeMortalityPattern1
column is present, it should contain values related to a 1x1 simulation while the SmoothDFLatestAgeMortalityPattern
column relates to a 5x5 simulation.
PasfrNorm
Type of norm for computing age-specific fertility pattern to which the country belongs to. Currently only “GlobalNorm” is used.
PasfrGlobalNorm, PasfrFarEastAsianNorm, PasfrSouthAsianNorm
Dummies indicating which country to include to compute the specific norms.
Available in the 2024 dataset. These are parameters of the Flow Difference Method to generate age-specific net migration patterns (Sevcikova et. al, 2024). They correspond to the intercept, slope, minimum flow rate, female sex ratio for the in-flow and out-flow, respectively.
There is one record for each country. See Sevcikova et al (2016) on how information from the various columns is used for projections.
Data provided by the United Nations Population Division.
H. Sevcikova, N. Li, V. Kantorova, P. Gerland and A. E. Raftery (2016). Age-Specific Mortality and Fertility Rates for Probabilistic Population Projections. In: Dynamic Demographic Analysis, ed. Schoen R. (Springer), pp. 285-310. Earlier version in arXiv:1503.05215.
H. Sevcikova, J. Raymer J., A. E. Raftery (2024). Forecasting Net Migration By Age: The Flow-Difference Approach. arXiv:2411.09878.
data(vwBaseYear2019) str(vwBaseYear2019)
data(vwBaseYear2019) str(vwBaseYear2019)
Functions for creating ASCII files containing projection summaries, such as the median, the lower and upper bound of the 80 and 95% probability intervals, respectively, as well as containing individual trajectories.
write.pop.projection.summary(pop.pred, what = NULL, expression = NULL, output.dir = NULL, ...) write.pop.trajectories(pop.pred, expression = "PXXX", output.file = "pop_trajectories.csv", byage = FALSE, observed = FALSE, wide = FALSE, digits = NULL, include.name = FALSE, sep = ",", na.rm = TRUE, ...)
write.pop.projection.summary(pop.pred, what = NULL, expression = NULL, output.dir = NULL, ...) write.pop.trajectories(pop.pred, expression = "PXXX", output.file = "pop_trajectories.csv", byage = FALSE, observed = FALSE, wide = FALSE, digits = NULL, include.name = FALSE, sep = ",", na.rm = TRUE, ...)
pop.pred |
Object of class |
what |
A character vector specifying what kind of projection to write. Total population is specified by “pop”. Vital events are specified by “births”, “deaths”, “sr” (survival rate), “fertility” and “pfertility” (percent fertility). Each of these strings can (some must) have a suffix “sex” and/or “age” if sex- and/or age-specific measure is desired. For example, “popage”, “birthssexage”, “deaths”, “deathssex”, are all valid values. Note that for survival, only “srsexage” is allowed. For percent fertility, only “pfertilityage” is allowed. Suffix “sex” cannot be used in combination with “fertility”. Moreover, “fertility” (without age) corresponds to the total fertility rate. If the argument is |
expression |
Expression defining the measure to be written. If it is not |
output.dir |
Directory in which the resulting files will be stored. If |
output.file |
File name to write the trajectories into. |
byage |
Logical indicating if the expression is defined by age, i.e. if it includes curly braces ( |
observed |
Logical indicating if observed data should be written ( |
wide |
Logical indicating if the data format should be wide. By default, trajectories are written in long format. |
digits |
To how many decimal digits should the indicator be rounded. By default no rounding takes place. |
include.name |
Logical indicating if country names should be included in the dataset. |
sep |
The field separator string. |
na.rm |
Logical indicating if records with |
... |
For
For |
The write.pop.projection.summary
function creates one file per value of what
, or expression
, called ‘projection_summary_’suffix‘.csv’, where suffix is either what
or, if an expression is given, the value of file.suffix
. It is a comma-separated table with the following columns:
“country_name”: country name
“country_code”: country code
“variant”: name of the variant, such as “median”, “lower 80”, “upper 80”, “lower 95”, “upper 95”
period1: e.g. “2005-2010”, or “2010”: Given population measure for the first time period
period2: e.g. “2010-2015”, or “2015”: Given population measure for the second time period
... further time period columns
If expression
is given, expression.label
(by default the full expression) is written as the first line of the file starting with #. The file contains one line per country, and possibly sex and age.
Function write.pop.trajectories
writes out all trajectories, either in long format (default) or, if wide = TRUE
in wide format (years become columns).
If the expression
argument is used, the same applies as for pop.map
in terms of Performance and Caching.
Hana Sevcikova
pop.predict
, pop.map
, pop.expressions
outdir <- tempfile() dir.create(outdir) sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir=sim.dir, write.to.cache=FALSE) # proportion of 65+ years old to the whole population write.pop.projection.summary(pred, expression="PXXX[14:27] / PXXX", file.suffix="age65plus", output.dir=outdir, include.observed=TRUE, digits=2) # various measures write.pop.projection.summary(pred, what=c("pop", "popsexage", "popsex"), output.dir=outdir) unlink(outdir, recursive=TRUE)
outdir <- tempfile() dir.create(outdir) sim.dir <- file.path(find.package("bayesPop"), "ex-data", "Pop") pred <- get.pop.prediction(sim.dir=sim.dir, write.to.cache=FALSE) # proportion of 65+ years old to the whole population write.pop.projection.summary(pred, expression="PXXX[14:27] / PXXX", file.suffix="age65plus", output.dir=outdir, include.observed=TRUE, digits=2) # various measures write.pop.projection.summary(pred, what=c("pop", "popsexage", "popsex"), output.dir=outdir) unlink(outdir, recursive=TRUE)