import_survey.Rd
The survey information should be in TOML format, with fields corresponding to survey design components. For example,
= strata_var
strata = cluster_var
clusters = wt_var weights
import_survey(file, data, read_fun, ...)
make_survey(.data, spec)
the file containing survey information (see Details)
optional, if supplied the survey object will be created with the supplied data.
Can be either a data.frame-like object, or a path to a data set which
will be imported using iNZightTools::smart_read
.
function required to load the data specified in file
additional arguments to read_fun
a data.frame
a inzsvyspec
object
a inzsvyspec
object containing the design parameters and, if data supplied,
the created survey object. The object is a list containing at least a 'spec'
component, and if data
is supplied then also 'data' and 'design' components.
For replicate weight designs, vectors (if necessary) are declared with square brackets, like so:
= ['w01', 'w02', 'w03', 'w04', ..., 'w20'] repweights
although this would be better expressed using a regular expression,
= '^w[0-2]' repweights
which matches all variables starting with a w
followed by digits between 0 and 2 (inclusive).
Additionally, the information can contain a file
specification
indicating the path to the data, which will be imported (if it exists
in the same directory as file
) using read_fun
, if specified;
or alternatively file
can be a URL to a data file that will be downloaded
and read, again using read_fun
.
If the data is not specified to import_survey
, then make_survey
can be used to manually
construct an inzsvyspec
object with the design attached. This might be useful if you have
multiple datasets with the same design, for example.
import_survey
calls make_survey
when data is provided.
import_survey()
: Import survey information from a file
make_survey()
: Construct a survey object from a data set and an inzsvyspec
object
The survey design specification file used by 'surveyspec' should be in TOML format. This allows for a very human-readable syntax,
= "value" arg
and additionally some additional complexity where needed (for example when specifying calibration information).
For stratified and clustering samples, each argument to survey::svydesign
can be
given on its own line. So for a stratified sample using the apistrat
data from the
'survey' package, the following specification would suffice:
= "stype"
strata = "pw"
weights = "fpc" fpc
For a cluster sample, we instead can provide either clusters
or ids
(the former
makes it more obvious to beginners, while the latter is consistent with svydesign()
).
For example, specifying the design for the apiclus2
data:
= "dnum + snum"
clusters = "fpc1 + fpc2" fpc
Alternatively, survey data may be distributed with replicate weights.
To specify this information to import_survey()
, the same concept is used
but the arguments supplied should be based off those used in survey::svrepdesign()
.
For example (taken from ?svrepdesign
):
= "pw"
weights = "wt[1-9]+"
repweights = "JK1"
type = "~(1-15/757)*14/15"
scale = FALSE combined
Note here that you can specify an expression for scale
, but need to use
this syntax, "~expr", for it to be parsed correctly.
Finally, survey design calibration can be performed by including this information using TOML list syntax. For example, to calibrate the 'apistrat' data,
= "stype"
strata = "pw"
weights = "fpc"
fpc
[calibrate.stype]= 4421
E = 755
H = 1018
M
"sch.wide"]
[calibrate."No" = 1072
"Yes" = 5122
Note importantly the use of quotes around variable names which include a period (.),
here sch.wide
. Currently, only calibrating by a factor is possible.
library(survey)
#> Loading required package: grid
#> Loading required package: Matrix
#> Loading required package: survival
#>
#> Attaching package: ‘survey’
#> The following object is masked from ‘package:graphics’:
#>
#> dotchart
data(api)
dstrat <- svydesign(ids = ~1, strata = ~stype, weights = ~pw,
fpc = ~fpc, data = apistrat)
f <- tempfile(fileext = ".svydesign")
write_spec(dstrat, f)
cat(readLines(f), sep = "\n")
#> strata = "stype"
#> fpc = "fpc"
#> weights = "pw"
#> type = "survey"
(spec <- import_survey(f))
#> Survey design specification:
#> * ids: 1
#> * strata: stype
#> * fpc: fpc
#> * weights: pw
#> * type: survey
#> * survey_type: survey
#> * calfun: linear
#>
#> Design object: empty
(svy <- make_survey(apistrat, spec))
#> Survey design specification:
#> * ids: 1
#> * strata: stype
#> * fpc: fpc
#> * weights: pw
#> * type: survey
#> * survey_type: survey
#> * calfun: linear
#>
#> Design object:
#> Stratified Independent Sampling design
#> survey::svydesign(ids = ~1, strata = ~stype, fpc = ~fpc, weights = ~pw,
#> data = apistrat)
# or all in one:
(svy <- import_survey(f, data = apistrat))
#> Survey design specification:
#> * ids: 1
#> * strata: stype
#> * fpc: fpc
#> * weights: pw
#> * type: survey
#> * survey_type: survey
#> * calfun: linear
#>
#> Design object:
#> Stratified Independent Sampling design
#> survey::svydesign(ids = ~1, strata = ~stype, fpc = ~fpc, weights = ~pw,
#> data = data)