MBIE Endeavour grant
Colin Simpson, Barry Milne, Andrew Sporle
Informatics for Social Services and Wellbeing …
more later!
Honorary position here (thanks James)
lead developer since 2013/14
shifting focus as audience has evolved
See Chris Wild’s talks featuring hits like We Will Plot You
for organisations/groups with low/no money/time/skill
v4.1: surveys now handled natively
plots
summaries (tables of counts)
inference / modelling
data wrangling …
same goal: removal of barriers
Data
GUI
Explore
Export results/code
data
is from a survey?In
iNZight isn’t much better … or is it?!
(Remember survey variables never have nice names)
data = "apiclus2.csv"
ids = "dnum + snum"
fpc = "fpc1 + fpc2"
Details: inzight.nz/docs/survey-specification.html
iNZight (GUI interface, collects user input, displays results)
iNZightModules (UI for time series, regression, maps, …)
iNZightPlots (graphs, summaries, inference)
iNZightTools (utility functions, data wrangling)
iNZightTS (time series)
iNZightMR (multiple response)
iNZightRegression (model summaries, residual plots)
iNZightMaps (lat/lng points, fill-in-the-shapefile maps)
plus vit and some others …
wrapper functions makes programming GUIs easier
packages don’t need GUI
iNZightPlots::inzplot()
simple functions aimed towards novice coders
returns the R code
GUI \(\rightarrow\) high level functions \(\rightarrow\) lower-level (e.g., ggplot)
library(iNZightTools)
iris_filtered <- filterNumeric(iris, "Sepal.Width", "<", 3.5)
head(iris_filtered)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 4.9 3.0 1.4 0.2 setosa
## 2 4.7 3.2 1.3 0.2 setosa
## 3 4.6 3.1 1.5 0.2 setosa
## 4 4.6 3.4 1.4 0.3 setosa
## 5 5.0 3.4 1.5 0.2 setosa
## 6 4.4 2.9 1.4 0.2 setosa
## iris %>% dplyr::filter(Sepal.Width < 3.5)
## # A tibble: 3 x 3
## Species Sepal.Length_median Sepal.Length_var
## <fct> <dbl> <dbl>
## 1 setosa 5 0.124
## 2 versicolor 5.9 0.266
## 3 virginica 6.5 0.404
## iris %>%
## dplyr::group_by(Species) %>%
## dplyr::summarize(
## Sepal.Length_median = median(Sepal.Length,
## na.rm = TRUE
## ),
## Sepal.Length_var = var(Sepal.Length,
## na.rm = TRUE
## ),
## .groups = "drop"
## )
modified wrapper functions to handle surveys
refactored GUI to pass around a ‘data-thing’ (data or survey)
library(survey)
data(api, package = "survey")
dclus2 <- svydesign(id = ~dnum+snum,
fpc = ~fpc1+fpc2,
data = apiclus2
)
dclus2_filtered <- filterNumeric(dclus2, "api99", ">=", 700)
pretty(code(dclus2_filtered))
## dclus2 %>%
## srvyr::as_survey() %>%
## srvyr::filter(api99 >= 700)
Big thanks to
Nā tō rourou, nā taku rourou,
ka ora ai te iwi.
Improve data standards
Promote Māori data sovereignty
Develop systems to support access
Evaluate synthesising of datasets
Security and privacy implications
Machine learning and AI methods
database connecting data across NZs sectors
high security environment
but also other unnecessary barriers: coding!
high school and/or university
no coding necessary
easy to learn and relearn
iNZight in Stats NZ data lab …? Watch this space!
Start confined to (example) small data sets …
primary researcher: SQL \(\Rightarrow\) CSV
non-coding researchers: graphs, tables, …
… and build from there!
groups/organisations/communities
population summaries (tables of counts)
regression models
demographic information …?
easy to learn and relearn
repeat analyses after 6 months / 2 years
no (or low) (re)training or consultation costs
produces R code script
Some important demographic information for communites (e.g., birth or death rates) requires specialist techniques and models.
John Bryant’s R packages (dembase, demest, …) for Bayesian demography
R coding required (and data transformations, working with multi-dimensional arrays, …)
so we tested out iNZight’s new add-on system …
Both work and ‘fun’
simple web app (ReactJS)
searchable database
researchers can explore what’s available
the display in 302 was broken
rebuilt it (again) using ReactJS + d3
uses newly available real-time occupancy
tmelliott iNZightVIT terourou
@tomelliottnz @iNZightUoA @terourou
tomelliott.co.nz inzight.nz terourou.org