Improving the prediction of bus arrival using real-time network state



Tom Elliott

25 August 2020

Part I

What is the status quo, and what is wrong with it?

In Auckland …

  • real-time vehicle locations
  • arrival and departure times/delays

  • ETA = scheduled arrival + current delay
  • no use of location information, traffic, historical, …

Around the world …

  • many unique real-time information systems
  • various data feeds:
    • vehicle positions, passenger counters, taxis, …
  • equally different prediction systems
    • Kalman filter, artificial neural network, support vector machines, …

  • GTFS: General Transit Feed Specification

Vehicle models

  • operations management
    • on-time performance, reducing bunching behaviour, …
  • little recent focus on ETAs
    • Kalman filter (e.g., Dailey et al. (2001), Cathey & Dailey (2003))
    • ANN/SVM (e.g., Yu et al. (2006), …)
  • but lots of cool models
    • particle filtering (e.g., Hans et al. (2015))

Traffic models

  • essential for reliable predictions
  • difficult to model (data availability, …)
  • location-specific examples
    • Yu et al. (2011) previous bus along same road, different route
    • Julio et al. (2016) traffic shockwaves
    • taxi data

  • vast majority of transit feeds use only GTFS

Arrival time prediction & journey planning

  • current position, travel times, dwell times
  • GTFS default: schedule + current delay
  • usually a point estimate “ETA: 5 mins

  • JP is hard (Horn (2004), Häme & Hakula (2013))
  • Simple to complex questions
    • which bus to arrive on time
    • which set of buses to get to destination fastest
    • minimal waiting time between legs
  • Bérczi et al. (2017): use of probabilistic arrival time information

Part II

Bus arrival prediction using real-time network state



  1. GTFS network construction
  2. Vehicle model
  3. Transit network model
  4. Arrival time prediction
  5. Journey planning

1. GTFS network construction

library(transitr)
nw <- create_gtfs("https://cdn01.at.govt.nz/data/gtfs.zip",
    db = "at_gtfs.sqlite")
nw %>% construct()

1. GTFS network construction

library(transitr)
nw <- create_gtfs("https://cdn01.at.govt.nz/data/gtfs.zip",
    db = "at_gtfs.sqlite")
nw %>% construct()

1. GTFS network construction

library(transitr)
nw <- create_gtfs("https://cdn01.at.govt.nz/data/gtfs.zip",
    db = "at_gtfs.sqlite")
nw %>% construct()

2. Vehicle model

  • Observations \(\boldsymbol{y}_1, \boldsymbol{y}_2, \cdots, \boldsymbol{y}_{k-1}, \boldsymbol{y}_k\)
  • Underlying state \(\boldsymbol{x}_0, \boldsymbol{x}_1, \cdots, \boldsymbol{x}_{k-2}, \boldsymbol{x}_k\)

  • Recursive Bayesian estimation: Predict and Update
  • Estimates distance, speed, average speed along each road segment

3. Transit network model

  • Observations \(b_{v\ell c}\) with error \(e_{v\ell c}\)
    • average speed of vehicle \(v\), road \(\ell\), time period \((t_{c-1},t_c]\)
  • Underlying state \(\beta_{\ell c}\) (average vehicle speed, m/s)

  • Hierarchical structure \[ \begin{split} b_{v\ell c} &\sim \mathcal{N}(B_{v\ell c}, e_{v\ell c}^2) \\ B_{v\ell c} &\sim \mathcal{N}_T(\beta_{\ell c}, \phi_\ell^2) \\ \beta_{\ell c} &\sim \mathcal{N}_T(F_c(\beta_{\ell,c-1}, \Delta_c), q^2),\quad \Delta_c = t_c - t_{c-1} \end{split} \]

3. Transit network model

  • Historical data to estimate \(\phi_\ell\) and \(q\)
  • JAGS (Plummer, 2003)

  • Kalman filter: real-time network state
    • \(\hat \beta_{c|c-1} = \mathbb{E}(\beta_c | b_{0:c-1})\)
    • \(P_{c|c-1} = \mathrm{Var}(\beta_c | b_{0:c-1})\)
  • Information filter
    • Multiple vehicles/segment/time period
    • Limited to independent segments

3. Transit network model: results

\(\hat\beta_\ell\) \(\pm 1.96\sqrt{\vphantom{x^2}P_\ell}\) and \(\hat\beta_\ell\) \(\pm 1.96\sqrt{P_\ell + \phi_\ell^2}\)

4. Arrival time prediction

  • ETA = travel time + dwell time
    • Shalaby & Farhan (2004), Jeong & Rilett (2005), Hans et al. (2015)
  • Travel times: sum of (distributions) of segment travel times
    • travel time = distance / speed
  • Dwell times: multimodality (dwell can be zero)

4. Arrival time prediction

  • Particle filter
    • particles complete route recording arrival times
    • handles dwell time, layovers, etc.

5. Journey planning

  • Need a useful summary of \(p(\alpha_j|\boldsymbol{x}_k, \boldsymbol{\beta}_k)\)
  • ETAs expected as integer minutes
  • CDF approximation \[ \mathbb{P}(A < a) = \sum_{x=0}^{x=a-1} \mathbb{P}(A \in [x, x+1)) = \sum_{x=0}^{x=a-1} \left( \frac{1}{N^\star} \sum_{i=1}^{N^\star} I_{\lfloor \alpha^{(i)}/60 \rfloor = x} \right) \]

5. Journey planning

  • CDF allows us to answer questions:
    • change of catching bus if I arrive at \(a\)
    • which bus should I catch to have at least 90% change of on-time arrival
    • probabilty of making transfer between two services
  • Provides event probabilities (not binary ‘Yes’ or ‘No’)

Part III

Using particle filtering to model vehicles and generate arrival time distributions

Why the particle filter?

  • Flexible with few assumptions
  • Handles complex vehicle behaviour
  • Intuitive likelihood

  • Multimodality

Why not the particle filter?

  • Computationally intensive
  • Hard to distribute the results

Why not the particle filter?

  • Computationally intensive
  • Hard to distribute the results

Particle filter in action

\[ p(\boldsymbol{x}_{k-1} | \boldsymbol{y}_{1:k-1}) \approx \sum_{i=1}^N w_{k-1}^{(i)} \delta_{\boldsymbol{x}_{k-1}^{(i)}} (\boldsymbol{x}_{k-1}) \]

\(N = 50\)

Particle filter in action: predict