1 Introduction

1.1 Welcome

  • About the instructor

1.2 What will we learn?

  • Introduction to statistical models
  • Unreplicable vs. replicable studies
    • The collision of designed experiments vs. observational spatio-temporal data
  • Example 1: Spatial statistics
    • Kriging
  • Example 2: Trajectories
    • Models for human movement/trajectories

1.3 Intro to statistical models

  • What is data?
    • Something in the real world that you can, in some way, observe and measure with or without error
  • What is a statistic?
    • A function of the data
  • What is a model?
    • Simplification of something that is real designed to serve a purpose
  • What is a statistical model?
    • Simplification of a real data generating mechanism
    • Constructed from deterministic mathematical equations and probability density / mass functions
    • Capable of generating data
    • Generative vs. non-generative models
  • What is the purpose of a statistical model
    • Capable of making predictions, forecasts, and hindcasts
    • Enables statistical inference about observable and unobservable quantities
    • Reliability quantify and communicate uncertainty

1.4 Intro to spatio-temporal data

  • What is spatial data?
  • What is a time series?
  • What is spatio-temporal data?

1.5 Half-baked opinions about designed experiments

  • Where did these half-baked opinions come from?
    • 15 years experience of statistical consulting (i.e., watching people struggle)
    • Authoring or co-authoring ~100 publication and a proud owner of a high rejection rate!
    • Writing a text book (link)
    • Teaching 20+ graduate-level courses on the topic
  • Gold standard: designed experiments, replication, and randomization
    • Replication crisis
    • ASA’s statment on p-values (link)
    • Model systems vs. reality
  • My observations
    • Most of what I see/work on are almost purely observational data/studies (e.g., here) or data/studies that have some elements of a designed experiment (e.g., the ability to apply a treatment) but lack other features (e.g., ability to replicate)
    • Experimental design, just like other frameworks is a tool, that works for some but not all studies (e.g., studies of plant vs. animal diseases)
    • Some professions seem very hesitant to use any tool other than designed experiments even when key features are missing
    • Ideas and methods from spatial statistics, time series analysis, and spatio-temporal statistics offer an alternative view
    • Example from the book Range
    • At the end of the day, it is all about trade offs in assumptions (e.g., regression vs. anova)