The Watchmaker's Guide to Population Genetics Logo

The Workshop

  • The Watchmaker’s Philosophy
    • Why Build It Yourself?
    • The Watchmaker’s Way
    • The Gears of Understanding
    • On Mathematical Rigor
    • On Teaching Probability and Calculus
    • On Python Implementations
    • Your Journey

The Workbench (Prerequisites)

  • The Workbench (Prerequisites)
    • Likelihood-Based Probabilistic Inference
      • Why Likelihood?
      • The Likelihood Function
      • The Toolkit: Key Distributions
        • The Exponential Distribution: Coalescence Waiting Times
        • The Poisson Distribution: Mutations and the SFS
        • The Gamma Distribution: Ages and Rates
        • The Gaussian Distribution: Smoothness Priors
      • Maximum Likelihood Estimation (MLE)
        • Worked Example: Inferring Population Size from the SFS
        • Fisher Information and Confidence Intervals
      • Bayesian Inference
        • Conjugate Priors: When Bayesian Inference Has Closed-Form Solutions
      • Composite and Approximate Likelihoods
        • Worked Example: Composite Likelihood from Two Data Sources
      • The Other Paradigm: Neural Networks and Amortized Inference
        • The key idea
        • What amortized inference does well
        • What likelihood-based inference does well
        • Why this book focuses on the likelihood approach
      • Summary
    • Coalescent Theory
      • The Big Idea
      • The Wright-Fisher Model (Forward in Time)
      • Going Backwards: The Coalescent
        • The probability that two specific lineages coalesce in a given generation
        • Waiting time to coalescence
      • The Coalescent with \(n\) Samples
      • Expected Number of Lineages at Time \(t\)
      • Mutations on the Coalescent Tree
      • Summary
    • Ancestral Recombination Graphs
      • Why Trees Aren’t Enough
      • What Is Recombination?
        • What We’ve Established So Far
      • Recombination in the Coalescent
      • The Structure of an ARG: A Directed Acyclic Graph
      • Marginal Trees
      • The Tree Sequence Representation
      • Branch Lengths and the ARG
      • Why ARG Inference Is Hard
      • Summary
    • Hidden Markov Models
      • Why HMMs for ARG Inference?
      • A Warm-Up Example: Weather and Umbrellas
      • The Core Idea
      • Formal Definition
      • The Forward Algorithm
      • Scaling for Numerical Stability
      • Stochastic Traceback (Sampling)
      • The Li-Stephens Trick: Linear-Time Transitions
        • The Li-Stephens Transition Structure
        • The \(O(K)\) Forward Step
      • Summary
    • The Sequentially Markov Coalescent
      • The Problem with the Full Coalescent
      • What Does “Markov” Mean, and Why Does It Matter?
        • Intuitive Explanation
        • Formal Definition
        • Why Markov Matters for Computation
      • What Makes CwR Non-Markov?
        • The Mechanism
        • What Are Ghost Lineages? A Concrete Example
      • The SMC Approximation
        • Why Does This Restore the Markov Property?
        • How Good Is the Approximation?
      • The SMC Transition Probability
        • Deriving \(r_i\): The Recombination Probability
        • Deriving \(q_j\): The Re-joining Weights
      • PSMC: The Pairwise Case
      • The Cumulative Distribution Function
      • Why SMC Enables HMM Inference
      • Summary
    • The Diffusion Approximation
      • The Big Idea
      • From Wright-Fisher to Continuous Frequency
        • Mean and variance of \(\Delta x\)
        • The diffusion timescale
        • Code: WF trajectories converging to SDE paths
      • Stochastic Differential Equations
        • Euler-Maruyama simulation
      • From SDEs to PDEs: The Fokker-Planck Equation
        • The two terms: diffusion and advection
      • Boundary Conditions
        • Absorbing boundaries
        • Why \(x(1-x)\) vanishes at boundaries
        • The flux condition
        • Reflecting boundaries and mutation
      • Stationary Distributions
        • The neutral case
        • With mutation: the Beta distribution
        • With selection: exponential tilting
      • Numerical Solutions: Finite Differences for PDEs
        • Discretizing \(x\) on a grid
        • Finite-difference approximations
        • The method of lines
        • Crank-Nicolson time stepping
        • The curse of dimensionality
        • Code: 1D diffusion solver
      • Connection to the Site Frequency Spectrum
        • The binomial bridge
        • How dadi and moments differ
      • Summary
    • Ordinary Differential Equations
      • The Big Idea
      • What Is an ODE?
      • Euler’s Method
      • The Runge-Kutta Family
        • RK2: The Midpoint Method
        • RK4: The Classic Method
        • RK45: Adaptive Step Size (Dormand-Prince)
      • Systems of Coupled ODEs
      • Stiffness and Implicit Methods
      • The Matrix Exponential
      • Summary
    • Markov Chain Monte Carlo
      • The Big Idea: Why Sample?
      • Bayesian Inference in 60 Seconds
      • Markov Chains
        • Stationary Distribution
      • The Metropolis-Hastings Algorithm
      • Gibbs Sampling
      • Convergence Diagnostics
      • Practical Considerations
        • Proposal Tuning
        • Data-Informed Proposals
        • Parallel Tempering
        • When MCMC Is Not Enough
      • MCMC in Population Genetics: Three Applications
        • ARGweaver: Gibbs Sampling over ARGs
        • SINGER: MH with Data-Informed Proposals
        • PHLASH: Beyond MCMC
      • Summary

Timepieces

  • Timepieces
    • Verification Status
    • Timepiece I: PSMC
      • The Mechanism at a Glance
      • Why Just Two Sequences?
      • Chapters
        • Overview of PSMC
        • The Continuous-Time PSMC Model
        • Discretizing Time
        • The PSMC HMM and EM Algorithm
        • Decoding the Clock
        • Demo: Running PSMC on Simulated Data
    • Timepiece II: SMC++
      • The Mechanism at a Glance
      • Chapters
        • Overview of SMC++
        • The Distinguished Lineage
        • The ODE System
        • The Continuous HMM
        • Population Splits
        • Demo: Running SMC++ on Simulated Data
    • Timepiece III: The Li & Stephens HMM
      • The Mechanism at a Glance
      • Chapters
        • Overview of the Li & Stephens HMM
        • The Copying Model
        • Haploid LS HMM Algorithms
        • The Diploid Extension
        • Demo: Running the Li & Stephens HMM on Simulated Data
    • Timepiece IV: msprime
      • The Mechanism at a Glance
      • Chapters
        • Overview of msprime
        • The Coalescent Process
        • Segments & the Fenwick Tree
        • Hudson’s Algorithm
        • Demographics & Population
        • Mutations
        • Demo: Running msprime on Simulated Data
    • Timepiece V: ARGweaver
      • The Mechanism at a Glance
      • Chapters
        • Overview of ARGweaver
        • Time Discretization
        • Transition Probabilities
        • Emission Probabilities
        • MCMC Sampling
        • Demo: Running ARGweaver on Simulated Data
    • Timepiece VI: tsinfer
      • The Mechanism at a Glance
      • Chapters
        • Overview of tsinfer
        • Gear 1: Ancestor Generation
        • Gear 2: The Copying Model
        • Gear 3: Ancestor Matching
        • Gear 4: Sample Matching & Post-Processing
        • Demo: Running tsinfer on Simulated Data
    • Timepiece VII: SINGER
      • The Mechanism at a Glance
      • Chapters
        • Overview of SINGER
        • Branch Sampling
        • Time Sampling
        • ARG Rescaling
        • Sub-Graph Pruning and Re-grafting (SGPR)
        • Demo: Running SINGER on Simulated Data
    • Timepiece VIII: Threads
      • The Mechanism at a Glance
      • Chapters
        • Overview of Threads
        • Haplotype Matching with the PBWT
        • Memory-Efficient Viterbi Inference
        • Dating Path Segments
        • Demo: Running Threads on Simulated Data
    • Timepiece IX: tsdate
      • The Mechanism at a Glance
      • Where tsinfer Ends and tsdate Begins
      • Chapters
        • Overview of tsdate
        • The Coalescent Prior
        • The Mutation Likelihood
        • Inside-Outside Belief Propagation
        • Variational Gamma (Expectation Propagation)
        • Rescaling
        • Demo: Running tsdate on Simulated Data
    • Timepiece X: moments
      • The Mechanism at a Glance
      • Chapters
        • Overview of moments
        • The Site Frequency Spectrum
        • The Moment Equations
        • Demographic Inference
        • Linkage Disequilibrium
        • Demo: Running moments on Simulated Data
    • Timepiece XI: dadi
      • The Mechanism at a Glance
      • dadi vs. moments
      • Chapters
        • Overview of dadi
        • The Diffusion Equation
        • Numerical Integration
        • Demographic Inference
        • Demo: Running dadi on Simulated Data
    • Timepiece XII: momi2
      • The Mechanism at a Glance
      • Chapters
        • Overview of momi2
        • The Coalescent SFS
        • The Moran Model
        • Tensor Machinery
        • Automatic Differentiation & Inference
        • Demo: Running momi2 on Simulated Data
    • Timepiece XIII: Gamma-SMC
      • The Mechanism at a Glance
      • PSMC vs. Gamma-SMC
      • Chapters
        • Overview of Gamma-SMC
        • The Gamma Approximation
        • The Flow Field
        • The Forward-Backward CS-HMM
        • Segmentation and Caching
        • Demo: Running Gamma-SMC on Simulated Data
    • Timepiece XIV: PHLASH
      • The Mechanism at a Glance
      • Chapters
        • Overview of PHLASH
        • The Composite Likelihood
        • Random Time Discretization
        • The Score Function Algorithm
        • Stein Variational Gradient Descent (SVGD)
        • Demo: Running PHLASH on Simulated Data
    • Timepiece XV: CLUES
      • The Mechanism at a Glance
      • Why Detect Selection?
      • Chapters
        • Overview: Detecting Selection
        • The Wright-Fisher HMM
        • Emission Probabilities
        • Inference: From Gene Trees to Selection
        • Demo: Running CLUES on Simulated Data
    • Timepiece XVI: SLiM
      • The Mechanism at a Glance
      • Chapters
        • Overview of SLiM
        • The Wright-Fisher Generation Cycle
        • Recipes
        • Demo: Running SLiM on Simulated Data
    • Timepiece XVII: Relate
      • The Mechanism at a Glance
      • Where tsinfer and SINGER End and Relate Begins
      • Chapters
        • Overview of Relate
        • Gear 1: Asymmetric Painting
        • Gear 2: Tree Building
        • Gear 3: Branch Length Estimation (MCMC)
        • Gear 4: Population Size Estimation
        • Demo: Running Relate on Simulated Data
    • Timepiece XVIII: discoal
      • The Mechanism at a Glance
      • Chapters
        • Overview of discoal
        • The Allele Frequency Trajectory
        • The Structured Coalescent Under Selection
        • Hard, Soft, and Partial Sweeps
        • discoal and msprime: Two Takes on Sweeps
        • Demo: Running discoal on Simulated Data
The Watchmaker's Guide to Population Genetics
  • Search


© Copyright 2026, Kevin Korfmann.

Built with Sphinx using a theme provided by Read the Docs.