Speakers

Bryon Aragam, University of Chicago, Booth School of Business

Photo of Bryon Aragam, University of Chicago, Booth School of Business

Title: Bridging causality and deep learning with causal generative models

Abstract: Generative models for vision and language have shown remarkable capacities to emulate creative processes but still lack fundamental skills that have long been recognized as essential for genuinely autonomous intelligence. Difficulties with causal reasoning and concept abstraction highlight critical gaps in current models, despite their nascent capacities for reasoning and planning. Bridging this gap requires a synthesis of deep learning’s expressiveness with the powerful framework of statistical causality.

We will discuss our recent efforts towards building generative models that extract causal knowledge from data while retaining the flexibility and expressivity of deep learning. Unlike traditional causal methods that rely on predefined causal structures, we tackle the more complex problem of learning causal structure directly from data—even when the causal variables themselves are not explicitly observed. This introduces significant challenges, including ill-posedness, nonconvexity, and the exponential complexity of combinatorial search. We will outline statistical aspects of these problems and present progress towards resolving these challenges with differentiable approaches to causal discovery and representation learning.

Trent D Buskirk, Old Dominion University, School of Data Science and Department of Epi, Bio and Environmental Health

Photo of Trent D Buskirk, Old Dominion University, School of Data Science and Department of Epi, Bio and Environmental Health

Title: The Chatbot Has Entered the Survey: Rethinking Survey Science in the Age of AI

Abstract: As generative AI redefines how we create and consume information, a provocative question emerges: what happens when the survey interviewer is a chatbot—or the respondent is, too?

Survey research may be one of the most vulnerable—and most adaptable—fields facing this technological shift. A recent article by Eloundou et al. (2023) identified survey research as particularly susceptible to disruption by Large Language Models (LLMs). While the field is no stranger to innovation, today’s AI tools bring new kinds of challenges and opportunities that extend well beyond automation.

Survey research has long been a human-powered endeavor, yet one that has embraced innovation—from punch cards to online panels. In this keynote, we explore the evolving role of Large Language Models across the entire survey research pipeline—from design and sampling to data collection and estimation. Drawing on real-world examples, emerging literature, and a few provocative questions of our own, we’ll examine both the opportunities and the pitfalls of integrating LLMs into a field long driven by human insight.

Serina Chang, University of California Berkeley, EECS and Computational Precision Health

photo of Serina Chang, University of California Berkeley, EECS and Computational Precision Health

Title: “LLMs as Synthetic Survey Respondents: Accuracy, Diversity, and Integration”

Abstract: Surveys are an invaluable tool for understanding population opinions and behaviors, but they demand significant time and resources. Large language models (LLMs) present new opportunities to simulate responses to survey questions, given their natural language understanding, generation capabilities, and social knowledge acquired through pretraining on vast amounts of Internet data. Integrating LLMs as synthetic respondents into survey pipelines could enhance efficiency across various survey stages—from pilot testing to pre-survey sampling design to post-survey data imputation and analysis—not as a replacement for human participants but as a way to augment existing efforts. However, these opportunities bring new challenges, such as how to improve and evaluate LLMs’ accuracy in predicting survey responses, understand how and where they store social knowledge in their billions of parameters, and condition them to reflect unique individuals or subpopulations. In this talk, I’ll describe how we’ve addressed these challenges in our work, including fine-tuning LLMs on high-quality survey data and testing their generalization, probing their internal representations, and conditioning them on rich data (such as news articles and life narratives) to faithfully represent diverse subpopulations and individuals.

Yang Chen, University of Michigan, Department of Statistics

photo of Yang Chen, University of Michigan, Department of Statistics

Title: Spatial-temporal tensor completion methods with applications to space weather

Abstract: Spatial-temporal tensor data with complex missingness patterns are prevalent in space weather. Traditional low-rank tensor models, including CP, Tucker, and Tensor-Train, exploit low-dimensional structures to recover missing entries. These methods often treat all tensor modes symmetrically, failing to capture the unique spatiotemporal patterns inherent in scientific data, where the temporal component exhibits low-frequency stability and high-frequency variations. Furthermore, these approaches do not naturally provide entry-wise uncertainty quantification, which is crucial in satellite imaging data. In this talk, I will discuss several methods we developed for spatial-temporal tensor completion and uncertainty quantification in space weather, including the most recent Fourier Low-rank and Sparse Tensor (FLoST).

Moo Chung, University of Wisconsin-Madison, Department of Biostatistics & Medical Informatics

photo of Moo Chung, University of Wisconsin-Madison, Department of Biostatistics & Medical Informatics

Title: Causality via the Minimum Energy Principle

Abstract: Conventional causal models—such as Granger causality and structural equation modeling—are primarily designed for acyclic relationships and often lack the capacity to capture higher-order or cyclic interactions that emerge in dynamic brain networks. To address this challenge, we introduce a model-free causal framework, which we call the Topological Causal Model (TCM). It is grounded in the minimum energy principle and inspired by the physical intuition that systems tend toward configurations of minimal energy. In this formulation, causality arises naturally as the directional flow of energy from high to low potential. Using Hodge theory—a mathematical framework from differential geometry—we decompose edge flows into three orthogonal components: gradient, curl, and harmonic. Each component reflects a distinct topological structure and admits independent energy minimization, enabling scalable and interpretable inference even in the presence of feedback cycles and complex interactions. We apply this framework to resting-state functional MRI time series from the human brain and demonstrate how it reveals cyclic causal patterns that are inaccessible to traditional causal models. These findings provide a new perspective on brain network dynamics and underscore the advantages of topological approaches over conventional statistical methods. This talk is based on two recent developments in Hodge-theoretic network analysis (arXiv:2110.14599, arXiv:2211.10542).

Alex Hagen, Pacific Northwest National Lab

photo of Alex Hagen, Pacific Northwest National Lab

Title: The confluence of representation learning, distribution distances and calibrated inference; Training and using neural networks in flexible material analysis workflows for nuclear forensics research

Abstract: A succession of publications has shown the utility of morphological analysis for material characterization, and more recent results illustrate the utility of quantitative (as compared to qualitative) morphology. While techniques utilizing strongly supervised neural networks have achieved extremely high performance on specific and highly constrained tasks, un- or weakly-supervised techniques have shown to obtain competitive performance to their supervised counterparts, and the embeddings obtained from a unsupervised model can be used for diverse set of downstream tasks once trained. We find this flexibility especially useful for agile, short timeline analyses of materials and for “open” problems where relevant materials may not have been characterized a priori. We present several methods for first calibrating these models in a manner that they can provide accurate probabilities of their predictions’ veracity, and several methods of combining these probabilities. Prior work demonstrated success using non-parametric approaches, such as classifier 2-sample tests (C2STs) and permutation-based high-dimensional distance tests, to detect effects like strike order, oxalic feed type, and calcination temperature on encodings of synthesized Pu(III) oxalates. We also present a hybrid approach leveraging both the sensitivity of C2STs and precision of permutation distance tests.  Overall, the combination of steps into a pipeline allows for full-image and even full-sample (where a sample has many images) predictions with associated confidence values.  Results and examples for predicting and mapping uranium ore concentrates’ synthetic process from imagery will be presented.

Heike Hofmann, CSAFE/University of Nebraska-Lincoln, Statistics

photo of Heike Hofmann, CSAFE/University of Nebraska-Lincoln, Statistics

Eric Laber, Duke, Biostatistics and Bioinformatics

photo of Eric Laber, Duke, Biostatistics and Bioinformatics

Title: Reinforcement Learning for Respondent-Driven Sampling

Abstract: Respondent-driven sampling (RDS) is widely used to study hidden or hard-to-reach populations by incentivizing study participants to recruit their social connections. The success and efficiency of RDS can depend critically on the nature of the incentives, including their number, value, call to action, etc. Standard RDS uses an incentive structure that is set {\em a priori} and held fixed throughout the study. Thus, it does not make use of accumulating information on which incentives are effective and for whom.

We propose a reinforcement learning (RL) based adaptive RDS study design in which the incentives are tailored over time to maximize cumulative utility during the study. 

We show that these designs are more efficient, cost-effective, and can generate new insights into the social structure of hidden populations. 

In addition, we develop methods for valid post-study inference which are non-trivial due to the adaptive sampling induced by RL as well as the complex dependencies among subjects due to latent (unobserved) social network structure.  We provide asymptotic regret bounds and illustrate its finite sample behavior through a suite of simulation experiments. 

Lan Luo, Rutgers, The State University of New Jersey, Biostatistics and Epidemiology

photo of Lan Luo, Rutgers, The State University of New Jersey, Biostatistics and Epidemiology

Title: Online statistical inference with streaming data: renewability, dependence, and dynamics

Abstract: New data collection and storage technologies have given rise to a new field of streaming data analytics, including real-time statistical methodology for online data analyses. Streaming data refers to high-throughput recordings with large volumes of observations gathered sequentially and perpetually over time. Such a data collection scheme is pervasive not only in biomedical sciences, such as mobile health, but also in other fields such as IT, finance, services, and operations. Despite a large amount of work in the field of online learning, most of them are established under strong independent and identical data distributions, and very few target statistical inference. This talk will center around three key components in streaming data analyses: (i) renewable updating, (ii) cross-batch dependency, and (iii) time-varying effects. I will first introduce how to conduct a renewable updating procedure, in the case of independent data batches, with a particular aim of achieving similar statistical properties to the offline oracle methods but enjoying great computational efficiency. Then I will discuss how we handle the dependency structure that spans across a sequence of data batches to maintain statistical efficiency in the process of renewable updating. Lastly, a dynamic weighting scheme will be integrated into the online inference framework to account for time-varying effects. I will provide both conceptual understanding and theoretical guarantees of the proposed method, and illustrate its performance via numerical examples.

Yang Ni, Texas A&M University, Statistics

photo of Yang Ni, Texas A&M University, Statistics

Title: Consistent DAG selection for Bayesian Causal Discovery under general error distributions

Abstract: We consider the problem of learning the underlying causal structure among a set of variables, which are assumed to follow a Bayesian network or, more specifically, a linear recursive structural equation model (SEM) with the associated errors being independent and allowed to be non-Gaussian. A Bayesian hierarchical model is proposed to identify the true data-generating directed acyclic graph (DAG) structure where the nodes and edges represent the variables and the direct causal effects, respectively. Moreover, incorporating the information of non-Gaussian errors, we characterize the distribution equivalence class of the true DAG, which specifies the best possible extent to which the DAG can be identified based on purely observational data. Furthermore, under the consideration that the errors are distributed as some scale mixture of Gaussian, where the mixing distribution is unspecified, and mild distributional assumptions, we establish that by employing a non-standard DAG prior, the posterior probability of the distribution equivalence class of the true DAG converges to unity as the sample size grows. This shows that the proposed method achieves the posterior DAG selection consistency, which is further illustrated with examples and simulation studies.

Chris Saunders, South Dakota State University, Statistics

Photo of Chris Saunders, South Dakota State University, Statistics

Shubhanshu Shekhar, University of Michigan, EECS

photo of Shubhanshu Shekhar, University of Michigan, EECS

Title: On the near-optimality of betting confidence sets for bounded means

Abstract: Constructing nonasymptotic confidence intervals (CIs) for the mean of a univariate distribution from independent and identically distributed (i.i.d.) observations is a fundamental task in statistics. For bounded observations, a classical nonparametric approach proceeds by inverting standard concentration bounds, such as Hoeffding’s or Bernstein’s inequalities. Recently, an alternative approach for defining CIs and their time-uniform variants called confidence sequences (CSs), based on the principle of testing-by-betting, has been shown to be empirically superior to the classical methods. In this talk, I will present some results that explain this improved empirical performance of betting-based CIs and CSs.

Lan Wang, University of Miami, Management Science

photo of Lan Wang, University of Miami, Management Science

Title: Distributional Off-Policy Evaluation in Reinforcement Learning

Abstract: In the existing literature of reinforcement learning (RL), off-policy evaluation is mainly focused on estimating a value (e.g., an expected discounted cumulative reward) of a target policy given the pre-collected data generated by some behavior policy. Motivated by the recent success of distributional RL in many practical applications, we study the distributional off-policy evaluation problem in the batch setting when the reward is multi-variate. We propose an offline Wasserstein-based approach to simultaneously estimate the joint distribution of a multivariate discounted cumulative reward given any initial state-action pair in the setting of an infinite-horizon Markov decision process. Finite sample error bound for the proposed estimator with respect to a modified Wasserstein metric is established in terms of both the number of trajectories and the number of decision points on each trajectory in the batch data. Extensive numerical studies are conducted to demonstrate the superior performance of our proposed method. (Joint work with Zhengling Qi, Chenjia Bai, and Zhaoran Wang)

Christopher K. Wikle, Curators’ Distinguished Professor and Chair, Department of Statistics, University of Missouri

photo of Christopher K. Wikle, Curators’ Distinguished Professor and Chair, Department of Statistics, University of Missouri

Title: “Physics” Informed Neural Models for Spatio-Temporal Data: Something Old, Something New, Something Borrowed…

Abstract: In the last few years, process (physics)-informed neural models (PINNs) for spatio-temporal data have become ubiquitous across many areas of science due to the benefits of adding mechanistic-based information (e.g., fluid dynamics partial differential equations, PDEs) inside the neural network black box.  In many scientific applications where there is substantial a priori process knowledge, incorporating this information can improve model efficiency and constrain the solution to realistic state-spaces.  The notion of including process knowledge in data-driven models is not new (e.g., meteorologists were including PDE constraints in data assimilation optimizations as early as the late 1950s) and statisticians have developed several useful paradigms that can account for mechanistic information and quantify uncertainty (e.g., physical-statistical models). This talk presents an overview of some general approaches that accommodate process knowledge and data-driven parameter/process estimation, with a focus on PINNs and physical statistical models. Then, it is shown that a flexible Bayesian hierarchical modeling framework can accommodate PINN process models while also accounting for stochastic process discrepancies, observation uncertainty, and parameter uncertainty.  The talk concludes with an example and discussion of challenges.

This is joint work with Josh North (Lawrence Berkeley National Lab), Giri Gopalan (Los Alamos National Lab), and Myungsoo Yoo (U. Texas-Austin)

Lei Zou, Texas A&M University, Geography

photo of Lei Zou, Texas A&M University, Geography

Title: Responsible GeoAI for Disaster Resilience

Abstract: Advances in geospatial artificial intelligence (GeoAI), combined with the increasing volume and diversity of geospatial data, have enabled researchers and practitioners to monitor, model, and manage the complex dynamics of human systems, urban developments, and environmental changes at unprecedented spatial and temporal scales. These capabilities are particularly valuable for disaster resilience, which relies on timely, accurate, and location-specific information to support effective decision-making. This talk explores the opportunities, successes, and challenges of applying GeoAI to enhance disaster resilience. It begins with a framework that traces the evolution of GeoAI, from early applications of machine learning on geospatial data to the emergence of autonomous, multimodal geospatial reasoning systems. A series of case studies is introduced to illustrate how AI and data science advances (e.g., large-language models, pre-trained vision models, digital twins) have been used to analyze multi-sourced geospatial big data (e.g., social media, street-view imagery, satellite observations, and human mobility data) to support critical aspects of disaster resilience, including monitoring real-time impacts, modeling human-disaster interactions, and informing mitigation strategies and policy interventions. To address the ethical and technical challenges posed by these rapidly evolving technologies, the talk further proposes the STEP paradigm (Security, Trustworthiness, Equity, and Philanthropy) as a framework for responsible GeoAI. This paradigm offers practical guidance to ensure GeoAI systems are not only powerful and scalable but also transparent, fair, and safeguard human values. These theoretical, methodological, and applied insights chart a pathway toward responsible GeoAI that can advance disaster resilience and long-term sustainable development.