Frequently Asked Questions

Statistics and Data Science Seminar Series

Hosted by the Department of Mathematics and Statistics

For the Statistics and Data Science Seminar Series, please attend using the link attached to the specific seminar.

 

September 4th, 2024

A latent trajectory analysis for multivariate mixed outcomes: a study on the effect of bariatric surgery via electronic health records.

Dr. JungWun Lee from the Department of Biostatistics at Boston University School of Public Health

Abstract:

Trajectory analysis can be a statistical solution for explaining heterogeneities by partitioning patients into less heterogeneous subgroups based on similarities in outcome variables. This work proposes a novel trajectory analysis for electronic health records, a longitudinal data set containing multiple biomarkers, demographic factors of patients, and many missing values. The proposed model discovers subgroups of patients so that patients with the same trajectory group memberships are similar in their observed outcomes, while patients with different trajectories are heterogeneous. The proposed model may conceive multivariate mixed outcomes consisting of categorical and continuous variables simultaneously. We suggest an estimation strategy using the expectation-maximization algorithm, which provides the maximum-likelihood estimates and is highly stable to many missing values. We also present an application of our methodology to the DURABLE data set, an NIH-funded study examining long-term outcomes of patients who experienced bariatric surgery between 2007 and 2011.

 

September 11th, 2024

Innovation Diffusion Models: Theory and Practice

Dr. Mariangela Guidolin from the Department of Statistical Sciences at the University of Padova

Abstract:

The seminar is a general overview of a class innovation diffusion models that can be used to describe and forecast the evolution in time of sales of new products or technologies. Starting from the basic Bass model (BM), the seminar will be devoted to present some of its generalizations, which account for the presence of exogenous shocks, affecting the timing of the diffusion process, and for the presence of a dynamic market potential, as a function of a communication process, which develops over time. Moreover, some generalizations of the univariate BM are proposed to account for the presence of competition. The statistical techniques involved in model estimation combine time-series analysis with nonlinear regression techniques. The key objectives of the seminar are: to describe the main mathematical features of the models, discussing the meaning of the parameters from the economic point of view with real-data applications; to present and discuss the statistical aspects involved in model estimation and selection; to show and discuss predictive and explanatory ability of the proposed models, highlighting the properties and limitations of each of the models described.

 

September 18th, 2024

Generative AI agents for science and medicine

Dr. James Zou, an Associate Professor of Biomedical Data Science, Computer Science, and Electrical Engineering at Stanford University

Abstract:

This talk will explore how we can develop and use generative AI to help researchers. I will first discuss how generative AI can act as research co-advisors. We will then discuss how genAI can expand researchers' creativity by designing and experimentally validating new drugs. Finally, I will present how visual-language AI helps clinicians aggregate and interpret noisy data. I will conclude by sharing some thoughts on the future of AI agents for science. 

Bio:

James Zou is an associate professor of Biomedical Data Science, CS and EE at Stanford University. He is also the faculty director of Stanford AI4Health. He works on advancing the foundations of ML and in-depth scientific and clinical applications. Many of his innovations are widely used in tech and biotech industries.  He has received a Sloan Fellowship, an NSF CAREER Award, two Chan-Zuckerberg Investigator Awards, a Top Ten Clinical Achievement Award, several best paper awards, and faculty awards from Google, Amazon, and Adobe. His research has also been profiled in popular press including the NY Times, WSJ, and WIRED.

 

September 25th, 2024

Online statistical inference with streaming data: renewability, dependence, and dynamics

Dr. Lan Luo, an Assistant Professor in the Department of Biostatistics and Epidemiology at Rutgers University

Abstract:

New data collection and storage technologies have given rise to a new field of streaming data analytics, including real-time statistical methodology for online data analyses. Streaming data refers to high-throughput recordings with large volumes of observations gathered sequentially and perpetually over time. Such data collection scheme is pervasive not only in biomedical sciences such as mobile health, but also in other fields such as IT, finance, services, and operations. Despite a large amount of work in the field of online learning, most of them are established under strong independent and identical data distribution, and very few target statistical inference. This talk will center around three key components in streaming data analyses: (i) renewable updating, (ii) cross-batch dependency, and (iii) time-varying effects. I will first introduce how to conduct a renewable updating procedure, in the case of independent data batches, with a particular aim of achieving similar statistical properties to the offline oracle methods but enjoying great computational efficiency. Then I will discuss how we handle the dependency structure that spans across a sequence of data batches to maintain statistical efficiency in the process of renewable updating. Lastly, a dynamic weighting scheme will be integrated into the online inference framework to account for time-varying effects. I will provide both conceptual understanding and theoretical guarantees of the proposed method and illustrate its performance via numerical examples.  

 

October 2nd, 2024

Towards faster non-asymptotic convergence for diffusion-based generative models

Yuting Wei, Assistant Professor, University of Pennsylvania

Diffusion models, Generative models, training-free samplers and Probability flow ODE

View Presentation