Share:
Image-Based Scientific Machine Learning for Theories of Biological Dynamics Across Scales
Overview
Modern imaging now allows us to observe living systems across molecular, cellular, and tissue scales with unprecedented precision, yet our ability to extract mechanistic understanding to “learn the rules of life” from these data remains limited. Unlike molecular “omics” data, imaging captures continuous spatial and temporal information—shapes, motions, forces, concentrations and chemical potentials—that are inherently complex to represent, analyze, and interpret. The challenge lies in transforming this deluge of high-dimensional, dynamic spatiotemporal data into quantitative models, and linking these to other multi-modal data (e.g. proteomics, genomics, metabolomics), to reveal the governing principles of biological organization and dynamics across scales.
​
The goal of the program is to accelerate the development of image-based scientific machine learning for biological dynamics. Such approaches will enable “learning the laws” of biology directly from experimental images and movies, yielding new insight into how complex forms and behaviors emerge and evolve from the scales of sub-cellular structures to whole organs. Building shared benchmarks, interpretable methodologies, open tools, and cross-disciplinary collaborations will accelerate discovery and lay the foundation for a new era of AI-enabled scientific discovery for theory in biology across scales. We aim to (rapidly) bring together physicists, computer scientists, mathematicians and biologists to build on the current momentum in this evolving field.
Associated themes for the program are:
Force Inference and Learning Equations of Motion - How can we use time-lapse data, capturing shape changes and movements, to learn forces, fields and equations of motion of complex dynamics (force inference, information bottleneck).
​
Spatial Representation of biological images - Scientific images are information rich - containing information of spatial distributions of molecules and cells, in N-dimensions (many channels). How can we represent this type of spatial information and efficiently learn the patterns within toward interpretable understanding (e.g., graph neural networks, VAE, CNN).
​
Virtual/Digital Staining - What are the possibilities (and constraints) of predicting an image field B from images of field A (e.g., phase contrast to fluorescence or segmentation, protein concentration to forces, low to high resolution, etc.). What are the metrics and biological measurements used to assess success? How does the success (or lack thereof) of virtual stains inform the nature of the biological system and impact future experimental design?
​
Muti-modal integration - How do you bring data in from different modalities (e.g. scRNAseq, metabolomics, signaling) and integrate with imaging data- (e.g. trajectory embedding, optimal transport, anchoring datasets).
Multi-scale (time and space) Integration - How do we link representations across disparate scales of space and time?
​
Generative AI/Foundation Models of Biological Dynamics - What is needed to make robust simulators of biological dynamics across scales (e.g. virtual cells, tissues and organs)?
​
Seeing the Unseen - Can these be used to extend predictions of measurements that go beyond that which is currently experimentally accessible (e.g. movies with atomic resolutions or knowledge of signaling state).
Beyond Prediction: Learning the Theories - Bringing all information together to learn (interpretable) models of biological dynamics and physiology across scales. These should be predictive, robust, generative, and interpretable, and give guidance on how to engineer and manipulate future physiology from scales of cells to organs.
​
AI for the lab and the classroom - AI is changing how we do science, from classroom, to managing lab knowledge, to literature research, to actual research. We should be discussing best (and rapidly evolving) practices in the field.