Minhuan Li

I am a Flatiron Research Fellow at the Center for Computational Mathematics, Flatiron Institute. I received my PhD in Applied Physics from Harvard University in January 2025, advised by Doeke Hekstra. I also hold a S.M. in Computational Science & Engineering from Harvard, and an undergraduate degree in Physics from Fudan University.

During my PhD, I was a Fellow at the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, and interned at D.E. Shaw Research, where I worked on jointly fitting a family of QM/ML force field models to quantum mechanical and experimental data. I am also a core member of Reciprocal Space Station, an open-source consortium for structural biology software. I care deeply about contributing to the scientific community through both computational tools and shared infrastructure.

My research builds scalable, mathematically principled methods for biomolecular dynamics. I approach this by designing an interacting ecosystem of Data, Models, and Neural Priors, drawing heavily on the biophysical principles underlying experiments such as X-ray crystallography and cryo-EM, generative modeling, and statistical-physics-inspired sampling.

In this framework, Data represents the raw experimental observables (e.g., cryo-EM micrographs, X-ray scattering patterns); Models are the unified parameterizations of the molecule (e.g., atomic coordinates); and Neural Priors are the foundation models (e.g., AlphaFold) that capture the statistical rules of biology.

Data → Model. The core inverse problem, and where I spend most of my time. I develop differentiable forward models that faithfully map molecular structures to experimental observables, robust objective functions that remain well-behaved under noise and sparsity, and efficient samplers for high-dimensional, multimodal conformational landscapes.

Prior → Model. Steering pretrained foundation models with experimental likelihoods at inference time — without retraining — so they serve as principled priors for the inverse problem above.

Data/Model → Prior. Training biological foundation models on heterogeneous, context-rich data — integrating sequence, structure, surface chemistry, evolutionary signals, and experimental readouts — so the model learns to condition on the full biological context rather than any single modality alone. Besides, in area where physics-based prior is missing or less informative, data-driven prior can be useful.

Prior → Data. Closing the loop: using the neural prior’s uncertainty to guide experimental design and enable efficient, AI-centric data collection. This is the frontier I aim to develop alongside experimental collaborators.

news

Apr 28, 2026	Preprint out: MIMIC — a large collaboration with Polymathic AI. MIMIC is a multimodal biomolecular foundation model trained on diverse aligned data spanning sequences, structures, and exciting modalities like RNA transcriptome-wide chemical probing. Check out the blog post for an overview.
Apr 27, 2026	Released the codebase for embedopt, including a tutorial for running it on real cryoEM map data.
Apr 01, 2026	Our paper on ROCKET — using AlphaFold as a prior for experimental structure determination — is published in Nature Methods.
Feb 10, 2026	Preprint out: embedopt, a new framework for robust inference-time steering of protein diffusion models via embedding optimization.
Jan 01, 2026	Preprint out: GOTO, a robust differentiable sliced Wasserstein loss for image-based inverse problems, with applications to cryo-EM.

selected publications

arXiv

MIMIC: A Generative Multimodal Foundation Model for Biomolecules

Siavash Golkar, Jake Kovalic^*, Irina Espejo Morales^*, Samuel Sledzieski^*, Minhuan Li^*, and 26 more authors

arXiv preprint arXiv:2604.24506, 2026

Bib HTML Blog

@article{golkar2026mimic,
  title = {MIMIC: A Generative Multimodal Foundation Model for Biomolecules},
  author = {Golkar, Siavash and Kovalic, Jake and Morales, Irina Espejo and Sledzieski, Samuel and Li, Minhuan and Sokolova, Ksenia and Krawezik, Geraud and Bietti, Alberto and Gibbs, Claudia Skok and Klypa, Roman and Xiong, Shengwei and Lanusse, Francois and Parker, Liam and Cho, Kyunghyun and Cranmer, Miles and Hehir, Tom and McCabe, Michael and Meyer, Lucas and Morel, Rudy and Mukhopadhyay, Payel and Pettee, Mariel and Qu, Helen and Shen, Jeff and Fouhey, David and Sotoudeh, Hadi and Mulligan, Vikram and Cossio, Pilar and Hanson, Sonya M. and Jones, Alisha N. and Troyanskaya, Olga G. and Ho, Shirley},
  journal = {arXiv preprint arXiv:2604.24506},
  year = {2026},
  show_authors = {5},
}

Nat. Methods

AlphaFold as a prior: experimental structure determination conditioned on a pretrained neural network

Alisia Fadini^*, Minhuan Li^*, and 11 more authors

Nature Methods, 2026

Bib HTML Code

@article{fadini2026alphafold,
  title = {AlphaFold as a prior: experimental structure determination conditioned on a pretrained neural network},
  author = {Fadini, Alisia and Li, Minhuan and McCoy, Airlie J and Banjara, Suresh and Okumura, Hiroki and Napier, Eve and Fontana, Pietro and Khan, Amir R and Jovine, Luca and Terwilliger, Thomas C and Read, Randy J and Hekstra, Doeke R and AlQuraishi, Mohammed},
  journal = {Nature Methods},
  pages = {1--11},
  year = {2026},
  publisher = {Nature Publishing Group US New York},
  show_authors = {2},
}

arXiv

Robust Inference-Time Steering of Protein Diffusion Models via Embedding Optimization

Minhuan Li^*†, Jiequn Han, Pilar Cossio, and Luhuan Wu^*†

arXiv preprint arXiv:2602.05285, 2026

Bib HTML Code

@article{li2026robust,
  title = {Robust Inference-Time Steering of Protein Diffusion Models via Embedding Optimization},
  author = {Li, Minhuan and Han, Jiequn and Cossio, Pilar and Wu, Luhuan},
  journal = {arXiv preprint arXiv:2602.05285},
  year = {2026},
  show_authors = {4},
}

bioRxiv

Improving Cryo-EM Optimization Robustness with an Optimal Transport Loss Function for Noisy Images

Geoffrey Woollard, David Herreros, Minhuan Li^†, Pilar Cossio^†, and 1 more author

bioRxiv, 2025

Bib HTML Code

@article{woollard2025improving,
  title = {Improving Cryo-EM Optimization Robustness with an Optimal Transport Loss Function for Noisy Images},
  author = {Woollard, Geoffrey and Herreros, David and Li, Minhuan and Cossio, Pilar and Duc, Khanh Dao},
  journal = {bioRxiv},
  year = {2025},
  publisher = {Cold Spring Harbor Laboratory},
  show_authors = {4},
}

bioRxiv

SFCalculator: connecting deep generative models and crystallography

Minhuan Li, Kevin Dalton, and Doeke Hekstra^†

BioRxiv, 2025

Bib HTML Code

@article{li2025sfcalculator,
  title = {SFCalculator: connecting deep generative models and crystallography},
  author = {Li, Minhuan and Dalton, Kevin and Hekstra, Doeke},
  journal = {BioRxiv},
  year = {2025},
  show_authors = {3},
}

Nat. Commun.

Revealing thermally-activated nucleation pathways of diffusionless solid-to-solid transition

Minhuan Li, Zhengyuan Yue, Yanshuang Chen, and 3 more authors

Nature Communications, 2021

Bib HTML

@article{li2021revealing,
  title = {Revealing thermally-activated nucleation pathways of diffusionless solid-to-solid transition},
  author = {Li, Minhuan and Yue, Zhengyuan and Chen, Yanshuang and Tong, Hua and Tanaka, Hajime and Tan, Peng},
  journal = {Nature Communications},
  volume = {12},
  number = {1},
  pages = {4042},
  year = {2021},
  publisher = {Nature Publishing Group UK London},
}

Sci. Adv.

Revealing roles of competing local structural orderings in crystallization of polymorphic systems

Minhuan Li, Yanshuang Chen, Hajime Tanaka^†, and 1 more author

Science advances, 2020

Bib HTML

@article{li2020revealing,
  title = {Revealing roles of competing local structural orderings in crystallization of polymorphic systems},
  author = {Li, Minhuan and Chen, Yanshuang and Tanaka, Hajime and Tan, Peng},
  journal = {Science advances},
  year = {2020},
  publisher = {American Association for the Advancement of Science},
}

latest posts

May 27, 2024	Flow-Matching Objectives
May 13, 2024	Training Neural ODE with three different loss types
Sep 12, 2023	Implicit Reparameterization Gradients