Alex Lu

Senior Researcher

Microsoft Research

Hello!

How do we use AI to make biological discoveries? While AI is an increasingly powerful tool for scientific discovery, most of its successes are in fields where ample prior knowledge exists to pose well-defined prediction problems. However, many biological domains are inherently exploratory, and characterized by a lack of complete knowledge. My research focuses on figuring out how we can still use AI to empower biological discovery in these settings.

Specifically, I focus on several core themes:

Foundation models and self-supervised representation learning: These methods can learn from unlabeled data at scale, potentially unearthing unsupervised insights in yet-to-be-characterized data. I design biologically-motivated and principles-first methods.
Generalization and robustness: For foundation models to be useful in discovery, they must learn biological signal. I design evaluations that understand what models learn and scale on, and methods that ensure that models learn biology, not irreproducible technical variation.
Aligning with downstream use-cases: Applying models for discovery in less-characterized disciplines means we need to interact with them differently. These include both technical methods like zero-shot evaluation, interpretability, and generative models, but also developing an understanding of overlooked applications.

Featured Publications

Systemic Biases in Sign Language AI Research: A Deaf-Led Call to Reevaluate Research Agendas

A critical meta-analysis of recent AI paper in sign languages reveals systematic biases.

Aashaka Desai, Maartje de Meulder, Julie Hochgesang, Annemarie Kocab, Alex X. Lu

PDF

Feature reuse and scaling: Understanding transfer learning with protein language models

A systematic benchmark of protein language models reveal that they do not scale for any tasks except structure.

Francesca Zhoufan-Li, Ava P. Amini, Yisong Yue, Kevin Yang, Alex X. Lu

PDF

Assessing the limits of zero-shot foundation models in single-cell biology

Proposed single-cell foundation models fail to outperform basic baselines zero-shot.

Kasia Z Kedzierska, Lorin Crawford, Ava P Amini, Alex X Lu

PDF Project

Protein generation with evolutionary diffusion: sequence is all you need

Discrete diffusion models on protein sequences can generate novel proteins.

Sarah Alamdari, Nitya Thakkar, Rianne van den Berg, Neil Tenenholtz, Bob Strome, Alan Moses, Alex X. Lu, Nicolo Fusi, Ava P. Amini, Kevin Yang

PDF

Discovering molecular features of intrinsically disordered regions by using evolution for contrastive learning

Self-supervised learning exploiting principles of comparative genomics can help us understand the intrinsically disordered dark …

Alex X Lu, Amy X Lu, Iva Pritisanac, Taraneh Zarin, Julie D Forman-Kay, Alan M Moses

PDF Code Dataset Project

Recent Publications

ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles

Kayo Yin, Chinmay Singh, Fyodor Minakov, Vanessa Milan, Hal Daume III, Cyril Zhang, Alex X Lu, Danielle Bragg

PDF

Convolutions are competitive with transformers for protein sequence pretraining

Kevin Yang, Nicolo Fusi, Alex X Lu

PDF

See all publications