Publications

A functional map of the human intrinsically disordered proteome

A systematic analysis of intrinstically disordered regions in human proteins.

Phonological Perception of Sign Language Models

Sign language recognition models do not learn phonology.

MorphoHELM: A Comprehensive Benchmark for Evaluating Representations for Microscopy-Based Morphology Assays

A comprehensive benchmark for morphological profiling.

Vermeer: Autoregressive generative modeling of microscopy predicts protein localization

Autoregressive image generation synthesizes microscopy images of proteins in cells.

Underrepresentation of children in public medical imaging datasets

Children are underrepresented in public medical imaging datasets.

scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling

A matched dataset of cell painting and single cell transcriptomics.

Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction

Bias correction improves probabilistic subseasonal weather forecasting.

Evolutionary conditioning enables guided generation of functionally diverse enhancers

Using evolution to condition generative models of enhancers designs sequences without prior knowledge of function.

Multimodal Alignment Improves Generalizability of Genomic Biomarker Prediction in Computational Pathology

Aligning pathology images with proteins and language improves generalization of biomarker prediction.

Investigating Dictionary Expansion for Video-based Sign Language Dictionaries

Expanding sign language dictionaries without extensive ML model retraining.

Exploring Reduced Feature Sets for American Sign Language Dictionaries

Strategies for looking up signs in dictionaries.

Exploring Collaboration to Center the Deaf Community in Sign Language AI

Exploring how ML and sign language experts can collaborate.

Representation Learning Methods for Single-Cell Microscopy are Confounded by Background Cells

Deep learning-based feature extractors for single cell microscopy images are confounded by backgroudn cells.

Zero-shot evaluation reveals limitations of single-cell foundation models

Proposed single-cell foundation models fail to outperform basic baselines zero-shot.

Integrating chemical structures as treatments improves representations of microscopy images for morphological profiling

Integrating microscopy images and chemical compounds improves image representations.

ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles

A corpus of 254 STEM Wikipedia articles interpreted into over 300 hours of American Sign Language.

Systemic Biases in Sign Language AI Research: A Deaf-Led Call to Reevaluate Research Agendas

A critical meta-analysis of recent AI paper in sign languages reveals systematic biases.

Convolutions are competitive with transformers for protein sequence pretraining

Convolutional models are competitive with transformers for protein sequences.

Protein structure generation via folding diffusion

A generative model of protein structure using diffusion models, inspired by how proteins fold.

Feature reuse and scaling: Understanding transfer learning with protein language models

A systematic benchmark of protein language models reveal that they do not scale for any tasks except structure.

Domain adaptation using optimal transport for invariant learning using histopathology datasets

Optimal transport can correct for batch effects in digital pathology.

Protein generation with evolutionary diffusion: sequence is all you need

Discrete diffusion models on protein sequences can generate novel proteins.

ASL citizen: a community-sourced dataset for advancing isolated sign language recognition

A large-scale crowdsourced dataset of isolated signs in ASL advances sign language recognition.

Incorporating knowledge of plates in batch normalization improves generalization of deep learning for microscopy images

Simply rearranging data leads to robust deep learning for microscopy.

Discovering molecular features of intrinsically disordered regions by using evolution for contrastive learning

Self-supervised learning exploiting principles of comparative genomics can help us understand the intrinsically disordered dark proteome.

CytoImageNet: A large-scale pretraining dataset for bioimage transfer learning

A large scale collection of publicly curated microscopy images.

Improved Conditional Flow Models for Molecule to Image Synthesis

Flow-based generative models can synthesize microscopy images from molecules.

Evolution Is All You Need: Phylogenetic Augmentation for Contrastive Learning

We argue that evolution effectively acts as phylogenetic augmentation for contrastive learning.

YeastSpotter: accurate and parameter-free web segmentation for microscopy images of yeast cells

We introduce YeastSpotter, a web application for the segmentation of yeast microscopy images into single cells.

The Cells Out of Sample (COOS) dataset and benchmarks for measuring out-of-sample generalization of image classifiers

We created a public dataset of 132,209 images of mouse cells, COOS-7, to test how robust classifiers are to covariate shifts.

Learning unsupervised feature representations for single cell microscopy images with paired cell inpainting

By training models with a self-supervised learning task, we learn highly effective representations of protein biology with no labels.

Integrating images from multiple microscopy screens reveals diverse patterns of change in the subcellular localization of proteins

Integrating 400,000 images from 24 experiments conducted on yeast to reveal novel aspects of cell biology.

Influence of repetitive mechanical loading on MMP2 activity in tendon fibroblasts

Matrix metalloproteinase2 has been implicated in tendon pathology caused by repetitive movements. However, its activity in the early stages of the tendon’s response to overuse, and its presence in the circulation as a possible indicator of tendon degradation, remain unknown. Human tendon cells were repetitively stretched for 5 days, and the rabbit Achilles tendon complex underwent repetitive motion 3× per week for 2 weeks. Quantitative polymer chain reaction analysis was performed to detect matrix metalloproteinase2/14 and tissue inhibitor of matrix metalloproteinase2 messenger ribonucleic acid of cells and rabbit tissue, and matrix metalloproteinase2 protein levels were determined with an enzyme linked immunoassay. Matrix metalloproteinase2 activity was examined using zymography of the conditioned media, tendon and serum. Immunohistochemistry was used to localize matrix metalloproteinase2 in tendon tissue, and the density of fibrillar collagen in tendons was examined using second harmonic generation microscopy. Tendon cells stretched with high strain or high frequency demonstrated increased matrix metalloproteinase2 messenger ribonucleic acid and protein levels. Matrix metalloproteinase2 activity was increased in the rabbit Achilles tendon tissue at weeks 1 and 2; however, serum activity was only increased at week 1. After 2 weeks of exercise, the collagen density was lower in specific regions of the exercised rabbit Achilles tendon complex. Matrix metalloproteinase2 expression in exercised rabbit Achilles tendons was detected surrounding tendon fibroblasts. Repetitive mechanical stimulation of tendon cells results in a small increase in matrix metalloproteinase2 levels, but it appears unlikely that serum matrix metalloproteinase2 will be a useful indicator of tendon overuse injury.

An Unsupervised kNN Method to Systematically Detect Changes in Protein Localization in High-Throughput Microscopy Images

A simple k-nearest neighbor algorithm can locally correct for covariate shifts when comparing image screens.

Angiopoietin‐like 4 promotes angiogenesis in the tendon and is increased in cyclically loaded tendon fibroblasts

The mechanisms that regulate angiogenic activity in injured or mechanically loaded tendons are poorly understood. The present study examined the potential role of angiopoietin-like 4 (ANGPTL4) in the angiogenic response of tendons subjected to repetitive mechanical loading or injury. Cyclic stretching of human tendon fibroblasts stimulated the expression and release of ANGPTL4 protein via transforming growth factor-β (TGF-β) and hypoxia-inducible factor 1α (HIF-1α) signalling, and the released ANGPTL4 was pro-angiogenic. Angiogenic activity was increased following ANGPTL4 injection into mouse patellar tendons, whereas the patellar tendons of ANGPTL4 knockout mice displayed reduced angiogenesis following injury. In human rotator cuff tendons, the expression of ANGPTL4 was correlated with the density of tendon endothelial cells. To our knowledge, this is the first study characterizing a role of ANGPTL4 in the tendon. ANGPTL4 may assist in the regulation of vascularity in the injured or mechanically loaded tendon. TGF-β and HIF-1α comprise two signalling pathways that modulate the expression of ANGPTL4 by mechanically stimulated tendon fibroblasts and, in the future, these could be manipulated to influence tendon healing or adaptation.

Accumulation of oxidized LDL in the tendon tissues of C57BL/6 or apolipoprotein E knock-out mice that consume a high fat diet: potential impact on tendon health

Clinical studies have suggested an association between dyslipidemia and tendon injuries or chronic tendon pain; the mechanisms underlying this association are not yet known. The objectives of this study were (1) to evaluate the impact of a high fat diet on the function of load-bearing tendons and on the distribution in tendons of oxidized low density lipoprotein (oxLDL), and (2) to examine the effect of oxLDL on tendon fibroblast proliferation and gene expression.

Enhanced collagen type I synthesis by human tenocytes subjected to periodic in vitro mechanical stimulation

Mechanical stimulation (e.g. slow heavy loading) has proven beneficial in the rehabilitation of chronic tendinopathy, however the optimal parameters of stimulation have not been experimentally determined. In this study of mechanically stimulated human tenocytes, the influence of rest insertion and cycle number on (1) the protein and mRNA levels of type I and III collagen; (2) the mRNA levels of transforming growth factor beta (TGFB1) and scleraxis (SCXA); and (3) tenocyte morphology, were assessed.

Podocalyxin Regulates Murine Lung Vascular Permeability by Altering Endothelial Cell Adhesion

Despite the widespread use of CD34-family sialomucins (CD34, podocalyxin and endoglycan) as vascular endothelial cell markers, there is remarkably little known of their vascular function. Podocalyxin (gene name Podxl), in particular, has been difficult to study in adult vasculature as germ-line deletion of podocalyxin in mice leads to kidney malformations and perinatal death. We generated mice that conditionally delete podocalyxin in vascular endothelial cells (PodxlΔEC mice) to study the homeostatic role of podocalyxin in adult mouse vessels. Although PodxlΔEC adult mice are viable, their lungs display increased lung volume and changes to the matrix composition. Intriguingly, this was associated with increased basal and inflammation-induced pulmonary vascular permeability. To further investigate the etiology of these defects, we isolated mouse pulmonary endothelial cells. PodxlΔEC endothelial cells display mildly enhanced static adhesion to fibronectin but spread normally when plated on fibronectin-coated transwells. In contrast, PodxlΔEC endothelial cells exhibit a severely impaired ability to spread on laminin and, to a lesser extent, collagen I coated transwells. The data suggest that, in endothelial cells, podocalyxin plays a previously unrecognized role in maintaining vascular integrity, likely through orchestrating interactions with extracellular matrix components and basement membranes, and that this influences downstream epithelial architecture.

Mast cells exert pro-inflammatory effects of relevance to the pathophyisology of tendinopathy

We have previously found an increased mast cell density in tendon biopsies from patients with patellar tendinopathy compared to controls. This study examined the influence of mast cells on basic tenocyte functions, including production of the inflammatory mediator prostaglandin E2 (PGE2), extracellular matrix remodeling and matrix metalloproteinase (MMP) gene transcription, and collagen synthesis.