I’m a 4th year Molecular Biophysics PhD student at UCSD, advised by Wei Wang. I received my Bachelor’s in Chemistry from Pomona College. I’m interested in using deep learning to help us better understand and engineer proteins. More specfically:
I document what I’m working on and learning about in this notebook.
You can reach me at y4ko [at] ucsd [dot] edu!
Protein sequence -> structure -> function. Beacuse of its direct link to function, structure has been the natural way for us to study proteins. Humanity has learned an incredible amount from structural biology–the PDB attests to this feat. But in the era of AI, is structure really the future?
Our current collection of structures represents a small, biased subset of the protein universe. Adding to it remains challenging (X-ray crystallography is no joke). Sequence data, on the other hand, is abundant, diverse, and importantly, supported by high-throughput technology that let us collect more at scale. While deep learning on structures has been successful, I see sequence as the AI-native modality for proteins.
My research so far has used sequence to study one of the most imporant aspects of protein function: interactions. In my first project, TUnA, I explored how we can predict protein-protein interactions from sequence. Most recently, I’m developing forge, a latent flow-matching model for designing protein binders in sequence space. Looking ahead, I’m excited about new self-supervised learning frameworks that leverage large-scale sequence interaction data to extract deeper, more functional signals from sequence itself.
If structure was our tool for understanding proteins, sequence may be AI’s: revealing more than what we could ever see on our own.
For a complete list of works, please see Google Scholar.
forge: sequence-based binder design with latent flow matching. Young Su Ko* and Wei Wang, Machine Learning in Structural Biology (MLSB), 2025.
Miniaturizing, Modifying, and Magnifying Nature's Proteins with Raygun. Kapil Devkota, Daichi Shonai, Joey Mao, Young Su Ko*, Wei Wang, Scott Soderling, Rohit Singh, bioRxiv, 2025.
TUnA: an uncertainty-aware transformer model for sequence-based protein–protein interaction prediction. Young Su Ko*, Jonathan Parkinson, Cong Liu, Wei Wang, Briefings in Bioinformatics, 2024.