Cambridge Healthtech Institute's 2nd Annual

Data Strategies and the Future of AI Models

Empowering Machine Learning with Smarter Data and Next-Generation Tools

January 20 - 22, 2026 ALL TIMES PST

For 2026, the three-day Data Strategies and the Future of AI Models track will explore the essential relationship between wet lab experiments and data science, and then look forward to new models and capabilities that will extend the value of AI and ML in biopharmaceutical R&D. Presenters will offer best practices for the development and acquisition of training data to ensure the best possible results in executing models and consider the range of responses to situations where available training data is insufficient to power experiments for challenging projects. For models, speakers will offer use cases for near term innovations and improvements projected to have an impact in the coming year, then look at longer term goals for models that will extend the use of ML and AI to a wider range of modalities and patient impacts.

Tuesday, January 20

7:30 amRegistration and Morning Coffee

8:30 amOrganizer's Welcome Remarks

8:35 am

Chairperson’s Remarks

Adrian Lange, PhD, Director, Machine Learning Research, A-Alpha Bio

8:40 am

Benchmarking Design Tools and Data Strategies for Active Learning and Multi-Objective Optimization of Antibody and Non-Antibody Biologics

Jung-Eun (June) Shin, PhD, Senior Machine Learning Scientist, Seismic Therapeutic

We develop and apply machine learning models to simultaneously optimize multiple drug-like properties of biologics, including antibodies and enzymes. We present generative models that harness both the design of functional proteins and the prediction of drug-like properties to engineer therapeutically developable proteins. We produce and experimentally characterize these designs for fitness, function, and developability, exploring the synergy of these methods in a generalized multi-objective optimization pipeline for biologics.

9:10 am

Machine Learning Models for Nanobody Developability Trained on a Custom Multi-Readout Dataset

Roberto Spreafico, PhD, Senior Director, Biologics AI Innovation, AstraZeneca

Biophysical characterization of biologics is resource-intensive and requires extensive wet-lab experimentation. To scale such efforts to millions of candidate molecules, protein language models offer a promising approach for predicting experimental outcomes computationally. However, the performance of current ML models is constrained by the limited availability of large, high-quality training datasets. Here, we introduce a purpose-designed, information-rich dataset, tailored to train ML models for predicting nanobody developability with improved accuracy.

9:40 am

FEATURED PRESENTATION: Active Learning for Improving Out-of-Distribution Lab-in-the-Loop Experimental Design

Victor Greiff, PhD, Associate Professor, University of Oslo; Director, Computational Immunology, IMPRINT

We present advances in understanding and improving biological sequence-based machine learning. First, we introduce an attribution method for generative models trained on positive-only data, enabling interpretability without requiring negatives. Second, we show that training data composition critically impacts generalization and rule learning across distributions. Together, these works underscore the importance of biologically grounded interpretability and deliberate dataset design in unlocking robust and explainable models for AI-powered antibody design.

10:10 am

Structure-Based Calculations for Predicting Properties and Profiling Antibody Therapeutics

Alain Ajamian, Director of Business Development, Chemical Computing Group

Predicting potential liabilities, aggregation, viscosity etc. is of importance in antibody development. Computational property prediction methods are routinely used in the selection and optimization of candidate antibodies. High quality property prediction involves prediction of ensembles of 3D structures at specified pH to reduce sensitivity to single conformational states. We present 3dpredict/Ab which calculates ensemble-based predictions of antibody developability descriptors and putative liabilities. 3dpredict/Ab allows for out-of-the-box SaaS automation and integration of such complex simulations of hundreds or thousands of sequences.

10:25 am

Applying in silico Tools for Protein Design: A Practical Review

Deniz Kavi, CEO & Co Founder, Tamarind Bio

This talk will present benchmarks, empirical results and best practices in applying the leading literature of molecular design tools for protein engineering applications. We will evaluate state-of-the art computational tools for de novo design, optimizations, and scoring of biologics, along with processes to create pipelines ready to be applied to discovery problems at scale. We will also discuss shortcomings and ongoing challenges and limitations of applying AI and physics-based tooling to practical discovery problems.

10:40 amGrand Opening Coffee Break in the Exhibit Hall with Poster Viewing

11:20 am

High-Throughput Data Generation and Active Learning for Developing Multispecific Antibodies

Winston Haynes, PhD, Vice President, Computational Sciences and Engineering, LabGenius Therapeutics

Due to their highly engineered formats and complex effector mechanisms, multispecific antibodies (including T-cell engagers [TCEs]) require context-specific training datasets to power ML models. We provide insights into our highly integrated and automated experimental and computational infrastructure that enables our cycle-based, multi-objective optimisation of multispecifics. We highlight our success deploying this infrastructure to develop a pipeline of highly potent and selective TCEs.

11:50 am

Accelerating Biologic Design with in silico Active Learning for Multi-Objective Optimization

Jiangyan Feng, PhD, Senior Advisor, Biotechnology Discovery Research, Eli Lilly and Company

Dr. Jiangyan Feng earned her Ph.D. in Computational Biology from the University of Illinois at Urbana-Champaign, where she specialized in molecular dynamics simulations to investigate protein conformational dynamics and applied bioinformatics for machine learning model development. She currently works at Eli Lilly, where her research focuses on antibody engineering and the application of in silico methods to accelerate biologics discovery and optimization.

12:20 pmTransition to Lunch

12:30 pmLuncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:00 pmRefreshment Break in the Exhibit Hall with Poster Viewing

1:29 pm

Chairperson's Remarks

Winston Haynes, PhD, Vice President, Computational Sciences and Engineering, LabGenius Therapeutics

1:30 pm

Designing the Next Generation of Biologics with Enhanced Functionality Using Machine Learning and a Rapid Iteration Wet Lab

Peyton Greenside, PhD, Co-Founder & CSO, BigHat Biosciences

BigHat Biosciences is transforming antibody discovery by combining machine learning and synthetic biology in rapid design-build–test cycles that generate thousands of candidates each week. Our platform goes beyond improving biophysics to engineer antibodies with enhanced functionality such as conditional binding and logic-based control (OR, AND, NOT) for greater safety and efficacy. In this keynote, we will share case studies showing how these innovations overcome the limitations of standard formats and deliver novel therapies ready for patients.

1:59 pm

Building Multi-Scale and Multi-Modal Models

PANEL MODERATOR:

Winston Haynes, PhD, Vice President, Computational Sciences and Engineering, LabGenius Therapeutics

As biologics R&D embraces AI and machine learning, researchers are leveraging models that integrate multiple data modalities—sequence, structure, function, literature, and omics—while also operating across biological scales, from residue-level interactions to systemic function. This panel will explore the design, training, and application of such models in therapeutic antibody and protein engineering. Industry and academic experts will discuss both technical challenges and practical use cases, offering insight into how multi-modal and multi-scale approaches are shaping the future of biologic drug discovery.

Integrating sequence, structure, and assay data: What makes a model truly multimodal?
Designing models to capture residue-level precision and domain-level context
Strategies for aligning embeddings across scales and modalities (e.g., cross-modal attention, hierarchical models)
Applying multi-scale, multimodal models to functional clonotyping and epitope prediction
Balancing model complexity with interpretability and regulatory relevance in drug development

PANELISTS:

Qing Chai, PhD, AVP, Computational Science, Biotechnology Discovery Research, Eli Lilly and Company

Peyton Greenside, PhD, Co-Founder & CSO, BigHat Biosciences

Jeremy Wohlwend, PhD, CTO, Boltz

2:55 pmSession Break

3:05 pm

PAIA´s High-Throughput Developability Assay Platform: A Versatile and Robust Technology for the Generation of High-Quality Training Data for Different Antibody Formats

Sebastian Giehring, PAIA Biotech GmbH

In this talk we present our assay technology capable of characterizing hundreds to thousands of antibodies and proteins for different biophysical parameters, such as hydrophobicity and non-specific binding. The assay technology is microplate-based and only needs minute amounts of protein, making it an ideal tool for the fast and efficient screening of large discovery campaigns. We will be showing data for different antibody formats and building blocks for bispecifics and multispecific antibodies, illustrating the versatility of the approach.

3:35 pmRefreshment Break in the Exhibit Hall with Poster Viewing

4:30 pm

Welcome Remarks

Mimi Langley, Executive Director, Life Sciences, Cambridge Healthtech Institute

4:35 pm

Chairperson's Remarks

Deborah Moore-Lai, PhD, Vice President, Protein Sciences, ProFound Therapeutics

4:40 pm

From Targets to Biologics: AI Powering the Next Leap in Discovery at Takeda

Yves Fomekong Nanfack, PhD, Head of AI/ML Research, Takeda

Takeda’s AI/ML strategy is redefining the path from targets to biologics, using advanced models to identify and validate novel targets, decode complex biology, and design the next generation of high-quality therapeutic molecules. By integrating agentic, generative, and large language model–driven approaches, AI is powering the next leap in discovery at Takeda.

4:50 pm

Agentic AI for Biologics: Scalable Infrastructure for GxP-Compliant, Insight-Driven Testing

Lieza M. Danan, PhD, Co-Founder & CEO, LiVeritas Biosciences

As biotherapeutics become more complex, automation of traditional testing labs falls short of delivering the insights needed for regulatory success. This talk introduces a GxP-native, full-stack AI platform designed to orchestrate and optimize mass spectrometry-based testing workflows across CMC, bioanalysis, and regulatory reporting. Dr. Lieza Danan shares how LiVeritas applies agentic AI to automate data interpretation, reduce error-prone manual steps, and generate submission-ready outputs—already proven in over 10 IND/BLA filings. Rooted in regenerative system design, this infrastructure enables scalable, adaptive, and compliant operations, empowering biopharma teams to accelerate product development with confidence, clarity, and scientific precision.

5:00 pm

Technological Trends Shaping the Landscape of Biopharmaceuticals

Aline de Almeida Oliveira, PhD, Competitive Intelligence Office (AICOM), Bio-Manguinhos/Fiocruz, Brazil

Currently, the biopharmaceutical industry is undergoing rapid technological advancements that are revolutionizing the development and production of biopharmaceuticals. Consequently, new therapeutic categories are gaining prominence, such as antibody-drug conjugates, bispecific antibodies, advanced therapies, among others. This rapid evolution requires constant vigilance to identify breakthroughs and guide strategic decision-making in this dynamic field. The aim of this strategic foresight analysis is to discuss technological trends for the future of biopharmaceuticals.

5:10 pm

PLENARY FIRESIDE CHAT

PANEL MODERATOR:

Deborah Moore-Lai, PhD, Vice President, Protein Sciences, ProFound Therapeutics

Kicking off with three focused 10-minute presentations, the Fireside Chat transitions into an engaging 30-minute fireside discussion. Panelists will delve into cutting-edge topics, including the role of AI/ML in biologics discovery, advancements in next-generation analytics and tools, entrepreneurial trends and investment landscapes, and emerging therapeutic modalities. In tribute to Dr. King’s legacy, this session will also highlight the importance of fostering diversity, equity, and inclusion within the biotech innovation ecosystem.

PANELISTS:

Lieza M. Danan, PhD, Co-Founder & CEO, LiVeritas Biosciences

Aline de Almeida Oliveira, PhD, Competitive Intelligence Office (AICOM), Bio-Manguinhos/Fiocruz, Brazil

Yves Fomekong Nanfack, PhD, Head of AI/ML Research, Takeda

5:40 pmNetworking Reception in the Exhibit Hall with Poster Viewing

6:00 pm

Meet the Moderator at the Plaza in the Exhibit Hall

Maria Calderon Vaca, PhD Student, Chemical Environmental & Materials Engineering, University of Miami

This young scientist meet-up is an opportunity to get to know and network with members of the BioLogic Summit community. This session aims to inspire the next generation of young scientists with discussion on career preparation, work-life balance, and mentorship.

6:40 pmClose of Day

Wednesday, January 21

7:15 amRegistration Open

7:30 amInteractive Breakout Discussions with Continental Breakfast

Engage in in-depth discussions with industry experts and your peers about the progress, trends, and challenges you face in implementing ML/AI in your work! Interactive discussion groups play an integral role in networking with potential collaborators, provide an opportunity to share examples from your work, and allow you to be part of a group problem-solving endeavor. Please visit the Interactive Breakouts page on the conference website for a complete listing of topics and descriptions.

TABLE 11: Comparing and Contrasting Machine Learning-Based Design of Antibody and Non-Antibody Biologics

Jung-Eun (June) Shin, PhD, Senior Machine Learning Scientist, Seismic Therapeutic

Public data availability and ease of screening designs and generating experimental data
Deimmunization and humanization methods for antibodies vs. non-antibody biologics
Computational and deep learning methods for functional engineering and de novo design
Methods for developability predictions and multi-objective optimization

TABLE 12: Overcoming the Data Bottleneck in AI-Driven Antibody Engineering

Roberto Spreafico, PhD, Senior Director, Biologics AI Innovation, AstraZeneca

Data scarcity as the key constraint to progress in antibody model development
High-throughput assays: Current limitations and future potential
Cross-disciplinary insights from medicinal chemistry and related domains

8:15 am

Chairperson's Remarks

Jung-Eun (June) Shin, PhD, Senior Machine Learning Scientist, Seismic Therapeutic

8:20 am

In vitro-Validated Synthetic Structures for Structural Foundation Models and Applications in de novo Design

Adrian Lange, PhD, Director, Machine Learning Research, A-Alpha Bio

There is a severe lack of protein-protein interaction (PPI) structural data, especially for antibody-antigen systems. We present an approach to create thousands of putative PPI structures, wherein we computationally design many de novo PPI structures and subsequently validate them in vitro with AlphaSeq. We show that the validated putative structures form a dataset on which we can train downstream machine learning models to yield improved model performance.

8:50 am

AI Structural Biology Network: Improving Protein Co-Folding Predictions by Leveraging Data from Multiple Pharma Companies

Robin Roehm, PhD, CEO & Co-Founder, Apheris

The AI Structural Biology Network brings together pharmaceutical companies and cutting-edge federated learning technology to leverage the collective data of multiple parties to improve AI-driven drug discovery. In this session, we will show results on how we improved OpenFold3—an open-source reproduction of AlphaFold3—with the proprietary structural data of the participating pharma companies.

9:20 am

OpenBind: Unlocking AI-Driven Drug Discovery with the World’s Largest Protein-Drug Interaction Dataset

Warren Thompson, PhD, Senior Computational Scientist, Diamond Light Source

The OpenBind consortium represents a transformative opportunity to address the critical data bottleneck for evolving co-folding models. This talk will highlight the progress made for routine collection of structural-affinity pairs at a scale never attempted in the public domain. This challenge requires coordinating diverse scientific work packages: from AI-assisted compound design, automated synthesis, to high-throughput structural and affinity measurements, and finally data dissemination and blind challenges.

9:50 am

Unlocking Novel Therapeutic Space: ALiCE HTPE as the Cell-Free Data Engine for AI-Guided Design of Next-Gen Formats

Jonathan Fauerbach, Head of R&D, R&D, LenioBio GmbH

10:20 amCoffee Break in the Exhibit Hall with Poster Viewing

11:00 am

Chairperson's Remarks

Hunter Elliott, PhD, Senior Director, Machine Learning, BigHat Biosciences

11:05 am

Incorporating in silico Tools into Antibody Discovery: Challenges and Opportunities

Andrew Nixon, PhD, Senior Vice President, Global Head Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc.

Antibody discovery is being transformed by the integration of in silico tools, from machine learning models to structure-based design. This presentation will explore how computational methods are being incorporated into discovery pipelines at scale, highlighting key opportunities for accelerating candidate selection and improving developability. It will also address ongoing challenges—including data quality, model interpretability, and cross-disciplinary integration—that must be overcome to realize the full potential of AI-driven antibody discovery.

11:35 am

AI for Antibody Design - Going Beyond the Static

Charlotte M. Deane, PhD, Professor, Structural Bioinformatics, Statistics, University of Oxford; Executive Chair, Engineering and Physical Sciences Research Council (EPSRC)

We can now computationally predict a single, static protein structure with high accuracy. However, we are not yet able to reliably predict structural flexibility. This ability to adapt their shape can be fundamental to their functional properties. A major factor limiting such predictions is the scarcity of suitable training data. I will show novel tools and databases that help to overcome this.

12:05 pm

Redesigning Antibody CDRs to Improve Developability Properties Using Machine Learning

Peter M. Tessier, PhD, Albert M. Mattocks Professor, Pharmaceutical Sciences & Chemical Engineering, University of Michigan

Antibody complementarity-determining regions (CDRs) form complex 3D surfaces that mediate high-affinity interactions with their target antigens. Some of the same sites in CDRs that mediate specific antibody binding also mediate undesirable developability properties. Here, we report methods for redesigning antibody CDRs—including those at sites in or near the paratope—to improve developability while maintaining high affinity and specificity.

12:35 pmTransition to Lunch

12:40 pm

LUNCHEON PRESENTATION: Ginkgo Datapoints Antibody Developability Competition Outcomes: Limited Model Performance and a Call for Data Standardization

Josh Moller, Senior Biological Engineer, AI, Ginkgo Datapoints

Antibody clinical viability depends critically on developability attributes, yet predictive model development is hampered by limited, heterogeneous data and poor generalization. To address this gap, we established the 2025 Ginkgo Datapoints Developability Competition, creating a new, blinded benchmark for developability modeling. We will share key observations of the competition, including model overfitting and limited out-of-distribution generalization. Future advances in modeling require larger, standardized datasets and more rigorous evaluation frameworks to translate predictive models into reliable design tools.

1:10 pmSession Break

1:45 pmRefreshment Break in the Exhibit Hall with Poster Viewing

2:15 pm

Chairperson’s Remarks

Elahe Vedadi, PhD, Research Scientist, Google/DeepMind

2:20 pm

NextGenPLM: A Novel Structure-Infused Foundational Protein Language Model for Antibody/NANOBODY VHH Discovery and Optimization

Abhinav Gupta, PhD, Principal Machine Learning Scientist, AI Innovation, Large Molecule Research, Sanofi

Sequence-only PLMs lack spatial context and miss critical folding, interface, and environment-dependent cues, while structure-prediction and docking methods are too slow and underperform on antibody and VHH complexes. NextGenPLM bridges this gap with a modular, scalable design that fuses pretrained PLMs with multimodal inputs—from raw sequences and functional assays to high-resolution structures—via spectral contact-map embeddings. Using results from internal campaigns, we will demonstrate its potential for rapid, data-driven biologics discovery.

2:50 pm

AI Co-Scientist Use Cases in Discovery, Engineering, and Development of Therapeutic Proteins

Elahe Vedadi, PhD, Research Scientist, Google/DeepMind

Accelerating scientific discovery requires novel computational approaches. Our AI co-scientist, a multi-agent system, addresses this by augmenting the research process. It uses a unique "generate, debate, and evolve" methodology, powered by scaled test-time compute, to help scientists formulate novel hypotheses. In therapeutic protein development, this approach has already yielded promising, experimentally validated results, identifying new epigenetic targets for liver fibrosis and drug repurposing candidates for AML. This demonstrates the co-scientist's potential to significantly advance complex research by empowering scientists with a powerful collaborative tool.

3:20 pm

PANEL DISCUSSION:

An Honest Conversation about What It Takes to Make ML Work in Biotherapeutics

PANEL MODERATOR:

Nicola Bonzanni, CEO, ENPICOM

We'll explore what it really takes to make machine learning useful in biologics discovery and engineering. From bridging lab and data science workflows, to dealing with scattered data and real-world model limitations, we’ll talk about what works, what doesn’t, and why. Expect a grounded look at the everyday decisions behind successful ML implementation: practical insights on preparing data, aligning teams, and deploying models where they matter most—in scientists’ hands.

• Why structured, high-throughput data and robust pipelines matter at least as much as good models
• What it takes to move from scattered analyses to automated, end-to-end workflows that support ML adoption
• Why most AI initiatives stall and what it really takes to operationalize models and shift team culture
• How to ensure lab scientists can actually use ML-model outputs in their day-to-day work
• AI adoption blockers: siloed teams, missing expertise, infrastructure readiness

PANELISTS:

Abhinav Gupta, PhD, Principal Machine Learning Scientist, AI Innovation, Large Molecule Research, Sanofi

Melody Shahsavarian, PhD, Senior Director, Data Strategy & Digital Transformation, Biotherapeutics Discovery Research, Eli Lilly & Company

Roberto Spreafico, PhD, Senior Director, Biologics AI Innovation, AstraZeneca

Michail Vlysidis, PhD, Principal Engineer, AbbVie

Daniel Yoo, Scientific Associate Director, Large Molecule Discovery, Amgen, Inc.

4:20 pmRefreshment Break in the Exhibit Hall with Poster Viewing

4:50 pmInteractive Breakout Discussions

TABLE 1: Accelerating Antibody Engineering with AI-Driven Active Learning: Optimizing DMTA Cycles

Jiangyan Feng, PhD, Senior Advisor, Biotechnology Discovery Research, Eli Lilly and Company

Active learning strategies for antibody optimization
Closing the loop: Integrating experimental feedback into ML models
Multi-objective optimization in antibody DMTA
Speed vs. thoroughness: Designing efficient DMTA cycles

TABLE 2: Best Practices for Using Agentic and Multi-Agent Systems in R&D

Elahe Vedadi, PhD, Research Scientist, Google/DeepMind

Strengths and limitations of the current systems
Practical techniques to evaluate these systems
Best practices for enabling agents to effectively utilize external tools and APIs
Planning for complex problem decomposition and effective action sequencing

5:40 pmClose of Day

Thursday, January 22

8:00 amRegistration Open

8:25 am

Welcome Remarks

Christina Lingham, Executive Director, Conferences and Fellow, Cambridge Healthtech Institute

8:30 am

Plenary Keynote Introduction

Andrew Nixon, PhD, Senior Vice President, Global Head Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc.

8:35 am

New Frontier of Biotherapeutic Discovery: Where Machine Learning Meets Molecular Design

Stephanie Truhlar, PhD, Vice President, Biotechnology Discovery Research, Eli Lilly and Company

The integration of AI into antibody discovery is transforming biotherapeutic development by accelerating timelines, improving success rates, and enabling access to challenging targets. At Lilly, we leverage a host of predictive tools to enable rapid high-quality hit selection, which is becoming our standard process to accelerate our discovery programs. Furthermore, our scientists have successfully utilized generative AI to explore previously inaccessible sequence space and engineer optimized antibodies with superior properties.

9:00 am PLENARY FIRESIDE CHAT:

End-to-End in silico-Designed Biologics

PANEL MODERATOR:

Andrew Nixon, PhD, Senior Vice President, Global Head Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc.

How is the path to drug development different with ML/AI?
How far off is de novo design for biologics? For antibodies?
How is ML/AI used for target selection?
How do you accelerate DMTA cycles?
Data standardization—how to incorporate historical data?
Federated learning—how do you ensure you have enough data to build a model?
Promoting change management

PANELISTS:

Charlotte M. Deane, PhD, Professor, Structural Bioinformatics, Statistics, University of Oxford; Executive Chair, Engineering and Physical Sciences Research Council (EPSRC)

Garegin Papoian, PhD, Co-Founder & CSO, DeepOrigin

Stephanie Truhlar, PhD, Vice President, Biotechnology Discovery Research, Eli Lilly and Company

9:30 amCoffee Break in the Exhibit Hall with Poster Viewing

9:45 am

Meet the Moderators at the Plaza in the Exhibit Hall

Michelle R. Gaylord, MS, Former Principal Scientist, Protein Expression & Advanced Automation, Velia Therapeutics

Deborah Moore-Lai, PhD, Vice President, Protein Sciences, ProFound Therapeutics

Join us for an inspiring Women in Science Meet-Up at this year’s BioLogic Summit—an inclusive meet-up designed to connect, uplift, and celebrate women across all stages of their scientific careers. Engage in meaningful conversations, share your journey, and gain insights from trailblazing women shaping the future of bioprocessing. Whether you're a newcomer or a seasoned professional, this is a chance to build a supportive network, foster mentorship, and discuss opportunities and challenges unique to women in the field. Our Women in Science programming invites the entire scientific community to discuss these barriers as we believe that all voices are necessary and welcome.

10:20 am

Chairperson’s Remarks

Frank Teets, PhD, Head, Computational Science, AI Proteins

10:25 am

Scalable Emulation of Protein Equilibrium Ensembles with Generative Deep Learning

Frank Noé, PhD, Partner Research Manager, Microsoft; Professor, Machine Learning for the Sciences, Free University Berlin

Despite advances in sequence and structure prediction, capturing functionally relevant protein dynamics at scale remains a major challenge. We present BioEmu, deep learning system that emulates protein equilibrium ensembles by generating thousands of statistically independent structures per hour on a single GPU. Trained on over 200 milliseconds of molecular dynamics (MD) simulations, static structures, and experimental stability data, BioEmu captures complex motions like cryptic pocket formation and domain rearrangements. It predicts relative free energies within 1 kcal/mol accuracy and jointly models structural ensembles and thermodynamic properties, offering mechanistic insights and a scalable, cost-effective path to understanding and designing protein function

10:55 am

Integrating Physics in Deep Learning Algorithms: A Force Field as a PyTorch Module

Joost Schymkowitz, PhD, Professor & Group Leader, Switch Lab, VIB-KU Leuven

We present a dual-framework approach for therapeutic antibody design that combines EvolveX, a structure-based CDR design pipeline using FoldX, with MadraX, a PyTorch-integrated differentiable force field. This physics-informed strategy enables data-efficient, interpretable protein engineering. EvolveX achieved >1000-fold affinity gains with structural and functional validation, while MadraX bridges deep learning and biophysics by allowing gradient flow through energy functions, enhancing design in data-scarce contexts.

11:25 am

Learning Millisecond Protein Dynamics from What Is Missing in NMR Spectra

Gina El Nesr, Graduate Researcher, Biophysics, Stanford University

Many proteins' biological function rely on micro- to millisecond dynamics, but large-scale data to study these motions is minimal. By curating >100 NMR relaxation datasets, we noticed that an observable hides in plain sight in >10,000 proteins in the BMRB. We trained Dyna-1 on this observable and found that it also predicts µs–ms motion directly measured in NMR relaxation experiments. Dynamics linked to biological function (e.g., enzyme catalysis, ligand binding) are particularly well-predicted.

11:55 amAttend Parallel Presentation or Enjoy Lunch on Your Own

1:00 pmIce Cream & Cookie Break in the Exhibit Hall with Last Chance for Poster Viewing

1:40 pm

Chairperson’s Remarks

Michail Vlysidis, PhD, Principal Engineer, AbbVie

1:45 pm

Modeling Antibody Conformational Ensembles with Physics-Based Simulations and Deep Learning

Fabian Spoendlin, Researcher, Oxford Protein Informatics Group, University of Oxford

Antibodies exhibit structural flexibility often central to their function. Here, we introduce two approaches to model conformational ensembles of antibodies at scale. First, we developed a high-throughput MD workflow that reproduces ensemble metrics observed in all-atom simulations and experimental datasets. Using this pipeline, we simulated over 150,000 antibodies to investigate flexibility profiles. Second, we trained a deep learning model on MD data and demonstrate its ability to predict key conformations.

2:15 pm

Zero-Shot Antibody Design in a 24-Well Plate with Chai Discovery

Matthew McPartlon, PhD, Co-Founder, Chai Discovery

Despite advances in AI-driven protein design, fully de novo generation of antigen-binding antibodies without extensive experimental screening has remained challenging. Here, we introduce Chai-2, a multimodal generative framework that designs antibodies (VH/VL), nanobodies (VHH), and miniprotein binders zero-shot, guided solely by target structures and epitopes. Evaluated across 52 structurally novel protein targets lacking preexisting binders, Chai-2 achieves a ~16% overall binding hit rate—over 100-fold higher than prior computational approaches—identifying binders for ~50% of targets within a single experimental round, when testing no more than 20 designs each. For miniproteins, Chai-2 yield a 68% success rate, routinely producing picomolar-affinity binders.

2:45 pm

Functional and Epitope Specific Monoclonal Antibody Discovery Directly from Immune Sera Using Cryo-EM

James Ferguson, Postdoctoral Associate, Integrative Structural & Computational Biology, Scripps Research Institute

Antibodies represent crucial therapeutics but traditional discovery methods are labor-intensive and limit high-throughput analysis. We developed an approach combining structural analysis and bioinformatics to infer heavy and light chain sequences from cryo-EM maps of serum-derived polyclonal antibodies bound to antigens. Using ModelAngelo for automated structure-building, we accelerated sequence determination and identified matches in B cell repertoires via ModelAngelo-derived hidden Markov models. Benchmarking against non-human primate HIV vaccine trials showed reduced analysis time from weeks to under one day with higher precision. Validation with murine influenza vaccination sera revealed multiple protective antibodies, enhancing discovery workflows for vaccine development and therapeutic applications.

3:15 pm

Protein Language Models Instead of Structure-Based Models for Biologics Predictions

Michail Vlysidis, PhD, Principal Engineer, AbbVie

Protein language models (PLMs) offer a fast and cost-effective alternative to traditional structure-based descriptors, which are computationally expensive and time-consuming. By generating encodings directly from sequence data, PLMs bypass the need for resolved protein structures. This talk explores whether PLMs can match the accuracy of traditional methods while providing a more efficient solution. We discuss comparative analyses that evaluate PLMs' precision and highlight their potential to accelerate drug discovery.

3:45 pm

Model for Predicting Protein Expression for Miniproteins

Frank Teets, PhD, Head, Computational Science, AI Proteins

We present a lightweight, fully sequence-based convolutional neural network for predicting miniprotein expression. Trained on curated in-house data labeled for production fidelity, the model operates as a binary classifier and generalizes across topologies with >90% accuracy. Now integrated into all drug discovery pipelines, it enables efficient downsampling of designed proteins, reduces synthesis waste, and exemplifies the value of automated design-build-test loops in AI-driven protein design.

4:15 pmClose of BioLogic Summit