Member of the Technical Staff, Biological Data

Output Biosciences

Location

New York, NY

Job Type

Full-time

Posted

June 11, 2026

Views

158

Salary Range

$150k - $250k USD

Job Description

The Role

Output has built a biological reasoning model that understands biology at the scale and complexity life actually operates. Our model independently learned the principles of molecular interactions, opening up drug treatments that were previously impossible. We're already generating therapies that traditional approaches cannot reach. The hardest problems in both AI and biology are being solved here, and there is room for you to own one.

Output is currently in stealth, operated by a team of repeat founders and biotech veterans with multiple exits in AI x Bio, and backed by top-tier VCs including Y Combinator.

You will own the data that our models learn from. This role requires a deep understanding of molecular biology - what a biological data source contains, what it implies, and what is missing. The quality and coverage of training data determines what our models can learn, and the biological insight behind how that data is constructed is the difference between a model that memorizes and one that reasons.

You will construct training datasets that capture how proteins and molecules interact, drawing from diverse biological data sources and extending them with your understanding of molecular principles

You will develop methods to expand training data beyond what exists in public databases, using biological and chemical reasoning to create new training signal where current data is sparse or absent

You will design benchmarks grounded in real molecular phenomena, measuring whether our models have learned biologically meaningful capabilities rather than statistical shortcuts

You will develop data strategies in collaboration with model researchers, determining what the model should learn from, what biological signal to prioritize, and how to sequence learning across modalities

You will design approaches for integrating data across biological scales and modalities, building coherent training data from heterogeneous experimental and computational sources

You will design rigorous splitting and evaluation strategies that prevent leakage and ensure model capabilities generalize to real biological problems

You will stay current with biological data sources, experimental methods, and molecular databases, continuously identifying new sources of training signal

About You

You have a PhD in computational biology, biophysics, structural biology, chemistry, biochemistry, or a related biological field with 2+ years of post-doctoral or industry research experience, or equivalent depth through a combined biology and computational background

You have deep understanding of molecular interactions, protein structure, and biological data at the molecular level, grounded in first principles rather than surface familiarity

You have experience working with large-scale biological or molecular datasets, including sourcing, cleaning, integrating, and analyzing heterogeneous data

You have strong programming skills in Python and are comfortable building computational pipelines for data processing at scale

You understand what machine learning models require from training data: coverage, quality, balance, and evaluation rigor

You approach data construction as a research problem, not a pipeline task: you think carefully about what data means, what signal it carries, and what is absent

Bonus Points

You have experience with computational biology tools such as structure prediction, molecular docking, or virtual screening

You have experience training or evaluating machine learning models, particularly on molecular or biological data

You have publications in computational biology, bioinformatics, or molecular informatics

You have a background in cheminformatics or molecular data analysis

The Role

Output is currently in stealth, operated by a team of repeat founders and biotech veterans with multiple exits in AI x Bio, and backed by top-tier VCs including Y Combinator.

You will construct training datasets that capture how proteins and molecules interact, drawing from diverse biological data sources and extending them with your understanding of molecular principles

You will develop methods to expand training data beyond what exists in public databases, using biological and chemical reasoning to create new training signal where current data is sparse or absent

You will design benchmarks grounded in real molecular phenomena, measuring whether our models have learned biologically meaningful capabilities rather than statistical shortcuts

You will design approaches for integrating data across biological scales and modalities, building coherent training data from heterogeneous experimental and computational sources

You will design rigorous splitting and evaluation strategies that prevent leakage and ensure model capabilities generalize to real biological problems

You will stay current with biological data sources, experimental methods, and molecular databases, continuously identifying new sources of training signal

About You

You have deep understanding of molecular interactions, protein structure, and biological data at the molecular level, grounded in first principles rather than surface familiarity

You have experience working with large-scale biological or molecular datasets, including sourcing, cleaning, integrating, and analyzing heterogeneous data

You have strong programming skills in Python and are comfortable building computational pipelines for data processing at scale

You understand what machine learning models require from training data: coverage, quality, balance, and evaluation rigor

You approach data construction as a research problem, not a pipeline task: you think carefully about what data means, what signal it carries, and what is absent

Bonus Points

You have experience with computational biology tools such as structure prediction, molecular docking, or virtual screening

You have experience training or evaluating machine learning models, particularly on molecular or biological data

You have publications in computational biology, bioinformatics, or molecular informatics

You have a background in cheminformatics or molecular data analysis

You have experience working with protein or molecular language models

Our Values

❤️ Heart: We foster a culture of ownership. We are assembling a team of individuals who are passionate and take pride in their contributions.

🏆 Excellence: We have an unwavering commitment to excellence and continuously challenge ourselves to reach the highest standards.

🚀 Practicality: We value practicality and results-oriented thinking. We are committed to making a tangible impact on the lives of patients and the broader community.

📣 Honesty: We place a high value on honesty and directness. We firmly believe in addressing issues as they arise, in an open and transparent manner.

🎮 Fun: We believe that life is too short to not have fun. Our goal is to create a workplace that is fun, engaging, rewarding and fulfilling.

What We Offer

We encourage new and different ideas, creativity and contrarian thinking

Healthy feedback focused environment to help you strive - leadership will have high expectations, regularly share constructive feedback, support you and help you grow, and welcome receiving feedback and ideas from you

You own your day-to-day management. What we care about is that we all hit our milestones

Competitive salary and equity in a growing, well-funded startup

Excellent medical, dental, and vision coverage

Researching Output Biosciences before you apply?

See 4 open roles · Culture, benefits & locations.

View Output Biosciences profile

Frequently Asked Questions

Where is the job located, and is it remote/hybrid/on-site?

The job is located in New York, NY. The posting does not specify a remote, hybrid, or on-site work-mode policy.

What are the required qualifications and experience level for this role?

You need a PhD in computational biology, biophysics, structural biology, chemistry, biochemistry, or a related field with 2+ years of post-doctoral or industry research experience (or equivalent depth). You must have a deep understanding of molecular interactions, experience with large-scale biological datasets, strong Python programming skills, and an understanding of machine learning training data requirements.

What are the key responsibilities of the Member of the Technical Staff, Biological Data?

You will construct training datasets capturing protein and molecular interactions, develop methods to expand training data, design benchmarks grounded in molecular phenomena, collaborate on data strategies with model researchers, integrate data across biological scales, and design rigorous splitting and evaluation strategies to prevent leakage.

What benefits and compensation are offered for this position?

Output Biosciences offers a competitive salary, equity in a well-funded startup, and excellent medical, dental, and vision coverage.

Ready to Apply?

Apply for this Position

You'll be redirected to the company's application page

Share this job:

Twitter LinkedIn

Explore Output Biosciences

Research the company before you apply.

4 open roles
Culture, benefits & locations

View company profile

Job Information

Source: manual

AI Relevance: 90/100 (Highly relevant)

Remote Type: onsite

Experience: Senior

Allowed Locations: Worldwide

Skills & Tags:

biological data foundation models AI for biology machine learning data engineering drug discovery

Get Similar Jobs by Email

Weekly digest of Output Biosciences and similar companies. Free.

Related Jobs

Output Biosciences

Member of the Technical Staff, Biological Data

Job Description

The Role

About You

Bonus Points

The Role

About You

Bonus Points

Our Values

What We Offer

Researching Output Biosciences before you apply?

Frequently Asked Questions

Ready to Apply?

Explore Output Biosciences

Job Information

Get Similar Jobs by Email

Related Jobs

Senior Computational Biologist (NYC)

Machine Learning Research Scientist, Generative Biology

Senior Computational Biologist (Remote)

Wait! Don't Miss Out

Get Jobs in Your Inbox