Staff Data Scientist, Genomics

CZ Biohub
Location
Redwood City, CA
Job Type
Full-time
Posted
March 12, 2026
Views
8
Salary Range
$214k - $295k USD

Job Description

Biohub seeks a data science leader to shape how biological data powers frontier AI models. The position combines strategic vision-setting with hands-on technical work, focusing on data representation and tokenization for genomic and multi-modal biological datasets.

Key Responsibilities

  • Design data representation and tokenization strategies for diverse biological data types
  • Establish data standards and quality metrics for cross-dataset integration
  • Combine heterogeneous data modalities into unified training frameworks
  • Assess how representation choices impact model performance and biological signal capture
  • Collaborate with ML engineers and AI researchers on dataset design
  • Lead cross-functional initiatives spanning engineering, science, and product teams
  • Identify new data acquisition and generation opportunities
  • Mentor staff and establish data science rigor standards

Required Qualifications

  • PhD in computational biology, bioinformatics, or related quantitative field
  • 8+ years with large-scale biological datasets (genomics, transcriptomics, proteomics, multi-omics)
  • Deep knowledge of measurement types: sequencing, imaging, proteomics
  • Experience designing data representations for machine learning applications
  • Strong Python and scientific computing skills
  • Expertise in ML/statistical modeling and modern architectures
  • Cross-disciplinary communication abilities

Benefits

  • 401(k) matching
  • Volunteer time off
  • Family-forming benefits
  • Relocation support

Get Similar Jobs in Your Inbox

Weekly digest of top bioinformatics jobs. No spam.

Ready to Apply?

Apply for this Position

You'll be redirected to the company's application page

Share this job:

Job Information

Source: manual
Remote Type: hybrid
Experience: Senior
Allowed Locations: Worldwide
Skills & Tags:
genomics Python machine learning AI multi-omics data science transcriptomics proteomics bioinformatics