← all jobs

Data Scientist, Cancer Informatics and AI/ML, Remote, Grant Funded

Work from home Full-time role Hiring

This position will support computational oncology and cancer informatics research initiatives focused on transforming complex clinical data into structured, actionable datasets for research, quality improvement, clinical trial identification, and care delivery optimization. The role will emphasize applied machine learning, natural language processing, and large language model-driven workflows using real-world clinical data, including electronic health record data, pathology reports, radiology reports, clinical notes, genomics, treatment data, and other institutional data sources. The Data Scientist will work semi-independently in close collaboration with clinical investigators, informatics teams, biostatisticians, and other data science stakeholders to design, build, evaluate, and refine computational pipelines. The ideal candidate will have practical prior experience developing data science workflows in Python and using modern machine learning or LLM-based tools in real projects. Job Responsibility

  • Develop, test, and maintain Python-based data pipelines for clinical research, quality improvement, and computational oncology projects.
  • Support cancer informatics projects involving natural language processing, machine learning, large language models, and structured extraction from unstructured clinical data.
  • Build workflows for processing clinical notes, pathology reports, radiology reports, treatment records, genomics reports, and other real-world healthcare data sources.
  • Implement and evaluate LLM-assisted workflows, including prompt engineering, structured output generation, model benchmarking, validation pipelines, and error analysis.
  • Assist with the development of retrieval-augmented generation workflows, vector search, embedding-based retrieval, and related approaches where appropriate.
  • Work with clinical subject matter experts to translate oncology-focused research questions into executable data science tasks.
  • Perform data cleaning, data wrangling, exploratory analysis, feature engineering, model development, and model performance evaluation.
  • Generate reproducible analyses, reports, dashboards, tables, and visualizations to communicate findings to clinical and operational stakeholders.
  • Maintain clear documentation of code, analytic decisions, model assumptions, validation methods, and project outputs.
  • Participate in model validation efforts, including comparison of computational outputs against clinician-reviewed reference standards.
  • Contribute to manuscript, abstract, grant, and presentation development through data analysis, figure generation, and methods documentation.
  • Work independently on assigned analytic tasks while communicating progress, limitations, and blockers clearly to project leadership.

Job Qualification

  • Bachelor’s Degree in Computer Science, Informatics, Statistics, Engineering, Data Science, or related field, required. Master’s Degree, preferred.
  • Minimum of two (2) years of post-graduate training or experience involving quantitative data analysis, required and working with clinical data, data science, and machine learning, preferred.
  • Working familiarity with basic medical and health information technology concepts, including standardized terminologies and ontologies and electronic health records, as well as Data Warehousing and Business Intelligence tools, required.
  • Expertise in working with SQL relational databases and statistical or general programming languages (e.g., Python, R), required.
  • Deep understanding of statistical and predictive modeling concepts, machine-learning approaches, clustering and classification techniques, and recommendation and optimization algorithms.

HIGHLY PREFERRED

  • Demonstrated prior experience building or implementing applied data science, machine learning, NLP, or LLM-based workflows. Completion of a short AI certificate, bootcamp, or introductory course alone is not sufficient for this role.
  • Strong practical experience with Python for data science, including pandas, NumPy, scikit-learn, Jupyter notebooks, and reproducible analytic workflows.
  • Prior experience applying machine learning, natural language processing, or large language models to real-world data problems.
  • Experience using off-the-shelf LLMs through APIs or enterprise platforms, including structured prompting, output parsing, evaluation, and workflow integration.
  • Experience with retrieval-augmented generation, vector databases, embeddings, semantic search, or document retrieval pipelines.
  • Experience working with clinical, biomedical, or electronic health record data.
  • Familiarity with oncology data, cancer registries, pathology reports, radiology reports, genomics reports, or clinical trial data.
  • Experience working in secure data environments, enterprise data warehouses, Databricks, Spark, SQL databases, or cloud-based analytic platforms.
  • Ability to write clean, maintainable, well-documented code and use version control such as Git.
  • Demonstrated ability to work semi-independently, manage multiple analytic tasks, and communicate technical concepts to non-technical clinical collaborators.
  • Prior experience contributing to academic research, abstracts, manuscripts, grant-funded projects, or healthcare quality improvement initiatives.
  • Understanding of model evaluation concepts including accuracy, precision, recall, F1 score, calibration, error analysis, and external validation.
  • Experience with prompt engineering alone is not sufficient; candidates should have substantive prior experience in data science, machine learning, computational research, or applied analytics.
  • Additional Salary Detail

The salary range and/or hourly rate listed is a good faith determination of potential base compensation that may be offered to a successful applicant for this position at the time of this job advertisement and may be modified in the future. When determining a team member's base salary and/or rate, several factors may be considered as applicable (e.g., location, specialty, service line, years of relevant experience, education, credentials, negotiated contracts, budget and internal equity).

More open positions

Lead AI & Data Scientist

Work from home Full-time role

Associate Director, Medical Omnichannel Data Scientist (Remote)

Work from home Full-time role

Sr. Business Intelligence Analyst, Training & Enablement (Remote) (Temp)

Work from home Full-time role

SR. DATA ENGINEER- REMOTE

Work from home Full-time role

Data Engineer III – Data Ingestion & Platform Modernization- Remote (Birmingham)

Work from home Full-time role

Digital Workflow Consultant

Work from home Full-time role

Senior Functional SAP Payroll Support Consultant - remote

Work from home Full-time role

Desktop Support Technician (Level II)

Work from home Full-time role

Vocational Clinical Specialist

Work from home Full-time role

Digital Marketing Manager

Work from home Full-time role

Hiring Immediately Package Handler

Work from home Full-time role

Experienced Part-Time Remote Data Entry Clerk - Operator (Flexible Work Arrangement)

Work from home Full-time role

Pharmacovigilance Assistant

Work from home Full-time role

Clinical Pharmacist for Medicare STARS - Remote

Work from home Full-time role

VP, Logistics & Warehousing (US Remote)

Work from home Full-time role

Hiring: Fractional CMO (High-Ticket, U.S.–Based Only) – Contract to Hire

Work from home Full-time role

Associate Value Engineer (AI-Driven Data Science & Analytics) - Orbit Program

Work from home Full-time role

Experienced Data Entry Specialist/Clerk – Manufacturing Operations Support

Work from home Full-time role

Growth Marketer / Engineer - Remote

Work from home Full-time role

Contract Bench Jeweler (1099)

Work from home Full-time role

Experienced Customer Service Live Chat Operator – Remote Opportunity with careerzynith

Work from home Full-time role