Senior Data Scientist III – AI Evaluation, Prompt Engineering

Remote, USA Full-time

Job Description: • Evaluate and tune LLM-powered features, such as prompt optimization, retrieval-augmented generation (RAG) systems, and semantic search performance • Design and execute experiments to measure model quality, reliability, and user impact—translating technical findings into product recommendations • Develop and maintain data pipelines for evaluating, tracking, and improving system performance (e.g., accuracy, latency, cost, and relevance metrics) • Analyze structured and unstructured datasets (e.g., product usage logs, document metadata, LLM outputs) to identify patterns, insights, and areas for optimization • Collaborate with product managers to translate product goals into measurable data science questions, propose next steps, and inform roadmap priorities • Provide technical guidance to data engineers who build and maintain analytics and model evaluation infrastructure • Communicate results clearly—through written reports, dashboards, and presentations—to technical and non-technical stakeholders • Stay current on emerging practices in applied NLP, LLM evaluation, and data-driven product development, and thoughtfully adapt them to our environment Requirements: • 3–6 years of experience in data science, applied NLP, or AI product analytics, preferably within a SaaS or research-heavy product environment • Strong proficiency in Python and data analysis libraries such as Pandas; solid working knowledge of SQL • Ability to design and evaluate LLM-based systems (e.g., RAG pipelines, prompt evaluations, output scoring), even if not specialized in deep learning • Experience with data exploration, experimentation, and reporting—from defining metrics to visualizing and interpreting results • Comfort working with document-based datasets (e.g., text corpora, metadata, embeddings) and understanding information retrieval/semantic search concepts • Excellent written and verbal communication skills—able to present complex ideas simply and persuasively across distributed teams • Proven ability to self-direct, learn new tools and concepts quickly, and apply them pragmatically • Strong sense of curiosity, patience, and collaboration—especially in working across different disciplines and cultures Benefits: • Flexible remote-first work environment, with the option to work from our New York office • Comprehensive health coverage, including medical, dental, and vision plans • Retirement plan with inclusive risk benefits (disability, critical illness, life cover, and funeral cover) • Modern family benefits, including adoption, surrogacy, and parental leave • Paid study leave and professional development support • Well-being initiatives and opportunities for sabbaticals and personal growth • A culture that values work/life balance, clear communication, and continuous learning Apply tot his job

Apply Now

Entry-Level Remote Associate at Netflix – No Experience Necessary, $22/Hour, Flexible Schedule, and Opportunities for Career Growth

Remote, USA Full-time

Netflix Tagger Job India

Remote, USA Full-time

Back to Home

Senior Data Scientist III – AI Evaluation, Prompt Engineering

Similar Jobs

Senior Software Quality Assurance Engineer

QA Engineer /AI Forms Platform/

Research Staff

Solutions Architect (AI/ML) - Digital Native Business

Solutions Architect, AI Hyperscalers

[Remote] Multimodal AI System Engineer / DoD

AI Systems Engineer

Sr. Data Scientist, Enterprise AI (Remote)

Staff AI Research Scientist - Data Quality, Handshake AI

AI Applied Scientist, Code Intelligence

Commercial Lines CSR

Part-Time Content Creator (Product Video & Social Media)

Business Process Consultant

Experienced Remote Data Entry Specialist - Apple Ecosystem Support (Part-Time, No Experience Required)

Investment & Equity Research Analyst; Remote

Experienced Product Manager for A1378 Apple TV Remote - Leading Consumer Electronics Innovation

Associate Cloud Consultant

AI/ML Specialist Solutions Architect

Entry-Level Remote Associate at Netflix – No Experience Necessary, $22/Hour, Flexible Schedule, and Opportunities for Career Growth

Netflix Tagger Job India

Senior Data Scientist III – AI Evaluation, Prompt Engineering

Similar Jobs

Senior Software Quality Assurance Engineer

QA Engineer /AI Forms Platform/

Research Staff

Solutions Architect (AI/ML) - Digital Native Business

Solutions Architect, AI Hyperscalers

[Remote] Multimodal AI System Engineer / DoD

AI Systems Engineer

Sr. Data Scientist, Enterprise AI (Remote)

Staff AI Research Scientist - Data Quality, Handshake AI

AI Applied Scientist, Code Intelligence

Commercial Lines CSR

Part-Time Content Creator (Product Video & Social Media)

Business Process Consultant

**Experienced Remote Data Entry Specialist - Apple Ecosystem Support (Part-Time, No Experience Required)**

Investment & Equity Research Analyst; Remote

Experienced Product Manager for A1378 Apple TV Remote - Leading Consumer Electronics Innovation

Associate Cloud Consultant

AI/ML Specialist Solutions Architect

Entry-Level Remote Associate at Netflix – No Experience Necessary, $22/Hour, Flexible Schedule, and Opportunities for Career Growth

Netflix Tagger Job India

Experienced Remote Data Entry Specialist - Apple Ecosystem Support (Part-Time, No Experience Required)