[Remote] Research Engineer – Applied Generative AI (LLMs & Multimodal Systems)
Note: The job is a remote job and is open to candidates in USA. Pocket FM is where audio storytelling comes to life, powered by cutting-edge AI. As a Research Engineer on our team, you will architect and implement core AI systems that directly impact storytelling experiences for users. Responsibilities • Architect & Implement Fine-Tuning Pipelines: Go beyond notebooks. Design, build, and optimize robust pipelines for fine-tuning foundation models (e.g., LLaMA, Mistral, Qwen) on custom datasets. You'll own the process from data curation and training to evaluation for tasks like narrative generation and dialogue synthesis • Develop & Deploy Multimodal AI Systems: Engineer and productionize models that seamlessly blend modalities. Your primary focus will be on creating state-of-the-art systems for text, speech, and audio generation to power dynamic and immersive storytelling • Own the AI Orchestration Layer: Design and implement a scalable orchestration system (e.g., using LangGraph, Ray, or custom frameworks) to manage complex, multi-agent AI workflows. This includes planning agents, tool-using models, and evaluation layers that work in concert • Build Scalable MLOps Infrastructure: Bridge the gap between models and our production environment. You will integrate generative AI workflows with our cloud infrastructure, ensuring our systems for training, inference, and deployment are efficient, reliable, and scalable • Translate Research into Production-Ready Code: Be the critical link between theoretical research and tangible product features. You'll read the latest papers, identify promising techniques, and write the high-quality, efficient code needed to make them work at scale Skills • Proficiency in Python, PyTorch, and the Hugging Face ecosystem (Transformers, Accelerate, PEFT) • Demonstrable experience with frameworks like FSDP, DeepSpeed, or Megatron-LM • Familiarity with AI workflow orchestrators (e.g., LangGraph, Prefect, Ray) and experience connecting models to cloud infrastructure (AWS, GCP, or Azure) • A track record (ideally 3+ years) of building and shipping machine learning models into production environments • Deep, hands-on experience fine-tuning large language models using techniques like LoRA, QLoRA, DPO, or RLHF • Practical experience building models that integrate multiple modalities (e.g., text-to-speech, audio understanding, vision-language) • Experience building or maintaining the full AI pipeline, from data ingestion to model serving and monitoring • A bias for action and an obsession with shipping robust, efficient code • Experience with Retrieval-Augmented Generation (RAG) systems • Building AI agents • Designing novel evaluation frameworks for generative models Benefits • Competitive compensation • Meaningful employee stock options (ESOPs) Company Overview • Pocket FM creates audio series platforms for long-form audio entertainment. It was founded in 2018, and is headquartered in Bangalore, Karnataka, IND, with a workforce of 501-1000 employees. Its website is Company H1B Sponsorship • Pocket FM has a track record of offering H1B sponsorships, with 1 in 2025. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job