Systems Research Engineer, Machine Learning Systems

Remote, USA Full-time Posted 2025-03-12

About the position

As a Systems Research Engineer specialized in Machine Learning Systems, you will play a crucial role in researching and building the next generation AI platform at Together. Working closely with the modeling, algorithm, and engineering teams, you will design large-scale distributed training systems and a low-latency/high-throughput inference engine that serves a diverse, rapidly growing user base. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning systems, ensuring that our AI infrastructure remains at the forefront of innovation.

Responsibilities
? Optimize and fine-tune existing training and inference platform to achieve better performance and scalability
,
? Collaborate with cross-functional teams to integrate cutting edge research ideas into existing software systems
,
? Develop your own ideas of optimizing the training and inference platforms and push the frontier of machine learning systems research
,
? Stay up-to-date with the latest advancements in machine learning systems techniques and apply many of them to the Together platform

Requirements
? Strong background in machine learning systems, such as distributed learning and efficient inference for large language models and diffusion models
,
? Knowledge of ML/AI applications and models, especially foundation models such as large language models and diffusion models, how they are constructed and how they are used
,
? Knowledge of system performance profiling and optimization tools for ML systems
,
? Excellent problem-solving and analytical skills
,
? Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experience

Nice-to-haves

Benefits
? Competitive compensation
,
? Startup equity
,
? Health insurance
,
? Flexibility in terms of remote work

Apply Job!

Similar Remote Jobs