[Remote] Student Researcher [Seed Vision – Multimodal Joint Modeling] – 2026 Start (PhD)
Note: The job is a remote job and is open to candidates in USA. ByteDance is a leading company in AI foundation models, focusing on advanced research and technological advancements. The role of Student Researcher involves conducting research on multimodal generative models and contributing to foundational models for visual generation. Responsibilities Conduct research on joint training of vision, language, and video models under a unified architecture Develop scalable and efficient methods for autoregressive-style multimodal pretraining, supporting both understanding and generation Explore cross-modal tokenization, alignment, and shared representation strategies Investigate instruction tuning, captioning, and open-ended generation capabilities across modalities Contribute to system-level improvements in data curation, model optimization, and evaluation pipelines Skills Currently pursuing a PhD in Computer Vision, Machine Learning, NLP, or a related field Research experience in multimodal learning, large-scale pretraining, or vision-language modeling Proficiency in deep learning frameworks such as PyTorch or JAX Demonstrated ability to conduct independent research, with publications in top-tier conferences such as CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR Experience with autoregressive LLM training, especially in multimodal or unified modeling settings Familiarity with instruction tuning, vision-language generation, or unified token space design Background in model scaling, efficient training, or data mixture strategies Ability to work closely with infrastructure teams to deploy large-scale training workflows Benefits Day one access to health insurance Life insurance Wellbeing benefits 10 paid holidays per year Paid sick time (56 hours if hired in first half of year, 40 if hired in second half of year) Housing allowance Company Overview ByteDance is a technology company that develops content creation platforms and services. It was founded in 2012, and is headquartered in Beijing, Beijing, CHN, with a workforce of 10001+ employees. Its website is Company H1B Sponsorship ByteDance has a track record of offering H1B sponsorships, with 1350 in 2025, 1123 in 2024, 775 in 2023, 487 in 2022, 417 in 2021, 245 in 2020. Please note that this does not guarantee sponsorship for this specific role.