Machine Learning Researcher

OtherRemote, London, United Kingdom


Description

Job Description:
 
We are seeking a Machine Learning Researcher to join our team and help advance the state of the art in human-centric generative video models. Your work will focus on improving expression control, lip synchronisation, and overall realism in models such as WAN and Hunyuan. You’ll collaborate with a world-class team of researchers and engineers to build systems that can generate lifelike talking-head videos from text, audio, or motion signals—pushing the boundaries of neural rendering and avatar animation. We are hiring remotely across the EMEA region.
 
Key Responsibilities
  • Research and develop cutting-edge generative video models, with a focus on controllable facial expression, head motion, and audio-driven lip synchronisation.
  • Fine-tune and extend video diffusion models such as WAN and Hunyuan for better visual realism and audio-visual alignment.
  • Design robust training pipelines and large-scale video/audio datasets tailored for talking-head synthesis.
  • Explore techniques for controllable expression editing, multi-view consistency, and high-fidelity lip sync from speech or text prompts.
  • Work closely with product and creative teams to ensure models meet quality and production constraints.
  • Stay current with the latest research in video generation, speech-driven animation, and 3D-aware neural rendering.
 
Must Haves
  • Strong background in machine learning and deep learning, especially in generative models for video, vision, or speech.
  • Hands-on experience with video synthesis tasks such as face reenactment, lip sync, audio-to-video generation, or avatar animation. 
  • Proficient in Python and PyTorch; familiar with libraries like MMPose, MediaPipe, DLIB, or image/video generation frameworks.
  • Experience training large models and working with high-resolution audio/video datasets.
  • Deep understanding of architectures such as transformers, diffusion models, GANs and motion representation techniques.
  • Proven ability to work independently and drive research from idea to implementation.
  • Strong problem-solving skills, ability to work autonomously in a remote-first environment.
 
Nice to Have
  • PhD in Computer Vision, Machine Learning, or a related field, with publications in top-tier conferences (CVPR, ICCV, ICLR, NeurIPS, etc.).
  • Familiarity with or contributions to open-source projects in lip sync, video generation, or 3D face modelling.
  • Experience with real-time inference, model optimisation, or deployment for production applications.
  • Knowledge of adjacent areas like emotion modelling, multimodal learning, or audio-driven animation.
  • Experience working with or adapting models like WAN, Hunyuan or similar.

About BRAHMA AI:
BRAHMA AI is the next generation of enterprise media technology formed through the integration of Prime Focus Technologies and Metaphysic. By combining CLEAR®, CLEAR® AI, ATMAN, and VAANI into one ecosystem, BRAHMA AI enables enterprises to manage, create, and distribute content with intelligence, security, and efficiency.


Proven, scalable, and enterprise-tested, BRAHMA AI is helping global organizations accelerate growth, efficiency, and creative impact in the AI-powered era.