1. What is a Machine Learning Engineer at Relace?
At Relace, a Machine Learning Engineer is not just building standard wrapper applications; you are developing the foundational models and infrastructure that power the next generation of code agents. As a company that powers the fastest model on OpenRouter at a staggering 10,000 tokens per second, Relace sits at the intersection of cutting-edge research and low-level systems engineering. The models you build and optimize are relied upon by fast-moving, high-scale engineering organizations like Lovable, Figma, and Vercel.
This role is highly critical because optimizing small language models (SLMs) for retrieval, application, and core code generation requires squeezing every ounce of performance out of modern hardware. Whether you focus on the systems engineering side—writing custom CUDA kernels and optimizing memory layouts—or the science side—designing training methodologies and model architectures—your work directly impacts how code gets written globally. You will work alongside a highly elite team of mathematicians, physicists, and computer scientists who value elegant systems design and mathematical rigor.
For anyone passionate about deep performance tuning and running large-scale machine learning workloads close to the metal, this role offers an unparalleled engineering playground. The environment is fast-paced, highly collaborative, and deeply technical, demanding a strong first-principles approach to solving complex training and inference bottlenecks.