The HPC/AI (High performance Computing and Artificial Intelligence) team is on a mission to build the next-generation distributed AI supercomputer, enabling breakthroughs in artificial intelligence by delivering unmatched computational power, scalability and reliability. We design and develop cutting-edge infrastructure that supports high-performance AI model training at scale, laying the foundation for innovations that redefine what AI can achieve.
We are seeking individuals with experience in network engineering to help design and develop the systems that support large-scale AI and HPC workloads. This role involves working on network infrastructure, automation workflows, observability tools, and performance optimization systems that support ultra-low latency and high-throughput environments.
As a Cloud Network Engineer, you will contribute to the development and operation of advanced networking systems that support AI model training and deployment in the cloud. You’ll work with technologies such as Ethernet, InfiniBand, and accelerated compute platforms (e.g., NVIDIA and AMD GPUs), helping ensure the reliability and performance of distributed clusters.
This opportunity involves configuring and managing network systems that prioritize speed, reliability, and availability at scale. You’ll collaborate with hardware, infrastructure, and platform teams to deliver solutions that support AI training and inference. If you have experience with high-speed networking, distributed systems, performance engineering, or network architecture, we welcome your application.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
#EiP

