The Production Engineering and Artificial Intelligence (AI) Group, part of the Linux Systems Group within Microsoft, plays a critical role in powering Azure Cloud. This team ensures that Azure operates with the latest version of Linux software at the highest levels of quality and performance, serving as the gatekeeper for production software. The team achieves this at Azure scale through efficient automation and by leveraging artificial intelligence to reduce the human effort required for these responsibilities. This is an excellent opportunity to join the Production Engineering and AI Group and contribute to the growth of Microsoft’s Azure Cloud infrastructure.
As a Site Reliability Engineer II, you will be responsible for ensuring that software deployments follow safe rollout processes while driving operational excellence. You will leverage technical expertise, telemetry analysis, and advanced artificial intelligence to maintain reliability and performance across large-scale systems.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

