Software Engineer II

Microsoft remote • Multiple Locationsfull_time

The Azure Compute team builds a fault-tolerant, distributed system on top of commodity datacenter hardware to deliver infrastructure for hosting cloud applications in virtual machines (VMs). The team creates the experience that resources are limitless, elastic, and always available.


This role is part of the Availability Platform team within Azure Compute, which focuses on ensuring every Azure virtual machine achieves a service-level agreement (SLA) of 99.99 percent or higher. Meeting this target requires innovative thinking, data-driven decisions, and intelligent automation. The team owns services that monitor the health of millions of Azure machines and the control plane services that make repair decisions. We use artificial intelligence (AI) and machine learning to build predictive failure models that proactively migrate virtual machines before failures occur, reducing customer impact and improving platform resilience.


We are also exploring generative AI to enhance diagnostics, automate root cause analysis, and accelerate incident resolution. Collaboration with data scientists and AI researchers enables us to continuously evolve the platform with smarter, self-healing capabilities. As a Software Engineer II, you will design and deliver services architecture at hyperscale, work on incremental development with high quality, and adapt quickly to customer feedback while integrating advanced AI technologies.


Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.