Principal Software Engineering Manager-Azure Storage
Microsoft
Principal Software Engineering Manager-Azure Storage
Multiple Locations, United States
Save
Overview
As a Principal Software Engineering Manager - Azure Storage, you will lead strategic initiatives to optimize fleet health and reduce offline capacity across hyperscale environments. This role is pivotal in driving intelligent solutions that improve reliability, minimize operational overhead, and enable scalable Artificial Intelligence (AI) and Machine Learning (ML) workloads for customers like Open Artificial Intelligence (Open AI), Temu, and others. You’ll partner across engineering, product, and industry teams to modernize data infrastructure, accelerate digital transformation, and deliver measurable impact in sustainability and efficiency.
The Principal Software Engineering Manager - Azure Storage, will lead a team of developers focused on scaling and optimizing one of the world’s largest storage server fleets. Your mission is to reduce offline capacity and manual operational burden through intelligent automation and AI-driven solutions. This high-impact role offers visibility at the VP level, opportunities to shape fleet strategy, and the flexibility of working across time zones in a collaborative, remote-friendly environment.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Qualifications
Required Qualifications:
- Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python
- OR equivalent experience.
- 2+ year(s) of deep understanding of server hardware architecture and fleet-level hardware lifecycle management, including diagnostics, telemetry, and failure mitigation.
- 2+ years of people management experience.
- 3+ years of technical background in cloud infrastructure, storage systems. Preferably within hyperscale environments.
- 3+ years of demonstrated ability to plan and execute complex projects, including setting priorities, managing timelines, and delivering results across cross-functional teams.
Other Requirements:
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Qualifications:
- Experience with AI/ML-driven automation for anomaly detection, predictive maintenance, or system optimization.
- Experience communication and stakeholder management skills, with a proven ability to engage Vice President (VP)-level leadership and influence technical roadmaps.
- Experience driving engineering excellence, including service quality, reliability, and operational readiness.
- Demonstrated comfort working across time zones in a remote-friendly, globally distributed team environment.
Responsibilities
- Lead and manage a high-performing engineering team focused on scaling and optimizing Azure Storage’s global fleet infrastructure.
- Drive planning and execution of team deliverables, ensuring alignment with partner teams, business goals, technical strategy, and service-level objectives.
- Develop and deliver scalable features that reduce offline capacity, improve fleet reliability, and minimize manual operational overhead and risk.
- Leverage AI/ML to build intelligent automation for anomaly detection, predictive maintenance, and fleet health optimization.
- Engage with senior leadership, including VP-level stakeholders, to influence roadmap priorities and communicate impact.
- Guides team to drive multiple group's project plans, release plans, and work items in coordination with appropriate stakeholders (e.g., project managers).
- Guides team and acts as an expert for Designated Responsible Individual (DRI) and monitors other engineers across product lines, working on call to monitor system/product/service for degradation, downtime, or interruptions.