Director, Global Forensics Engineering
Microsoft
Microsoft Cloud Infrastructure and Operations (CO+I) is the engine that powers Microsoft's cloud services. The group is responsible for designing, building, and operating Microsoft’s global datacenters; managing the programmatic delivery of our critical infrastructure design, equipment procurement, construction delivery, infrastructure innovation, demand planning and capacity utilization of our unified infrastructure; and responsible for all operations needed to run the physical infrastructure. We focus on smart growth with an emphasis on automation, data-driven engineering, cost‐effectiveness, and environmental sustainability. We deliver the core infrastructure and foundational technologies for Microsoft's 200+ online businesses including Azure, Office 365, Bing, Xbox Live, Skype, and OneDrive. Our portfolio is built and managed by a team of subject matter experts working 24x7x365 to support services for more than 1 billion customers and 20 million businesses in over 90 countries worldwide.
CO+I’s Forensic Engineering team drives systemic risk reduction by embedding root cause analysis (RCA) lessons into operational standards across Microsoft’s global datacenter fleet. We lead investigations into high-severity incidents, establish forensic best practices, and deliver insights that safeguard reliability, availability, and resilience at scale. We are seeking a motivated and experienced Director, Global Forensics Engineering to champion these efforts and shape the future of datacenter reliability worldwide.
In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
Post-Incident Governance: Oversee RCA reviews, validate corrective actions, and ensure compliance with global standards.
Risk Assessment & Containment: Implement temporary countermeasures and develop long-term solutions to prevent recurrence.
Define Standards: Establish forensic frameworks, evidence handling protocols, and reliability metrics.
Innovation: Advance forensic techniques, integrate telemetry and predictive analytics for proactive detection and prevention.
Leadership & Collaboration: Lead cross-functional teams, engage senior stakeholders, and manage global programs for systemic improvements.
Continuous Improvement: Drive operational excellence through data-driven insights and best practices.
Qualifications
Required Qualifications:
Doctorate Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 5+ years technical engineering experience OR Master's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 7+ years technical engineering experience OR Bachelor's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 8+ years technical engineering experience.
Other Requirements:
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred Qualifications:
Doctorate Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 7+ years technical engineering experience OR Master's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 10+ years technical engineering experience OR Bachelor's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 12+ years technical engineering experience
4+ years people management experience
Experience integrating telemetry, predictive analytics, and AI-driven tools into forensic processes
Prior experience in multinational environments
Professional certifications such as Certified Reliability Engineer (CRE), or equivalent.
Proven track record in Root Cause Analysis (RCA) and post-incident governance
Knowledge of risk assessment methodologies and containment strategies.
Familiarity with forensic frameworks, evidence handling, and reliability metrics.
Experience leading cross-functional teams and managing global programs
Advanced capability in failure analysis, data interpretation, and corrective action planning.
Deep understanding of industry standards for forensic engineering (e.g., ISO, IEC)
#COICareers
Reliability Engineering M5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.