Connecting people I'd hire with companies I'd work at

Matt Wallaert
companies
Jobs

Senior Site Reliability Engineer

Microsoft

Microsoft

Software Engineering
Posted on Jul 1, 2025

Senior Site Reliability Engineer

Hyderabad, Telangana, India

Save

Share job

Date posted
Jul 01, 2025
Job number
1830477
Work site
Up to 50% work from home
Travel
0-25 %
Role type
Individual Contributor
Profession
Software Engineering
Discipline
Site Reliability Engineering
Employment type
Full-Time

Overview

Are you passionate about working on cutting-edge devices? Surface Team is dedicated to building powerful devices that empower individuals and organizations. We’re working on the next generation of Surface products, and we need talented individuals like you!

We’re seeking a skilled engineer to enhance enterprise customer experience in managing a fleet of Surface devices. Our team is responsible for creating and maintaining online portals, backend APIs, Microservices, Function Apps, Web Jobs, and integrations with Supply Chain systems. Our solutions leverage AI and Copilots to enhance productivity, providing the best experience, and streamline operations for enterprise customers.


As a key member of the team, you will be responsible for designing and deploying reliable distributed platforms, empowering commercial customers to self-serve, manage and monitor Surface devices at scale. This is an exciting opportunity to demonstrate broad leadership and impact across Devices.


Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

  • Bachelors/ Masters degree in Computer Science or other Engineering field
    • 8+ years of technical experience in software engineering and DevOps in developing build, deployment pipelines and infrastructure building and running cloud service at large scale
    • 4+ years of experience with software development in programming language C#, WebAPIs, Cosmos, SQL Azure, Microsoft fabric
    • Excellent technical design, problem solving and debugging skills
    • Excellent leadership, communication, teamwork and collaboration skills across organizations
    • Passionate, motivated, self-driven and quick learner
    • Ability to deal with the ambiguity associated with working in a fast-paced environment
    • Systematic problem-solving approach, coupled with effective communication skills and a sense of curiosity
    • Expertise in analyzing, troubleshooting, and automating root cause analysis and mitigation of incidents impacting large-scale distributed systems.

Other Requirements:


Candidates must be able to meet Microsoft, customer and/or government security screening requirements that are required for this role. These requirements include, but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Excellent technical design, problem solving and debugging skills.
  • Excellent leadership, communication, teamwork and collaboration skills across organizations.
  • Passionate, motivated, self-driven and quick learner.
  • Ability to deal with the ambiguity associated with working in a fast-paced environment.
  • Systematic problem-solving approach, coupled with effective communication skills and a sense of curiosity.
  • Excellent written and verbal communication skills.
  • Experience in developing Monitoring & Telemetry tools, Containers(Azure Kubernetes Service),CICDs.
  • Experiences with building dashboards, code analysis , secure practices.

Responsibilities

  • Champion and implement DevOps and Site Reliability Engineering best practices to ensure system reliability, observability, and operational excellence.
  • Own the uptime and performance of applications built on Azure Containers, APIs, and modern UI frameworks, ensuring they meet stringent SLAs and customer expectations.
  • Drive incident response, root cause analysis, and postmortem processes to continuously improve system resilience.
  • Develop and maintain automation for deployment, monitoring, alerting, and self-healing systems to reduce manual toil and improve efficiency.
  • Partner closely with software engineering, product owners, design scalable and fault-tolerant systems.
  • Monitor system performance and plan for future growth, ensuring infrastructure is right-sized and cost-effective.
  • Ensure systems are secure, compliant, and aligned with Microsoft’s security standards and policies.
  • Guide and mentor junior engineers, fostering a culture of learning, ownership, and continuous improvement.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.