Principal Software Engineer
Microsoft
Want to impact the foundation for future AI storage development in Azure, the world's computer? The Azure Managed Lustre File System (AMLFS) team leads development, deployment, and monitoring of the most popular High-Performance Computing (HPC) parallel file system in the world: Lustre, the Azure storage solution of choice for AI training and fine-tuning. The AMLFS Platform Team is responsible for end-to-end delivery of AMLFS images, cluster deployment, logs and metrics, and configuration compliance. An ideal candidate will also have opportunities to impact cluster architecture and design of Lustre in the Azure ecosystem, performance analysis and optimization of AMLFS, and customer support for the most challenging parallel filesystem bugs or performance anomalies that arise within our product.
As a Principal Software Engineer in the AMLFS Platform team you will lead design and development of key features, primarily working on reliable deployment of AMLFS in Azure, assessing and mitigating security risks, developing comprehensive unit and system-level tests, and diagnosing, mitigating, and fixing the most challenging deployment and upgrade customer issues. You will lead the design and development of logging, monitoring, and reporting capabilities for AMLFS and help define and measure key Service Level Indicators designed to make our product increasingly robust. This opportunity will allow you to develop expertise in distributed system and HPC/AI filesystem design, implementation, and debugging, grow proficient in navigating and managing Linux operating systems, and hone leadership qualities as you develop strong collaborative working relationships with with the core storage, compute, and networking teams that form the foundation of Azure.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
- Partners with appropriate stakeholders to determine user requirements for a set of scenarios.
- Leads identification of dependencies and the development of design documents for a product, application, service, or platform.
- Leads by example and mentors others to produce extensible and maintainable code used across products.
- Leverages subject-matter expertise of cross-product features with appropriate stakeholders (e.g., project managers) to drive multiple group's project plans, release plans, and work items.
- Holds accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions.
- Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale and shares knowledge with other engineers.
Qualifications
Required Qualifications:
- Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python
- OR equivalent experience.
Other Requirements:
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python
- OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python
- OR equivalent experience.
#azurecorejobs
Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.