Cloud Network Engineer
Microsoft
Cloud Network Engineer
Multiple Locations, United States
Save
Overview
The HPC/AI (High performance Computing and Artificial Intelligence) team is on a mission to build the next-generation distributed AI supercomputer, enabling breakthroughs in artificial intelligence by delivering unmatched computational power, scalability and reliability. We design and develop cutting-edge infrastructure that supports high-performance AI model training at scale, laying the foundation for innovations that redefine what AI can achieve.
We are seeking individuals with experience in network engineering to help design and develop the systems that support large-scale AI and HPC workloads. This role involves working on network infrastructure, automation workflows, observability tools, and performance optimization systems that support ultra-low latency and high-throughput environments.
As a Cloud Network Engineer, you will contribute to the development and operation of advanced networking systems that support AI model training and deployment in the cloud. You’ll work with technologies such as Ethernet, InfiniBand, and accelerated compute platforms (e.g., NVIDIA and AMD GPUs), helping ensure the reliability and performance of distributed clusters.
This opportunity involves configuring and managing network systems that prioritize speed, reliability, and availability at scale. You’ll collaborate with hardware, infrastructure, and platform teams to deliver solutions that support AI training and inference. If you have experience with high-speed networking, distributed systems, performance engineering, or network architecture, we welcome your application.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
#EiP
Qualifications
Required/Minimum Qualifications:
- Bachelor's Degree in Electrical Engineering, Optical Engineering, Computer Science, Engineering, Information Technology, or related field OR equivalent experience.
- Experience designing, deploying, and supporting data center and backbone networks for distributed computing platforms.
Other Qualifications:
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Additional or Preferred Qualifications:
- Master's Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field OR Bachelor's Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field AND 2+ years technical experience in network design, development, and automation OR equivalent experience.
- Proficient understanding of Routing Protocols including BGP, MPLS and tunneling techniques including VxLAN/EVPN
- Experience with telemetry and observability tools for monitoring physical network health, link performance, and congestion at scale
- Background in building scalable, fault-tolerant physical networks for distributed computing environments (e.g., AI/ML clusters, HPC systems)
- Proficiency in Linux-based systems, including kernel-level networking, interface tuning, and low-level debugging of physical network issues
Cloud Network Engineering IC2 - The typical base pay range for this role across the U.S. is USD $84,200 - $165,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $109,000 - $180,400 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft will accept applications for the role until November 5th, 2025.
#azurecorejobs
Responsibilities
- As a Cloud Network Engineer, you will own deployment, and operational excellence of high-performance physical networking systems that underpin large-scale AI and high-performance computing (HPC) environments.
- Network Deployment: Support the deployment of high-throughput, low-latency network topologies (e.g., Clos, FatTree) using technologies such as InfiniBand and Ethernet.
- Operational Support: Monitor network health, respond to incidents, perform root-cause analysis, and contribute to improvements in availability and observability.
- Cross-Functional Collaboration: Work with hardware engineering, data center operations, and software-defined networking teams to integrate physical and logical network layers.
- Documentation & Standards: Maintain documentation for network designs, cabling standards, and deployment procedures. Participate in design reviews and ensure alignment with safety and compliance standards.
- Innovation & Research: Stay informed about advancements in optical networking, high-speed interconnects, and AI/HPC fabric technologies. Evaluate emerging solutions to improve scalability and performance.