O pozici
The Substrate Transport team builds and operates the global messaging platform that underpins collaboration across Microsoft 365. We own the Exchange Online transport pipeline, processing billions of messages daily across traditional email, service-to-service communications, and emerging M365 experiences across Outlook, SharePoint, Teams, Copilot, and partner services.
We are looking for a Principal Software Engineer to provide technical leadership for the modernization of Substrate Transport. This role will help re-architect a legacy, business-critical messaging platform into a cloud-native, scalable, secure, and intelligent transport system, including AI integration across the software development lifecycle.
In this role, you will lead complex architecture and implementation across teams, make high-judgment technical decisions that affect service reliability and customer trust, mentor engineers, and drive measurable engineering excellence. You will operate at Principal scope by brokering hard cross-team problems, creating clarity for multi-year technical outcomes, changing the momentum of current plans toward higher customer and business impact, and creating the conditions for the team and partner teams to deliver safely at global scale.
Co budeš dělat
- Lead architecture, design, and implementation for complex Substrate Transport modernization work, including cloud-native platform capabilities, and integration with Exchange Online and broader M365 service dependencies.
- Partner with product managers, technical program managers, security/privacy experts, customer escalation teams, and partner engineering teams to determine requirements, validate feasibility, and translate ambiguous customer and service needs into clear technical direction and executable milestones.
- Own and broker cross-team architecture decisions across upstream and downstream dependencies, ensuring designs meet performance, scalability, resiliency, disaster-recovery, cost, security, privacy, compliance, and accessibility expectations.
- Lead by example in producing extensible, maintainable, well-tested, secure, performant code and reviewing code and test code for diagnosability, reliability, maintainability, security risks, compliance issues, and appropriate test coverage.
- Define and use the right optics, quality metrics, telemetry, dashboards, and feedback loops to guide technical decisions, measure customer value, identify risks early, and improve service health and engineering outcomes.
- Drive safe-change practices including feature flags, flighting, experimentation, deployment automation, rollback strategies, production-like validation, and secure dependency management to minimize customer impact and accelerate recovery.
- Lead live-site engineering for a globally distributed service: act as a designated responsible individual when needed, improve troubleshooting guides, reduce recurring incidents, drive retrospectives and repair items, and strengthen monitoring and operational readiness.
- Apply AI-native development practices responsibly, including appropriate controls over AI-generated requirements, designs, code, tests, and operational assets; evaluate AI tools and practices that improve engineering productivity and quality.
- Mentor and coach engineers across the team, build shared technical judgment, create clarity and energy, model Microsoft values and One Microsoft behaviors, and foster an inclusive environment where diverse perspectives improve product outcomes.
- Stay current with industry and Microsoft engineering practices, evaluate new technologies and patterns, and introduce approaches that improve quality, engineering productivity, reliability, security, and responsiveness to changing priorities.
Koho hledáme
- Bachelor's Degree in Computer Science or related technical field AND technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
- Experience leading architecture and delivery for large-scale distributed cloud services, high-privilege operations, privacy-aware telemetry, auditability, secure coding practices, and integration with security monitoring and response processes.
- Experience applying data-driven decision making to product and platform engineering, including defining metrics, interpreting production signals, and using customer and service data to guide design and prioritization.
- Demonstrated ability to mentor senior engineers, influence technical direction without direct authority, communicate clearly with executives and partner teams, and create an inclusive culture that enables others to do their best work.