Site Reliability Engineer
Microsoft Posted on 2022-11-18
Salary | Not mentioned |
---|---|
Location | Redmond, WA 98052 (Overlake area) |
Type | Full-time |
Job Description
At Microsoft, our mission is to empower every person and every organization on the planet to achieve more. As a member of our engineering team, you will play an integral part in making that happen, navigating us into the future and impacting the lives of people all around the world.
To make this work for our customers, we need continual effort to make that delivery reliable. To drive reliability, we need you – someone who already is, or is interested in becoming, a Site Reliability Engineer (also known as SRE).
SREs are people who take engineering-based approaches to solve operations problems: we like infrastructure, we like seeing how big complicated things work, and most importantly, we gain great satisfaction from making it better. We have backgrounds in lots of things – of course, Computer Science, System Administration, Networking, Mathematics, and Engineering generally, but you can also find folks who »ve worked in Physics, Chemistry, Biology, Statistics, and even English.
SREs build, monitor, and maintain the systems and infrastructure that ensure our customers can quickly access their data and run workloads whenever and wherever they need to. We identify service problems and areas for improvement, and we follow up by fixing those problems. Our work is key to the success of many of the Microsoft services you »ll have heard of, and a number you haven »t. There are very few bits of Microsoft which aren »t touched by SREs in some way or other.
SREs come in two kinds: SRE-SWE (people with a software engineering background), and SRE-SE (people with a systems engineering background).
SREs with a software engineering background can come from industry and academia. Their distinguishing feature is their inclination to, and demonstrated competence in, working with software. An SRE-SWE could be successful in perhaps any software team, but typically has an interest in infrastructure, scale, performance, or the behavior of distributed systems generally.
Responsibilities
The scale of our operations is enormous. Microsoft »s products and services are overwhelmingly consumed online, and billions of people use them every day. We need people who enjoy analyzing complicated problems, coming up with creative solutions, working in focused teams to build things no-one has thought of before, all in the service of production reliability.
If you are excited by this type of challenge, and you love to work in groups of people who are similarly excited, come join us. We value the input of people who aren »t afraid to be learning all the time, who celebrate mistakes because they show the way forward, and those who are happy to continuously improve. We strongly believe that diverse experiences and backgrounds, and an environment where everyone can feel safe to contribute their own insights in a data-driven, objective, but the supportive way is the key to making the best workplace possible, and the best workplace makes the best products and services. Not only is it the smart thing, but it’s also the right thing.
- Are interested in distributed systems and working with high scale services.
- Like to work in a fast-moving environment and you aren »t afraid to change things to make them better.
- Enjoy new technological challenges and solving hard problems.
- Believe that a team working well together is truly smarter than the single smartest person on that team.
- Aspire to grow as a person, as a teammate, and as an engineer.
Qualifications
We try not to have too many formal qualifications, since mindset and demonstrated ability are more important, but previous successful candidates have often had some or all the following:
- 1+ years of experience working with Distributed Systems/Distributed Cloud Services
- 1+ years of software development: automation-related experience valued. Scripting languages such as bash, python, and PowerShell, or compiled languages such as C, C# and Go are most relevant, but others are acceptable.
- 1+ years of experience working in an Agile (or similar)-environment
Preferred Qualifications:
- Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes generally, microservices, and so on.
- Associated troubleshooting skills, including the ability to follow RPC call-chains across arbitrary network steps. Consequent understanding of monitoring in distributed systems.
- Experience with working in a team, including coordinating large projects, communicating well, and exercising initiative when presented with problems.
- Generally speaking, practical experience running large scale online systems is always an advantage.
Company description:
Every company has a mission. What »s ours? To empower every person and every organization to achieve more. We believe technology can and should be a force for good and that meaningful innovation contributes to a brighter world in the future and today. Our culture doesn’t just encourage curiosity; it embraces it. Each day we make progress together by showing up as our authentic selves. We show up with a learn-it-all mentality. We show up cheering on others, knowing their success doesn »t diminish our own. We show up every day open to learning our own biases, changing our behavior, and inviting in differences. When we show up, we achieve more together. Microsoft operates in 190 countries and is made up of more than 220,000 passionate employees worldwide.
Follow us