Lead Site Reliability Engineer

ez cater

(Boston, Massachusetts)
Full Time
Job Posting Details
About ez cater
ezCater is the #1 online -and the only nationwide - marketplace for business catering in the United States – a $21 billion market. Our 850K+ on-time ratings and reviews, our 51K+ caterers and restaurants, and our 5-star customer service make it superbly easy for business people to find and order great food for their meetings.
Summary
We’re looking for a top-notch, hands-on SRE to lead our small and talented infrastructure engineering team and help us elevate our game when it comes to designing, building and operating high-performance and highly-available systems. At ezCater every engineer is responsible for the software they build, and SREs play a critical part in providing the tools, practices, and expertise to support them succeed. Our production systems are hosted in AWS data centers running a large Ruby on Rails web application and a handful of smaller services in Ruby, Node.js, and Java. We currently deploy 3-5 times a day. Our systems are stable and fire drills are rare.Technologies we’re currently using include: Amazon Web Services (EC2, ELB, S3, RDS, ElastiCache) and Ubuntu Linux, Postgres, Redis, Memcached, ElasticSearch, Chef, ServerSpec, Terraform, NewRelic, DataDog, Sumo Logic and Test Kitchen
Responsibilities
* Design, build, and maintain the core infrastructure for ezCater * Actively manage the backlog for our infrastructure team and work closely with other SREs on the team to provide coaching and mentorship * Help us increase developer productivity and get to true continuous delivery * Develop operational and security standards and champion operational excellence and secure coding practices * Partner with engineering teams closely to educate and consult * Participate in solution design for new features, products, systems and tooling * Debug complex problems across the whole stack * Continually monitor application/system performance and costs, generate actionable insights and either implement or advocate for them * Participate in on-call rotations, along with every member of the engineering team * Ruthlessly eliminate repetitive manual tasks and recurring errors * Ensure we are always employing best-of-breed tooling for all our infrastructure and automation needs * Collaboratively plot course for the maturing and growth of ezCater’s infrastructure * Participate (and sometimes run point) in handling production incidents * Work closely with engineering teams to conduct root cause analysis for production incidents, and evolve infrastructure and tooling.
Ideal Candidate
* Thrive in a highly collaborative, no red-tape, rapid-growth environment * Love building tooling and infrastructure to help developers be more productive * Love eliminating repetitive manual tasks through automation * Have a healthy appreciation of what it means to work in production * Have solid Unix command line and systems chops * Have experience with substantial, distributed SaaS or eCommerce systems * Can point to a solid track record of success leading small-to-medium infrastructure teams * Have vision and well-informed opinions about how to build infrastructure for a high-growth, technology-driven company that’s headed towards the $1B mark

Questions

Answered by on
This question has not been answered
Answered by on

There are no answered questions, sign up or login to ask a question

Want to see jobs that are matched to you?

DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.