Senior Site Reliability Engineer

Groupon

(Palo Alto, California)
Full Time
Job Posting Details
About Groupon
Groupon provides a global marketplace where people can buy just about anything, anywhere, anytime. We’re enabling real-time commerce across an expanding range of categories including local businesses, travel destinations, consumer products, and live or lively events. At the same time, we are providing advertising options and tools that merchants can use to grow and manage their businesses.
Summary
The Senior Site Reliability Engineer, and a master of the Groupon web site. A strong candidate should have an expert knowledge of the technologies and best practices used in a high-traffic, customer-facing website. We need people who enjoy solving complex problems, who can act independently, and who can stay calm in high stress situations - while driving a path to restoration. Groupon is growing rapidly, and we need leaders who can help beat a path forward.
Responsibilities
* Part of a Global TDO team, working US business hours five days a week. No on-call! * Prioritize the focus of the Global SOC during normal times and during crises * During a crisis, lead the effort to triage and mitigate * Manage real-time communications during outages with both technical and non-technical audiences * Evangelize Best Practices to the rest of the company * Develop policies and procedures that improve overall product stability * Design and create tools to manage the site * Participate in reviews of outages in order to improve overall product stability * Build relationships with development teams and technology leaders across the company
Ideal Candidate
* 10+ years large scale environment experience * Appropriate technical background (Bachelor in computer science or equivalent) * Strong knowledge of Linux operating systems and environment * Strong knowledge of Networking, Load balancers, DNS, and TCP/IP * Experience with databases (MySQL, Postgres) * Ultimate self-starter * Experience in handling production outages and root cause analysis * Strong crisis management leadership ability * Effective communication skills, whether talking to individual contributors or to executive management * Experience with Virtualization a plus * Skill with Ruby, Python, or Java a plus * Experience creating tools for infrastructure (IaaS and PaaS) management and automation a plus
Compensation and Working Conditions
Benefits Benefits included

Questions

Answered by on
This question has not been answered
Answered by on

There are no answered questions, sign up or login to ask a question

Want to see jobs that are matched to you?

DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.