Software Engineer: Reliability
Button, INC.
(New York, New York)At Button we’re proud to be judged by the company we keep. As entrepreneurs, we’ve collectively started over a dozen companies and have experience at top tech companies like Google, Venmo, and Rocket Internet.
- Steer and mature a modern API platform.
- Work closely with other platform engineers to design a scalable, maintainable, service-oriented production architecture.
- Take control of and automate all aspects of deployment, using tools like Ansible.
- Design load balancing, failover, and alerting systems for best-in-class 24x7 high availability across all services.
- Champion best practices surrounding logging, monitoring, build and test systems.
- Shape our engineering culture by coming up with ideas, tools, and infrastructure wherever you see a problem to be solved.
- Measure and monitor systems to build company-wide awareness of production; build dashboards for important metrics.
How do we work together?
- Our services are written in NodeJS, Go, and Python.
- We believe in using the right tool for the job, and we love learning.
- We use AWS, Graphite, Loggly, and a host of other SaaS tools to run our stack.
- Everything is code reviewed.
- You’ll be an important engineer in keeping our systems up and shaping a “reliability first” culture where all engineers participate in the rotation.
- So many war stories on building and maintaining a highly-scalable production system.
- A love of hands on coding and expert proficiency in at least one language like Python, Ruby, or Go.Mastery of AWS and EC2.
- Experience designing scalable “mission critical” systems.
- You know what it takes to keep the pager quiet.
- An ability to move fast, make decisions, and take a pragmatic approach to any problem.
Questions
There are no answered questions, sign up or login to ask a question

Want to see jobs that are matched to you?
DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.