Senior Systems Engineer

Sauce Labs

(San Francisco, California)
Full Time
Job Posting Details
About Sauce Labs

Sauce Labs provides the world’s largest cloud-based platform for the automated testing of web and mobile applications. Its award-winning service eliminates the time and expense of maintaining an in-house testing infrastructure, freeing development teams of any size to innovate and release better software, faster.

Responsibilities
  • Write tools and scripts to provide automation and self service solutions for ourselves and other teams
  • Design new systems to support production services
  • Install, configure and debug hardware and systems in our data center
  • Creatively solve scale challenges regarding a rapidly expanding cloud environment
  • Work with real hardware - high density Cisco UCS B-series blades and C-series rack-mount servers, Nexus networking (10Gb+ core network), storage (NAS and SAN), Mac-in-a-datacenter, custom appliances for mobile devices, load balancers, and beyond
  • Help improve monitoring and identify key performance metrics
  • Proactive R&D - discovering and implementing new tools, emerging technology, etc.
  • Disaster recovery design, implementation, and maintenance
  • Create NOC runbooks, procedures, documentation, and diagrams of the environments you manage
  • Troubleshooting and resolution of server/network issues
  • Help maintain hardware in Sauce’s colocation facilities
  • Help build out new data centers around the globe
  • Participation in 24x7 on-call rotation
Ideal Candidate
  • 3+ years recent experience working as a Linux administrator/engineer at scale (hundreds of systems) and designing/deploying ‘highly available’ solutions
  • 2+ years of recent professional experience designing, developing, and operating Configuration Management solutions such as Chef, Puppet, Salt (preferred), or Ansible (preferred) at scale
  • Solid experience in Linux tuning, profiling, and monitoring
  • Strong skills in at least one language: Python (preferred), Ruby, Bash

Bonus points for:

  • Experience deploying/managing KVM-Qemu and LXC
  • Solid understanding of cloud/networking/distributed computing environment concepts; including TCP/ IP connections, firewalls, VLANs, etc.
  • Experience and understanding of contemporary metrics, monitors, and logging solutions especially statsD, Graphite, ELK, Splunk, Nagios, etc.
  • Provisioning and automation with Cisco Unified Computing System Manager
  • Highly organized, able to multi-task, able to work individually, as well as within a team, and across teams
  • Excellent communication skills, both verbal and written across all user levels
  • Deployment automation in physical and virtual environments (PXE, Cobbler, MAAS (preferred))
  • Working knowledge of load balancing technologies (hard/soft)
  • Proven experience collaborating in a cross functional team environment
  • Familiarity with software engineering practices, including n-tier architecture, configuration management, development methodologies (e.g. agile, waterfall, spiral, prototyping), etc.

Questions

There are no answered questions, sign up or login to ask a question

sign up or login to save this job and more
San Francisco, California
Skills Desired
Sign up or login to see how your skills match up.
  • Cloud
  • Debugging
  • Design
  • Designing Systems
  • Network Operations
  • Python
  • Ruby
  • Strong Oral and Written Communication
  • Writing Shell Scripts
  • Bash
  • Disaster Recovery
  • Distributed Computing
  • Firewall
  • KVM
  • Linux System Administration
  • Software Configuration Management
  • TCP/IP
  • Cisco
  • Software Engineering
  • Chef Software
  • SaltStack

Want to see jobs that are matched to you?

DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.