Cloud Services Operations Engineer

Wikimedia Foundation

(San Francisco, California)
Full Time Fully Remote
Job Posting Details
About Wikimedia Foundation
The Wikimedia Foundation, Inc. is a nonprofit charitable organization dedicated to encouraging the growth, development and distribution of free, multilingual, educational content, and to providing the full content of these wiki-based projects to the public free of charge. The Wikimedia Foundation operates some of the largest collaboratively edited reference projects in the world, including Wikipedia, a top-ten internet property.
Summary
Come work within the Technology department at the Wikimedia Foundation! We are administering a public OpenStack cloud (Infrastructure as a Service) with a modern Platform as a Service (Kubernetes) running on it. We are dedicated to supporting developers within and outside of the Wikimedia Foundation. Candidates need to be comfortable sharing ideas, providing guidance, following instructions, mentoring volunteers, and communicating in public and asynchronous ways (mailing lists/forums/irc). Our team works remotely and so can you!
Responsibilities
* Perform day-to-day operational tasks on Wikimedia’s Cloud Services infrastructure (deployment, maintenance, configuration, troubleshooting) * Support volunteer and staff developers using Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) products * Implement and utilize configuration management and deployment tools (Puppet, Kubernetes) * Assist in the architectural design of new services and making them operate at scale * Assist in or lead incident response, diagnosis and followup on system outages or alerts across our stack
Ideal Candidate
**Requirements:** * Bachelor's degree and 5+ years related work experience; or equivalent work experience; or Master’s degree and 3+ years related work experience * Minimum of 5+ years of professional experience with infrastructure support and Linux * Solid development history with interpreted languages and web stack technologies. * Experience managing modern distributed container cluster management systems (Primarily Kubernetes but also Docker Swarm, Mesos, …) * Minimum of 3 years of experience with Open Source configuration management and orchestration tools (Puppet, Ansible, Chef, SaltStack, …) * Experience managing an elastic computing environment (Openstack, Cloudstack, …) * On-call support and off-hours coverage in a 24x7 environment * Solid understanding of networking and TCP/IP fundamentals * Ability and ambition to support staff and volunteer developers inside and outside of the Wikimedia Foundation * Strong verbal and written proficiency with the English language **Pluses:** * Experience with Golang * Experience with advanced distributed storage and database systems (Swift, Ceph, Cassandra...) * Low level systems troubleshooting and debugging skills (CPU/memory profiling, C/C++ experience, in-depth Linux knowledge) * Experience with the use, maintenance and configuration of monitoring, metrics and logging infrastructure (Icinga/Nagios, Prometheus, Grafana, Graphite, Logstash/Kibana, etc.)
Compensation and Working Conditions
Benefits Benefits included

Additional Notes on Compensation

Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!). The 401(k) retirement plan offers matched contributions at 4% of annual salary.

Questions

Answered by on
This question has not been answered
Answered by on

There are no answered questions, sign up or login to ask a question

Want to see jobs that are matched to you?

DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.