DevOps Senior Manager

PubNub

(San Francisco, California)
Full Time
Job Posting Details
About PubNub

PubNub runs a globally distributed “Data Stream Network”, a cloud service that developers use to build and scale large real-time apps. We have thousands of customers, and process billions of realtime messages each month. We power a variety of realtime applications including multi-player games, ride dispatch services, social events, online auctions, education apps, telecom infrastructure and more. We develop software close to the bare iron and optimize performance in microseconds.

Summary

The Engineering/DevOps team is responsible for designing, developing, operationalizing, sustaining and scaling PubNub’s Data Stream Network. This includes our secure, distributed messaging bus as well as all add-on services and data pipelines including Storage/Playback, Presence, Access Management, Push Gateways and more. We are a strong team of Engineers and Architects who are low on drama and high on results. Our mission is to use uptime, performance and scale as tools, to extend the fabric of real-time possibilities and stay true to the trust and confidence reposed in us by customers to deliver delightful user experiences. We believe that teams, which balance linear progression and non-linear innovation, achieve the best results. Consequently, we place the team ahead of the individual when solving problems and celebrating achievements. If you are on a journey to seek a team whose norm is to swarm and perform, we are your destination! As an DevOps Architect/Leader, you would be directly responsible for championing tools and technologies; adopting and adapting frameworks and services used/needed by current and forward looking features. In net terms, you’ll serve as a productivity multiplier for other devops and developers in the team.

Responsibilities
  • Reporting to the VP Engineering/Ops your have responsibility for our devops/operations strategy.
  • Collaborate with engineering teams, product owners and other stakeholders to understand tooling needs for Agile development and Continuous Integration/Delivery/Deployment (CI/CD) practices.
  • Devise automation strategies for upcoming releases while continuously modernizing existing systems.
  • Create and manage local development environments for complex software systems.
  • Manage and own pre-production (testing/staging) infrastructure and cloud resources (DNS, firewalls, proxies, load balancers, databases, monitoring systems etc.).
  • Create and manage production environments for complex software systems.
  • Work closely with Architects and Engineers to define build pipelines, administer artifact repositories and automate test tooling.
  • Champion best practice methodologies for packaging and distributing web-scale applications.
  • Assist the execution of performance, stress and security testing.
  • Assess release readiness of features from an operational perspective.
  • Lead Site Reliability to ensure smooth and optimal production rollouts of provisioning, deployment, configuration, monitoring and other day-to-day operational activities.
  • Manage reporting, monitoring and alerting metrics.
  • Lead Change management and incident management processes.
  • Lead tool selection for production, automation and process.
Ideal Candidate

Experience & Skills Required:

  • Minimum criteria
    • Proven skills and background in running a realtime highly scaled system managing billions of transactions a month.
    • Outstanding understanding of cloud infrastructure providers.
    • Experience of large Cassandra deployments in the order of 72 nodes or more.
    • Strong understanding of networking concepts, protocols, and security (TCP/IP, UDP, HTTP, NTP, DNS, TLS etc).
    • Hands on experience in vendor-agnostic infrastructure automation and configuration management technologies such as Terraform and Ansible.
    • Container technology such as Docker, Mesos etc.
    • Past experience with web scale operation of virtualized linux farms and infra components such as Proxies, Reverse Proxies, in-memory caches, distributed filesystems etc.
    • Experience implementing Change management and incident management tools and process.
    • 10+ years experience in cloud grade SaaS software.
  • Advantageous
    • Strong automation design skills
    • Experience with implementing and using Continuous Integration and Delivery (CI/CD) design patterns and tool chains
    • Expert level knowledge of shell scripting
    • Working knowledge of Python
    • Experience with administering and using Code review tools such as Crucible/Gerritt and Artifact Repositories such as Artifactory
    • Attention to detail and ability to work independently on complex problems
    • BS or MS in Computer Science or related technical field.
Compensation and Working Conditions
Benefits Benefits included
Reports to VP Engineering/Ops

Additional Notes on Compensation

Competitive salary and pre-IPO stock options. Generous paid medical/dental/vision coverage, plus medical and dependent FSA. Catered lunch 3x a week. Fully stocked break room with unlimited drinks and snacks. Team outings and holiday party.

Questions

There are no answered questions, sign up or login to ask a question

sign up or login to save this job and more
San Francisco, California
Skills Desired
Sign up or login to see how your skills match up.
  • Change Management
  • Cloud
  • Linux
  • Networking
  • Patterns
  • Python
  • Scaling
  • Continuous Integration
  • DevOps
  • DNS
  • SaaS
  • TCP/IP
  • Computer Science
  • HTTP
  • Docker
  • Ansible
  • Crucible
  • Terraform
  • Security Testing
  • Access Management
  • Code Review
  • In-Memory
  • Management Tools
  • Infrastructure Automation
  • Scale Applications
  • Artifactory
  • Data Stream
  • Shell
  • CI/CD

Want to see jobs that are matched to you?

DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.