Site Reliability Engineer, Data Center

New Relic

(Portland, Oregon)

Full Time

Job Posting Details

About New Relic

New Relic is a Software Analytics company that makes sense of billions of metrics across millions of apps. We help the people who build modern software understand the stories their data is trying to tell them.

Responsibilities

* Lead the development of a robust set of micro REST APIs to help create agile and robust infrastructure management and reporting workflows.
* Improve and build upon our existing automation tools for systems provisioning and management.
* Independently learn new technologies and master the New Relic infrastructure so that you can provide 'full stack' diagnostics, when necessary, to help to figure out the root cause of internal problems.
* Communicate effectively with fellow SREs and other engineering teams, and describe problems succinctly with sufficient detail that you can hand-off an ongoing problem to another team or a peer for completion.
* Strategize with fellow SREs and other engineering teams on complex problems, and make decisions and recommendations about systems improvements after analyzing possible courses of conduct.
* Perform periodic on-call duty as part of a global team maintaining the availability and performance of the New Relic site and APIs used by third-party services, as well as the various internal services and systems that these core interfaces depend on.
* Physical requirements:  Ability to lift 50 lbs repeatedly, with our without accommodation.

* Lead the development of a robust set of micro REST APIs to help create agile and robust infrastructure management and reporting workflows. * Improve and build upon our existing automation tools for systems provisioning and management. * Independently learn new technologies and master the New Relic infrastructure so that you can provide 'full stack' diagnostics, when necessary, to help to figure out the root cause of internal problems. * Communicate effectively with fellow SREs and other engineering teams, and describe problems succinctly with sufficient detail that you can hand-off an ongoing problem to another team or a peer for completion. * Strategize with fellow SREs and other engineering teams on complex problems, and make decisions and recommendations about systems improvements after analyzing possible courses of conduct. * Perform periodic on-call duty as part of a global team maintaining the availability and performance of the New Relic site and APIs used by third-party services, as well as the various internal services and systems that these core interfaces depend on. * Physical requirements: Ability to lift 50 lbs repeatedly, with our without accommodation.

Ideal Candidate

* Proficiency in one of the following languages is expected: Perl, Python, Ruby, or Go. Ruby and/or Go experience is strongly preferred. * Experience with Linux systems administration and tuning. * Solid understanding of TCP/IP networking and switching. * Troubleshooting skills that range from diagnosing low-level hardware and software issues to large-scale failures. * Ability to work independently. * Experience with monitoring, trending, and logging tools such as Nagios, Kibana, and Cacti. * Shown success resolving multiple interrupt-driven priorities simultaneously. * Experience with Incident management. * Experience with load balancing, storage, and clustering technologies.

Similar Jobs

See other jobs at New Relic
See more engineering jobs in Oregon

Questions

Answered by on

This question has not been answered

Answered by on

Ask a question!

There are no answered questions, sign up or login to ask a question

Site Reliability Engineer, Data Center

New Relic

Questions

For Job Seekers

Contact Us

Site Reliability Engineer, Data Center

New Relic

Questions

Want to see jobs that are matched to you?

Application Submitted

Login Here

Question Submitted

Thanks for submitting your question!