Senior Data Engineer

DigitalOcean

(New York, New York)

Full Time

Job Posting Details

About DigitalOcean

DigitalOcean, the cloud for developers, is a dynamic, high-growth technology company that serves a passionate community of technologists around the world. We want to simplify cloud computing for every developer and are working on some of the most challenging and interesting problems in cloud computing.

Summary

The Senior Data Engineer will help expand the Data Science & Engineering side of the Data & Analytics team at a critical phase of growth for DigitalOcean’s product portfolio.  Data Engineers plays a foundational role in the team’s ability to assess new business opportunities, uncover new insights, and implement strategies that engage our users, optimize our operations and grow our business.  Data Engineering owns the process of extracting, transforming, and loading large amounts of data from various sources (structured/unstructured/streaming) into a comprehensive, unified data environment.  This data environment is heavily relied upon by decision-makers in all aspects of the business for insights and intelligence, and also serves as the backbone for numerous data science products that are integrated in systems company-wide.  Thus, it is critical that Data Engineering effectively manages the timeliness and quality of data being piped into the environment, as well as the reliability of the environment itself.  As DigitalOcean seeks to significantly expand its product portfolio and customer base, Data Engineers will be asked to continually evaluate and construct pipelines for entirely new data sources of varying levels of complexity and scale.  Beyond technical expertise, we are looking for an individual with strong strategic acumen and significant experience working with customer-focused teams geared toward delivering business impact.  You should feel at home in a fast-paced startup environment, with the ability and desire to dive independently into incomplete and imperfect datasets. You should thrive in situations where decisions need to be made as quickly and effectively as possible based on the available data, and where your code, insights and advice are used daily to make decisions that affect over a million users.  You should feel ready to create a framework or solution where one does not already exist, and should be ready to blaze a path forward where there may be ambiguity.

The Senior Data Engineer will help expand the Data Science & Engineering side of the Data & Analytics team at a critical phase of growth for DigitalOcean’s product portfolio. Data Engineers plays a foundational role in the team’s ability to assess new business opportunities, uncover new insights, and implement strategies that engage our users, optimize our operations and grow our business. Data Engineering owns the process of extracting, transforming, and loading large amounts of data from various sources (structured/unstructured/streaming) into a comprehensive, unified data environment. This data environment is heavily relied upon by decision-makers in all aspects of the business for insights and intelligence, and also serves as the backbone for numerous data science products that are integrated in systems company-wide. Thus, it is critical that Data Engineering effectively manages the timeliness and quality of data being piped into the environment, as well as the reliability of the environment itself. As DigitalOcean seeks to significantly expand its product portfolio and customer base, Data Engineers will be asked to continually evaluate and construct pipelines for entirely new data sources of varying levels of complexity and scale. Beyond technical expertise, we are looking for an individual with strong strategic acumen and significant experience working with customer-focused teams geared toward delivering business impact. You should feel at home in a fast-paced startup environment, with the ability and desire to dive independently into incomplete and imperfect datasets. You should thrive in situations where decisions need to be made as quickly and effectively as possible based on the available data, and where your code, insights and advice are used daily to make decisions that affect over a million users. You should feel ready to create a framework or solution where one does not already exist, and should be ready to blaze a path forward where there may be ambiguity.

Responsibilities

Develop and implement scalable ETL processes for new data sources of varying levels of complexity and scale
Contribute to the ongoing maintenance and scaling of the over-arching ETL framework and data environment, including performance tuning
Focus on production status and data quality of the data environment and data products being delivered to the business, and effectively communicate to internal user base regarding production changes/issues
Interface closely with data infrastructure, engineering and technical operations teams to ensure reliability and scalability of ETL framework and data environment
Work closely with other members of the Data & Analytics team, including Business Intelligence and Data Science to understand evolving ETL needs as more complex data models are introduced

Ideal Candidate

* Significant experience in custom ETL design, implementation and maintenance for multiple high-growth companies, preferably those with technical products
* Scripting in Python required, experience with R/Scala/Go a plus
* Track record of developing and evolving complex data environments and intelligence platforms for business users
* Demonstrable ability to relate high-level business requirements to technical ETL and data infrastructure needs, including underlying data models and scripts
* History of proactively identifying forward-looking data engineering strategies, utilizing cutting-edge technologies, and implementing at scale
* Hands-on experience with schema design and dimensional data modeling
* Understanding of statistical modeling, machine learning and data mining concepts
* Demonstrable critical thinking and analytical skills, including the ability and confidence to make conclusions and recommendations from data
* Experience interacting with key stakeholders in different fields, interpreting challenges and opportunities into actionable engineering strategies
* Experience with Big Data technologies such as HDFS/Hive/Spark/Mesos
* Advanced SQL (MySQL, PostgreSQL) scripting required
* Programming against APIs required
* Experience with Looker (BI Platform) a plus
* Bachelor’s degree in Computer Science, Math, Statistics, Economics, or other quantitative field or cumulative relevant experience

Significant experience in custom ETL design, implementation and maintenance for multiple high-growth companies, preferably those with technical products
Scripting in Python required, experience with R/Scala/Go a plus
Track record of developing and evolving complex data environments and intelligence platforms for business users
Demonstrable ability to relate high-level business requirements to technical ETL and data infrastructure needs, including underlying data models and scripts
History of proactively identifying forward-looking data engineering strategies, utilizing cutting-edge technologies, and implementing at scale
Hands-on experience with schema design and dimensional data modeling
Understanding of statistical modeling, machine learning and data mining concepts
Demonstrable critical thinking and analytical skills, including the ability and confidence to make conclusions and recommendations from data
Experience interacting with key stakeholders in different fields, interpreting challenges and opportunities into actionable engineering strategies
Experience with Big Data technologies such as HDFS/Hive/Spark/Mesos
Advanced SQL (MySQL, PostgreSQL) scripting required
Programming against APIs required
Experience with Looker (BI Platform) a plus
Bachelor’s degree in Computer Science, Math, Statistics, Economics, or other quantitative field or cumulative relevant experience