Data Engineer

IHeartMedia

(New York, New York)

Full Time

Job Posting Details

About IHeartMedia

iHeartCommunications, Inc. was founded in San Antonio, TX under the name Clear Channel Communications, Inc. with the purchase of a single radio station in 1972. After decades of growing media assets globally, the company has become one of the world’s leading media and entertainment companies, operating as iHeartMedia, Inc. iHeartMedia, Inc. consists of two main media businesses: Clear Channel Outdoor Holdings (NYSE: CCO) and the wholly owned iHeartMedia.

Summary

The Data Engineer who joins our team will impact how people listen to the radio, how the marketing and advertising industry can connect with our millions of listeners, and help drive important business decisions with a robust data platform.

Responsibilities

* Develop end-to-end ETL processes in Python to send large data sets to a Hadoop cluster and bring summarized results back into a Redshift data warehouse for downstream business analysis.  The data sources can include Kakfa, flat files and REST API’s.
* Identify performance bottlenecks in data pipelines and architect quicker more efficient solutions when necessary.  This may involve reaching out to internal teams and external partners to ensure the appropriate optimization standards are being followed.    
* Create new data warehouse solutions and ensure best practices are following in schema and table design
* When needed, perform data housekeeping, data cleansing, normalization, hashing, and implementation of required data model changes.
* Increase efficiency and automate processes by collaborating with the data platform team to update existing data infrastructure (data model, hardware, cloud services, etc.)
* Work in an Agile development methodology and own data driven solutions end-to-end

Develop end-to-end ETL processes in Python to send large data sets to a Hadoop cluster and bring summarized results back into a Redshift data warehouse for downstream business analysis. The data sources can include Kakfa, flat files and REST API’s.
Identify performance bottlenecks in data pipelines and architect quicker more efficient solutions when necessary. This may involve reaching out to internal teams and external partners to ensure the appropriate optimization standards are being followed.
Create new data warehouse solutions and ensure best practices are following in schema and table design
When needed, perform data housekeeping, data cleansing, normalization, hashing, and implementation of required data model changes.
Increase efficiency and automate processes by collaborating with the data platform team to update existing data infrastructure (data model, hardware, cloud services, etc.)
Work in an Agile development methodology and own data driven solutions end-to-end

Ideal Candidate

Ability to write well-abstracted, reusable code components in Python
A self-starter who can find data anomalies and and fix issues without direction.
Willing and interested to work in new areas and across multiple programming paradigms such as Kafka, RabbitMQ, Redshift, Hadoop, Linux, etc
Ability to investigate data issues across a large and complex system by working alongside multiple departments and systems.

Nice-to-Have’s:

Experience with Hadoop & Hive as well as Python's Luigi ETL framework are a huge plus
Exposure to Amazon Web Services, especially S3, EC2, Redshift and EMR
Knowledge of development environment and provisioning tools such as Vagrant, Chef and Docker
Understanding of modern version control tools, such as Git, as well as a Continuous Integration tools such as Travis or Jenkins

Similar Jobs

See other jobs at IHeartMedia
See more engineering jobs in New York