JOB TITLE: Data Engineer - Transformation Engineer
Location : New York/Remote/Remote Commutable
Full Time /Permanent
ABOUT THE ROLE:
As a Data Engineer, you will play a key role in designing and building scalable, real-time and batch data solutions that power analytics and operational platforms across NBCUniversal Operations & Technology. This role will focus on building and maintaining API connectors, setting up event-driven data pipelines, managing data flow into an event house or lakehouse, and ensuring data quality and reliability throughout. You will collaborate with cross-functional teams to enable the efficient movement and transformation of data for enterprise use.
- API Connector Development: Build, maintain, and document scalable API integrations with internal and external platforms to automate data ingestion and synchronization.
- Event-Driven Architecture: Design and implement data pipelines using event-driven frameworks and messaging systems (e.g., Kafka, Kinesis) to support real-time analytics and operational use cases.
- Data Lakehouse Engineering : Manage data ingestion and transformation pipelines feeding into an event house or lakehouse architecture, ensuring consistency and scalability across datasets.
- Data Transformation & Orchestration: Use ETL/ELT tools and scripting to clean, transform, and enrich data from various sources into usable formats for analysts and stakeholders.
- Data Quality & Validation: Establish automated validation rules, monitor data integrity, and proactively resolve quality issues to ensure trust in data systems.
- Collaboration & Enablement: Partner with analytics, business, and engineering teams to understand data requirements and deliver scalable solutions that meet evolving needs.
REQUIREMENTS:
- Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field, or equivalent experience.
- 3-6 years of experience as a data engineer or in a similar backend data role.
- Experience building and maintaining API integrations and event-driven data pipelines.
- Proficiency in SQL and data pipeline scripting (e.g., Python, PySpark).
- Experience with cloud platforms such as AWS, Azure, or Google Cloud (e.g., S3, Redshift, Snowflake, or Databricks).
- Familiarity with event streaming platforms such as Apache Kafka, Kinesis, or similar.
- Strong attention to detail with a commitment to data accuracy and quality assurance.
- Ability to thrive in a fast-paced, collaborative environment.
PREFERRED QUALIFICATIONS:
- Experience with CI/CD pipelines and infrastructure-as-code tools (e.g., Terraform, GitHub Actions).
- Familiarity with lakehouse architecture principles and tools like Delta Lake or Apache Iceberg.
- Experience building robust logging, error handling, and monitoring for data pipelines.
- Understanding of data governance, privacy, and compliance principles.
- Strong documentation and communication skills, with the ability to explain technical solutions to non-technical stakeholders.
- Background in media, entertainment, or consumer technology environments.