Need Consultants local to Plano, TX.
Job Title: Lead Data Engineer
Location: Plano, TX Onsite
Duration: 12+ Months W2 Contract
Job Description:
Need highly skilled Lead Data Engineer to design, build, and optimize scalable data platforms and pipelines in a cloud-native environment. The candidate will have deep expertise in modern data engineering practices, distributed data processing, cloud architecture, and data governance. This role requires strong technical leadership, hands-on engineering capabilities, and the ability to mentor teams while driving best practices across enterprise data platforms.
Key Responsibilities
- Lead the design, development, and optimization of large-scale data pipelines and data platforms.
- Architect and implement high-performance batch and real-time data processing solutions using Spark and cloud-native technologies.
- Drive data engineering best practices, coding standards, performance tuning, and operational excellence.
- Design and maintain metadata-driven frameworks to improve scalability, reusability, and maintainability of data solutions.
- Handle complex Change Data Capture (CDC) implementations and troubleshoot edge cases across heterogeneous source systems.
- Resolve sparse column ingestion challenges and ensure data consistency, quality, and reliability across data platforms.
- Implement robust monitoring, observability, and failure recovery mechanisms for distributed systems.
- Collaborate with Data Architects, Data Scientists, Analysts, and Business stakeholders to deliver data-driven solutions.
- Lead cloud integration initiatives, including secure cross-account AWS connectivity and data sharing.
- Establish and enforce data governance, security, lineage, and access control policies using Unity Catalog and related governance tools.
- Mentor junior and senior engineers and provide technical leadership for project delivery.
Required Skills & Qualifications
- 12+ years of IT experience with at least 5+ years in Big Data/Data Engineering.
- Strong expertise in Scala programming and Apache Spark (RDD, DataFrames, Spark SQL).
- Hands-on experience with Databricks and AWS ecosystem.
- Experience with Workday HCM integrations, including data extraction and transformation.
- Proficiency in SQL and data modeling concepts.
- Experience in ETL tools, especially Informatica PowerCenter, and migration strategies.
- Strong understanding of data pipeline architecture and distributed systems.
- Proven experience in project management and team leadership.
- Excellent communication and stakeholder management skills.
Candidate should become strong in below areas:
- Spark optimization
- Data Engineering fundamentals
- CDC edge cases
- Sparse column ingestion issues
- Metadata-driven frameworks
- Cross-account AWS integration
- Unity Catalog governance
- Distributed system failures