Job ID : 84652-1
Job Title : Data Engineer
Location : Malvern, PA
Duration : 6 months + possible extension
Rate Range: $45 - $60/hour on W2/ C2C (All inclusive)
We are seeking an experienced Tech Lead-Data Engineer with a strong background in Java, AWS, Python, PySpark, and event-driven architectures. The role involves designing and building scalable batch and streaming data pipelines, optimizing cloud data platforms, and delivering high-quality, reliable datasets that support analytics, reporting, and machine learning workloads.
Required Skills & Qualifications- 15 years of professional experience in Data Engineering.
- Strong expertise in Python and PySpark for large-scale data processing.
- Advanced hands-on experience with AWS (S3, Glue, EMR, Lambda, Step Functions, Kinesis/MSK, DynamoDB, Athena, Redshift).
- Deep experience building event-driven and streaming data pipelines.
- Strong SQL experience for analytical and ETL workloads.
- Hands-on experience with workflow orchestration tools such as Airflow or Step Functions.
- Experience with CI/CD, Git, and Infrastructure-as-Code (Terraform or CloudFormation).
- Strong understanding of distributed systems, Spark performance tuning, data modeling, and cloud cost optimization.
- Knowledge of data security, encryption, networking, and compliance best practices in cloud environments.
- Prior work experience at client or in client's Industry
Applicants must be able to work directly for Artech on W2.
Preferred Skills & Qualifications- Strong design and architectural understanding.
- Excellent communication and stakeholder interaction skills.
- Ability to work in a globally distributed team.
- Architect, build, and maintain event-driven data pipelines using AWS services such as Kinesis, MSK/Kafka, Lambda, Step Functions, SQS/SNS, and Glue/EMR.
- Develop ETL/ELT workflows using Python and PySpark, ensuring performance, scalability, and cost efficiency.
- Implement and optimize Spark-based data transformations, partitioning strategies, and data processing frameworks.
- Design and manage data lake and warehouse structures using S3, Glue Catalog, Athena, and/or Redshift.
- Build streaming solutions with checkpointing, stateful transformations, idempotency, and schema evolution.
- Ensure high standards of data quality, observability, monitoring, and alerting (CloudWatch, Datadog, etc.).
- Implement data security best practices including IAM, encryption (KMS), networking, and governance.
- Create reusable frameworks, internal libraries, and CI/CD pipelines for automated deployments.
- Collaborate with data scientists, analysts, and business teams to deliver well-modeled, reliable datasets.
- Lead design reviews, mentor junior engineers, and contribute to engineering best practices.
- Competitive salary and benefits package.
- Opportunities for professional growth and development.
- Inclusive and innovative work environment.
For immediate consideration please click APPLY to begin the screening process with Alex.