IT Data Engineer
- Western Michigan University Homer Stryker M.D. School of Medicine
- Kalamazoo, Michigan
- Full Time
Please Note this is a Salaried Hybrid Position that will require on-site attendance. (MI residents only or willing to relocate - at their expense, to Michigan)
The Data Engineer is responsible for designing, building, and maintaining the organization's data
infrastructure to support a scalable, governed, and analytics-ready environment. This role focuses on the
development of robust data pipelines, integration of disparate data sources, and optimization of data storage
and processing frameworks, while contributing to the evolution of the WMed data platform toward a data-asa-
service (DaaS) model that enables standardized, secure, and reusable access to trusted data assets
across clinical operations, medical education, research, compliance, and executive decision-making.
Working within a modern data architecture, the Data Engineer transforms fragmented, system-centric data
into structured, reliable, and accessible datasets. This role partners closely with analysts, stakeholders, and
technical teams to ensure data availability, integrity, and performance across platforms including partner
EHRs, academic systems, and enterprise applications.
BENEFITS
Wellness reimbursement.
Continuing education and tuition reimbursement.
Employer-funded retirement plan.
Two medical plan options: PPO and High Deductible Health Plan (HDHP) with employer HSA contribution.
Flexible work solutions based on position and department.
Up to four weeks of PTO accrual beginning in year one.
Paid holidays.
Paid volunteer time.
Paid preferred holiday.
DUTIES AND RESPONSIBILITIES:
Design, build, and maintain scalable data pipelines to ingest, transform, and load data from clinical, academic, and enterprise systems
Develop and manage ETL processes using tools such as Pentaho and Microsoft SSIS, ensuring reliability and performance
Design and implement data models and schemas to support downstream analytics and reporting use cases
Contribute to the development of a DaaS platform, enabling reusable, governed data products and standardized access patterns for analysts, applications, and self-service users
Optimize and maintain PostgreSQL data environments, including performance tuning and storage strategies
Implement and monitor data quality, validation, and error-handling processes
Establish and maintain data lineage, metadata, and documentation to support governance and transparency
Integrate new data sources into the data platform, including APIs, flat files, and third-party systems
Troubleshoot and resolve data issues across the full data lifecycle
Support the evolution of organizational data architecture toward a modern, scalable platform
Contribute to standards, best practices, and governance frameworks for data engineering
All other duties as assigned
To perform this job successfully, an individual must be able to perform each duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required.
EDUCATION AND/OR EXPERIENCE:
Bachelor's degree in Computer Science, Information Systems, Engineering, or related technical field (or equivalent experience).
3 to 5 years of experience in data engineering, ETL development, or data platform engineering.
Advanced proficiency in SQL (PostgreSQL or similar relational databases).
Experience designing and managing data pipelines and workflows (Pentaho, Microsoft SSIS, or similar ETL tools).
Experience working with healthcare data systems preferred.
Experience integrating academic and enterprise systems preferred.
OTHER SKILLS AND ABILITIES:
Strong understanding of data architecture, data modeling, and data warehousing concepts.
Experience building and maintaining data pipelines (ETL), orchestration, and scheduling frameworks.
Knowledge of data warehousing and/or data lake architectures.
Experience with data quality frameworks, validation, and monitoring.
Familiarity with performance tuning and query optimization.
Familiarity with modern data platform concepts including DaaS and self-service analytics enablement.
Understanding of data governance, lineage, and metadata management concepts.
Experience working with structured and semi-structured data (CSV, JSON, APIs).
Ability to troubleshoot data issues across multiple systems and layers (source to pipeline to warehouse to reporting).
Experience with BI tools (Power BI or similar) for downstream data consumption.
Ability to learn and adapt to evolving data technologies and platforms.