Senior Principal Data Architect

  • Work from Home
  • Lebanon, Tennessee
  • Full Time
This position is incentive eligible. This is our story

Born from our Care Transformation and Innovation team, DT&I was created to expand HCA Healthcares digital and AI strategy. Were building intelligent systems, enhancing workflows, and driving innovation across a nationwide network. If youre ready to build technology that saves lives and improves care, your future starts here.

What you will accomplish in this role

Job Summary

The Principal Data Architect - Multi-Modal Ingestion & Intelligent Lakehouse Platforms (GCP) is a senior technical leadership role responsible for defining and delivering next-generation data architecture capabilities for HCAs enterprise data ecosystem. This role focuses specifically on multi-modal ingestion frameworks and intelligent document management within a GCP-based Lakehouse architecture, optimized for cost efficiency, scalability, performance, and insight generation.

This individual will lead the design and evolution of platforms that ingest, process, store, govern, and serve structured, semi-structured, and unstructured data (documents, text, images, PDFs, audio/video metadata) using modern data engineering techniques, LLMs, and AI-assisted pipelines. The role requires deep expertise in GCP, Lakehouse patterns, and advanced data ingestion strategies, combined with the ability to translate business and clinical needs into durable, enterprise-grade architectures.

As a Principal Architect, this role sets technical direction across multiple teams, mentors senior engineers and architects, and acts as a key partner to Data Engineering, AI/ML, Security, and Governance organizations. The outcome is a secure, governed, and intelligent data platform that accelerates analytics, AI adoption, and operational insight across HCA.

What you will do:

Core Competencies

The following are highlighted entrepreneurial competencies and core expectations for the job/role:

  • Strategic thinking and architectural leadership
  • Deep technical expertise in cloud data platforms
  • Strong problem-solving and systems design skills
  • Ability to bridge business, clinical, and technical domains
  • Clear written and verbal communication

This role will focus on setting technical direction on groups of applications and similar technologies as well as taking responsibility for the implementation of technically robust solutions encompassing all business, architecture, and technology constraints.

Technical Leadership & Architecture

  • Define and own enterprise-scale data architecture patterns for multi-modal ingestion and Lakehouse platforms on Google Cloud Platform.
  • Architect and evolve document-centric and unstructured data pipelines that support ingestion, enrichment, embedding, indexing, storage, and retrieval at scale.
  • Lead the design of intelligent ingestion frameworks leveraging LLMs and advanced techniques (e.g., semantic chunking, embeddings, metadata extraction, classification, and enrichment).
  • Establish architectural standards for cost-optimized, high-throughput ingestion across batch, streaming, and event-driven workloads.
  • Drive platform designs that support search, analytics, and downstream AI/ML use cases in partnership with the AI/ML organization.

Lakehouse & Data Platform Design

  • Architect and manage enterprise Lakehouse environments using technologies such as BigQuery, Apache Iceberg, Delta Lake, and GCS.
  • Ensure strong design around schema evolution, ACID compliance, partitioning strategies, metadata management, and lifecycle policies.
  • Optimize storage and compute usage to balance performance and cost across large-scale document and data repositories.
  • Design data models and access patterns that support both analytical and AI-driven workloads.

Ingestion, Processing & AI Enablement

  • Design and oversee ETL/ELT and ingestion pipelines using Dataflow, Dataproc, Pub/Sub, Cloud Run, GKE, and related services.
  • Integrate AI/ML services into ingestion and processing pipelines for document understanding and content intelligence in partnership with that practice area.
  • Partner with AI/ML teams to enable embedding generation, vector storage, and retrieval patterns aligned with enterprise governance standards.
  • Ensure ingestion frameworks are resilient, observable, and designed for continuous evolution.

Governance, Security & Compliance

  • Define and enforce best practices for data governance, security, privacy, and compliance (HIPAA, GDPR) across structured and unstructured data.
  • Ensure architectural alignment with enterprise policies for data retention, lineage, access control, and auditability.
  • Participate in and lead architectural design reviews to ensure adherence to standards and patterns.

Collaboration & Influence

  • Collaborate with business, clinical, analytics, and engineering stakeholders to translate requirements into scalable architectural solutions.
  • Provide architectural guidance for cloud migrations and modernization initiatives involving document and data platforms.
  • Maintain a holistic view of enterprise information assets through diagrams, reference architectures, and technical roadmaps.
  • Act as a technical mentor and advisor to senior engineers and architects.
What qualifications you will need:
  • Bachelor's degree in computer science, related technical field, or equivalent experience required

  • Master's degree in computer science or related field preferred
  • 5+ years of experience in Cloud Data or Information Architect required
  • 7+ years of experience in Healthcare referred
  • 15+ years of experience in Information Technology required
  • Deep experience designing enterprise data architectures on Google Cloud Platform or other Cloud Service Providers.
  • Hands-on expertise with GCP (or similar cloud platform like Azure or AWS) services, including:

- BigQuery, Cloud Storage, Dataflow, Dataproc

- Pub/Sub, Cloud Run, GKE, Cloud Functions

- Bigtable, Cloud SQL, Cloud Spanner

  • Strong knowledge of Lakehouse technologies and formats: Iceberg, Delta Lake, Parquet, Avro, JSON.
  • Experience with document and unstructured data processing, including ingestion, enrichment, and indexing.
  • Practical experience integrating LLMs and AI frameworks (e.g., Vertex AI, LangChain, embeddings, RAG patterns) into data pipelines.
  • Proficiency in Python, SQL, and JVM-based languages (Java/Scala).
  • Experience with CI/CD, DevOps, and infrastructure-as-code for data platforms.
  • Strong understanding of data security, privacy, and regulatory requirements in cloud environments.
  • Ability to analyze complex problems, modernize legacy pipelines, and design scalable solutions.
  • Ability to communicate complex architectures clearly to both technical and non-technical audiences.

Certifications (a plus, but not required):

  • GCP Professional Data Engineer
  • GCP Professional Cloud Architect

PHYSICAL DEMANDS/WORKING CONDITIONS (Specific statements of physical effort required and description of work environment; e.g., prolonged sitting at CRT. required travel %)

  • Prolonged sitting or standing at computer workstation including use of mouse, keyboard, and monitor.
  • Requires ability to provide after-hours support.

At HCA Healthcare, we are committed to fostering a culture of growth that allows you to build the career of a lifetime. We encourage you to apply for our Principal Data Architect - Multi-Modal Ingestion today. We review all applications promptly, and qualified candidates will be contacted to continue the process. Join us!

We are an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Job ID: 523519052
Originally Posted on: 6/3/2026

Want to find more Technology opportunities?

Check out the 165,238 verified Technology jobs on iHireTechnology