Python Data Engineer

7 Days Old

Salary:
Start Date: 05/26/2025
Title: Python Data Engineer Location: Remote Type: Temp to Perm
Description:
We are looking for a Python Data Engineer that will bring strong expertise in CMS datasets (MOR, MMR, MAO) and an understanding of healthcare regulations. The role requires proficiency with modern cloud data engineering tools, including Dataflow, BigQuery, and Airflow for orchestration, along with solid foundational knowledge in data warehousing concepts and optimization techniques for large healthcare datasets.
What You Will Do: Design, develop, and maintain scalable ETL pipelines for CMS datasets using GCP Dataflow and Python. Architect and manage data warehouses using BigQuery, ensuring scalability and cost-efficiency. Implement Airflow DAGs for orchestration of complex data workflows and scheduling. Ensure data quality, validation, lineage, and governance aligned with CMS and HIPAA compliance standards. Optimize large-scale datasets through partitioning, clustering, sharding, and cost effective query patterns in BigQuery. Work collaboratively in Agile teams, using Jira for project tracking and Confluence for documentation. Monitor and troubleshoot data pipelines, ensuring reliability and operational excellence.
You Will Be Successful If: Self-motivated, proactive, and capable of thriving in a fast-paced, agile startup environment with minimal supervision. Demonstrates strong ownership of tasks and deliverables, acting as a task master. Eager self-learner who stays current with emerging technologies and industry trends. Excellent communication skills, both written and verbal, to effectively collaborate across multidisciplinary teams.
What You Will Bring: Bachelor's degree in Computer Science, Information Systems, or related field. 7+ years of experience in cloud-based data engineering, preferably with healthcare datasets. Extensive experience working with risk adjustment. Expertise in building ETL pipelines using GCP Dataflow (Apache Beam) and Python. Expert experience with BigQuery including schema design, optimization, and advanced SQL. Hands-on experience with Airflow orchestration for large-scale data workflows. Deep understanding of data warehouse concepts such as star schema, snowflake schema, normalization, denormalization. Expert in dataset optimization techniques: query optimization, partitioning, clustering. Familiarity with Agile processes, Jira, Confluence, and cloud-native engineering best practices. Knowledge of CMS datasets (MOR, MMR, MAO) and healthcare data compliance (HIPAA).
remote work
Location:
Houston
Category:
Technology

We found some similar jobs based on your search