Data Engineer

1 Days Old

Data Engineer - Role Overview: The Data Engineer will be a key builder on our AI journey, responsible for designing, constructing, and maintaining the data infrastructure required to support our AI initiatives. This role will focus on building robust and scalable data pipelines to extract data from a variety of sources, integrate it with our data lake/warehouse, and prepare it for analysis by our Data Analysts and training custom AI models. This position is critical for enabling our focus on vendor-provided capabilities and eventually building custom solutions. Key Responsibilities: Design, build, and maintain scalable and efficient ETL/ELT data pipelines to ingest data from internal and external sources (e.g., APIs from EPIC, Workday, relational databases, flat files). and data warehouse to ensure data is clean, accessible, and ready for analysis and model training. Collaborate with the Data Analyst and other stakeholders to understand their data requirements and provide them with clean, well-structured datasets. Implement data governance, security, and quality controls to ensure data integrity and compliance. Automate data ingestion, transformation, and validation processes. Work with our broader IT team to ensure seamless integration of data infrastructure with existing systems. Contribute to the evaluation and implementation of new data technologies and tools. Required Skills & Qualifications: ETL/ELT Development: Strong experience in designing and building data pipelines using ETL/ELT tools and frameworks. SQL: Advanced proficiency in SQL for data manipulation, transformation, and optimization. Programming: Strong programming skills in Python (or a similar language) for scripting, automation, and data processing. Data Warehousing: Experience with data warehousing concepts and technologies. Cloud Computing: Hands-on experience with at least one major cloud platform's data services (e.g., Microsoft Azure Data Factory, Azure Fabric, IICS). Version Control: Proficiency with Git for code management and collaboration. Problem-Solving: Proven ability to troubleshoot and resolve data pipeline issues. Data Modeling: Experience with various data modeling techniques (e.g., dimensional modeling). Real-time Processing: Familiarity with real-time data streaming technologies (e.g., Kafka, Azure Event Hubs). Education: Bachelor's degree in Computer Science, Engineering, or related field. Nice-to-Have Skills: API Integration: Experience building data connectors and integrating with APIs from major enterprise systems (e.g., EPIC, Workday). CI/CD: Knowledge of Continuous Integration/Continuous Deployment practices for data pipelines. AI/ML MLOps: A basic understanding of the machine learning lifecycle and how to build data pipelines to support model training and deployment. Experience with Microsoft Fabric: Direct experience with Microsoft Fabric's integrated data platform (OneLake, Data Factory, Synapse Data Engineering).
Location:
Columbus
Category:
Technology

We found some similar jobs based on your search