Principal Platform Engineer, Infrastructure & Tooling for Services, Data, and GenAI, San Francisco

Principal Platform Engineer, Infrastructure & Tooling for Services, Data, and GenAI

New Yesterday

Job Summary:

We’re looking for a Principal Platform Engineer, Infrastructure & Tooling for Services, Data, and GenAI to help shape the future of infrastructure at Disney’s Ad Platforms. In this high-impact role, you’ll lead innovation across the cloud platforms that support both data and software engineering teams—designing systems that are secure, scalable, and built to accelerate delivery without compromising quality. As a senior technical expert, you’ll drive the architecture and evolution of our AWS-based infrastructure, champion cloud-native CI/CD practices, and build automation and internal tooling that empower developers to move faster, safer, and with greater autonomy. You’ll also explore the use of GenAI to streamline workflows and eliminate repetitive, time-consuming tasks. This is a hands-on role with strategic reach. You’ll collaborate closely with security, DevOps, SRE, and application teams to deliver platform capabilities that improve developer experience, streamline operations, and support rapid, high-quality delivery of new features and services. If you're passionate about building reliable platforms, enabling teams to ship with confidence, and bringing intelligent automation into everyday engineering workflows—we’d love to hear from you. Help us scale Disney’s infrastructure platform with a focus on security, performance, developer experience, and innovation. Responsibilities: Design and evolve resilient, secure, and scalable AWS infrastructure for software services and data pipelines, while driving platform standards and ensuring adherence across teams.

Lead adoption and enhancement of Kubernetes (EKS) with integrated Istio service mesh for traffic management and observability.

Architect secure network configurations, including VPC design, IAM, peering, and service-to-service security.

Own and improve CI/CD automation using Jenkins, Spinnaker, and GitHub Actions, ensuring scalable and governed software delivery.

Drive infrastructure-as-code best practices with Terraform, enabling reusable and composable module patterns.

Lead AIOps initiatives including anomaly detection, automated remediation, and intelligent alerting to reduce operational noise.

Identify and implement automation opportunities using Python, Go, or similar, to eliminate repetitive tasks and accelerate delivery.

Act as a technical SME during high-severity incidents and root cause investigations, collaborating with infrastructure, data, and application teams.

Establish and advocate for best practices in cloud security, compliance, cost governance, and disaster recovery.

Guide teams on managing infrastructure across environments with minimal manual effort and strong observability hooks.

Partner with engineering leads from Ads Data and Application teams to co-design infrastructure that meets shared goals.

Mentor engineers and stakeholders on DevOps, cloud architecture, and platform maturity.

Collaborate with platform engineering to evolve internal developer tooling, onboarding patterns, and golden paths.

Basic Qualifications: Bachelor’s Degree in Computer Science, Software Engineering, or related technical discipline; or requisite work experience

10+ years of experience across Infrastructure, DevOps, Software Engineering, or Site Reliability Engineering in large-scale cloud environments.

Deep expertise with AWS core services (EKS, IAM, VPC, S3, CloudWatch, ALB/NLB).

Strong experience with Kubernetes (especially Amazon EKS) and Istio in production environments.

Hands-on experience with Jenkins, Spinnaker, and GitHub Actions in enterprise-scale CI/CD environments.

Proficiency in one or more languages used for automation (Python, Go, Java).

Experience designing and securing infrastructure from CI/CD to runtime, including the network layer.

Ability to translate business needs into infrastructure blueprints and automation strategies.

Preferred Qualifications: Masters Degree in Computer Science, MBA, or a related field; and/or related certification/s

Experience designing and supporting infrastructure for both real-time data pipelines (Spark, Flink, Kafka, Databricks) and orchestration frameworks (Airflow, MWAA), as well as traditional application workloads.

Background in media, ad tech, or regulated environments (SOX, GDPR, CCPA).

Exposure to Datadog, Prometheus, or similar observability stacks.

Familiarity with AIOps platforms or custom GenAI-based automation solutions for operational use cases.

Experience leading infrastructure modernization or platform consolidation initiatives across organizations

#DISNEYTECH

The hiring range for this position in Santa Monica, CA and Glendale, CA is $184,300 to $247,100 per year, in Seattle, WA is $193,100 to $258,900 per year, and in San Francisco, CA is $201,900 to $270,700 per year. The base pay actually offered will take into account internal equity and also may vary depending on the candidate’s geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial, and/or other benefits, dependent on the level and position offered.

Apply

Location:: San Francisco