Staff Platform Engineer Agentic AI Systems IFS The Loops
New Yesterday
We’re seeking a Staff Platform Engineer to help shape the future of agentic AI systems . In this role you will help design the backbone of our real-time, distributed systems. You’ll be at the forefront of building systems that orchestrate massive data flows, reactive services, and agentic workloads—systems that must adapt dynamically and operate reliably under heavy and unpredictable load. You’ll work with tools like Kafka , Akka , stream processing frameworks , and other core distributed technologies, and collaborate across engineering teams to deliver infrastructure that is elastic, fault-tolerant, and observable by design.
If you’re passionate about high-performance computing, resilient architecture, and enabling real-time intelligence at scale, this role is for you.
Responsibilities
Design and implement scalable, distributed platform components with technologies like Kafka , Akka (Typed) , gRPC .
Architect and optimize data pipelines capable of handling billions of messages/events per day with low latency and high reliability.
Lead efforts in agentic scaling – dynamically spawning, routing, and managing autonomous agents (services/functions) in response to workload or demand.
Build resilient systems that self-heal, auto-scale , and degrade gracefully under pressure.
Define and implement metrics, tracing, and observability for end-to-end system behavior and performance.
Collaborate closely with infrastructure, SRE, and product teams to ensure platform scalability aligns with growth and reliability goals.
Drive root-cause analysis of performance bottlenecks and propose long-term architectural improvements.
Participate in on-call rotations, architecture reviews, and deep technical design sessions.
Qualifications:
Qualifications
5+ years of experience building distributed systems in a high-throughput production environment.
Deep expertise with Kafka (topics, partitions, consumers, tuning, schema registry, stream processing).
Strong experience with Akka or other actor-based concurrency models; familiarity with Akka Cluster, Sharding, Persistence, or Typed API.
Solid programming skills in Java .
Understanding of agentic workloads and dynamic system orchestration (e.g., microservices that represent intelligent agents).
Experience designing scalable APIs , message protocols (e.g., Protobuf, Avro), and event-driven architectures.
Familiarity with cloud-native environments (e.g., Kubernetes, service mesh, container orchestration).
Preferred Qualifications
Experience with serverless compute models or function-as-a-service scaling paradigms.
Contributions to open-source projects in the distributed systems ecosystem.
Experience with AI or ML-driven orchestration or agentic frameworks .
Familiarity with operational tooling : Prometheus, Grafana, OpenTelemetry, Kafka monitoring tools, etc.
Additional Information
What We’re Offering
Salary Range: $175,000-200,000 total comp
Flexible paid time off, including sick and holiday
Medical, dental, & vision insurance
401K with Company contribution
Flexible spending accounts
Life insurance and disability benefits
Tuition assistance
Community involvement and volunteering events
M/F/Disabled/Vet VEVRAA Federal Contractor. We are a Drug-Free Workplace. Interested candidates should apply at: www.ifs.com/about/careers-at-ifs
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran. VEVRAA Federal Contractor, Equal Opportunity Employer
- Location:
- San Francisco, CA, United States
- Category:
- Architecture And Engineering Occupations