Senior Site Reliability Engineer II (Kafka)
New Today
Senior Site Reliability Engineer II (Kafka) Join to apply for the Senior Site Reliability Engineer II (Kafka) role at Gamigion
Senior Site Reliability Engineer II (Kafka) 2 weeks ago Be among the first 25 applicants
Join to apply for the Senior Site Reliability Engineer II (Kafka) role at Gamigion
Get AI-powered advice on this job and more exclusive features.
At Braze, we have found our people. We’re a genuinely approachable, exceptionally kind, and intensely passionate crew.
Considering applying for this job Do not delay, scroll down and make your application as soon as possible to avoid missing out.
We seek to ignite that passion by setting high standards, championing teamwork, and creating work-life harmony as we collectively navigate rapid growth on a global scale while striving for greater equity and opportunity – inside and outside our organization.
To flourish here, you must be prepared to set a high bar for yourself and those around you. There is always a way to contribute: Acting with autonomy, having accountability and being open to new perspectives are essential to our continued success.
Our deep curiosity to learn and our eagerness to share diverse passions with others gives us balance and injects a one-of-a-kind vibrancy into our culture.
If you are driven to solve exhilarating challenges and have a bias toward action in the face of change, you will be empowered to make a real impact here, with a sharp and passionate team at your back. If Braze sounds like a place where you can thrive, we can’t wait to meet you.
What You'll Do
Site Reliability Engineers (SREs) are responsible for keeping all internal-facing services and platforms running smoothly. In a nutshell, SREs ensure site uptime. SREs blend sensible system administrators and software engineers who apply sound engineering principles, operational discipline, and mature automation to the environments and infrastructure services we provide. We specialize in systems–whether it be networking, the Linux kernel, or some more specific interest in scaling–algorithms or distributed systems.
Our team helps to improve automation, infrastructure reliability, and empowers Braze’s other engineering teams to leverage the infrastructure products and platforms we create easily. Braze operates at a massive scale with over 3.3 billion monthly active users across our customers, collecting hundreds of billions of data points each month, and sending billions of messages to end-users daily. We use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more. As a Senior Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously improve the infrastructure, automation, and tooling that build internal products from these technologies.
Responsibilities
Partner with Braze’s engineering teams on:
Architecting products to effectively utilize infrastructure platforms in a scalable, reliable manner
Debugging reliability and scalability issues across all stack layers, including the products built using our infrastructure platforms
Make monitoring and alerting alerts on symptoms and not on outages
Ensure that Braze meets our strict enterprise-grade SLAs with customers
Develop Braze’s internal platform infrastructure:
Create Infrastructure as code using Chef, Terraform, and Kubernetes
Develop deployment pipelines for applications in multiple languages using Docker, Kubernetes, etc.
Provide centralized/common tooling, services, and automation frameworks that are critical for scaling operations, capacity management, reducing operational pain, and improving the day-to-day workflow of Braze’s engineering teams
Manage incidents:
Be on a PagerDuty rotation to respond to availability incidents and provide support for other engineers
Use your on-call shift to prevent incidents from ever happening
Retrospect everything that happens to turn lessons into system improvements/changes, automation, etc.
Who You Are
5+ years of experience as a Software, DevOps, or Site Reliability Engineer
3+ years of Data Streaming Reliability Engineering
Experience in monitoring, troubleshooting, and optimizing Kafka streaming applications, including diagnosing lag, partition imbalances, consumer group issues, and broker failures
Expertise in setting up alerting, dashboards, and runbooks for high-availability and fault-tolerant streaming pipelines
3+ years of Kafka performance tuning & automation
Strong background in scaling Kafka clusters, tuning producer/consumer configurations, and managing schema evolution.
Proficiency in infrastructure automation (Terraform, Ansible, Kubernetes) and CI/CD practices to streamline deployments and ensure resilient data streaming workflows.
You think about systems - interfaces, boundaries, edge cases, failure modes, behaviors, specific implementations
Have an urge to collaborate, document, and deliver quickly
Collaborating across the global remote teams, often working asynchronously
Document everything so you don't need to learn the same thing (or plan the same work) twice
Delivering fast to delight our customers– even internal ones
Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it
Have a desire to solve everyday challenges facing software engineers and automate their toil away
Have an excellent ability to manage multiple tasks and expectations at once
Know your way around Linux and Unix Shell.
Have strong programming skills - Ruby and/or Go preferred
Have experience with Docker, Kubernetes, Terraform, or similar IaC technologies
Have experience with MongoDB, Redis, Kafka, Postgres, or similar data technologies
For candidates based in the United States, the pay range for this position at the start of employment is expected to be between $140,800 and $232,000/year with an expected On Target Earnings (OTE) between $156,000 and $258,000/year (including bonus or commission). Your exact offer may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. In addition to cash compensation, Braze offers full- and part- time employees a comprehensive Total Rewards package that includes equity grants of restricted stock (RSUs) so that all Braze employees own a piece of our company.
What We Offer
Braze benefits vary by location, and we encourage you to review our specific benefits offerings for each country here . More details on benefits plans will be provided if you receive an offer of employment.
Benefits
From offering comprehensive benefits to fostering hybrid ways of working, we’ve got you covered so you can prioritize work-life harmony. Braze offers benefits such as:
Competitive compensation that may include equity
Retirement and Employee Stock Purchase Plans
Flexible paid time off
Comprehensive benefit plans covering medical, dental, vision, life, and disability
Family services that include fertility benefits and equal paid parental leave
Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
A curated in-office employee experience, designed to foster community, team connections, and innovation
Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
Employee Resource Groups that provide supportive communities within Braze
Collaborative, transparent, and fun culture recognized as a Great Place to Work
About Braze
Braze is the leading customer engagement platform that empowers brands to Be Absolutely Engaging. Braze allows any marketer to collect and take action on any amount of data from any source, so they can creatively engage with customers in real time, across channels from one platform. From cross-channel messaging and journey orchestration to Al-powered experimentation and optimization, Braze enables companies to build and maintain absolutely engaging relationships with their customers that foster growth and loyalty.
Braze is proudly certified as a Great Place to Work in the U.S., the UK, Australia, and Singapore. In 2025, we were recognized as one of Built In’s Best Places to Work. In 2024, we were included in U.S. News & World Report’s Best Companies to Work For (Top 10%) and recognized in Great Place to Work’s Fortune Best Medium Workplaces, Fortune Best Workplaces in Technology, Fortune Best Workplaces for Parents, and Fortune Best Workplaces for Women.
Additionally, we were featured in Great Place to Work UK’s Best Workplaces, Best Workplaces in Europe, Best Workplaces for Development, Best Workplaces for Wellbeing, Best Workplaces for Women, and Best Workplaces in Technology.
You’ll find many of us at headquarters in New York City or around the world in Austin, Berlin, Bucharest, Chicago, Dubai, Jakarta, London, Paris, San Francisco, Singapore, São Paulo, Seoul, Sydney and Tokyo – not to mention our employees in nearly 50 remote locations.
BRAZE IS AN EQUAL OPPORTUNITY EMPLOYER
At Braze, we strive to create equitable growth and opportunities inside and outside the organization.
Building meaningful connections is at the heart of everything we do, and that includes our recruiting practices. We're committed to offering all candidates a fair, accessible, and inclusive experience – regardless of age, color, disability, gender identity, marital status, maternity, national origin, pregnancy, race, religion, sex, sexual orientation, or status as a protected veteran. When applying and interviewing with Braze, we want you to feel comfortable showcasing what makes you you .
We know that sometimes different circumstances can lead talented people to hesitate to apply for a role unless they meet 100% of the criteria. If this sounds familiar, we encourage you to apply, as we’d love to meet you.
Please see our Candidate Privacy Policy for more information on how Braze processes your personal information during the recruitment process and, if applicable based on your location, how you can exercise any privacy rights.Seniority level Seniority levelMid-Senior level
Employment type Employment typeFull-time
Job function Job functionEngineering and Information Technology
Referrals increase your chances of interviewing at Gamigion by 2x
Get notified about new Senior Site Reliability Engineer jobs in San Francisco, CA .
Senior Software Engineer, Robotics Reliability (Hardware in the Loop Experience Required) Emeryville, CA $170,000.00-$220,000.00 1 day ago
San Mateo, CA $180.00-$220.00 16 hours ago
San Francisco, CA $204,000.00-$259,000.00 2 weeks ago
South San Francisco, CA $160,000.00-$200,000.00 1 month ago
San Francisco, CA $160,000.00-$300,000.00 4 months ago
Senior Software Engineer, Common Infrastructure Redwood City, CA $184,000.00-$230,000.00 2 weeks ago
San Francisco, CA $180,000.00-$240,000.00 2 weeks ago
Sr. Site Reliability Engineer, Compute SRE San Mateo, CA $192,890.00-$238,520.00 1 week ago
Sr. Software Engineer - Capacity & Efficiency Engineering San Francisco, CA $198,000.00-$220,000.00 2 hours ago
Redwood City, CA $175,000.00-$225,000.00 2 days ago
San Francisco, CA $150,000.00-$250,000.00 1 year ago
Senior Robotics Software Engineer, Planning and Control San Francisco, CA $150,000.00-$240,000.00 1 month ago
San Francisco, CA $161,000.00-$191,500.00 3 days ago
Sr. Software Engineer - Payments SupportSenior SDET II (Software Development Engineer in Test) San Mateo, CA $150,000.00-$160,000.00 1 month ago
San Francisco, CA $170,000.00-$190,000.00 1 month ago
San Mateo, CA $150,000.00-$185,000.00 2 weeks ago
Senior Site Reliability Engineer , Scalability San Francisco, CA $146,700.00-$214,800.00 3 hours ago
San Francisco, CA $156,000.00-$211,000.00 4 days ago
Senior Site Reliability Engineer, Supply San Francisco, CA $17,500.00-$200,000.00 21 hours ago
Redwood City, CA $140,000.00-$198,000.00 2 weeks ago
San Mateo, CA $150,000.00-$185,000.00 2 weeks ago
San Francisco, CA $175,000.00-$250,000.00 1 month ago
San Francisco, CA $146,000.00-$180,666.00 1 day ago
San Francisco, CA $91,560.00-$182,930.00 3 hours ago
Staff / Tech Lead Engineer - Driver Pricing Platform Foster City, CA $165,000.00-$252,000.00 3 weeks ago
Novato, CA $98,400.00-$145,620.00 1 month ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr
- Location:
- San Francisco, CA
- Salary:
- $200
- Category:
- Engineering
We found some similar jobs based on your search
-
New Today
Senior Site Reliability Engineer II (Kafka)
-
San Francisco, CA
-
$200
- Engineering
Senior Site Reliability Engineer II (Kafka) Join to apply for the Senior Site Reliability Engineer II (Kafka) role at Gamigion Senior Site Reliability Engineer II (Kafka) 2 weeks ago Be among the first 25 applicants Join to apply for the Senior S...
More Details -
-
3 Days Old
Senior Site Reliability Engineer II (Kafka)
-
San Francisco, CA, United States
-
$200,000 - $250,000
- Engineering
Senior Site Reliability Engineer II (Kafka) Join to apply for the Senior Site Reliability Engineer II (Kafka) role at Gamigion Senior Site Reliability Engineer II (Kafka) 2 weeks ago Be among the first 25 applicants Join to apply for the Senior...
More Details -