CloudOps Engineer
7 Days Old
Cloudops Engineer
About the Role: We're looking for a CloudOps Engineer to join our fast-growing CloudOps team focused on Developer Experience, SRE, and FinOps. In this role, you'll be responsible for the reliability, performance, and observability of CloudZero's infrastructure empowering engineering teams to ship features that help customers understand and optimize their cloud spend.
CloudZero processes billions of events daily across AWS, Azure, and GCP. Our customers rely on real-time, accurate cost data to make business-critical decisions and any instability in our system impacts their planning. Built entirely on a unique serverless architecture (no EC2s or containers), our platform demands infrastructure that scales gracefully, fails predictably, and recovers automatically.
The problems are interesting: handling massive data volumes efficiently, ensuring sub-second query performance across terabytes of data, and scaling systems to support customers spending millions monthly all in a modern, event-driven environment.
You Will:
- Infrastructure as Code everything. Design and maintain Pulumi modules that provision reliable, cost-efficient cloud resources. No clicking through consoles.
- Build observability into everything. Instrument systems so that failures surface quickly and debugging happens with data, not guesswork. You'll know about problems before customers do.
- Automate the boring stuff. Deployments, scaling, backups, and changing limits; if humans are doing it repeatedly, you'll build systems to automate it instead.
- Partner with product engineering. Help teams design resilient services, review architectures for operational complexity, and build deployment pipelines that enable safe and fast shipping.
- Optimize for cost and performance. CloudZero's business is helping others optimize cloud costs. We should be exemplars of efficient cloud usage ourselves.
Requirements:
- 35+ years of experience building and operating distributed systems in AWS
- Strong skills in Python, Infrastructure as Code (e.g., Pulumi or Terraform), and Kubernetes
- Hands-on experience with monitoring tools such as Prometheus or DataDog
- Proven ability to debug production issues under pressure
- Values thoughtful, reliable system design over reactive "hero" efforts
- Balances automation intelligently builds solutions to real problems, not automation for its own sake
- Able to clearly explain complex technical issues to non-technical stakeholders
- Strong documentation habits to support long-term team clarity and system stability
- Excited to take ownership of infrastructure and solve operational challenges at scale
Please note: CloudZero is unable to sponsor employment visas or provide immigration-related support now or in the future. All candidates must have current, unrestricted authorization to work in the United States permanently.
- Location:
- Boston
We found some similar jobs based on your search
-
New Today
CloudOps Engineer
-
Boston
- Technology
Job Description Job Description About the Role: Test We’re looking for a CloudOps Engineer to join our fast-growing CloudOps team focused on Developer Experience, SRE, and FinOps. In this role, you’ll be responsible for the reliability, performanc...
More Details -
-
1 Days Old
Senior CloudOps Engineer
-
Us
Senior CloudOps Engineer Wilmington, DE / Buffalo Grove, IL / Lewisville, TX / Logan, UT Monday – Friday 9:00 am – 6:00 pm Hybrid We are seeking an experienced and driven Senior CloudOps Engineer to join our team. In this critical role, you will desi...
More Details -
-
7 Days Old
CloudOps Engineer
-
Boston
Cloudops Engineer About the Role: We're looking for a CloudOps Engineer to join our fast-growing CloudOps team focused on Developer Experience, SRE, and FinOps. In this role, you'll be responsible for the reliability, performance, and observability ...
More Details -
-
59 Days Old
CloudOps Engineer
-
Boston, MA, United States
- IT & Technology
About the role: CloudZero is looking for a Software Engineer focused on Developer Experience (DX) to help shape the internal tooling, workflows, and infrastructure that enable our engineers to ship high-quality software efficiently and reliably. In t...
More Details -