Applications Engineer (GPU-Accelerated), San Francisco, CA, United States

Applications Engineer (GPU-Accelerated)

New Today

LocationSan Francisco HQEmployment TypeFull timeLocation TypeOn-siteDepartmentR&DCompensationIC4Estimated salary commensurate with experience. $235K • Offers EquityIC5Estimated salary commensurate with experience. $258K • Offers EquityTier GuideOur Compensation Philosophy:Market-based: Our formula ensures new hires earn at or above current 75th percentile cash compensation benchmarks.Ownership: Our generous equity program ensures new hires are owners, not just employees.Transparent: We openly discuss salary expectations to avoid surprises later in the process.Data-driven: We use objective data to remove bias and ensure consistency in compensation decisions.About AlembicAlembic is pioneering a revolution in marketing, proving the true ROI of marketing activities. The Alembic Marketing Intelligence Platform applies sophisticated algorithms and AI models to finally solve this long-standing problem. When you join the Alembic team, you’ll help build the tools that provide unprecedented visibility into how marketing drives revenue, helping a growing list of Fortune 500 companies make more confident, data-driven decisions.About the RoleWe’re looking for a Machine Learning Applications Engineer with GPU, Python, and C++ expertise to help productionize cutting-edge causal AI models. You’ll work closely with ML scientists to turn experimental research code into optimized, scalable, and well-structured software that powers Alembic’s real-time analytics and inference systems.This is a hands-on, performance-focused role where you’ll operate at the intersection of applied ML, systems engineering, and high-performance computing.Key ResponsibilitiesTranslate early-stage ML research and prototypes into reliable, testable, and performant software componentsUse CUDA, Triton, and Numba to optimize GPU-accelerated workloads for inference and preprocessingContribute to core libraries and performance-critical routines using modern C++ in hybrid Python/C++ environmentsDevelop modular, reusable infrastructure that supports deployment of ML workloads at scaleCollaborate with data scientists and engineers to optimize data structures, memory usage, and execution pathsBuild interfaces and APIs to integrate ML components into Alembic’s broader platformImplement logging, profiling, and observability tools to track performance and model behaviorMust-Have Qualifications4–7 years of software engineering experience, including substantial time in Python and C++Hands-on experience with GPU programming, including CUDA, Triton, Numba, or related frameworksStrong familiarity with the Python data stack (Pandas, NumPy, PyArrow) and low-level performance tuningExperience writing high-performance, memory-efficient code in C++Demonstrated ability to work cross-functionally with researchers, platform engineers, and product teamsComfort transforming research-grade ML code into maintainable, production-grade softwareNice-to-HaveExperience with hybrid Python/C++ or Python/CUDA extension development (e.g., Pybind11, Cython, custom ops)Familiarity with ML serving or inference tools (e.g., TorchServe, ONNX Runtime, Triton Inference Server)Exposure to structured data modeling, causal inference, or large-scale statistical computationBackground in distributed systems or parallel processing is a plusWhat You’ll GetA pivotal role building GPU-accelerated software at the heart of a real-world AI productCollaboration with an elite team of ML scientists, engineers, and product leadersThe opportunity to shape performance-critical infrastructure powering enterprise decision-makingA culture rooted in technical rigor, curiosity, and product impactCompensation Range: $235K - $258K #J-18808-Ljbffr

Apply

Location:: San Francisco, CA, United States