SR Lead Software Engineer - High Performance Computing
New Today
Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products.
As a Senior Lead Software Engineer at JPMorgan Chase within the AI Infrastructure team , you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. Drive significant business impact through your capabilities and contributions, and apply deep technical expertise and problem-solving methodologies to tackle a diverse array of challenges that span multiple technologies and applications. You will lead virtual and direct teams of developers, teaching them best practices in high-performance computing (HPC) practices that intersect with AI/ML. Thus, you are collaborative—especially since you will work closely with cross-functional teams comprised of data scientists, business analysts and other engineers. You will infuse the JPMorgan developer community with an appreciation of the impact that HPC can have by delivering software that consistently outperforms other platforms. You will deliver a variety of options to serve our various business needs--sometimes driven by low-latency; other times driven by throughput or low power.
Job responsibilities
Regularly provides technical guidance and direction to support the business and its technical teams, contractors, and vendors
Develops secure and high-quality production code, and reviews and debugs code written by others
Drives decisions that influence the product design, application functionality, and technical operations and processes
Serves as a function-wide subject matter expert in one or more areas of focus
Actively contributes to the engineering community as an advocate of firmwide frameworks, tools, and practices of the Software Development Life Cycle
Influences peers and project decision-makers to consider the use and application of leading-edge technologies
Adds to the team culture of diversity, equity, inclusion, and respect
Build scalable and efficient inferencing and training pipelines using HPC software techniques and patterns
Working closely with business and data science teams, develop easy-to-use systems that serve their needs
Using telemetry, create measurable frameworks for deciding amongst hardware and software options
Publish and support re-usable patterns to optimize training and inference of ML models on various architectures
Support developer community in learning lessons from high-performance computing (HPC) domain
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 5+ years applied experience
Hands-on practical experience delivering system design, application development, testing, and operational stability
Advanced in one or more programming language(s)
Advanced knowledge of software applications and technical processes with considerable in-depth knowledge in one or more technical disciplines (., cloud, artificial intelligence, machine learning, mobile,
Ability to tackle design and functionality problems independently with little to no oversight
Practical cloud native experience
Experience in Computer Science, Computer Engineering, Mathematics, or a related technical field
Advanced understanding of High-Performance Computing system architectures and network topologies
Expertise in at least one accelerator type (., GPU, FPGA) and experience mapping LLMs onto these accelerators
Proficiency parallel programming and performance analysis of accelerator-based systems
Familiarity with HPC software (., NCCL, MPI) and resource schedulers (., Kubernetes, SLURM)
Preferred qualifications, capabilities, and skills
Strong programming skills in Python, scripting, C, C++ with experience in AI/ML frameworks like PyTorch and LangChain
Master’s Degree in Computer Science (required)
8+ years of experience in high-performance computing software
5+ years of experience with accelerators and deep learning, particularly large language models
Experience in large organizations and regulated industries is a plus
Excellent communication skills and the ability to work collaboratively in a dynamic team environment
- Location:
- San Francisco