Senior Software Engineer - AI/ML for AWS Neuron Inference

New Today

AWS Neuron is the comprehensive software stack designed for AWS Inferentia and Trainium cloud-scale machine learning accelerators. We are seeking a seasoned software engineer to join our Machine Learning Inference Applications team. In this role, you will play a crucial part in the development and optimization of core components of LLM Inference, including Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, and more. You will collaborate closely with chip architects, compiler engineers, and runtime engineers, ensuring optimal performance and accuracy on Neuron devices for an array of models such as Llama 3.3 70B, 3.1 405B, DBRX, Mixtral, and others. Key Responsibilities: Adapt cutting-edge research in LLM optimization to enhance performance on Neuron chips. Work collaboratively across teams to leverage both open-source and proprietary models. About Our Team: Our team is committed to fostering the growth of new members. With a diverse group of experienced professionals, we emphasize knowledge-sharing and mentorship. Senior team members provide one-on-one guidance and constructive code reviews. We prioritize your career advancement by assigning projects that expand your engineering expertise, empowering you to tackle more complex challenges as you progress. BASIC QUALIFICATIONS: 3+ years of professional software development experience. 2+ years of experience in system design or architecture, including design patterns and reliability scaling. Proficiency in at least one programming language. Understanding of machine learning models, their architectures, and optimization techniques for enhanced performance. PREFERRED QUALIFICATIONS: 3+ years of experience covering the full software development lifecycle, including coding standards, code reviews, source control management, build processes, testing, and operations. Bachelor's degree in Computer Science or a related field. Hands-on experience with PyTorch or Jax, particularly in developing and deploying LLMs in production environments using GPUs, Neuron, TPU, or similar AI acceleration hardware. Amazon is an equal opportunity employer, committed to diversity and inclusion. Our culture empowers all employees to achieve excellent results for our customers. If you require accommodations during the application or hiring process, please reach out to our recruiting partner. The base pay for this position ranges from $129,300/year in our lowest geographic market to $223,600/year in our highest geographic market. Compensation is based on various factors including location and individual experience. In addition to a competitive salary, a total compensation package will include equity, sign-on bonuses, and a comprehensive range of medical, financial, and other benefits. This position will remain posted until filled. Please apply through our career site.
Location:
Seattle, WA, United States
Category:
Computer And Mathematical Occupations

We found some similar jobs based on your search