AIML - Staff Software Engineer, On-Device Machine Learning
Location
Cupertino, CA | United States
Job description
At Apple, the AIML - On-Device Machine Learning group is responsible for accelerating the creation of amazing on-device ML experiences, and we are looking for a tenured software engineer to help define and implement features that accelerate and compress large state of the art (SoTA) models (e.g., LLMs) in our on-device inference stack. We are a dedicated team working on ground breaking technology in the field of natural language processing, computer vision and artificial intelligence. We are designing, developing, and optimizing large-scale language/vision/multi-modal models that power on-device inference capabilities across various Apple products and services. This is a unique opportunity to work on powerful new technologies and contribute to Apple's ecosystem, with a commitment to privacy and user experience impacting millions of users worldwide.
Are you someone who can write high-quality, well-tested code and collaborate cross-functionally with partner HW, SW and ML teams across the company? If so, come join us and be part of the team that is helping Machine Learning developers innovate and ship enriching experiences on Apple devices!
Key Qualifications
- 10-15+ years proven programming skills using standard ML tools such as C/C++, Python, PyTorch, Tensorflow, CUDA/Metal.
- Solid understanding of state-of-the-art DNN optimization techniques and how they translate to hardware acceleration architectures, and a general ability to reason about system performance (compute/memory) tradeoffs
- Hands-on experience working (training, fine-tuning, optimizing, deploying) with large models (e.g. LLMs).
- Hands-on experience applying common machine learning optimization techniques, like quantization and sparsity-induction, to reduce the resource consumption and/or eliminate latency
- Experience building APIs and/or core components of ML frameworks
- Capacity to iterate on ideas, work with a variety of partners from all parts of the stack — from Apps to Compilation, HW Arch, and Power/Performance analysis
- Proven track record to analyze sophisticated and ambiguous problems
- Disciplined programming abilities with a strong attention to detail
- Strong applied experience with compiler technology to work with CPU, GPU, and ML accelerators
- Excellent problem-solving (e.g. via building forward-looking prototype systems), critical thinking, strong communication, and collaboration skills
Description
As a member of this team, the successful candidate will:
- Build features for our on-device inference stack to support the most relevant accuracy preserving, general purpose techniques that empower model developers to compress and accelerate SoTA models (e.g., LLMs) in apps
- Convert models from a high-level ML framework to a target device (CPU, GPU, Neural Engine) for optimal functional accuracy and performance
- Write unit and system integration tests to ensure functional correctness and avoid performance regressions
- Diagnose performance bottlenecks and work with HW Arch teams to co-design solutions that further improve latency, power, and memory footprint of neural network workloads
- Analyze impact of model optimization (compression/quantization etc) on model quality by partnering with modeling and adaptation teams across diverse product use cases.
Education & Experience
Bachelor’s, Master's, or PhD in Computer Science, Machine Learning or a related field
Additional Requirements
- Apple’s most important resource, our soul, is our people. Apple benefits help further the well-being of our employees and their families in meaningful ways. No matter where you work at Apple, you can take advantage of our health and wellness resources and time-away programs. We’re proud to provide stock grants to employees at all levels of the company, and we also give employees the option to buy Apple stock at a discount — both offer everyone at Apple the chance to share in the company’s success. You’ll discover many more benefits of working at Apple, such as programs that match your charitable contributions, reimburse you for continuing your education and give you special employee pricing on Apple products.
- Apple benefits programs vary by country and are subject to eligibility requirements.
- Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Apple is committed to working with and providing reasonable accommodation to applicants with physical and mental disabilities. Apple is a drug-free workplace.
Pay & Benefits
- At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $199,800.00 and $364,100.00, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics.
As a member of this team, the successful candidate will:
- Build features for our on-device inference stack to support the most relevant accuracy preserving, general purpose techniques that empower model developers to compress and accelerate SoTA models (e.g., LLMs) in apps
- Convert models from a high-level ML framework to a target device (CPU, GPU, Neural Engine) for optimal functional accuracy and performance
- Write unit and system integration tests to ensure functional correctness and avoid performance regressions
- Diagnose performance bottlenecks and work with HW Arch teams to co-design solutions that further improve latency, power, and memory footprint of neural network workloads
- Analyze impact of model optimization (compression/quantization etc) on model quality by partnering with modeling and adaptation teams across diverse product use cases. Bachelor’s, Master's, or PhD in Computer Science, Machine Learning or a related field
Job tags
Salary