Close

Presentation

1178141 HPC/AI MPI Ecosystem Software Engineer
·
HPE
·
Fort Collins, CO
DescriptionJoin the HPE AI Fabric team and be a part of the growth and evolution of Artificial Intelligence (AI), high speed networking fabrics, and the fastest growing and most significant technology revolution since the Internet.

Responsibilities include, but are not limited to:

• Engage and work with the Commercial HPC and AI ISV and open-source SW communities to validate, tune, and enable applications on the Slingshot Ethernet fabric.

• Enable the broad MPI ecosystem (OpenMPI, Intel MPI, Cray MPI, other distributions) by working with application and MPI vendors to target, tune, and ensure market-leading performance.

• Design, implement and maintain system software that enables communication between GPUS, CPUs, and storage in scale out AI and HPC systems.

• Work with all the leading architectures and vendors in the AI and Data Center markets — NVIDIA, AMD, Intel.

• Work with the OEM, ODM, and VAR channels vendors on bringing Slingshot to a broader set of customers. Validate and tune applications driving those engagements.

• Develop and own HPE product usage support, upstreaming and community engagements, and internal testing and infrastructure.

• Work with cross-disciplinary teams to understand business requirements and align software direction to meet those needs.
Requirements• Bachelor’s or master's degree in computer science, engineering, or related field • 10+ years of relevant experience with a background in networking and communications software development and/or architecture in the data center, university, government lab, or AI-centric environments • Background in MPI software development with an emphasis on HPC applications development, tuning, and deployment in a scale out compute cluster environment • Ability to participate and own pieces of the product release pipeline up to and including package integration and support • Deep understanding of networking architecture and communications including Ethernet and InfiniBand networking technologies • Understanding of computer architecture, and familiarity with the fundamentals of GPU architecture • Experience with NVIDIA and AMD GPU infrastructure and software stacks • Programming and debug skills in C, C++ and Python • Ability to understand how applications and industry middleware/libraries work in Slingshot-enabled systems and identify strategies and ideas for allowing these applications to work to customer expectations • Experience with user-based networking and OFI libfabric software interfaces and APIs
Company DescriptionHewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know diverse backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career, our culture will embrace you. Open up opportunities with HPE.
·
·
2024-10-25
Event Type
Job Posting
TimeTuesday, 19 November 202410:30am - 3pm EST
LocationExhibit Hall A3 - Job Fair Inside
Registration Categories
TP
W
TUT
XO/EX
Countries
USA
Companies
HPE
In-Person / Remotes
In-person
Part Time / Full Times
Full Time