Presentation
1174287 Full-Stack Monitoring & AIOps Engineer
·
HPE
·
Chicago, IL
SessionJob Postings
DescriptionHPE Slingshot R&D team member to be located on site at Argonne National Lab (Chicago Metropolitan Area). This employee will report to HPE’s R&D manager as a team member of the Slingshot AIOps and Monitoring R&D group. Employee will also report to an ANL supervisor and is expected to work alongside other ANL staff in support of the Aurora system. ANL will provide standard on-site office environment and network access. At this time it is not expected that this person will require security clearance. The primary objective for this staffing arrangement is to improve the exchange of ideas, requirements, and diagnostics/monitoring tools and to enable HPE to implement features and future diagnostics/monitoring products to better address the requirements of ANL’s Aurora supercomputer system. This employee may also facilitate in the deployment, beta-testing and utilization of new HPE Fabric AIOps and Slingshot Monitoring software products.
Expected roles and responsibilities include:
• Employee will manage and communicate expectations to both the HPE manager and ANL supervisor regarding respective responsibilities and commitments.
• Track and facilitate in resolution to customer’s HPC interconnect issues and interface to HPE Slingshot R&D, as needed, to align the proper resources to resolve advanced issues beyond traditional break/fix.
• Assist in the diagnosis of fabric-related problems, write documentation, perform RCAs, drive upgrade planning, tool development, and other related tasks.
• Develop and program integrated software algorithms to structure, analyze and leverage structured and unstructured data in monitoring and analytics system applications.
• Can work with large scale computing frameworks, data analysis systems, and modeling environments.
• Use machine learning and statistical modeling techniques to improve product/system performance, data management, quality, and accuracy.
• Formulate descriptive, diagnostic, predictive and prescriptive insights/algorithms and translate technical specifications into code.
• Document procedures for installation and maintenance, complete programming, perform testing and debugging, define and monitor performance metrics.
• Contribute to the success of HPE by translating customer requirements and industry trends into products, solutions, and systems improvement projects.
• Contributions are expected to have measurable impact on Slingshot definition or development.
• Apply in-depth professional knowledge and innovative ideas to solve complex problems. Visible contributions improve time-to-market, achieve cost reductions, or satisfy current and future unmet customer needs.
• Recognized internal authority on key technology area applying innovative principles and ideas.
• Provide technical leadership for significant project/program work.
• Lead or participate in cross-functional initiatives and contribute to mentorship and knowledge sharing across the organization.
Expected roles and responsibilities include:
• Employee will manage and communicate expectations to both the HPE manager and ANL supervisor regarding respective responsibilities and commitments.
• Track and facilitate in resolution to customer’s HPC interconnect issues and interface to HPE Slingshot R&D, as needed, to align the proper resources to resolve advanced issues beyond traditional break/fix.
• Assist in the diagnosis of fabric-related problems, write documentation, perform RCAs, drive upgrade planning, tool development, and other related tasks.
• Develop and program integrated software algorithms to structure, analyze and leverage structured and unstructured data in monitoring and analytics system applications.
• Can work with large scale computing frameworks, data analysis systems, and modeling environments.
• Use machine learning and statistical modeling techniques to improve product/system performance, data management, quality, and accuracy.
• Formulate descriptive, diagnostic, predictive and prescriptive insights/algorithms and translate technical specifications into code.
• Document procedures for installation and maintenance, complete programming, perform testing and debugging, define and monitor performance metrics.
• Contribute to the success of HPE by translating customer requirements and industry trends into products, solutions, and systems improvement projects.
• Contributions are expected to have measurable impact on Slingshot definition or development.
• Apply in-depth professional knowledge and innovative ideas to solve complex problems. Visible contributions improve time-to-market, achieve cost reductions, or satisfy current and future unmet customer needs.
• Recognized internal authority on key technology area applying innovative principles and ideas.
• Provide technical leadership for significant project/program work.
• Lead or participate in cross-functional initiatives and contribute to mentorship and knowledge sharing across the organization.
RequirementsBachelor's or master's degree in Computer Science, Electrical Engineering, or equivalent.
• Typically, 6-10 years’ experience.
• High Performance Computing experience is nice to have.
Company DescriptionHPE is the global edge-to-cloud company built to transform your business. How? By helping you connect, protect, analyze, and act on all your data and applications wherever they live, from edge to cloud, so you can turn insights into outcomes at the speed required to thrive in today’s complex world.
·
·
2024-10-17
Event Type
Job Posting
TimeTuesday, 19 November 202410:30am - 3pm EST
LocationExhibit Hall A3 - Job Fair Inside
TP
W
TUT
XO/EX
USA
HPE
In-person
Full Time