FEDERAL HPC SYSTEMS ENGINEER WITH SECURITY CLEARANCE, Motion Recruitment Partners LLC, Arlington, VA


Motion Recruitment Partners LLC -
N/A
Arlington, VA, US
N/A

FEDERAL HPC SYSTEMS ENGINEER WITH SECURITY CLEARANCE

Job description

Overview: We are seeking a highly skilled and experienced Federal HPC Systems Engineer to join our team. This role is critical for managing and maintaining high-performance computing (HPC) systems for a federal project, ensuring they operate at peak efficiency. The ideal candidate will possess a strong background in HPC environments, Linux administration, and network management, with a focus on providing exceptional service and support in a secure, federal setting. Responsibilities: Dell HPC Configuration, Management, and Maintenance: Manage the setup, configuration, and ongoing maintenance of HPC systems.
Ensure HPC systems adhere to Dell HPC best practices.
Knowledge Transfer and Documentation: Provide extended training and knowledge transfer to customer staff.
Create and maintain System Integration and Deployment (SID) documentation, including planning, cables, labeling, and switches.
Hardware and Software Validation: Validate firmware and software versions and settings.
Assist with benchmark testing using tools like High Performance Linpack (HPL), Alltoall, bidirectional bandwidth, and Stream.
Storage Systems: Manage GPFS (General Parallel File System) storage.
Operating Systems: Administer and troubleshoot Ubuntu and SUSE Linux Enterprise Server (SLES) environments.
Network Management: Conduct InfiniBand (IB) and Ethernet (Enet) network testing and management.
Knowledge of Kubernetes is a plus.
Administration: Monitor and manage Dell infrastructure.
Handle user requests and review log files.
Generate regular operational reports.
Provide capacity planning and assist with disaster recovery planning and design.
Problem Management: Isolate and troubleshoot incidents.
Coordinate service incidents and open service requests.
Participate in root cause analysis reviews.
Change Management: Assist with software/firmware management and change requests.
Document policies and procedures in collaboration with compliance managers and stakeholders.
Monitor migration activities.
Continual Service Improvement: Recommend procedural changes for operational optimization.
Share best practices from other engagements and provide performance tuning recommendations.
Post-Implementation and Knowledge Sharing: Work with customer technical leadership for system status awareness and future planning.
Conduct transition planning and incremental configuration.
Perform knowledge transfer for new technology features and management activities.
Develop run books and provide product enhancement recommendations.
Implement Dell EMC System Management Tools.
Change Evaluation and Recommendations: Review IT processes and policies, including incident, capacity, performance, and change management, as well as user and backup policies.
Assist with solution documentation of policies and procedures in conjunction with compliance managers and stakeholders.
Conduct knowledge transfer to address the customer s skills and resource gaps and provide technology recommendations.
Qualifications: U.S. Citizenship is required.
Active TS/SCI clearance.
Extensive experience in HPC environments.
Proficiency in Ubuntu and SUSE Linux Enterprise Server (SLES).
Strong knowledge of InfiniBand (IB) and Ethernet (Enet) networks.
Experience with GPFS storage systems.
Familiarity with Dell HPC best practices.
Knowledge of Kubernetes is a plus.
Excellent documentation and communication skills.
Strong problem-solving and troubleshooting abilities.
Ability to work onsite in Arlington, VA.

Full-time 2024-07-21
N/A
N/A
USD

Privacy Policy  Contact US
Copyright © 2023 Employ America All rights reserved.