HPC Engineer
The Exploration Company
Turin or Bordeaux
We want you, as a talented HPC Engineer, to design, build, and operate high-performance computing environments that power our space mission simulations, data analytics, and research workloads.
Key Responsibilities
You will be responsible for architecting, deploying, and maintaining HPC clusters across public cloud environments (e.g., AWS, or GCP) and potentially on-premises systems. You will also work closely with internal engineering teams to onboard users, support new application deployments, and ensure optimal cluster performance for mission-critical tasks.
- Architect, deploy, and maintain scalable HPC clusters in the cloud and/or on-prem environments.
- Develop infrastructure automation for cluster provisioning, configuration management, and scalability.
- Collaborate with software, flight, and data engineering teams to deploy HPC workloads efficiently.
- Manage resource scheduling, user access, and workload optimization to ensure high throughput and cost efficiency.
- Develop and maintain monitoring, logging, and alerting for system performance and reliability.
- Implement best practices for data management, security, and compliance in multi-tenant HPC environments.
- Support users in onboarding, workflow optimization, and troubleshooting computational jobs.
- Evaluate emerging cloud/HPC technologies and recommend improvements to cluster architecture.
What we would love to see from you
In the role of an Engineer HPC and Cloud Computing, ideally, you will have the following:
- Bachelor's or master's degree in computer science, engineering, physics, or related technical field.
- Experience with HPC systems administration or cloud-based compute infrastructure.
- Strong proficiency with public cloud platforms (AWS ParallelCluster, Azure Batch, or similar).
- Experience with containerization (Docker, Singularity) and orchestration (Kubernetes, Slurm, or PBS).
- Advanced scripting and automation in Python, Bash, or similar languages.
- Familiarity with distributed storage systems and parallel file systems (Lustre, EFS, FSx, etc.).
- Understanding of workload management, scheduling, and performance tuning for compute-intensive environments.
- Knowledge of MPI, GPU acceleration, and numerical methods for scientific workloads.
- Excellent communication and collaboration skills; ability to support diverse engineering teams.
Don't forget to mention Space-Careers when applying.