HPCaTS: High Performance and Privacy through Confidential Computing, Trust, and Security
A heterogeneous HPC cluster for teaching and research.
Funding: University of Basel
Duration: 2026–Present
HPCaTS (High Performance and Privacy through Confidential Computing, Trust, and Security) is the heterogeneous compute cluster of the HPC Group (High Performance Computing) and the PET Group (Privacy-Enhancing Technologies) at the Department of Mathematics and Computer Science at the University of Basel.
HPCaTS has a two-fold purpose: to provide a teaching platform where students learn parallel and accelerated programming, as well as confidential computing, on real hardware, and to serve as a fully controlled experimental platform for leading-edge research in high performance computing and privacy-enhancing technologies.
HPCaTS is a single cluster that brings together two CPU instruction sets (x86_64 and Arm) across processors from three vendors (AMD, Intel, and NVIDIA), plus three distinct GPU architectures from two vendors, alongside large-memory nodes and a high-bandwidth InfiniBand fabric. This makes it an ideal environment for studying performance portability, cross-vendor GPU programming with both CUDA and ROCm/HIP, and the architectural trade-offs of modern accelerated computing, from tightly integrated APUs and superchips to traditional discrete-GPU servers.
| Node | Partition | Arch | CPU | Cores (Threads) | Memory | GPU |
| login-amd | – | x86_64 | 2× AMD EPYC 9124 (16-core) | 32 | 384 GB DDR5 | None |
| login-arm | – | aarch64 | NVIDIA Grace CPU Superchip (dual Grace) | 144 | 480 GB LPDDR5X | None |
| node001 | amd | x86_64 | 4× AMD Instinct MI300A APU | 96 | 512 GB unified HBM3 | 4× MI300A (CDNA 3, integrated) |
| node002 | amd | x86_64 | 4× AMD Instinct MI300A APU | 96 | 512 GB unified HBM3 | 4× MI300A (CDNA 3, integrated) |
| node003 | arm | aarch64 | NVIDIA GH200 Grace-Hopper Superchip | 72 | 480 GB LPDDR5X + 96 GB HBM3 | 1× Hopper GPU (GH200) |
| node004 | arm | aarch64 | NVIDIA GH200 Grace-Hopper Superchip | 72 | 480 GB LPDDR5X + 96 GB HBM3 | 1× Hopper GPU (GH200) |
| node005 | highmem | x86_64 | 2× Intel Xeon Gold 6258R (28-core) | 56 (112) | ~1.5 TB DDR4 | 2× NVIDIA A100-PCIE-40GB |
| node006 | highmem | x86_64 | 2× AMD EPYC 7742 (64-core) | 128 (256) | ~1.5 TB DDR4 | None |
Additional Notes
- AMD MI300A (node001, node002) is an APU: the CPU cores and the CDNA 3 GPU share a single pool of HBM3 memory on the same package. There is no separate “host” and “device” memory to copy between. Each node has four MI300A devices, one per NUMA domain. These nodes are programmed with the ROCm/HIP stack.
- NVIDIA GH200 (node003, node004) is a superchip: an Arm-based Grace CPU and a Hopper GPU joined by a high-bandwidth coherent NVLink-C2C link. These nodes are aarch64 and are programmed with CUDA.
- NVIDIA A100 (node005) is a pair of classic discrete PCIe GPUs in a large-memory x86_64 host, programmed with CUDA.
- High-bandwidth InfiniBand (NVIDIA/Mellanox ConnectX-6) connecting all nodes and storage, with multiple rails on the AMD MI300A nodes for higher bandwidth. A separate Ethernet network handles administration and carries no compute traffic.
- ~2 PB of shared storage on a central Seagate CORVAULT system, serving home, data, and software directories to every node over NFS
- Slurm workload manager with GPU-aware scheduling
- Ubuntu 24.04 LTS, with CUDA and ROCm provided centrally