HPCaTS: High Performance and Privacy through Confidential Computing, Trust, and Security


A heterogeneous HPC cluster for teaching and research.


Funding: University of Basel

Duration: 2026–Present

HPCaTS (High Performance and Privacy through Confidential Computing, Trust, and Security) is the heterogeneous compute cluster of the HPC Group (High Performance Computing) and the PET Group (Privacy-Enhancing Technologies) at the Department of Mathematics and Computer Science at the University of Basel.

HPCaTS has a two-fold purpose: to provide a teaching platform where students learn parallel and accelerated programming, as well as confidential computing, on real hardware, and to serve as a fully controlled experimental platform for leading-edge research in high performance computing and privacy-enhancing technologies.

HPCaTS is a single cluster that brings together two CPU instruction sets (x86_64 and Arm) across processors from three vendors (AMD, Intel, and NVIDIA), plus three distinct GPU architectures from two vendors, alongside large-memory nodes and a high-bandwidth InfiniBand fabric. This makes it an ideal environment for studying performance portability, cross-vendor GPU programming with both CUDA and ROCm/HIP, and the architectural trade-offs of modern accelerated computing, from tightly integrated APUs and superchips to traditional discrete-GPU servers.

Node Partition Arch CPU Cores (Threads) Memory GPU
login-amd x86_64 2× AMD EPYC 9124 (16-core) 32 384 GB DDR5 None
login-arm aarch64 NVIDIA Grace CPU Superchip (dual Grace) 144 480 GB LPDDR5X None
node001 amd x86_64 4× AMD Instinct MI300A APU 96 512 GB unified HBM3 4× MI300A (CDNA 3, integrated)
node002 amd x86_64 4× AMD Instinct MI300A APU 96 512 GB unified HBM3 4× MI300A (CDNA 3, integrated)
node003 arm aarch64 NVIDIA GH200 Grace-Hopper Superchip 72 480 GB LPDDR5X + 96 GB HBM3 1× Hopper GPU (GH200)
node004 arm aarch64 NVIDIA GH200 Grace-Hopper Superchip 72 480 GB LPDDR5X + 96 GB HBM3 1× Hopper GPU (GH200)
node005 highmem x86_64 2× Intel Xeon Gold 6258R (28-core) 56 (112) ~1.5 TB DDR4 2× NVIDIA A100-PCIE-40GB
node006 highmem x86_64 2× AMD EPYC 7742 (64-core) 128 (256) ~1.5 TB DDR4 None

Additional Notes

  • AMD MI300A (node001, node002) is an APU: the CPU cores and the CDNA 3 GPU share a single pool of HBM3 memory on the same package. There is no separate “host” and “device” memory to copy between. Each node has four MI300A devices, one per NUMA domain. These nodes are programmed with the ROCm/HIP stack.
  • NVIDIA GH200 (node003, node004) is a superchip: an Arm-based Grace CPU and a Hopper GPU joined by a high-bandwidth coherent NVLink-C2C link. These nodes are aarch64 and are programmed with CUDA.
  • NVIDIA A100 (node005) is a pair of classic discrete PCIe GPUs in a large-memory x86_64 host, programmed with CUDA.
  • High-bandwidth InfiniBand (NVIDIA/Mellanox ConnectX-6) connecting all nodes and storage, with multiple rails on the AMD MI300A nodes for higher bandwidth. A separate Ethernet network handles administration and carries no compute traffic.
  • ~2 PB of shared storage on a central Seagate CORVAULT system, serving home, data, and software directories to every node over NFS
  • Slurm workload manager with GPU-aware scheduling
  • Ubuntu 24.04 LTS, with CUDA and ROCm provided centrally