HPCaTS: High Performance and Privacy through Confidential Computing, Trust, and Security

A heterogeneous HPC cluster for teaching and research.

Funding: University of Basel

Duration: 2026–Present

HPCaTS (High Performance and Privacy through Confidential Computing, Trust, and Security) is the heterogeneous compute cluster of the HPC Group (High Performance Computing) and the PET Group (Privacy-Enhancing Technologies) at the Department of Mathematics and Computer Science at the University of Basel.

HPCaTS has a two-fold purpose: to provide a teaching platform where students learn parallel and accelerated programming, as well as confidential computing, on real hardware, and to serve as a fully controlled experimental platform for leading-edge research in high performance computing and privacy-enhancing technologies.

HPCaTS is a single cluster that brings together two CPU instruction sets (x86_64 and Arm) across processors from three vendors (AMD, Intel, and NVIDIA), plus three distinct GPU architectures from two vendors, alongside large-memory nodes and a high-bandwidth InfiniBand fabric. This makes it an ideal environment for studying performance portability, cross-vendor GPU programming with both CUDA and ROCm/HIP, and the architectural trade-offs of modern accelerated computing, from tightly integrated APUs and superchips to traditional discrete-GPU servers.

Node	Partition	Arch	CPU	Cores (Threads)	Memory	GPU
login-amd	–	x86_64	2× AMD EPYC 9124 (16-core)	32	384 GB DDR5	None
login-arm	–	aarch64	NVIDIA Grace CPU Superchip (dual Grace)	144	480 GB LPDDR5X	None
node001	amd	x86_64	4× AMD Instinct MI300A APU	96	512 GB unified HBM3	4× MI300A (CDNA 3, integrated)
node002	amd	x86_64	4× AMD Instinct MI300A APU	96	512 GB unified HBM3	4× MI300A (CDNA 3, integrated)
node003	arm	aarch64	NVIDIA GH200 Grace-Hopper Superchip	72	480 GB LPDDR5X + 96 GB HBM3	1× Hopper GPU (GH200)
node004	arm	aarch64	NVIDIA GH200 Grace-Hopper Superchip	72	480 GB LPDDR5X + 96 GB HBM3	1× Hopper GPU (GH200)
node005	highmem	x86_64	2× Intel Xeon Gold 6258R (28-core)	56 (112)	~1.5 TB DDR4	2× NVIDIA A100-PCIE-40GB
node006	highmem	x86_64	2× AMD EPYC 7742 (64-core)	128 (256)	~1.5 TB DDR4	None

Additional Notes

AMD MI300A (node001, node002) is an APU: the CPU cores and the CDNA 3 GPU share a single pool of HBM3 memory on the same package. There is no separate “host” and “device” memory to copy between. Each node has four MI300A devices, one per NUMA domain. These nodes are programmed with the ROCm/HIP stack.
NVIDIA GH200 (node003, node004) is a superchip: an Arm-based Grace CPU and a Hopper GPU joined by a high-bandwidth coherent NVLink-C2C link. These nodes are aarch64 and are programmed with CUDA.
NVIDIA A100 (node005) is a pair of classic discrete PCIe GPUs in a large-memory x86_64 host, programmed with CUDA.
High-bandwidth InfiniBand (NVIDIA/Mellanox ConnectX-6) connecting all nodes and storage, with multiple rails on the AMD MI300A nodes for higher bandwidth. A separate Ethernet network handles administration and carries no compute traffic.
~2 PB of shared storage on a central Seagate CORVAULT system, serving home, data, and software directories to every node over NFS
Slurm workload manager with GPU-aware scheduling
Ubuntu 24.04 LTS, with CUDA and ROCm provided centrally