Open Theses and Projects
Bachelor, Master THESES and

The HPC group offers several hot and interesting topics for bachelor and master theses, as well as individual projects at the bachelor and master levels in the area of parallel and distributed computing. Come and join our team!

Completed theses and student projects can be found here.

Below is a list of topics for theses and projects that are only an example of what you could work on in our team.

This list is not actively maintained. Therefore, interested students please contact us here for further details on an existing topic, for updates on novel hot topics, or to discuss a topic of your own interest.

  1. Improving the Performance of  an appMRI Hippocampus Volume Analyzer (HVA)  (Master Thesis/Project) - Co-supervision with MIAC

Currently, the processing time for creating an appMRI HVA (URL) report is approximately 3 hours, excluding the human quality control. The main goal of this project is to study and understand how the algorithm (FreeSurfer 5.3 or FreeSurfer 6.0 - latest release) could be improved so that the computation time to calculate the volume of the hippocampus is significantly decreased. appMRI HVA algorithm (FreeSurfer 5.3) already relies on OpenMP parallelization to speed some operations. Ideally, we would like to identify and modify new routines that could benefit from this approach. Additionally, we would like to identify the optimal configuration of an appMRI cluster node (number of threads/cores per job, CPU, memory, etc.) to obtain the best performance of the appMRI HVA infrastructure.

  1. Algorithms and Experiments for Quantum Computing (Master Thesis)

Quantum computing (QC) is radically different from the conventional computing approach. Based on quantum bits that can be zero and one at the same time, a quantum computer acts as a massive parallel device with an exponentially large number of computations taking place at the same time. This will make problems tractable that are non-tractable even for the most powerful classical supercomputers. While the physics behind QC has been explored hundred years ago, implementations are still in an early development state. But major companies as well as research funding agencies currently massively invest in this direction. In the master thesis you will explore this fascinating field and get hands-on experience on QC simulators and early systems.

  1. FPGA-Based Accelerators for High-Performance Computing (Bachelor/Master Thesis)

Field-programmable devices such as Field-Programmable Gate Array (FPGA) technology are a hybrid of hardware and software. Integrated circuits consist of thousands of basic computing blocks which both offer hardware acceleration and application-specific programmability. Thus, FPGA devices can act as accelerators: Compute-intensive program parts are executed on the FPGA co-processor while run-time organization and other program parts are run on a standard CPU. In this thesis you will study the potential of using FPGA in High Performance Computing comparing the performance against standard CPUs for specific applications (ex. machine learning)

  1. A parallel debugger for MPI applications

Having an open source MPI debugger, is a step on the road to educational parallel debugger, customized debuggers, and free license debuggers.  Serial debuggers like gdb or lldb have Machine Interface (GDB/MI or LLDB/MI) that is used by many debuggers in different IDEs. You have to integrate them to behave as shown in the demo using your friendly GUI application.

  1. A visualizer for job logs in batch systems for high performance computing clusters

Converting batch system logs from file to database style to help organizations to share their data for commercial or scientific purposes with less efforts and with more privacy options. This work includes implement of a tool that collects batch systems logs and visualizes the utilization statistics, convert batch system logs from file to database style, and implement a user friendly web interface for viewing usage statistics.

  1. What is your name, benchmark scheduler?

There are numerous benchmarks and parallel workloads available in the HPC community. They are believed to employ very good schedulers. The documentation accompanying these workloads does not provide the details about the scheduling techniques/algorithms employed therein. During this thesis, scheduling algorithms will be identified in HPC workloads and the findings will be assessed comparatively.

  1. A visualization tool for job schedulers in HPC

Build a visualization tool to visualize the status of the jobs and the queues of the schedulers and jobs on a HPC system. Using information from qstat, qhost, qquota, and information available from batch logs to build the information to be displayed.

  1. Performance comparison of parallel programming paradigms on miniHPC

Study the performance of different parallel programming models on miniHPC, explore which programming model performs the best, explain the performances obtained from the benchmarks and how it relates to architecture/software stack, optimize compilation of benchmarks for miniHPC architecture, possibility tune the benchmark to achieve the best possible performance in every programming model.

  1. Performance engineering with stencil kernels and codes

It is an important motif and widely used, for this reason researchers try to optimize its computation and several stencil compilers have been implemented. Focus of your thesis is to study 2 of them: PLUTO (v 0.11) and Girih. The first exploits the polyhedral model to optimize the loops with affine transformations, whereas the latter is mainly used to develop and analyze the performance of Multi-core Wavefront Diamond (MWD) tiling techniques, which are used to perform temporal blocking. Once implemented a test case and executed an experiment, the results should be compared against Roofline Model and ECM Model in order to understand how the approaches exploit the available hardware.

  1. Development of a stencil kernel and application benchmark

The stencil computational pattern is representative of several numerical code, where it usually represents an important part of the execution time. In said codes, the stencil part span from 2 dimensional to 3 dimensional grids, high order to low order, varying the arithmetic intensity. Several tool and implementations are available. The question you are going to answer is: “Given a stencil belonging to a certain category, what is the best choice for its compilation?” You will implement a test case from each category in OpenMP, PLUTO and PATUS (the latter 2 being stencil compilers) and benchmark the produced outputs.

  1. Managing of shared experiment workspaces among different HPC systems

Conducting research experiments in Computational Science is not only a matter of writing code but also of configuring the software used for running it on complex high performance computing (HPC) systems. Manually configuring the software drives, oftentimes leads to non-reproducible experiments in terms of either pure execution or final results. Furthermore, a key aspect for a scientist who carries on an experiment is to have the possibility to collaborate in a simple and effective way with another scientist, this can be more difficult when using HPC systems: An HPC system is usually closed environment accessible, unless special configuration, only using a local accounts. Such a local account can’t be used for accessing a different system and, most likely, will not give a full control to the machine (e.g. installing new software). The HPC Group (formerly HPWC) is currently developing a framework called “PROVA!” with the aim of managing and sharing HPC experiments to further a collaborative research.
The scope of the thesis is to analyze pros and cons of different approaches to the shared workspaces in order to propose a solution suitable for the HPC field and integrate it in “PROVA!”.

  1. Identification and analysis of the communication behavior of parallel applications

The execution of applications on parallel computing systems requires that application processes communicate during their execution. Understanding the communication behavior of parallel applications is important for optimizing their parallel execution. The communication patterns can be represented as process graphs (or networks) and/or task graphs. This work involves (1) identification and classification of communication behavior types from various synthetic and real parallel applications and (2) investigation of the similarity and differences between the process graphs and the task graphs of single parallel applications. To realize this work synthetic communication patterns may be developed and the communication behavior of real applications will be extracted and classified based on their execution traces.

  1. From OTF2 traces to the SimGrid toolkit

OTF2 refers to the open trace format (version 2), a format used to store the execution traces of applications as a sequence of events. Understanding the traces helps in analyzing the behavior of the applications during execution. The goal is to develop a tool that reads OTF2 trace files as input and extract the structure of the application, execution times, and use this information to develop a simulator that simulates the application using SimGrid simulation framework programming interfaces. The developed tool will be used to automatically create inputs for simulating the execution of parallel applications by reading their execution traces.