PhD position in Machine Learning or Applied Mathematics at Jacobs University Bremen

A PhD position in Machine Learning or Applied Mathematics is available in the group of Prof. Peter Zaspel at Jacobs University Bremen, Germany. The position is focused on the development of novel machine learning techniques in context of a large- to extreme-scale biochemistry simulation for photosynthesis (https://doi.org/10.1016/j.cell.2019.10.021). The respective research will involve the further development of kernel-based machine learning models (Kernel Ridge Regression, Gaussian Process Regression, etc.) towards multi-fidelity models and large-scale computations in an interdisciplinary application.

The group of Prof. Peter Zaspel is located at Jacobs University Bremen, a private, state-accredited, English-language research university. The research group focuses on machine learning, uncertainty quantification and high performance computing in context of applications from the natural sciences, engineering and beyond. For more details, see https://www.peter-zaspel.de/

This project is in close collaboration with the group of Prof. Ulrich Kleinekathöfer from the Physics Department at Jacobs University Bremen, whose group will contribute the biophysics simulation application (http://ukleinekat.user.jacobs-university.de/).

A successful applicant is expected to have a Master’s degree in computer science, (applied) mathematics or similar discipline, strong analytical skills in context of machine learning and/or (numerical) mathematics, excellent proficiency in a programming language (preferable Python or C/C++) and interest in large-scale machine learning and computing applications. Experience in (numerical) simulation is an advantage. A good command of English is essential, both as the local working language and because of our international collaborations.

We offer a PhD position that is limited to 3 years. It is funded under the DFG grant “Excitation Energy Transfer in a Photosynthetic System with more than 100 Million Atoms”. The salary will be paid in accordance with the Collective Agreement for the Public Service of the Federation (Tarifvertrag des öffentlichen Dienstes, TVöD Bund), with salary level 13 (100%). The place of employment will be Bremen, Germany.

The position is available immediately and applications will be considered until the position is filled. Applicants should submit a CV, a brief statement of research interests, a copy of their MSc thesis, and the names of two referees by e-mail to Prof. Peter Zaspel at p.zaspel(at)jacobs-university.de

Multi-fidelity machine learning

Imagine that we are not just given training samples (i.e. inputs and outputs) but specifically each of the samples is associated to a “cost” and might have a given “accuracy”.

This project investigates, how to find optimal machine learning models that do not only have a minimum loss / maximum accuracy but also a minimal overall cost. That is, we face a much more complex optimization task.

In the project, we touch topics in:

  • various regression ML methods
  • optimization
  • approximation theory

WARNING: Again a rather mathematical, but very beautiful topic. Could go from a very applied view to a pretty theoretical one.

This is highly research relevant and has very important applications in machine learning in simulation and other fields.

Some first links:

Machine Learning in Quantum Chemistry

In this research project, we are interested in the prediction of properties for molecules.

The project touches the following fields:

  • prediction of properties of molecules by quantum chemistry simulation software
  • optimal feature representation for molecules in machine learning
  • various types of machine learning techniques

This has tremendously important applications in the field of virtual material design, drug discovery, …

The actual work can range from utilizing and comparing existing machine learning methods in that field to developing completely new approaches.

Here some links:

Machine Learning in Fluid Mechanics

Computational Fluid Mechanics is a field in engineering, in which the computer is used to solve mathematical equations that describe the behavior of fluids like air or water. The just mentioned solution process is typically called “simulation”. The objective of this research topic is to investigate the application of Machine Learning models in fluid simulations. That is, the typically expensive simulation process is replaced by a Machine Learning problem.

This research project touches the following topics:

  • Modeling of fluids by the Navier-Stokes equations
  • use of an existing Navier Stokes fluid solver to generate training snapshots
  • further development of machine learning techniques for prediction of outcomes of fluid simulations
  • time-series prediction / quantity of interest prediction / spatial prediction

This is another hot topic, at least in the “simulation business”. Research-relevant questions are:

  • Can we find ML models that nicely predict bifurcation-like behavior?
  • Can we use ML models as sub-models (homogenization-like) in bigger models?

Here some links:

Wavelets as Features for Time Series ML

In this project, the idea would be to familiarize oneself further with the following concepts

  • time series data
  • Wavelet analysis to generate features
  • several types of machine learning models
    • kernel ridge regression
    • multilayer perceptron
    • radial basis function networks
    • transfer learning using some well-known image classifier

Application data can range from Quantum Chemistry over Finance to Health, hence is very broad.
The main objective would be to start with a “black box” approach, i.e. using some existing implementation of a continuous wavelet filter bank and then to develop a deeper understanding on how the choice of some parameters in the wavelet filter bank influences the prediction quality.

A first reference:

Radial Basis Function networks

This topic combines prior knowledge on kernel ridge regression with neural networks. The following content will be considered:

  • (deep) neural networks
  • kernel ridge regression
  • radial basis function networks

The objective would be to study the relationship of the predictive power of kernel ridge regression and radial basis function networks based on given data from quantum chemistry or other relevant science application.

A few first links:

Neural Network Compression by Low Rank Approximation

This is a very technical topic, which I would be interested to explore. It involves:

  • neural networks
  • low rank matrix approximation

Here the idea is to speed up neural network inference and maybe even training by approximating fully connected layers (i.e. matrices) by low-rank approximations of them.

WARNING: This is again a very mathematical topic.
References to be collected:

Fast Kernel Ridge Regression by matrix approximation techniques

The topic of this project is the efficient training of Machine Learning by Kernel Ridge Regression.

Relevant content will be:

  • Kernel Ridge Regression
  • iterative solvers for linear systems
  • matrix approximation techniques:
    • low rank approximation (SVD, ACA, …)
    • Askit
    • Hierarchical Matrices

Application data should be large-scale and science-related. Maybe the first starting point would be data from quantum chemistry that I have access to.
The beauty of this project would be to further develop and analyze the impact of non-exact solvers for linear systems on the quality of the prediction of Kernel Ridge Regression. This is highly research relevant.

WARNING: Some flavor of this topic (e.g. hierarchical matrices) requires a profound mathematical background.

Some first links: