Transfer learning from non-medical datasets

Transfer learning recently became a popular technique for training machine learning algorithms. The goal is to transfer some information from dataset A (the source) to dataset B (the target). This increases the total amount of data the classifier learns from, leading to a more robust algorithm. This is very important for medical imaging datasets, which sometimes can contain only a few hundred images. A recent development is that it is possible to transfer information from non-medical datasets – for example, dataset A can be a collection of cat and dog pictures, and dataset B can be a set of chest CT scans.

A few researchers have compared non-medical and medical datasets as the source data. Some have found non-medical data to be better, while others achieved superior results with medical data. However, in all comparisons, different datasets, dataset sizes, different networks and so forth have been used. Therefore more systematic comparisons are needed.

In this project you will use train neural networks on various non-medical and medical public datasets, and apply them to medical target data. Possible directions include varying the dataset size, number of classes and other parameters. Another direction is combining multiple source datasets in a neural network ensemble. The goal is to provide advice on what kind of considerations should be made when choosing a source dataset.

Some experience with machine learning is required, experience with Python is preferred. Experience with medical imaging is preferred but not required.

Supervisor: Dr. Veronika Cheplygina (v.cheplygina at


Cheplygina, V. (2018). Cats or CAT scans: transfer learning from natural or medical image source datasets?. arXiv preprint arXiv:1810.05444.

Menegola, A., Fornaciali, M., Pires, R., Bittencourt, F. V., Avila, S., & Valle, E. (2017, April). Knowledge transfer for melanoma screening with deep learning. In Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on (pp. 297-300). IEEE.

Zhang, R., Zheng, Y., Mak, T. W. C., Yu, R., Wong, S. H., Lau, J. Y., & Poon, C. C. (2017). Automatic Detection and Classification of Colorectal Polyps by Transferring Low-Level CNN Features From Nonmedical Domain. IEEE J. Biomedical and Health Informatics, 21(1), 41-47.

Combining relative assessments for melanoma classification

The project addresses melanoma classification in skin lesion images. Typically machine learning algorithms for this application would learn from images which have been labeled as melanoma or not. A less explored option is to learn from relative assessments of images, for example, whether images are similar to each other or not. Such assessments can be used to learn a good representation of the images, and the representation can then be further trained with a traditional dataset. An advantage of this method is that relative assessments may be more intuitive to provide than diagnostic labels, which could allow a large number of assessments to be collected via crowdsourcing.

In this project you will develop a deep learning algorithm which uses two types of input: melanoma labels provided by experts, and relative assessments provided by the crowd. The relative assessments were collected as part of the MelaGo project at TU Eindhoven, where participants could rate images via a (gamified) app. One of the goals is therefore also to investigate how gamification affected the quality of the assessments. Another goal is to investigate how to best combine the assessments, and whether filtering annotators by quality can improve the results.


Some experience with machine learning is required, experience with Python is preferred.

Supervisor: Dr. Veronika Cheplygina (v.cheplygina at


Khan, V.-J, Pim Meijer, Michele Paludetti, Reka Magyari, Dominique van Berkum, and Veronika Cheplygina. “MelaGo: Gamifying Medical Image Annotation”, 2018. PDF

Schultz, M., & Joachims, T. (2004). Learning a distance metric from relative comparisons. In Advances in neural information processing systems (pp. 41-48).

Ørting, S. N., Cheplygina, V., Petersen, J., Thomsen, L. H., Wille, M. M., & de Bruijne, M. (2017). Crowdsourced emphysema assessment. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (pp. 126-135).

Meta-learning for medical image segmentation

Imagine you have experience with two segmentation applications (for example, tissue segmentation in brain MRI, and cell segmentation in histopathology), and you know that different (deep) learning methods work best in each application. Can you decide which method to use on a third application, for example segmentation of vessels in retinal images, without trying all the possibilities first?

In machine learning this idea of predicting which methods will perform better on a given dataset is called “meta-learning”. This can be done by characterizing each (dataset, method) pair with several “meta-features”, which describe the data (for example, the number of images) and the method (for example, how many layers a neural network has). The label of this pair is the performance of the method on the dataset. This way, a meta-classifier can learn what type of data and classifiers perform well together.

An important open question is how to choose the meta-features for this problem. In this MSc project, you will investigate how to adapt meta-learning features from the literature to medical imaging problems, and engineer specialized features that might not be applicable to other types of data. You will work on a set of publicly available medical imaging datasets, and implement your methods in the OpenML platform.

Some experience with machine learning is required, experience with Python is preferred. Experience with medical imaging is preferred but not required.

Supervisors: Dr. Veronika Cheplygina and Dr. Joaquin Vanschoren (Data Mining, Department of Computer Science)

Contact: v.cheplygina at


Cheplygina, V., Moeskops, P., Veta, M., Dashtbozorg, B., & Pluim, J. P. W. (2017). Exploring the similarity of medical imaging classification problems. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (pp. 59-66). Springer.

Vanschoren, J., Blockeel, H., Pfahringer, B., & Holmes, G. (2012). Experiment databases. Machine Learning87(2), 127-158.


Weakly supervised learning in medical imaging (various projects)

Data is often only weakly annotated: for example, for a medical image, we might know the patient’s overall diagnosis, but not where the abnormalities are located, because obtaining ground-truth annotations is very time-consuming. Multiple instance learning (MIL) is an extension of supervised machine learning, aimed at dealing with such weakly labeled data. For example, a classifier trained on healthy and abnormal images, would be able to label a both a previously unseen image AND local patches in that image.









Figure 1: Supervised learning and multiple instance learning, shown for the task of detecting abnormalities in chest CT images. Images from Cheplygina, V., de Bruijne, M., & Pluim, J. P. W. (2018). Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. arXiv preprint arXiv:1804.06353.

There are still a number of open research directions. For example,

  • How can we evaluate the patch-level predictions without ground-truth labels?
  • Could we improve MIL algorithms by asking experts only a few questions, where they verify the algorithm’s decisions?
  • What can we learn about MIL in medical imaging from other applications where it has been applied?

As a MSc student you would choose one or more medical imaging applications you are interested in, using an open dataset or a dataset available through collaborators, and work with us on formulating your own research question. Participating in a machine learning competition, creating open source tools and/or writing a paper for a scientific conference are also encouraged.

Some experience with machine learning is required (for example 8DC00 if you are a TU/e student). Experience with Python is preferred but experience with another programming language and willingness to learn Python is also sufficient.

Supervisor TU/e: Dr. Veronika Cheplygina (v.cheplygina at