Transfer learning from non-medical datasets

Transfer learning recently became a popular technique for training machine learning algorithms. The goal is to transfer some information from dataset A (the source) to dataset B (the target). This increases the total amount of data the classifier learns from, leading to a more robust algorithm. This is very important for medical imaging datasets, which sometimes can contain only a few hundred images. A recent development is that it is possible to transfer information from non-medical datasets – for example, dataset A can be a collection of cat and dog pictures, and dataset B can be a set of chest CT scans.

A few researchers have compared non-medical and medical datasets as the source data. Some have found non-medical data to be better, while others achieved superior results with medical data. However, in all comparisons, different datasets, dataset sizes, different networks and so forth have been used. Therefore more systematic comparisons are needed.

In this project you will use train neural networks on various non-medical and medical public datasets, and apply them to medical target data. Possible directions include varying the dataset size, number of classes and other parameters. Another direction is combining multiple source datasets in a neural network ensemble. The goal is to provide advice on what kind of considerations should be made when choosing a source dataset.

Some experience with machine learning is required, experience with Python is preferred. Experience with medical imaging is preferred but not required.

Supervisor: Dr. Veronika Cheplygina (v.cheplygina at


Cheplygina, V. (2018). Cats or CAT scans: transfer learning from natural or medical image source datasets?. arXiv preprint arXiv:1810.05444.

Menegola, A., Fornaciali, M., Pires, R., Bittencourt, F. V., Avila, S., & Valle, E. (2017, April). Knowledge transfer for melanoma screening with deep learning. In Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on (pp. 297-300). IEEE.

Zhang, R., Zheng, Y., Mak, T. W. C., Yu, R., Wong, S. H., Lau, J. Y., & Poon, C. C. (2017). Automatic Detection and Classification of Colorectal Polyps by Transferring Low-Level CNN Features From Nonmedical Domain. IEEE J. Biomedical and Health Informatics, 21(1), 41-47.