Virtual Expert in the Electron Microscope

Student Project: Virtual Expert in the Electron Microscope

 

Introduction

The 21st century is one of the most productive era in the history of drug discovery. It is largely due to the revolutionized field of structural biology, within which we see the molecule structure, we explore its function, and we design macro molecules to cure diseases.

Among all of those structural biology techniques, Single Particle Analysis (SPA) on Transmission Electron Microscope (TEM) is no doubt the leading one. It illuminates a frozen sample using a low-dose electron beam, acquires thousands of 2-dimensional images of the sample, and deduce the 3D structure by advanced image and volume analysis.

This technique is awarded the Noble prize in 2017 for resolving high resolution molecule structures which previously were thought to be impossible.

             

Left: Typical SPA workflow. Courtesy of Prof. Z. Zhou. Right: The 3D structure of Gamma Secretase, which helped us understand the Alzheimer disease. (A) The SPA structure (B) The atomic structure. Courtesy of Prof. Y. Shi.

 

Problem Statement

Despite its great achievements, SPA nevertheless has its short-comings: it requires a massive amount of expertise to ensure in every step of data acquisition, only the highest quality information is selected.

A typical decision cycle for a SPA scientist:

– Is the microscope condition good enough?

– If yes, which piece of the prepared samples should I work on first?

– Once decided, which position on this piece of sample should I shine the beam on?

– Once the image is taken, where is the molecule of interest on this image? Is it good enough?

All of those questions can be answered by a well-trained expert but is very difficult to solve using classical algorithms.

A typical decision flow of SPA. From a to e, the scientist selects finer and finer area, and eventually results in the good quality image. Courtesy of Prof. Z. Zhou.

Therefore, to increase the ease of use and assist faster drug design, our ultimate goal is to learn from those experts, and eventually train a virtual expert in the microscope with deep learning technique.

 

Student project

This project is the first step towards this goal. It includes the following activities for the student:

– Literature study of deep learning techniques and the SPA workflow. Identify the potential candidates for training the virtual scientist.

– Train a neural network for one of the above decision steps.

– Analyze its performance compared to the fully manual approach.

– Write a final report.

Student profile

– Affinity with deep learning / math / image processing / signal processing

– Able to program in one or more languages ( Python / C++ / Matlab )

– Creative, enthusiastic, communicative

 

About Thermo Fisher Scientific

Thermo Fisher Scientific is the world leader in serving science, with revenues of more than $20 billion and approximately 70,000 employees globally. Our mission is to enable our customers to make the world healthier, cleaner and safer. We help our customers accelerate life sciences research, solve complex analytical challenges, improve patient diagnostics, deliver medicines to market and increase laboratory productivity. Through our premier brands – Thermo Scientific, Applied Biosystems, Invitrogen, Fisher Scientific and Unity Lab Services – we offer an unmatched combination of innovative technologies, purchasing convenience and comprehensive services.

Thermo Fisher Scientific Contact

More information on possibility for projects can be obtained from:

Dr. Ir. Erik Franken

Email: erik.franken <at> thermofisher.com

Dr. Yuchen Deng

Email: yuchen.deng <at> thermofisher.com

Transfer learning from non-medical datasets

Transfer learning recently became a popular technique for training machine learning algorithms. The goal is to transfer some information from dataset A (the source) to dataset B (the target). This increases the total amount of data the classifier learns from, leading to a more robust algorithm. This is very important for medical imaging datasets, which sometimes can contain only a few hundred images. A recent development is that it is possible to transfer information from non-medical datasets – for example, dataset A can be a collection of cat and dog pictures, and dataset B can be a set of chest CT scans.

A few researchers have compared non-medical and medical datasets as the source data. Some have found non-medical data to be better, while others achieved superior results with medical data. However, in all comparisons, different datasets, dataset sizes, different networks and so forth have been used. Therefore more systematic comparisons are needed.

In this project you will use train neural networks on various non-medical and medical public datasets, and apply them to medical target data. Possible directions include varying the dataset size, number of classes and other parameters. Another direction is combining multiple source datasets in a neural network ensemble. The goal is to provide advice on what kind of considerations should be made when choosing a source dataset.

Some experience with machine learning is required, experience with Python is preferred. Experience with medical imaging is preferred but not required.

Supervisor: Dr. Veronika Cheplygina (v.cheplygina at tue.nl)

References:

Cheplygina, V. (2018). Cats or CAT scans: transfer learning from natural or medical image source datasets?. arXiv preprint arXiv:1810.05444.

Menegola, A., Fornaciali, M., Pires, R., Bittencourt, F. V., Avila, S., & Valle, E. (2017, April). Knowledge transfer for melanoma screening with deep learning. In Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on (pp. 297-300). IEEE.

Zhang, R., Zheng, Y., Mak, T. W. C., Yu, R., Wong, S. H., Lau, J. Y., & Poon, C. C. (2017). Automatic Detection and Classification of Colorectal Polyps by Transferring Low-Level CNN Features From Nonmedical Domain. IEEE J. Biomedical and Health Informatics, 21(1), 41-47.

Combining relative assessments for melanoma classification

The project addresses melanoma classification in skin lesion images. Typically machine learning algorithms for this application would learn from images which have been labeled as melanoma or not. A less explored option is to learn from relative assessments of images, for example, whether images are similar to each other or not. Such assessments can be used to learn a good representation of the images, and the representation can then be further trained with a traditional dataset. An advantage of this method is that relative assessments may be more intuitive to provide than diagnostic labels, which could allow a large number of assessments to be collected via crowdsourcing.

In this project you will develop a deep learning algorithm which uses two types of input: melanoma labels provided by experts, and relative assessments provided by the crowd. The relative assessments were collected as part of the MelaGo project at TU Eindhoven, where participants could rate images via a (gamified) app. One of the goals is therefore also to investigate how gamification affected the quality of the assessments. Another goal is to investigate how to best combine the assessments, and whether filtering annotators by quality can improve the results.

 

Some experience with machine learning is required, experience with Python is preferred.

Supervisor: Dr. Veronika Cheplygina (v.cheplygina at tue.nl)

References

Khan, V.-J, Pim Meijer, Michele Paludetti, Reka Magyari, Dominique van Berkum, and Veronika Cheplygina. “MelaGo: Gamifying Medical Image Annotation”, 2018. PDF

Schultz, M., & Joachims, T. (2004). Learning a distance metric from relative comparisons. In Advances in neural information processing systems (pp. 41-48).

Ørting, S. N., Cheplygina, V., Petersen, J., Thomsen, L. H., Wille, M. M., & de Bruijne, M. (2017). Crowdsourced emphysema assessment. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (pp. 126-135).

Meta-learning for medical image segmentation

Imagine you have experience with two segmentation applications (for example, tissue segmentation in brain MRI, and cell segmentation in histopathology), and you know that different (deep) learning methods work best in each application. Can you decide which method to use on a third application, for example segmentation of vessels in retinal images, without trying all the possibilities first?

In machine learning this idea of predicting which methods will perform better on a given dataset is called “meta-learning”. This can be done by characterizing each (dataset, method) pair with several “meta-features”, which describe the data (for example, the number of images) and the method (for example, how many layers a neural network has). The label of this pair is the performance of the method on the dataset. This way, a meta-classifier can learn what type of data and classifiers perform well together.

An important open question is how to choose the meta-features for this problem. In this MSc project, you will investigate how to adapt meta-learning features from the literature to medical imaging problems, and engineer specialized features that might not be applicable to other types of data. You will work on a set of publicly available medical imaging datasets, and implement your methods in the OpenML platform.

Some experience with machine learning is required, experience with Python is preferred. Experience with medical imaging is preferred but not required.

Supervisors: Dr. Veronika Cheplygina and Dr. Joaquin Vanschoren (Data Mining, Department of Computer Science)

Contact: v.cheplygina at tue.nl

References

Cheplygina, V., Moeskops, P., Veta, M., Dashtbozorg, B., & Pluim, J. P. W. (2017). Exploring the similarity of medical imaging classification problems. In Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis (pp. 59-66). Springer.

Vanschoren, J., Blockeel, H., Pfahringer, B., & Holmes, G. (2012). Experiment databases. Machine Learning87(2), 127-158.

 

Weakly supervised learning in medical imaging (various projects)

Data is often only weakly annotated: for example, for a medical image, we might know the patient’s overall diagnosis, but not where the abnormalities are located, because obtaining ground-truth annotations is very time-consuming. Multiple instance learning (MIL) is an extension of supervised machine learning, aimed at dealing with such weakly labeled data. For example, a classifier trained on healthy and abnormal images, would be able to label a both a previously unseen image AND local patches in that image.

 

 

 

 

 

 

 

 

Figure 1: Supervised learning and multiple instance learning, shown for the task of detecting abnormalities in chest CT images. Images from Cheplygina, V., de Bruijne, M., & Pluim, J. P. W. (2018). Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. arXiv preprint arXiv:1804.06353.

There are still a number of open research directions. For example,

  • How can we evaluate the patch-level predictions without ground-truth labels?
  • Could we improve MIL algorithms by asking experts only a few questions, where they verify the algorithm’s decisions?
  • What can we learn about MIL in medical imaging from other applications where it has been applied?

As a MSc student you would choose one or more medical imaging applications you are interested in, using an open dataset or a dataset available through collaborators, and work with us on formulating your own research question. Participating in a machine learning competition, creating open source tools and/or writing a paper for a scientific conference are also encouraged.

Some experience with machine learning is required (for example 8DC00 if you are a TU/e student). Experience with Python is preferred but experience with another programming language and willingness to learn Python is also sufficient.

Supervisor TU/e: Dr. Veronika Cheplygina (v.cheplygina at tue.nl)

 

Liver Cancer Recurrence Prediction

The only potentially curative option for patients with colorectal liver metastases (CRLM) or hepatocellular carcinoma (HCC) is surgical resection. However, 80–85% of these patients are not eligible for liver surgery because of extensive intrahepatic metastatic lesions or the presence of extrahepatic disease. Neoadjuvant chemotherapy (NAC) is increasingly applied with the aim to downsize tumors in patients with initially unresectable disease to attain a resectable situation.

Accurate imaging of the liver following neoadjuvant chemotherapy is crucial for optimal selection of patients eligible for surgical resection and preparation of a surgical plan. MRI is the most appropriate imaging modality for preoperative assessment of patients with CRLM or HCC.

However, NAC may impair lesion detection and underestimate lesion size. As a result, patients whose tumors were considered resectable on preoperative imaging may turn out to have unresectable tumors during surgery. Or the underestimation may result in insufficient surgery, resulting in positive margins and re-excisions.

The incidence of recurrence after liver resection is very high. In different series between 43% and 65% of the patients had recurrences within 2 years of removal of the first tumor, and up to 85% within 5 years.  Without any form of treatment, most patients with recurrent cancer will die within one year.

Following surgical treatments,  doctors will frequently use MRI to check for residual tumors and will look at the risk that cancer will come back (recur) to decide if the patient should be offered additional treatments (called adjuvant therapy) or repeat the hepatectomy.

Project goal

The aim of this study is to design and develop a deep learning-based algorithm to predict five-year liver cancer recurrence using series of liver MRI exams. Patients have serial liver MRI exams: pre-treatment baseline MRI, at follow-up MRI exams during the course of therapy or surgery, and a final MRI after completing the therapy protocol.

 Prerequisites

  • Enthusiastic Master student in electrical engineering, biomedical engineering, computer science, or a related field
  • Interest in the intersection of machine learning and deep learning
  • Understanding of basic machine learning concepts and image analysis
  • Programming experiences in MATLAB and Python
  • A good team player with excellent communication skills
  • A creative solution-finder

Duration: 9 months (BME or ME or MWT)

Start date: a.s.a.p.

Collaboration:  Netherlands Cancer Institute (NKI)

Location: TU/e (Eindhoven) and NKI (Amsterdam)

Contact: For project details, please contact Dr. Behdad Dasht Bozorg, email: B.Dasht.Bozorg@nki.nl

Real-time Multimodal Image Registration

Multimodal imaging is increasingly being used within healthcare for diagnosis, planning treatment, guiding treatment, biopsy, surgical navigation and monitoring disease progression.

Multimodality imaging takes advantage of the strengths of different imaging modalities to provide a more complete picture of the anatomy under investigation. The goal of this study is to develop a real-time MRI and ultrasound image registration.

MRI is used widely for both diagnostic and therapeutic planning applications because of its multi-planar imaging capability, high signal to noise ratio, and sensitivity to subtle changes in soft tissue morphology and function. Ultrasound imaging, on the other hand, has important advantages including high temporal resolution, high sensitivity to acoustic scatterers such as calcifications and gas bubbles, excellent visualization and measurement of blood flow, low cost, and portability. The strengths of these modalities are complementary, and the two are combined regularly (though separately) in clinical practice. The benefits of combining these modalities through image registration have been shown for intra-operative surgical applications and breast/prostate biopsy guidance.

Image registration is the process of transforming different modalities into the same reference frame to achieve as much comprehensive information about the underlying structure as possible. While MRI is typically pre-operative imaging techniques, ultrasound can easily be performed live during surgery.

Project goal

The aim of this study is to design and develop a deep learning-based method for the registration of multimodal images (MRI and ultrasound).

1st phase: Using built-in multi-modality image fusion feature in ultrasound machines on phantoms

2nd phase: Error estimation in multimodal registration application using CNN

Prerequisites

  • Enthusiastic Master student in electrical engineering, biomedical engineering, computer science, or a related field
  • Interest in the intersection of machine learning and deep learning
  • Understanding of basic machine learning concepts, image analysis and signal processing
  • Programming experiences in MATLAB and Python
  • A good team player with excellent communication skills
  • A creative solution-finder

Duration: 9 months (BME or ME or MWT)

Start date: a.s.a.p.

Collaboration:  Netherlands Cancer Institute (NKI)

Location: TU/e (Eindhoven) and NKI (Amsterdam)

Contact: For project details, please contact Dr. Behdad Dasht Bozorg, email: B.Dasht.Bozorg@nki.nl

Surgical Workflow Analysis

Minimally invasive surgery using cameras to observe the internal anatomy is the preferred approach to many surgical procedures. Furthermore, other surgical disciplines rely on microscopic images. As a result, endoscopic and microscopic image processing, as well as surgical vision, are evolving as techniques needed to facilitate computer-assisted interventions (CAI). Algorithms that have been reported for such images include 3D surface reconstruction, salient feature motion tracking, instrument detection or activity recognition.

Analyzing the surgical workflow is a prerequisite for many applications in computer-assisted surgery (CAS), such as context-aware visualization of navigation information, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery. Since laparoscopic surgeries are performed using an endoscopic camera, a video stream is always available during surgery, making it the obvious choice as input sensor data for workflow analysis. Furthermore, integrated operating rooms are becoming more prevalent in hospitals, making it possible to access data streams from surgical devices such as cameras, thermoflator, lights, etc. during surgeries.

This project focuses on the online workflow analysis of laparoscopic surgeries. The main goal is to segment surgeries into surgical phases based on the video.

 

Project Phases

  • Designing and developing deep architectures for surgical tools detection and segmentation of colorectal surgeries into surgical phases based on the video input (public dataset)
  • Achieving the highest performance compared to the winners of Endoscopic Vision Challenge-MICCAI 2018
  • Applying the developed technique on prostatectomy (in-house dataset)
  • Detection of deviation from normal habit patterns during surgery
  • Participation in the Endoscopic Vision Challenge-MICCAI 2019

 Prerequisites

  • Enthusiastic Master student in electrical engineering, biomedical engineering, computer science, or a related field
  • Interest in the intersection of machine learning and deep learning
  • Understanding of basic machine learning concepts, image analysis and signal processing
  • Programming experiences in MATLAB and Python
  • A good team player with excellent communication skills
  • A creative solution-finder

Duration: 9 months (BME or ME or MWT)

Start date: a.s.a.p.

Collaboration:  Netherlands Cancer Institute (NKI)

Location: TU/e (Eindhoven) and NKI (Amsterdam)

Contact: For project details, please contact Dr. Behdad Dasht Bozorg, email: B.Dasht.Bozorg@nki.nl