World Cancer Day AMA 2022 – You ask, we answer!

Cover of Ask Us Anything session we organizeswith our BraTS challenge team to answer all questions about our deep learning models

It has been World Cancer Day recently and as Graylight Imaging we would like to share with you how we contribute to the fight against cancer, especially using deep learning models.  

Although we are Graylight Imaging today, before our rebranding, we participated as Future Processing Healthcare in the 10th Brain Tumor Segmentation 2021 (BraTS) challenge jointly organized by the Radiological Society of North America (RSNA), the American Society of Neuroradiology (ASNR), and the Medical Image Computing and Computer Assisted Interventions (MICCAI) society. Our team: Szymon Adamski, Krzysztof Kotowski, Bartosz Machura, Jakub Nalepa took sixth place at this prestigious challenge. The results of the competition were presented at the American Congress of Radiology 2021. 

We have organized an Ask Us Anything session (#ama) with our BraTS challenge team to answer your questions on: what approaches to AI model for tumor segmentation they’ve developed, what frameworks they’ve used, what ups and downs they’ve experienced, or anything else related to the Brain Tumor Segmentation 2021 (BraTS) challenge and our algorithm. 

Below you will find the questions you have sent us and the answers from our BraTS team. 

Q1 

What preprocessing steps do you recommend for MRI images in the context of Deep Learning segmentation algorithms?  

We should strive for making our techniques independent from the data source (scanner) and the acquisition procedure. Data standardization (e.g., z-scoring within the MRI sequence) is always an important step to consider. 

Q2 

Could you give more details about the quick-start modules that Graylight uses and which steps they speed up? 

We have developed a battery of techniques and tools throughout the years that enable us to tackle medical image analysis tasks in a fairly quick way (they include deep learning models, pre/postprocessing routines, data handling tools and so forth). Our own custom toolkit can be used not only for rapid prototyping, image processing, but also for implementing ML-powered software as well. 

Q3 

Recommended hyperparameter search strategy?  

There are quite a number of them – nnU-Nets do wonderful job in fine-tuning the hyperparameters, but we also used particle swarm optimization in the past (you can have a look at our paper on that: https://dl.acm.org/doi/10.1145/3071178.3071208). Note, however, that deep learning models are heavily parameterized, hence traversing the entire search space is infeasible in practice – we build upon our previous experience in deep learning and image analysis here. 

Q4 

Recommended logger (WANDB, MLFlow, …)?  

We always want to maintain full reproducibility of our experimentation, and we can recommend MLFlow for that. 

Q5 

Which tools do you recommend for deep learning models deployment?  

Although it should depend on the toolset you are comfortable with, we utilize Python with standard deep learning packages, such as PyTorch and TensorFlow. We have been using the C++ chain as well. 

Q6 

Have you looked at using the algorithm for other lesions that can be found on MRI images in the brain?  

We have – in our fresh paper, we looked at pediatric optic pathway gliomas and segmented them in a fairly similar way. You may have a look at our paper here: https://www.sciencedirect.com/science/article/pii/S0010482522000294.

Q7 

How do deep learning models deal with a small database to get a high accuracy and generalization? how to train an algorithm with limited data?  

It was actually not the case in BraTS, as the organizers did a wonderful job in collecting a huge, heterogeneous and high-quality MRI data. However, if we had to tackle the issue of limited data, we have several well-established options for that that range across data augmentation, transfer and pre-training, semi-supervised learning, and others. You could see our takes on that in https://www.frontiersin.org/articles/10.3389/fncom.2019.00083/full or in https://www.sciencedirect.com/science/article/pii/S0010482522000294

Q8 

What was the key to get this result?  

We built upon our experience that we gathered throughout the years (it was our 5th BraTS already), and we tried to bridge the AI and clinical worlds at this time through tightening our collaboration with our clinical partner to look at the segmentation problem from a bit different angle. This helped us automate some post-processing steps that made the difference. Also, we have strong academic background – that was helpful as well (we enjoy asking ourselves difficult AI questions). 

Q9 

How did you use/work on cross validation data set? what parameters did you use?  

We followed five-fold cross-validation with stratification based on the distribution of the enhancing tumor, peritumoral edema, and necrotic core areas. 

Q10 

What are the hot topics of Deep Learning for Medical Image Analysis in 2022?  

That is a huge question indeed! Applying AI in medicine is not the future – it is happening here and now, but there are still open issues that require to be resolved. Proving the robustness of AI solutions for heterogeneous image data (captured by different scanners around the globe), training from distributed data, e.g., in the framework of federated learning, ensuring reproducibility of emerging algorithms (not trivial if the training/test data cannot be easily shared), benefiting from multi-modal data (e.g., images and clinical data) have been hot topics for quite a while already. Also, building end-to-end pipelines, e.g., coupling deep learning models for segmentation with radiomics for extracting quantifiable and mineable tissue characteristics, ultimately for improving patient’s care and for designing personalized treatment is an important research area as well. 

Q11 

What is best for image processing – Matlab, R or Python?  

We believe there is no one “best” tool, programming language or library for image processing (although we mostly utilize Python). The most important thing is to design and verify your solution properly. The programming language is like a brush in the hand of a painter – you could pick the one you are most comfortable with (although there might be some additional constraints you may want to consider, such as the maturity of the toolchain, its parallel computation capabilities, and so forth). 

Q12 

Are the data suitable to answer the clinical question—that is, do they capture the relevant real-world heterogeneity (bias), and are they of sufficient detail and quality?  

How to make an algorithm unbiased?  

We strive for making our techniques independent from the data source (scanner) and the acquisition procedure. We always try to include all the races, ages and genders in our datasets to reflect the diversity of the patients. Also, we always consult and validate datasets together with our clinical experts to ensure their high quality. 

Q13 

What are you currently working on? Are there other algorithms in the pipeline?  

There are lots of them! Our recent work focused on segmenting optic pathway gliomas from MRI (you can see our fresh paper here: https://www.sciencedirect.com/science/article/pii/S0010482522000294), but we are currently looking at different modalities, organs, and image analysis tasks too (we have been into cardiac imaging for quite a while, but also CT is currently in the pipeline). For more details, you can also have a look at: https://graylight-imaging.com/research-and-development/.

Q14 

What computational and software resources are required for the task?  

We exploited Python with PyTorch, and the experiments were run on a machine with an NVIDIA Tesla V100 GPU (32 GB) and 6 Intel Xeon E5-2680 (2.50 GHz) CPUs. 

Q15 

Since you have succeeded in BraTS are you considering applying the experience to other organs? 

Certainly, we are. We are striving to exploit our experience in an array of organs and modalities – we have been looking at CT, PET, PET/CT, cardiac imaging and other lesions in MRI as well. 

Q16 

From your perspective which organ or tissue is the most difficult to apply ML algorithms to?  

We could look at this from several angles – perhaps the most difficult organs are those that constantly “move” (e.g., gastrointestinal track or heart) or are super “tiny” (extremely small lesions, e.g., in lungs), but this difficulty may also come from lacking or limited ground-truth data that could be used for building and verifying machine learning algorithms for the task, or from the “difficulty” of the imaging modality (e.g., ultrasound). 

Q17 

Can your algorithm be trained in such a way to analyse Xray and PET?  

It would depend on the actual clinical problem to solve (spotting lesions, segmenting tumorous tissue, and so forth), but fundamentally yes – we could train our model using other image modalities targeting different organs. However, some steps of our current pipeline may not be directly applicable to such new modalities/organs, as they are tailored to brain tumor MRI. 

Q18 

Beyond research only? What from your perspective would have to happen, so that the BraTS algorithm could be used in clinical practice?  

The BraTS algorithm would have to undergo certification process (CE/FDA). We did it in the past for our Sens.AI software (the glioma segmentation and analysis tool) which is currently CE-marked (you may want to have a look at https://sensai.eu/en/). 

Q19 

Can your algorithm be used to track tumor progression tumor volume and size?  

Although we have not validated it in such longitudinal studies, we are positive that it can, as it delivers reproducible segmentations which are pivotal to objectively monitor the disease’s progression (if the segmentation masks were not reproducible, we would not be able to objectively track the disease, e.g., the volumetric measures of the tumors would have been affected by that, and they would not be directly “comparable” in different time points). 

Q20 

Are you planning to take part in similar challenges in the future?  

Of course, we are! Such challenges are, in our opinion, the perfect way to confront our techniques with other solutions over the independent datasets which leads to fully unbiased validation. 

Q21 

What would you have improved to take a higher position in the final ranking?  

We may not reveal that in detail… 🙂 What we can say though is that bringing clinicians (and their point of view) on board was the key. 

Q22 

How does this affect cancer treatment?  

It may help clinicians with the most time-consuming tasks that could be automated, such as tedious manual segmentation of brain tumors in 3D, so that they may focus on more challenging aspects of their work. 

See the previous post by Anna Choma: Graylight Imaging Data Scientists top ranked in RSNA-ASNR-MICCAI 2021 Brain Tumor Segmentation (BraTS) Challenge