Book Review: Deep Learning for the Life Sciences

In this blog post I will be talking about and reviewing a book that I have recently gone through- “Deep Learning for Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery & More”. But first, before we get started with it, I would like to emphasise on the fact that it is really one of the few books that talks about the application of Deep Learning in the field of Life Sciences. This is despite the fact that Deep Learning has tremendous applications in the field of Life Sciences. Some of these applications even have the potential to change the world. From being able to diagnose diseases just by looking at the raw X-ray or MRI scans to suggesting new antibiotics! All of this can be made possible by the use of Machine Learning and Deep Learning in the field of Life Sciences.

Deep Learning for Life Sciences: Book Cover
Deep Learning for Life Sciences: Book Cover

Summary

Chapters 1-2

The authors start the book with a brief introduction to Life Sciences in Chapter 1. This is followed by an introduction to the book in the same chapter. This is later followed by the basics of Machine Learning and Deep Learning in Chapter 2.

Chapters 3-4

Chapter 3 introduces the readers to DeepChem, a python library built on top of the TensorFlow library to facilitate the use of Deep Learning in the field Life Sciences. DeepChem is a useful library for Deep Learning in the field of Life Sciences. It contains many datasets, objects and functions that prove to be useful for this field of study. The chapter ends with creating a model for recognising numbers in MNIST dataset using DeepChem.

Performing Machine Learning on molecular data is the subject of Chapter 4. The chapter further talks about how molecules can be specified by using text-strings using something called SMILES(Simplified Molecular-Input Line-Entry System). It also contains an introduction to RDKit, which is an open-source chemo-informatics package. Later, it discusses about MoleculeNet, a large collection of datasets that can be used for Molecular Machine Learning. The chapter ends with SMARTS Strings, which can be considered an extension to the SMILES language with the added feature that they can be used to create queries.

Chapters 5-6

The next Chapter, 5, talks about Biophysical Machine Learning. The main focus of this chapter is to predict, using Machine Learning, that which molecule will interact with a given protein. The chapter starts with talking about proteins and their chemistry. This is followed by the PDBBind Case Study(PDB stands for Protein Data Bank). Then we create and train a model for the same. Applications of Deep Learning in the Field of Genomics are talked about in Chapter 6. The chapter starts by talking about DNA, RNA and proteins, and is then followed by TF(Transcription Factor) Binding. The chapter ends by creating a convolutional model for TF Binding.

Chapter 7

Chapter 7 is Machine Learning for Microscopy. This chapter deals with understanding the biological nature of a microscopic image. This includes tasks such as identifying different types of cells or count the number of cells in an image. The chapter starts by talking about microscopy, and related concepts such as diffraction limit. Along with that, recent advances in microscopy, such as electron microscope and fluorescence microscopy are also discussed. This is followed by training a model for cell counting(count the number of cells in an image). Also, a model for cell segmentation(to denote where cells appear and where background appears- a form of semantic segmentation for cells) is also trained. The cell segmentation example briefly talks about the U-Net Architecture, a well known neural network architecture for biomedical image segmentation.

Chapter 8

Deep Learning for Medicine is the topic of Chapter 8. The chapter starts with talking about the previously used computational techniques in the field of medicine, such as expert systems. Next, the chapter discusses the applications of deep radiology. This refers to the use of various radiology techniques such as X-rays, CT Scans, MRI Scans, etc for the diagnosis of diseases by using Convolutional Neural Networks. The chapter ends by creating and training a model for Diabetic Retinopathy.

Chapter 9

Chapter 9 discusses about Generative Models in the field of Life Sciences. Generative Models are quite different from other models that the authors talked about in this book. This is because instead to taking an input and producing an output as a function of its input, Generative Models take random(Gaussian) noise as input, and produce an output. The reason this is important is that, it can be used to do ‘creative’ things. These include tasks such as suggesting new antibiotics for treating a particular bacterial infection, for example. The chapter starts by talking about Variational Autoencoders and GANs(Generative Adversarial Networks). Following that, we train a Variational Autoencoder to generate new molecules. The new structure of the new molecules it outputs is in the form of SMILES.

Chapters 10-12

Deep Learning models are black-box models because of the fact that they are hard to explain and interpret. Chapter 10 deals with interpretability and explainability of the models. The next Chapter, 11 is a Virtual Screening Workflow example. Chapter 12 talks about the prospects and perspectives of applications of Deep Learning in the field of Life Sciences.

Summary

Overall, I find this book very well written. I would recommend this book to everyone who wants to study the application of Deep Learning in Life Sciences. At the same time I would like to recommend this book to every Deep Learning enthusiast. The reason being that, you can try to apply your knowledge in the field of Deep Learning to a new field.

Most people in the field of Deep Learning or Machine Learning are not specialists in Biology. And this same thing applies to me as well. Before getting started with this book, I was worried if I would be able to understand the content of the book. But looking back, there was hardly any time it mattered. Most of the book is written in an abstract form, and hence a good understanding of biology is not required. At the same time, the authors of the book have made sure that they explain the basics of the concept at the start of each chapter. For these 2 reasons, I find that the book is understandable to professionals in the field of Deep Learning, even when they don’t have a strong foundation in biology. If however, you feel like studying this subject further, you must improve you foundation of biology.