Deep Learning Approach for Separating Fast and Slow Components

Source: Deep Learning on Medium


Some Background

(A slide deck for this work can be found https://speakerdeck.com/jchin/decomposing-dynamics-from-different-time-scale-for-time-lapse-image-sequences-with-a-deep-cnn)

I left my job as a Scientific Fellow in PacBio after 9-year venture helping to make single-molecule sequencing becoming useful for the scientific community (see my story about the first couple year in PacBio there). Most of my technical/scientific work had something to do with DNA sequences. While there are some exciting deep learning approaches for solving a couple of interesting problems, I do like to explore a bit outside just DNA sequencing space.

I joined DNAnexus a while ago. The company has established itself as a leader of cloud computing platforms for biological data/sequencing data processing. I thought it was useful to experiment on demonstrating developing deep learning models for biological data other than DNA sequences with the platform. With such goal in mind, former CSO, Andrew Carroll, and I decided to see what we can do for some biological imaging related work.

When we were looking for some examples, Gene Myers (yes, the one who did first whole human genome shotgun assembly) published a tool CSBdeep from his lab in the Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) in Dresden, Germany to create super-resolution images from light-sheet confocal images for studying biological development process.

Example of using CSBDeep to archive super-resolution: Left: the original images. Right: the super-resolution images generated by CSBdeep. The image is provided by Ying Gu Lab.

Inspire by the CSBDeep paper, Andrew reached out a formal collaborator, Ying Gu, to see if she had some interesting images for us to demonstrate using CSBDeep on our platform. It was relatively easy to reproduce the CSBDeep results and apply it to new images with DNAnexus cloud computing platform. Nevertheless, I was thinking to see if we can do something a bit different, something new, at least to me.

A Movie Is a Bit More Fun Than Just Static Images

It turned out the Ying Gu’s research work was on tracking specific proteins involving the cellulose synthesis to solve important problems for bio-energy. The images we got were time-lapse movies tracking molecules inside cells. Initially, I think we might be able to achieve super-resolution with deep learning using multiple frames. While we got some initial success about that problem, I was “distracted” into solving a different problem.

In the image stack, the slow changing background contributes to the non-zero auto-correlation at longer timescales.

When I was looking at the time-lapse movies, it was hard not to notice there were some background components (e.g., microtubules, the backbone of a plant cell), and different blobs or particles that were moving in different speeds. I thought it was possible to use deep learning (as an unsupervised learning approach) to separate the background, slow components and activate components.

How can we separate the moving part from the static part in such time-lapse movies?

First, it is actually not too difficult to get the background image. One can just take the average or the median for each pixel for all images in the stack to get the background. To get the foreground images, we can just subtract the average background from each image. If the background is truly statics, this should be the easiest thing to do. Nevertheless, such an approach also assumes that there is only one interesting “foreground.” In fact, the underlying biological processes may have different components that have different dynamic scales, we might be able to use a deep learning architecture to decompose the different components.

From (T — t) to T: If ∆t is longer than the typically “faster” components, then we can
 catch the slow component using such autoencoder architecture.

Using Multiple Auto-encoders to Predict Future at Different Time Scale

What is a background in such time-lapse movie? In the deep learning neural network architectures, an auto-encoder can learn the reduced representation with a hidden layer to reproduce inputs. The loss function during training is typically the L2 differences between the outputs and the inputs. If we think the background is the invariant part of the movies, we should hope we can use such auto-encoder to learn the reduced representation that can predict output at the later time from an input at an earlier time. The invariant part of the images at different time points should be learn-able by an auto-encoder.

We can think an image at a given time can be reconstructed from features or components from an earlier time at different scales. We use auto-encoder that predicts image at time t from (t-∆t). If ∆t is big, then we hope the auto-encoder can learn the background part. We can learn the faster part with a smaller ∆t and so on. For example, we can construct the image at time t as a composition of the prediction from images at t-8(frames), t-4(frames), t-2(frames), and t-1(frames) to catch contribution from different time scales.

Along with this line of thought, we test an architecture show below for decomposing the time-lapse movies generated by Ying Gu’s group. I think we get reasonably good results.

From left to right: (a) Original (b) Slow Components (c ) Fast Component (d) Pseudo-color composition from the slow and fast components

Other Related Works

While I think the approach that we come up with is interesting and is very easy to implement with PyTorch, there are certainly some previous works that solve similar problems. For example, the paper “Prediction Under Uncertainty with Error-Encoding Networks (EEN)” by Mikael Henaff, Junbo Zhao, and Yann LeCun uses a feedback mechanism from prediction errors into the latent space to get better prediction results.

ENN Model Architecture

We should not be surprised that video background removal was a heavy studied topic in the field of image processing. I like to thank Earl Hubbell from Grail point the robust PCA approach for video background removal to me when I presented this work in late 2018.

Building Deep Learning Models with DNAnexus Cloud Platform

Part of my exercise here is also for practicing “eating your own dog food” as a newbie to DNAnexus platform. Below is a screenshot of a prototype that my colleagues and I worked toward an integration solution of a cloud-enabled Jupyter Lab Workstation in DNAnexus platform. With such integration, we can seamless integrating data management, model building, and evaluation.

We learn many experiences about the pros-and-cons of using Jupyter Lab with Docker backends on GPU instances and hope that the experience we have learned can help to improve the DNAnexus product better soon.