Akira’s ML News #Week41, 2020

Original article was published by akira on Deep Learning on Medium

Akira’s ML News #Week41, 2020

Here are some of the papers and articles that I found particularly interesting I read in week 41 of 2020 (4 October~). I’ve tried to introduce the most recent ones as much as possible, but the date of the paper submission may not be the same as the week.

  1. Machine Learning Papers
  2. Technical Articles
  3. Examples of Machine Learning use cases
  4. Other topics

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

1. Machine Learning Papers

— —

Interpreted PCA as playing a game and decentralizable

EigenGame: PCA as a Nash Equilibrium

They interpreted PCA as playing a game in which each eigenvector maximizes its own utility function, and showed that it is equivalent to PCA in Nash equilibrium conditions. This can be implemented as a decentralizable algorithm, so they are able to perform a large-scale analysis of the neural network. It is an important result because with AutoEncoder it is not equivalent to recovering the principal components, nor is it disentangled.



Unlike the usual generative model, which generates images by perturbing the noise, the stochastic differential equations are used to consider the continuum on which the noise evolves over time. With CIFAR10, IS 9.9 and FID 2.2 are achieved and 1024×1024 images can be also generated.

Going beyond autoregressive models with deep VAE


The study produced results comparable to autoregressive models, such as PixelCNN, and flow-based models at very deep VAEs, such as 78 layers. They suggest the reason why autoregressive systems are better than VAE is because of the depth of the network. Ignoring large gradients and starting learning of posterior distribution latter are overcome the increase in learning difficulty with depth. The number of latent variable dimensions allows us to manipulate the information content of images such as hair quality.

Improved visualization performance with one filter process one category

Training Interpretable Convolutional Neural Networks by Dierentiating Class-specic Filters

A study to improve the interpretation performance by constraining CNN filters such that one filter is responsible for one category. By multiplying the filters in the final layer with a trainable matrix of [0,1], they ensure that each filter uses only one category. The classification performance is also not compromised and the CAM visualization is better.

Predicting Stock Prices with Deep Learning

Stock2Vec: A Hybrid Deep Learning Framework for Stock Market Prediction with Representation Learning and Temporal Convolutional Network

A study to predict the next day’s stock price.The model not only processes daily stock prices using Dilated Conv, but also uses stock2Vec, a feature that uses natural language processing of news articles for each stock, in parallel. The embedding of Stock2Vec is similar to the skin feel.

Distilling BigGAN

TinyGAN: Distilling BigGAN for Conditional Image Generation

A study to distill BigGANs. The latent and class variable input and image output pairs are obtained beforehand and treated as a dataset to reduce memory usage during training, and a small student model is trained with three targets: the L1 distance per pixel, the difference between the hidden layers of D, and the usual adversarial losses. Although the performance is slightly degraded, they succeeded in reducing the number of parameters compared to BigGANs.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

2. Technical Articles

— — — —

Lecture materials by Prof. Yann LeCun are now available

Turing Award winner Dr. Yann LeCun’s Deep Learning course is now available for free. You can see not only the lecture materials but also the code in Jupyter notebook.

Why test score is higher than training score?

A thread discussing what causes scores in the Test data to be higher than those in the train data. They discuss such ,The proper split of train/test, and the fact that the train score is a one-epoch score average in the keras specification and therefore different from the model used in the test evaluation.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

3. Examples of Machine Learning use cases

— — — —

Protect victims’ privacy by DeepFake

An article about a documentary about gays and lesbians fleeing from Chechnya after being persecuted, and how Deepfake technology was used to protect the privacy of the victims by synthesizing their faces. Simply blurring the faces and using a synthetic voice makes them less realistic and less relatable, but using deepfake protects the privacy of the victims while maintaining the realism.

Machine-learning-powered video conferencing tool

Nvidia has announced Nvidia Maxine, a machine-learning-powered video conferencing tool that makes use of GPUs and the cloud, with features such as real-time translation and transcription, and gaze modification as if you were looking at a camera. One of the main features seems to be the data volume compression using GAN to extract only the necessary parts.

Machine Learning is Now Doing the Exhausting Task of Counting Craters On Mars

An article about how machine learning was used to make them discover a crater on Mars. Automated machine learning tools have allowed the team to discover new meteorite impact sites. The task of finding a small crater is very difficult and can take up to 40 minutes. By leaving such tasks to machine learning, it allows humans to focus on tasks that use more thinking skills.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

4. Other Topics

— — — —

Awful AI

This github repository is a collection of “Awful” uses of machine learning to promote discrimination, public surveillance, military use, etc. It was created in the hope that it will serve as a platform to promote discussion on how to combat these.

Paper with Code and Arxiv collaborate

Paper with Code and Arxiv have teamed up. A tab called “Code” has been added to the tab at the bottom of the page, and you can jump to the code from there.

GPT-3 in Reddit

A giant high-performance language model developed by OpenAI, GPT-3, lurked on the message board Reddit for a week and interacted with humans, but no one noticed . GPT-3 was spreading conspiracy theories and says “The purpose of exercise is to avoid thinking about the fact that you spend your life working for money.”.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Twitter, I post one-sentence paper commentary.