Source: Deep Learning on Medium

# U-net engineering. TPE Hyper-Parameter Optimization. Semantic Segmentation. Building Footprint extraction.

# TLDR

CNN U-net engineering using Tensorflow Tensorboard and TPE Hyper-Parameter Optimization. Adding a custom layer to increase CNN predictive capacity. All code is available in this tiny project.

# Motivation

Developing of a neural net architecture is an art. Sometimes it is like a Blind-Man’s Buff game. In spite that there is already a number of state of art neural net designs for a decent number of problems, each problem is unique and there is no theory that provides guidance in advance of what exactly should be used to achieve an optimal solution. In this paper I want to share my thoughts and experience of techniques available to tackle this problem in a systematic way. I’m trying to utilize two approaches: analyzing weights distributions and hyper-parameters optimization by TPE algorithm. Based on a weights analysis I add a custom layer in order to increase generalizing ability where it lacks as I assume. I apply these methods to semantic segmentation of satellite imagery problem(building footprint extraction).

# Area of Interest

As a showcase I use SpaceNet Challenge Las Vegas data. As a baseline model I choose U-net like CNN proposed in this Microsoft Azure blog post. I try to make changes of its design and I compare validation set loss and visual mask after the same number of learning iteration of baseline and improved architecture. Improved architecture has a number of convolutional filters as a hyper-parameter of added custom layer. This hyper-parameter is optimized by Tree-structured Parzen Estimator(hyperopt library).

# Improved U-net design

## Weights distribution analysis

Neural Nets are commonly thought as a black box. However there are researches those try to shed some light on NNs interiors. Some of them try to discover relations of parameters distribution and extrapolation ability(for instance https://arxiv.org/pdf/1504.08291.pdf). There is an idea that NN with weights distributed as Gaussian like distribution might have better performance. Though this statement is not strictly proven, my experience and intuition lead me to speculate that models, which weights distribution shape is smooth, looks like a Gaussian(more or less), changes gently from epoch to epoch, tend to perform better. Let’s try and see how this applies to the showcase problem.

Original U-net has final conv layer with three filters. Each filter has to be able to distinguish target class on the final feature map. Each filter has capacity of only 65 parameters, which might be insufficient. Let’s see how filter weights are distributed for different layers.

Luckily there is Tensorflow tensorboard tool which allows us to see what is happening internally while neural network black box is learning. It is like a flashlight in a dark room. Tensorboard has a number of nice features. Here we are inspecting layers weights histograms. Below are distributions of the penultimate(conv2d_9) and the last(conv2d_10) convolution layers.