[WEEK 7 — Facade Parsing using Deep Learning]

Source: Deep Learning on Medium

Go to the profile of Onur Cankur

Theme: Segmenting an image of a facade into predefined semantic categories

Team Members: Onur Cankur, furkan karababa, Javid Rajabov

This is our seventh blog post about our project and we are working hard to finish it successfully. You can see what we have done until now in this blog post.

An example of facade parsing(from DeepFacade paper)

This week we worked hard to improve our accuracy result using some approaches and different techniques. Unfortunately, not all our tries worked. However, finally, we improved our accuracy. In addition to that, we created a short video about our project and published it on YouTube.

About the video

As I mentioned above, we created a short video to briefly explain our project. In this video, first, we give brief explanations about semantic segmentation, facade parsing and why they are extremely important for machine learning and computer vision. Of course, we explained our aim and the basis model that we chose which is Fully Convolutional Networks. In addition, we mentioned about which datasets were used in our project. We listed some challenges during the project. We used PowToon to create the video.

And here is the video that we made.

Video presentation of our project

Improvement Process

In order to improve our accuracy, we thought we might use the downsampling part of some architectures like U-Net but we could not do it by now. Also, we thought we might use a different loss function which is explained in the paper named DeepFacade: A Deep Learning Approach to Facade Parsing which is written by Hantang Liu , Jialiang Zhang , Jianke Zhu, and Steven C.H. Hoi. However, it is hard to adapt this loss function to our model and unfortunately, we could not do it also by now.

But fortunately, we use some data augmentation techniques to improve our accuracy and it worked! As we mentioned our previous blog posts, we use eTRIMS, ECP and Paris Art Deco datasets and in eTRIMS there were 60 photos, in ECP there were 104 photos and in Paris Art Deco there were 79 photos. Using some data augmentation techniques such as rotating and flipping, we made them 4 times larger.

For each dataset, we get different number of improvement. You can see how much they improved and very detailed explanations about every step that we have done in our final report which we will write in this week. Getting results from three different datasets is one of the challenges because training required lots of time.

Next week, we are going to try to improve our accuracy with different approaches and I am hoping that we will accomplish it. We will also make presentation and submit our final project.

Thank you for reading and for your time. :)