Synthetic DATASET for your real problems

Source: Deep Learning on Medium

Go to the profile of rvjenya
(Shorty) First version our architecture

Today I’m going to tell synthetics data and how it helps us in real world. I’ll not be telling about many types of synthetic data for ML (Machine Learning), I want to talk — synthetic data for Deep Learning, for learning on your own Neural Network.

Several months ago I had to do unique data for my first MVP. I’ve gotten many different problems with incorrect data:

  • stamps on the images (if I grabbed from google images)
  • not 1:1 proportional
  • badly view position
  • poorly compression
  • and one important thing… it was very little datasets for each class.

I need to create more than 10k images for one class — I have a very little difference between classes.

For example = Walk people, People are looking at the phone and running people. I think you are understanding me, Neural Net can’t find much dependence, and of course, our loss function won’t be moving to the minimum.

I used to try many algorithms and methods:

  • manual images of validation
  • Scripts (grabbing from google images)
  • Google extension and so on

Ok. I think you are understanding — Why We won’t be able to make clear dataset for our tasks?

Goal and solutions:

  • create clear data
  • automatically labels for class
  • the correct position of view (camera to direct on the object)
  • 256*256 (not more)

I’ve gotten to my mind the idea — if I’ll be able to use, 3d data. It’ll help me to close all my problem than manually grabbing and labeling.

  • I used 3ds max
  • 5 realistic 3d people (I bought it in the online store, of course, you can find it free )
  • created 5 variations of textures (for each model), Use Adobe PH or other graphics studio for it
  • made a realistic 3d scene with global light
  • created animation for each model (360 rotation from the camera)
  • animation camera from horizontal to 45 up degrees
  • made animation when my models changed the position to the center camera
  • of course, it was batch render for all timeline


I got for one variation (the texture of clothes) = 1k images of the sequence

Total data for one class (People with the phone) = +/- 11 000 images (I spent 1 second for one 256*256 frame)

Next… I made one action script in the Adobe PH (flipping images, gauss blur and colorize) = I got… dataset*2
Sounds good…? )) 👍 Anytime I didn’t know, why this method will be great that previous manual google images grabbing. ok.. go next on

Firstly… I made dataset (I used Nvidia Digits Docker container and Titan X 12 gb for learning) with mix data:

  • My synthetic data
  • google images (of course I cleared it and labeled)
Dataset 22k — People with phone (from 3ds max render)

What do I have classes:

  • Blind walking People (with sticks)
  • People on the bike (riding)
  • People running (not walking, really quickly run)
  • People who look in the phone and walking ( My synthetic data 22k images)

Next… I had to learn Neural Network by my mix dataset. I used Caffe framework and AlexNet model. I’ve chosen 8 epochs and 0.01 base learning rate.

In the next time we can save your Pretrained model and repeat learning with highly epochs.

My goal was check hypothesis = How synthetic data works in real learning and of course, find dependency between grabbing photos.

When I learned my neural net and checked the classification… I’ve gotten interesting results:

  • synthetic data very quickly learning
  • has perfect predictions
  • one important thing = We can use synthetic data for learning and apply in the real world.
This is a test of classification on the real data. And our People with phone = synthetically dataset

If you have task of classification — you can try synthetic data and apply it to your physical world.

Caffe model in our real-time classification object (Deployed in Nvidia Jetson TX2 / Raspberry Pi B+ and Movidius)

I think this article will be helpful and you will be able to use synthetic data for your real problem.