Original article was published by Alexey Grigorev on Deep Learning on Medium
Getting a dataset with images is not easy if you want to use it for a course or a book. Yes, there are many datasets with images, but few of them are suitable for commercial or educational use.
To solve this issue, I decided to collect a dataset with clothing. All the pictures will be shared under the CC0 license. This means that anyone can use this data for any purpose.
- Creating a tutorial or a course (free or paid)
- Writing a book
- Kaggle competitions (as an external dataset)
- Training an internal model at any company
I already collected more than 1,000 pictures, but it’s not easy to do alone. I need your help.
How can I help?
There are many ways you can help.
Spread the word about it. Share it on social media, send it to your colleagues and friends.
Upload your pictures. If don’t want to go through your entire wardrobe and take a picture of every item — it’s okay. Even one image is helpful. Perhaps there’s a t-shirt nearby, jeans, or shoes? Take a picture and upload it using this form.
The form works on mobile too!
Upload many pictures at once. If you have more than a couple of images, using the previous form is not convenient. There are other options:
- Google Photos. The app can automatically synchronize all your images. Just move the pictures of clothes to a separate album and share the link.
- Dropbox, Google Drive, Yandex Disk, or any similar cloud storage. Upload a folder or a zip archive and share the link.
- WeTransfer.com. You can use it to upload files up to 2GB without registering.
Once you have a link, use another form to submit it:
There are the following categories of clothes:
- Long sleeves, sweaters, hoodies
- Jeans, pants, shorts
- Dresses, skirts
- Jackets, coats
- Clothes for kids
To make a picture, put the item on a floor or a bed:
Pictures of hanging clothes are fine, but make sure the item is visible:
The item shouldn’t be crumpled or packed:
The background should be contrasting enough to see the item:
An image should contain only one item:
And there should be no people:
If you’re not sure about something, just share it, and I’ll figure it out.
How can I know when the data is ready?
When I collect enough pictures, I’ll annotate them and upload the result to Kaggle. If you provide your email when sharing images, I’ll inform you when it happens.
I will also post in other places: