Human labelers replaced using AI

Mckinsey & Company point to five bottlenecks that stand between your organization and delivering a win with AI:

  1. Data labeling
  2. Obtaining massive training data sets
  3. The explainability problem
  4. Generalizability of learning
  5. Bias in data and algorithms

We’d add one more:

  1. AI Skills gap

At Ziff we believe there should be only one bottleneck between your organization and delivering a win with AI: Capturing expert understanding.

Image Example:

You have image data but you don’t know it’s value. To even find out if you have anything useful requires getting signed up for the 6 risks associated with delivering AI value. You have to collect and then label your data with something of interest. If at the end of this process you find out your data has nothing useful you are back to square one.


For this example we take a 550,000 image opensource dataset of people, but it could easily be your own dataset of any images. These could be images of people, documents, products, manufacturing processes, etc.. Typically, a large dataset without labels (meta-data or outcomes assigned to each image) is not useful and labeling efforts are required.

Labeling is time consuming, expensive, and often a non-starter if you are dealing with large amounts of sensitive data. If you are open to crowdsourcing labeling your data using a platform like Amazon’s Mechanical Turk you can conceivably compress your timeline but you should expect to pay between $25,000 to $150,000 over 1–6 months. If this is your first curation effort as an organization it will take longer because you will likely repeat the process several times before you get it right. You will come to find that repeating yourself is a common theme in AI.

Example Image AI Workflow

  1. Collect all images (est. time: days-weeks)
  2. Label your data with Metadata (est. time: days-weeks): If you already have metadata about your images this will save you time but make doubly sure that it’s a label you actually care about — tasking AI to identify hot dogs is fun but does it provide you with new opportunities or help you lower costs?
  3. Crowdsource or “Insource” labeling (est. time: weeks-months): If you do not have metadata or the information you want to extract from images is not already captured then you need to task humans to label your data and Mechanical Turk workers are not expert in your domain — high label errors will result
  4. Analysis (est. time: months): Turn over your image dataset to your deep-learning expert or deep-learning service and have them find an algorithm that works with your data
  5. Deploy on-premises (est. time: months): If you are deploying on-premises then your devops team will have to tool-up on GPU computing.
  6. Cloud deploy (est. time: weeks-months):Decide on and ship a “model server” and manage GPU resources and autoscaling.

Even after months of investment in time and resources you will likely still have to address lower-than-expected quality in the data.


Using AI for preprocessing and cleaning drastically improves the quality of our faces dataset

At Ziff we found this is a common pain point for our customers and partners so we wanted to see if we could automate this entire process. Most customer data that is organized has higher label errors than they anticipated, and for customers with unstructured data (i.e. image, audio, video) with no labeling this problem becomes a non-starter for them. People datasets can be especially troublesome because of the amount of preprocessing required (i.e. face detection/cropping).

At Ziff we leveraged our deep learning capabilities to automate the discovery process and assist the person that cares the most about the problem (VP of product, executive, etc..) to manage it. If faces exist within your dataset they are preprocessed properly using advanced detection networks for automated cropping.

With no human direction or coaching our AI process organizes the entire 550k dataset in a few minutes into meaningful clusters for expert review.

Large datasets are organized and segmented using AI for expert review

Natural clusters that are produced include things like: gender (male/female), race (asian/black/white), hats, beards, and age. For this particular use case, being able to train a gender and race model on the original dataset is very useful since many times external training sets deviate from the users dataset.


  1. 500,000 images index and organizing in less than an hour
  2. Natural clusters identified by AI and validated by an expert user
  3. 98.9% gender classification achieved using AI auto labels
  4. Other labels of interest included race (asian/black/white), age (>40,<40), and facial accessories (beard, hats).

Source: Deep Learning on Medium