3 Stages of an AI Project in Plain English

Source: Deep Learning on Medium


This article talks about the technical stages of an AI project. It shows an actionable process to setup teams for success, including who to hire and when. It assumes no prior knowledge and aims to be written in as plain of English as possible.

Before you begin the first step, you should make sure it is worthwhile for your organization to pursue AI. Hearing that you need to use AI to stay relevant is fine for the initial motivation, and it should never serve as the final justification.

Instead, consider defining the use case, stating the problem it solves, estimating the potential ROI, and evaluating access to talent and resources. Success also depends on organizational buy-in to create the right environment.

If you decide to move forward, there are 3 broad stages. The stages appear in a numbered order, although there is overlap between stages in practice. You likely will even need to go back to a previously completed step to make it better.

Let’s now imagine that we’re starting an AI project together.

1. Get High-Quality Data

We start by gathering a precious resource: data.

Getting high-quality data is by far the most important step in our entire project. And it is often the most ignored.

If our organization does not have data already, we begin by collecting some. Sources of data include the Internet, our users, and sensors. Collection often starts off as a one-time event, and over time it becomes done on a regular schedule.

Once we get some data, it is probably garbage. And a common saying about AI is “garbage in, garbage out.” Our results are limited by the quality of our data.

Data from the real-world is messy. It is incomplete. It has mistakes in it. It is often duplicated. And we must deal with these issues before going any further. We must first clean the garbage data into high-quality data.

At this stage, you will need someone like a data analyst or data scientist. If you would like to collect data on a regular schedule, add a software engineer and possibly a database administrator.

2. Run Experiments

We never know what will work until it works.

We can guess that a particular type of AI will work for our data, and then it fails to deliver the results we want. And we can guess that a different type will not work well, and it performs beyond our wildest dreams.

The point is that theories and hunches only go so far. We need to get our hands dirty to see what actually works in practice. Luckily, it is easy to try out many different types in somewhere between a few hours and a few days.

There are real limits to how well the current best performs. Sometimes the current best is not good enough to move forward. It is better to find that out before starting with the heavy lifting of the third step. Think of this stage as de-risking the investment.

Over time as researchers invent new types of AI, results naturally get better. And once we get great results, we are ready to move on to the next step.

At this stage, we will need someone like a machine learning engineer or a data scientist. Ideally, this would be the same person responsible for the first step because there is back-and-forth between the first step and second step. We likely will get better results by further cleaning our data.

3. Build for the Real-World

We create value by putting what works into the real world.

So far, our experiments have been one-off. This approach works well for quick tests, and it does not work well to repeat many times. And we typically want our AI to regularly repeat some task.

The solution is to build out infrastructure that can handle big bumps, unexpected errors, and malicious attacks. This process is serious work. We definitely require a team now, whereas the previous two steps could potentially be done by an individual.

The first concern should be defending from attackers. Organizations that create value from AI usually own sensitive data. One data breach can sink a smaller organization or permanently damage the trust of its customers. It is an on-going battle: as attackers learn more ways to attack, defenders must learn more ways to defend. Hence, defenses must be updated regularly by someone like a security engineer or security analyst.

If cybersecurity forms the defenses, then pipes form the backbone of the infrastructure. They carry data from place to place like pipes carrying water. Occasionally, a pipe might start to leak or get a blockage in it. We need data engineers to make sure the data keeps flowing smoothly through the pipes.

We also need a place to store all this data flowing through the pipes. We may want to run new experiments on the old data, or we may want to regularly collect new data. Databases provide a reliable way to store data for future use. A database administrator makes sure all data is accounted for and properly stored.

Another major part is putting the AI somewhere where it can be used. This may be putting it on a cloud server where it can be used via an Internet browser like Chrome. Or it may be in a software application that is installed like Microsoft Word. It depends on the use case. Web developers are best suited for the former, and software engineers for the latter.

Finally, we need a way to see the results from the AI. Think of mediums like a website, or a dashboard, or an app. A UX designer makes sure the product delights the user, no matter the medium. Websites and dashboards can use front-end developers. Apps require mobile developers.

Keep in mind that all infrastructure requires maintenance over time. Just like a car breaks down over time without repairs, so too does AI infrastructure. Investing upfront in maintenance keeps the project running smoothly with minimal interruptions.

We may find a candidate who can wear multiple hats, at least in the beginning of building the infrastructure. For example, software engineers often can serve as a database administrator. Over time as the project matures, it likely makes sense to bring in someone dedicated to maintaining the database.

Summary

First, make sure it is worthwhile for your organization to pursue AI. If you are ready to move forward, assess what stage you are currently in:

  1. Gather as much high-quality data as possible
  2. Run experiments to see what does work and what does not work
  3. Put what works into the real world with infrastructure

Each stage requires different expertise. Hire appropriately based on the stage your organization is currently in.

Be prepared to maintain the infrastructure like you would a cherished car. Get it checked out regularly, and invest in repairs that prevent more costly problems down the road.