AI Roadblock & Generalisation Error

Over the past ten years, deep learning — a method that uses layered machine-learning algorithms to extract structured information from massive data sets — has driven almost unthinkable progress in AI and the tech industry. It powers Google Search, the Facebook News Feed, conversational speech-to-text algorithms, and champion Go-playing systems. Outside the internet, we use deep learning to detect earthquakes, predict heart disease, and flag suspicious behavior on a camera feed, along with countless other innovations that would have been impossible otherwise.

But deep learning requires massive amounts of training data to work properly, incorporating nearly every scenario the algorithm will encounter. Systems like Google Images, for instance, are great at recognizing animals as long as they have training data to show them what each animal looks like. Experts like New York University’s Gary Marcus describes this kind of task as “interpolation,” taking a survey of all the images labeled “ocelot” and deciding whether the new picture belongs in the group.


A digital image is formed by mathematical function

Mathematical function of a digital image

Image Interpretation includes

  • detection: such as search for hot spots in mechanical and electrical facilities and white spot in x-ray images. This procedure is often used as the first step of image interpretation.
  • identification: recognition of certain target. A simple example is to identify vegetation types, soil types, rock types and water bodies. The higher the spatial/spectral resolution of an image, the more detail we can derive from the image.
  • delineation: to outline the recognized target for mapping purposes. Identification and delineation combined together are used to map certain subjects. If the whole image is to be processed by these two procedures, we call it image classification.

Engineers can get creative in where the data comes from and how it’s structured, but it places a hard limit on how far a given algorithm can reach. The same algorithm can’t recognize an ocelot unless it’s seen thousands of pictures of an ocelot — even if it’s seen pictures of housecats and jaguars, and knows ocelots are somewhere in between. That process, called “generalization,” requires a different set of skills.

The generalization analysis functions are used to either clean up small erroneous data in the raster or to generalize the data to get rid of unnecessary detail for a more general analysis. The erroneous data may be unclassified data originating from a raw image, unnecessary lines or text originating from a scanned paper map, or imported from some other raster format.

The image below is the raw satellite scene that will be classified.

Raw satellite image

In a supervised classification, training samples are identified on an image, such as the satellite image. The training samples are taken in different land uses to identify water, residential, hardwoods, conifers, and so on. From these training samples, all other cell locations in the image are allocated to one of these known land types or uses. Sometimes land use signatures (statistics derived from the training samples) are similar, making it difficult to distinguish between two classes. For example, with the existing training samples, the software may not be able to distinguish between an alder swamp and a wetland with hardwoods. This may be due to an inadequate number of training samples or the fact that certain land uses were never sampled at all. These limitations, as well as others, can lead to the misclassification of certain locations. As a result, a single or a small group of cells may be misclassified as an entity different from the sea of cells surrounding it, when in reality, the entity belongs to the group of cells that surrounds it. Another typical area of misclassification is the boundaries between different land uses. Often what results is a jagged, unrealistic representation of the boundary that can be smoothed with the generalization tools. Below is the classification of the satellite image. Notice there are many small, isolated single or groups of cells throughout the image.

Classified satellite image

Generalization error is the error obtained by applying a model to data it has not seen before. So, if you want to measure generalization error, you need to remove a subset from your data and don’t train your model on it. After training, you verify your model accuracy (or other performance measures) on the subset you have removed since your model hasn’t seen it before.

For a long time, researchers thought they could improve generalization skills with the right algorithms, but recent research has shown that conventional deep learning is even worse at generalizing than we thought. One study found that conventional deep learning systems have a hard time even generalizing across different frames of a video, labeling the same polar bear as a baboon, mongoose, or weasel depending on minor shifts in the background. With each classification based on hundreds of factors in aggregate, even small changes to pictures can completely change the system’s judgment, something other researchers have taken advantage of in adversarial data sets.

Marcus points to the chat bot craze as the most recent example of hype running up against the generalization problem. “We were promised chat bots in 2015,” he says, “but they’re not any good because it’s not just a matter of collecting data.” When you’re talking to a person online, you don’t just want them to rehash earlier conversations. You want them to respond to what you’re saying, drawing on broader conversational skills to produce a response that’s unique to you. Deep learning just couldn’t make that kind of chat bot. Once the initial hype faded, companies lost faith in their chat bot projects, and there are very few still in active development.

That leaves Tesla and other autonomy companies with a scary question: Will self-driving cars keep getting better, like image search, voice recognition, and the other AI success stories? Or will they run into the generalization problem like chat bots? Is autonomy an interpolation problem or a generalization problem? How unpredictable is driving, really?

It may be too early to know. “Driverless cars are like a scientific experiment where we don’t know the answer,” Marcus says. We’ve never been able to automate driving at this level before, so we don’t know what kind of task it is. To the extent that it’s about identifying familiar objects and following rules, existing technologies should be up to the task. But Marcus worries that driving well in accident-prone scenarios may be more complicated than the industry wants to admit. “To the extent that surprising new things happen, it’s not a good thing for deep learning.”

One study by the Rand Corporation estimated that self-driving cars would have to drive 275 million miles without a fatality to prove they were as safe as human drivers. The first death linked to Tesla’s Autopilot came roughly 130 million miles into the project, well short of the mark.

Drive.AI founder Andrew Ng, a former Baidu executive and one of the industry’s most prominent boosters, argues the problem is less about building a perfect driving system than training bystanders to anticipate self-driving behavior. In other words, we can make roads safe for the cars instead of the other way around. As an example of an unpredictable case, On being asked whether he thought modern systems could handle a pedestrian on a pogo stick, even if they had never seen one before. “I think many AV teams could handle a pogo stick user in pedestrian crosswalk,” Ng told. “Having said that, bouncing on a pogo stick in the middle of a highway would be really dangerous.”

“Rather than building AI to solve the pogo stick problem, we should partner with the government to ask people to be lawful and considerate,” he said. “Safety isn’t just about the quality of the AI technology.”

Alternative of Deep learning

Deep learning isn’t the only AI technique, and companies are already exploring alternatives. Though techniques are closely guarded within the industry (just look at Waymo’s recent lawsuit against Uber), many companies have shifted to rule-based AI, an older technique that lets engineers hard-code specific behaviors or logic into an otherwise self-directed system. It doesn’t have the same capacity to write its own behaviors just by studying data, which is what makes deep learning so exciting, but it would let companies avoid some of the deep learning’s limitations. But with the basic tasks of perception still profoundly shaped by deep learning techniques, it’s hard to say how successfully engineers can quarantine potential errors.

Future of Autonomous car

The dream of a fully autonomous car may be further than we realize. There’s growing concern among AI experts that it may be years, if not decades, before self-driving systems can reliably avoid accidents. As self-trained systems grapple with the chaos of the real world, experts like NYU’s Gary Marcus are bracing for a painful recalibration in expectations, a correction sometimes called “AI winter.” That delay could have disastrous consequences for companies banking on self-driving technology, putting full autonomy out of reach for an entire generation.

Autonomous car

Will autonomous car face same fate of chat bot once initial hype fades away? Time will tell. Let’s hope to put full autonomy reach for an entire generation. 🤲🙏

Source: Deep Learning on Medium