Original article was published by Sriya M on Artificial Intelligence on Medium
Future perfect autonomous cars
Technology it takes to build one
We all watched Doctor Strange movie by Marvel and the famous car crash scene. A serious lapse in Dr. Stephen Vincent Strange judgement, wilfully distracting himself while driving, took him down the hill crashing. The car in the scene is a 10-cylinder, $237,250, Lamborghini Huracán LP610. Setting time travel theory aside, could Artificial Intelligence have foreseen this harrowing crash before it happen. Could it have detect the car over speeding, moving out of lane and applied breaks. Or even better could it have let the doctor review his scans while the car takeover driving. Well in theory ‘Yes’. But…
We are reaching end of year 2020 and it appears that our driverless cars are receding into future?
From da Vinci’s Self-Propelled Cart in 1500 centuries — to — John McCarthy’s Dartmouth Conference, at which the term ‘Artificial Intelligence’ was first adopted in 1956 — to — Tesla’s first test drive of autonomous car in 2015, we have come a long way in auto tech advancements. Last decade witnessed big advances in AI, computer vision and object recognition, and speech generation, which gave us an optimistic 2020 dream of cursing in a hands-free car.
An enormous sum of USD 16 billion was spent until 2019 by large auto tech players, including Waymo, Tesla, GM, Uber, Baidu, Ford, and Toyota, in their mobility automation race. By 2025, $85 billion spend in estimated, increasing up to $225 billion by 2023.
Despite these extraordinary efforts and substantial expenditure, a full automation in our vehicles is still distinct reality, except in special trial programs. Even Gartner, a research and advisory firm, has now placed ‘autonomous vehicles’ in the trough of disillusionment of their yearly Hype Cycle.
So how complex is it to build a fully driverless car? Let’s understand its technological facets –
Self-driving vehicles turned out to be much more difficult engineering challenge than initially anticipated by the experts. It requires a complex interaction of numerous digital and physical systems which can emulate human behaviours like react to weather conditions, and make judgment calls on vehicle and pedestrians right of way etc.
In the subsequent explanation we have elided the technical complexity to provide simple explanation on various elements of autonomous vehicles.
Varying degrees of automation in our vehicles are represented as Levels (L0-L5). L0 stand for no automation; L1 — for Advanced Driver Assistance Systems (ADAS) controlling steering or speed to support the driver; L2 — for autonomously controlling both steering and acceleration simultaneously; L3- for Conditional automation, where the system can drive without the need for a human to monitor and respond; L4 — for high automation systems which can fully drive themselves under certain conditions; L5 — for full automation, having same mobility as a human driver.
L1and L2 automation cars are already available in mid-luxury segment cars and reportedly their sales are soaring. Toyota, Tesla, Nissan, Ford, and BMW are taking the lead in number of cars sold in 2019. L3 is mostly at regulatory clearance stage, Mercedes and BMW are expected to launch their L3 autonomous capability cars by 2021. While, the companies are currently focusing on L4 style of architecture with central control, L5 automation is in troubled waters due to various setbacks.
L4 & L5 automation successes is essentially dependent on evolution of AI and sensor technology. Maturity of technologies like voice search, voice and speech recognition, motion detection, image recognition and processing, and data analysis are the building blocks for the vehicle to act independently without human intervention. These technologies enable our cars to perceive the environment, to process inputs and decide the vehicle’s path, and to act upon decisions all by itself.
Gathering 360° input data is the first step. Multiple sensors are deployed to do just that
As an autonomous vehicle operates in dynamic environment (roads, terrains), it needs to build a map of this environment and localise itself within the map. The input to perform this Simultaneous Localisation and Mapping (SLAM) process needs to come from sensors and pre-existing maps created by AI systems and humans.
Both active and passive sensors such as RADAR, LIDAR, thermal cameras and digital cameras, GNSS, Ultrasound are deployed for the task. Thermal and digital cameras use CCD (charge-coupled device) or CMOS (complementary metal-oxide semiconductor) image sensors which capture and change the signal received in wavelengths (visible to near infrared spectra) to an electric signal. They are useful for detection of hot bodies, such as pedestrians or animals, for gathering visual field information, and for peak illumination situations such as the end of a tunnel etc.
Active sensors such RADAR (Radio Detection And Ranging), LIDAR (Light Detection And Ranging), Ultrasound, have a signal transmission source and rely on the principle of time-of-flight (ToF) to sense the environment. ToF measures the travel time of a signal from its source to a target, by waiting for the reflection of the signal to return. Most of these sensors are adversely effected by their limitations, for instance environmental conditions like rain and dust, signal range, and other signal interference in the field. LIDAR is most commonly used active sensor as it a provides a 3D map of up to 250m in range by detecting objects and their movement with fewer limitations. New LIDAR sensors can enable the vehicle to see objects 150–250 m away.
Vehicle manufacturers use a mixture of cameras and ToF sensors strategically located to overcome the shortcomings of the specific technology e.g. Tesla’s Model S uses a forward mounted radar to sense the road, 3 forward facing cameras to identify road signs, lanes and objects, and 12 ultrasonic sensors to detect nearby obstacles around the car.
Volvo-Uber uses a top mounted 360 degree Lidar to detect road objects, short and long range optical cameras to identify road signals and radar to sense close by obstacles.
Waymo uses a 360 degree LIDAR to detect road objects, 9 visual cameras to track the road and a radar for obstacle identification near the car.
Geo-mapping and sensor fusion for instating awareness in AV
Once the autonomous vehicle has scanned its environment, it can find its location on the road relative to other objects around it. This information is critical for lower-level path planning to avoid any collisions with objects in the vehicle’s immediate vicinity. In addition, geographical location, which translates to a latitude and longitude will also be required by the vehicle to know its relative local and global position on Earth in order to be able to determine a drive path.
Map services such as Google Maps are widely used for navigation, but HD maps may be required to increase the spatial and contextual awareness of autonomous vehicles. Other methods are also being explored to achieve the task such as Apple’s autonomous navigation system and Tesla’s high precision lane line maps. Wayve only uses standard sat-nav and cameras. While, MIT took a ‘map-less’ approach and used LIDAR sensors for all aspects of navigation, and only relying on GPS for a rough location estimate.
Based on all the raw data captured by the vehicle’s sensor and the pre-existing maps, the automated driving system uses Simultaneous localisation and mapping (SLAM) algorithms to construct and update a map of its environment while keeping track of its location in it. To improve SLAM accuracy, sensor fusion comes handy. Sensor fusion is the process of combining data from multiple sensors and databases to achieve improved information. Once its location on its map is known, the system can start planning which path to take to get from point A to point B.
Processing data using Machine Learning
A complex End-to-End solution based on deep learning algorithms handle all processing and decision making required to go from sensor data to actual motion. Each step of the process from sensing, localisation and mapping, path planning, and motion control is handled by a single, comprehensive software element that directly maps sensor inputs to driving actions.
These systems can be created with help of multiple different types of machine learning methods, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Deep Learning and Reinforcement Learning.
CNNs are mainly used to process images and spatial information to extract features of interest and identify objects in the environment.
RNNs are used to extract temporal information or to figure out how an object is moving in time. Such temporal information can be used by the self-driving car to correctly anticipate future actions of surrounding traffic and adjust its trajectory as needed.
DRL combines Deep Learning (DL) and Reinforcement Learning. DRL methods let software-defined ‘agents’ learn the best possible actions to achieve their goals in a virtual environment using a reward function.
Autonomous capabilities are achieved by training the system with colossal volumes of data
Machine learning algorithms are perfected when they are trained on data sets that represent realistic scenarios. Many open source datasets made available by researchers and companies including Aptiv, Lyft, Waymo, and Baidu, is used in semantic segmentation. The segmentation labels each pixel of an image with a corresponding class of what is being represented such as street objects, sign classification, pedestrian detection and depth prediction.
Autonomous vehicles rely on machine learning algorithms to not only perceive their environment but also to act on that data to control the car. Path planning can be taught to a CNN through imitation learning, in which the CNN tries to imitate the behaviour of a driver from billions of hours of footage of real driving. In more advanced algorithms, DRL is used, where a reward is provided to the autonomous system for driving in an acceptable manner.
Localised computational power is nécessiteux
Training neural networks and inference during operations of the vehicle requires enormous computing power. Most machine learning tasks are executed on cloud-based infrastructure with large computing power and cooling. However, with autonomous vehicles, use of only cloud many not be possible as the vehicle needs to be able to simultaneously react to new data. As such, part of the processing required to operate the vehicle needs to take place onboard, while model refinements could be done on the cloud.
Recent advances in machine learning are focusing on how the huge amount of data generated by the sensors onboard of autonomous vehicles can be efficiently processed to reduce the computational cost, using concepts such as attention or core-sets. In addition, advances in chip manufacturing and miniaturisation are increasing the computing capacity that can be mounted on an autonomous vehicle. With advances in networking protocols, cars might be able to rely on low-latency network-based processing of data to aid them in their autonomous operation.
Communicating and connecting with environment
Autonomous vehicles won’t gain widespread acceptance until the riding public feels assured of their safety and security, not only of passengers but also other vehicles and pedestrians. Hence, Vehicle to Vehicle (V2V), vehicle to other road participants (V2P) and vehicle to traffic infrastructure (V2I) information sharing is critical for autonomous vehicles. This communication provides better traffic management by interacting with autonomous and non-autonomous traffic and improving pedestrians safety.
The communication systems needed can be summed under the umbrella term of Vehicle-to-Everything (V2X) communications. Technology used to achieve this is Dedicated short-range communication (DSRC) or Cellular V2X or both. The communication between vehicles and other vehicles or devices is setup directly without network access through an interface called PC5. This interface is useful for basic safety services such as sudden braking warnings, or for traffic data collection. C-V2X also provides another communication interface called Uu, which allows the vehicle to communicate directly to the cellular network, a feature that DSRC does not provide.
Currently, C-V2X relies on fourth generation (LTE/4G) mobile networks, It is fast enough for gaming or streaming content but lack the speed and resilience required to sustain autonomous vehicle network operations. Arrival of 5G services can jet speed the accuracy and reliability of V2X communication technology. Main advantages of 5G include: greater data speeds (25–50% faster than 4G LTE), lower latency (25–40% lower than 4G LTE), and the ability to serve more devices. However, security of V2X communications and regulatory challenges remain unresolved.
Beyond the communication standard, the cloud network architecture is also a key component for autonomous vehicles. In this space, the infrastructure already developed by companies such as Amazon AWS, Google Cloud and Microsoft Azure for other applications is already mature enough to handle autonomous vehicle applications
Power, heat, weight, and size challenges are still a concern
Autonomous vehicles also faces challenges on the power consumption, thermal footprint, weight, and size of the vehicle components. The prime driver for high power consumption is the computational requirements, requiring to process more lines of code than any software platform or operating system that has been created so far.
Thermal performance of the vehicle is also necessary consideration, as increased processing demand and higher power throughput heats up the system. Cooling down electronic components and to keep them within certain temperature ranges, regardless of the vehicle’s external conditions is essential for the proper functioning of the system. But extra cooling systems – especially liquid based ones, extra components, extra wiring all add to weight and size to vehicle. One way to compensate for this is by reducing the size of LIDARs and other semiconductor components.
In 2020, the state of autonomous vehicles is such that no technology is yet capable of Level-5, full automation. Level 4, or high automation, has achieved the ability to drive without human supervision and interference, albeit under strictly defined conditions. Autonomous vehicle technology is looming under many unforeseen challenges for technology developers and scaled back projections for automakers. In order to make an autonomous future a reality significant collaboration throughout the auto and technology industry should be possible. Additionally, policymakers need to get on board with industry players in figuring out how to push this autonomous future forward. Undoubtedly, once achieved, this technology will change the world beyond individual personal transportation including public transportation, delivery & cargo, and specialty vehicles for farming and mining.