Original article was published by Katie Silverman on Artificial Intelligence on Medium
SPOT: The Future of Accessibility
When I was eight years old, I went into my vision checkup hoping that I would get glasses. I was in third grade, young enough that only a few kids had glasses or braces and that these were considered “cool.” It was unlikely, as my vision had always been good. In my left eye, my performance was decent — not perfect, but not in need of correction. However, when I covered my left eye, suddenly I couldn’t see… anything.
I don’t know how I hadn’t previously realized it, but through my right, the world was a soupy blur of gray with a few scattered bright colors. I said “E” based on my prior knowledge of the eye chart and was promptly issued the most ridiculous looking glasses you’ve ever seen, with a 20/400 coke bottle lense on one side and no prescription on the other.
Being semi-cycloptic has posed its challenges, to be sure. Even with my super-powerful contact lens, I have no depth perception, making walking into objects a legitimate (though often hilarious for bystanders) risk, sports a joke, and driving a death sentence. I avoid walking down stairs whenever possible because it requires an embarrassing amount of time and caution. However, as I navigate my two-dimensional world, I cannot even imagine the difficulties faced by those with little or no vision at all.
Across the world, 285 million people are visually impaired, of whom 39 million are blind. 18% of legally blind participants in a survey conducted by UC Santa Cruz experience head-level accidents (knocking their head into an unexpected object) more than once a month, and 23% of these accidents have medical consequences, requiring treatments including stitches, staples, plastic surgery, and dental treatment for broken teeth. 10% of legally blind participants trip and fall more than once a month, with 36% of falls having medical consequences. Participants said that they had required treatments ranging from stitches to orthopedic surgery to rehabilitation.
Overall, the world is a dangerous and even scary place, especially for someone who is visually impaired. This is where the challenge lies: how can we make it as easy as possible for people with visual impairments to navigate the world around them, so that the risks are close to or even the same as if they had no impairment at all?
This is where SPOT comes in. Our goal is to use artificial intelligence to improve accessibility for people with visual impairments, and to help every individual reach their maximum potential regardless of ability. While technologies such as Google’s Envision glasses are already available for preorder, we want to develop a more complete software that will improve the experiences of visually impaired people. More traditional aids, such as guide dogs and canes, are also not infallible, and are used by only 2–8% of the blind community.
SPOT glasses offer:
- Object recognition
- Face recognition
- Fast decision making
- GPS Navigation
- Text-to-speech (including handwriting)
- Haptic notifications (vibration)
- A discrete camera
- Bone conduction
- Transition lenses to protect the user from harmful UV rays
But how does it work?
YOLO (You Only Look Once) is a real-time object detection algorithm — one of the most effective out there. Many image analysis agents use classification algorithms, which can only determine what objects are present. However, YOLO can also tell where these objects are. Classification algorithms can also detect only one object at a time, while YOLO can run predictions on every object in its field of view with just one pass through its neural network (hence “only look once”), making it extremely fast.
YOLO allows SPOT users to get a complete idea of their surroundings very quickly, including signs, obstacles, an object they’re looking for (such as a door or their mug), or even a friend’s face. This is vital to performing certain tasks, such as crossing the street, safely — SPOT is not only useful in determining whether a light is red or green, but in locating a crosswalk, or helping the user to “look both ways.”
Part of what makes SPOT different from existing technologies is that it is trained through a type of machine-learning called reinforcement learning. SPOT stands for Simulation-Powered Optic Technology, and as the name suggests, SPOT learns in a simulated environment, where it is rewarded for making good choices (e.g. correctly identifying an object, crossing the street safely, etc.). This means that in addition to recognizing its environment, it knows the optimal reaction. This is similar to how self-driving cars are being trained.
Voyage DeepDrive, a “deep reinforcement learning” algorithm for self-driving cars, uses a similar approach. With Voyage DeepDrive, the AI is trained on simulated streets and learns how to behave under certain conditions through a system of punishments and rewards , so that when it’s paired with image recognition software it already knows what to do.
SPOT relies on this synthesis of reinforcement learning and object recognition. SPOT is trained to recognize city streets for increased pedestrian safety, and can even be mapped to a specific user’s home.
One simple event that would require reinforcement learning is walking up and down a staircase. Ordinarily, there might be a risk of tripping, but by coupling image-recognition software with reinforcement learning, SPOT can recognize not only how many stairs there are, but also when the best time to step is.
Optical Character Recognition (OCR)
SPOT glasses also offer Optical Character Recognition, which can interpret and read out characters and words, including handwritten ones. This feature allows users to “read,” as the SPOT glasses can perform a text-to-speech function. This feature is especially important, as fewer than 10% of legally blind individuals in the U.S. can read Braille, and there is also a lack of availability of books in Braille — not to mention writing that is necessary and imperative to read, such as street signs.
Faster Reaction Times
A well-timed AI can respond to danger faster than a human (or guide dog), but humans still take time to process that reaction. On average, it takes humans 0.17 seconds to respond to auditory stimulus — SPOT glasses saying “Stop” when there’s a car nearby, for example. This reaction time is further cut down by features that utilize the user’s other senses — a vibration, for example, has a slightly shorter reaction time of 0.15 seconds, a small but crucial difference in a possibly life-threatening situation.
Another important feature of SPOT glasses is that they are discreet. Current models have large, bulky cameras — you might as well just strap a GoPro to your head. SPOT glasses have better-concealed cameras so that users can feel comfortable using the product without announcing their disability to the world. This is not only a matter of aesthetics but of safety — a person who is very obviously visually impaired might be taken advantage of or stolen from, and in the United States, people with disabilities are three times as likely to be victims of serious violence.
Rather than an external speaker, SPOT glasses utilizes bone conduction. Bone conduction bypasses the eardrum to send sound waves directly through your skin and skull, making instructions sound like a voice in the user’s head. This also makes the device accessible to hearing-impaired users, since hearing loss is frequently caused by damage to the eardrums and will not interfere with bone-conducted sound.
Affordability and Access
One of the biggest reasons why SPOT is necessary is because current technologies are just not affordable to those who need them the most. Envision glasses cost two thousand dollars, while the Envision app is priced at fourteen dollars per month. On the other hand, products such as the OrCam MyEye are not available in the United States unless you are an eligible veteran. Sight is a human right, which is why one of our most important goals is to make SPOT as accessible as possible.
Because SPOT glasses are designed to help people with visual impairments, not to be a cool gadget, they do not need features such as Bluetooth pairing, various apps, or a general operating system. While it’s possible to have more expensive models with those options, the base model includes only a frame, a small camera, a bone-conducted headset, and a small computer. This drastically reduces the cost of both making and purchasing the glasses.
For people with disabilities, the world can be difficult to navigate in ways that many of us take for granted. However, we are living in an age where technology can be used to close the accessibility gap between those living with and without disabilities. SPOT aims to make use of emerging AI technologies to benefit real people in a tangible way. Here’s a quick reminder of what makes SPOT different:
- YOLO-image recognition
- Simulation-powered reinforcement learning
- Optical Character Recognition
- Faster reaction times
- A sleek, subtle design
- Bone conduction
- An affordable model that’s accessible to those who need it the most
Thank you for reading! We hope that this technology can soon be a reality. For more information, please check out our website.