Top 5 Speech Recognition Open-Source Projects and Libraries with Most Stars On Github

Original article was published by MRINAL WALIA on Artificial Intelligence on Medium


1. DeepSpeech — 15, 340 stars

DeepSpeech is an open-source speech to text engine which can run in real-time using a model trained by machine learning techniques based on Baidu’s Deep Speech research paper and is implemented using Tensorflow.

DeepSpeech can run in real-time on devices ranging from a Rasberry Pi 4 to any power GPU Servers and it supports various platforms for its development such as: Linux, Android, Windows and macOS.

Its API supports:

  • C
  • .NET
  • Java
  • Javascript
  • Python

To start using a pre-trained model or train your own model with DeepSpeech, you can follow the links below:

Github

Official Documentation

Source: https://mycroft.ai/blog/deepspeech-update/?cn-reloaded=1

2. Leon — 7, 100 stars

Leon is an open-source personal assistant who can live on your server and is able to perform task when you ask him to. You can talk to him and he can talk to you, you can text him and he can text you back and the best part is Leon can communicate with you by being offline to protect your privacy.

Leon is open-source and uses AI concepts. It is built mainly using Node.js and Python and supported operating systems include: Linux, MacOS and Windows.

You can find what he is able to do by browsing the: packages list and read more about it by clicking on this Link:

Github

Official Documentation

Source: https://github.com/leon-ai/leon

3. Wav2letter — 5, 400 stars

Wav2letter++ is Facebook AI Research’s end-to-end Automatic Speech Recognition Toolkit written entirely in C++, supporting a wide range of models and learning techniques. It is often compared to DeepSpeech due to the many similarities in the two.

Wav2letter++ also embarks a very efficient modular beam-search decoder, for both structured learning (CTC, ASG) and seq2seq approaches.Their Github repository includes recipes to reproduce the following research papers as well as pre-trained models.

To start building Recipes, clone the project from:

Github

Source: https://www.techleer.com/articles/455-wav2letter-a-facebook-ai-researchfair-automatic-speech-recognition-toolkit/

4. Annyang — 5, 890 stars

Annyang is an Open-Source JavaScript Speech Recognition library that lets users control your site with your voice commands. It supports more than 75 languages, has no dependencies and is free to use and modify.

You can easily add a GUI(Graphical User Interface) for the user to interact with Speech Recognition using Speech KITT. Speech KITT is fully customizable and comes with many different themes, and instructions on how to create your own designs.

Github

Play with some live speech recognition demos

Source: https://github.com/TalAter/annyang

5. SpeechRecognition — 5, 120 stars

SpeechRecognition is a free and open-source module for performing speech recognition in Python, with support for several engines and APIs in both online and offline mode.

It has many usage examples:

Github

Official Documentation

Source: https://pypi.org/project/SpeechRecognition/