Introduction to hand detection in the browser with Handtrack.js and TensorFlow

Original article was published on Deep Learning on Medium


Including Assets

Next, we need to include an audio file in index.html and create a video element in order to access the webcam. Lastly, we need to add an index.js script file into our index.html file that contains all the JavaScript code. After including our index.html file, our source code should look like the code snippet below:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Hand tracking for prevent covid-19</title>
</head>
<body>
<audio src="./warning-sound.mp3" id="audio"></audio>
<video id="video"></video>
<script src="src/index.js"></script>

</body>
</html>

The next step is to add the handtrack.js library to our project. In order to do that, we need to run the following command in our project command prompt:

yarn add handtrackjs

Then, we need to open our src/index.js file and import the handtrack plugin, as shown below:

import * as handTrack from "handtrackjs";

Now, we need to initialize handtrack library plugin with default parameters named as modelParams. The modelParams constant holds the object with the handtrack plugin configurations.

By using the load method provided by the handTrack module, we’re going to load the parameters into the plugin and get the results, which we’re going to assign to a variable object called model. All the coding implementation for this is provided in the code snippet below:

const modelParams = {
flipHorizontal: true, // flip e.g for video
imageScaleFactor: 0.7, // reduce input image size for gains in speed.
maxNumBoxes: 20, // maximum number of boxes to detect
iouThreshold: 0.5, // ioU threshold for non-max suppression
scoreThreshold: 0.79, // confidence threshold for predictions.
}
let model;handTrack.load(modelParams).then((lmodel) => {
model = lmodel;
});

Our initialization of a handTrack instance is now complete. We’re now going to detect the hand in the webcam screen and fetch the data from the plugin.

Fetching the Webcam Stream Data

The process of fetching the webcam stream data is simple and easy. All we have to do is to make use of browser API named MediaDevices.getUserMedia().

First, we need to get the video and audio element using the querySelector method, as shown in the code snippet below:

const video = document.querySelector("#video");
const audio = document.querySelector("#audio");

Then, we integrate the handTrack object with the video source.

As a reminder, the process is to detect hands using the handtrack model by adding the video object to the detecting function. Then, we’ll get the prediction data.

Next, we need to run the function every second in order to get the data for each moment. The function is implemented as a getUsermedia function.

As a result, the data length will be zero if the hand doesn’t appear on the screen. And if a hand does appears on the screen, then the length of the data will be more than zero, as shown in the console result below:

By using the simple condition based on the data length, we can implement the audio function to trigger the sound when the hand appears on the webcam screen.

Hence, we have successfully completed our simple hand detection app.

Conclusion

In this post, we used the power of TensorFlow technology in the web JavaScript environment for the detection of hand through the webcam. We learned how to detect hand movement with Handtrack.js. The aim of this project was to detect the hand before it touches the face where we use a webcam for sending visual data to the system. The system with Handtrack.js and TensorFlow technology detects the hand and notifies the user with data. The project is just the start-up for what we can do using machine learning technology like TensorFlow. There are many other technologies that you can use and make this project better.

The full source code is available in this GitHub repo: