Real-Time Question & Answer App using Gesture Recognition and OpenCV

Sem Onyalo
3 min readJul 18, 2020

--

Introduction

I look up from my phone while riding the subway and see a beautiful woman sitting across from me. In my high school days I could’ve never walked up to a woman offhandedly. But I’m a grown man now. Full of confidence, bravado, and caffeine. Still, I hesitate. Not because caffeine is wearing off, actually I can only see the left side of her face. It’s rush hour, the subway is packed, and a public rejection would not be a good start to my Monday morning. Plus, does a left eye and half a jawline indicate beauty?

Anyway, the episode got me thinking. Two more cups of coffee that morning would of had me staring down the barrel of public humiliation like a project manager asking the development team for an update right after standup. What would it have taken for the high school version of myself to approach that woman?

Project

To a shy person, public rejection feels worse than death. Nothing can replace the courage needed to initiate a conversation but getting a low-key rejection can soften the blow and help muster up the courage to start one in the first place.

Imagine if you had an app on your phone where you could get a yes or no answer with a simple nod or shake of the head. That’s what I’ll be explaining how to build in this article (link to source code is at the bottom of this article).

Haar Cascade Classifier

The first step in detecting a head nod or shake is to detect the face. I used a Haar cascade classifier to do this. A Haar cascade classifier calculates the pixel intensity in an image within the bounds of square-shaped patterns. Comparing a set of these calculations with a set already trained to recognize a face can detect where a face is in an image with over 95% accuracy!

Source: https://docs.opencv.org/

Lucas-Kanade Optical Flow

After detecting where the face is we then need to figure out how the face is moving. I used a method called Lucas-Kanade optical flow to achieve this. The Lucas-Kanade method is a mathematical equation that can predict how an object move over multiple frames by assuming that pixels close together move in a similar way. This method allows us to determine if the face is moving along the vertical axis (i.e. a yes) or along the horizontal axis (i.e. a no).

Question & Answer

The final step is to use our new yes/no detector in a question and answer flow. OpenCV provides a Haar cascade classifier and Lucas-Kanade implementation out of the box so all we need to do is build a Python application to put everything together.

Conclusion

That’s it! Relatively simple to build actually. In order to build a simple yes/no question & answer app we used a Haar cascade classifier to detect the person’s face, Lucas-Kanade optical flow to track how the face was moving, and OpenCV to turn it into an app.

OpenCV supports Android so turning this into a mobile app should be straight-forward. Although, not sure I’d recommend using it to try and get dates on the subway. You might get the same reaction my PM gets when he asks me for an update four hours after stand-up :)

Thanks for reading!

--

--