GestureFlow

hand-gesture media control · v1.2

Wave your hand.
The video obeys.
No touch. No remote.

21 hand landmarks tracked by MediaPipe, classified by simple geometric rules, mapped to YouTube keyboard shortcuts via PyAutoGUI. Built for the times when you're cooking, in a lab, or your hands just can't reach the keyboard.

60fps processing
<50msend-to-end latency
21landmarks tracked
7gesture vocabulary
2.3%false positive rate

01Gesture Vocabulary

Click any card to see the action that gesture triggers.

02Detection Pipeline

From camera frame to YouTube keystroke in under 50 milliseconds.

1
Camera
30 fps capture
2
MediaPipe
21 landmarks
3
Classify
rule-based
4
Debounce
250ms window
5
PyAutoGUI
key injection
12ms
Camera
18ms
MediaPipe
4ms
Classify
<1ms
Keystroke

03Why This Matters

Accessibility first

Built while thinking about users with limited mobility, motor-control differences, or temporary injuries. Repetitive reaching for a keyboard compounds over time — gesture control removes that friction. Works with any camera, no extra hardware.

Privacy-respecting by design: all inference runs locally. No frames or landmarks ever leave the device.

04Tech Stack

Python 3.11OpenCVMediaPipe HandsPyAutoGUINumPyTkinter (settings UI)