2ndQualcomm

Real-Time Pose Detection

Qualcomm Megathon 2024·OCT 2024

Real-time human pose estimation tuned to run smoothly on Snapdragon mobile hardware.

2nd

Result

2024

Year

Outcome

2nd place — a real-time pose detection pipeline balanced for speed, accuracy, and power on-device.

The problem

Run real-time human pose estimation on a phone — fast enough to feel live, accurate enough to be useful, and efficient enough not to drain the battery. The constraint that made it interesting was the hardware target: Snapdragon mobile silicon, not a desktop GPU.

Approach

A pose pipeline built on MediaPipe, then optimized for the Qualcomm Snapdragon platform using the QIDK toolkit:

Real-time camera-feed processing with skeletal landmark detection and on-screen visualization.
Inference tuned for the NPU, trading off model size, latency, and power to hit a smooth on-device frame rate.
An emphasis on the three-way balance — speed vs. accuracy vs. power — that decides whether on-device vision is actually usable.

Result

2nd place at the Qualcomm Megathon 2024 — and the foundation of on-device inference experience that paid off again the following year with SnapGen.

What I learned

The frame-rate/accuracy/power triangle is the whole game on mobile. You don't pick one; you tune all three against the target device.
Hardware-aware optimization compounds. The QIDK/NPU lessons here transferred directly to later on-device AI work.

Gallery

Scroll sideways · click any photo to enlarge