About

How I work.

I'm an undergraduate at IIIT Hyderabad pursuing a combined B.Tech and M.S. by Research in Computer Science and Engineering. Most of my recent focus is in the Video-Language Group at CVIT: multimodal models, video understanding, and turning research ideas into reproducible experiments.

I also ship backend code in a startup context (APIs, integrations, cloud), and I use hackathons as a forcing function to go deep fast — on-device inference, geospatial ML under tiny labels, and agentic pipelines on GCP.

This site is where I publish longer case studies when I have a story worth the length; shorter notes live under "Notes."

Highlights

→Hackathons: seven wins including on-device AI (Qualcomm Megathon), agentic data quality (Lloyds × GCP), and flood mapping from SAR (AISE Hack).
→Projects: Med Veda (on-device MedGemma Android), cross-lingual QA pipeline, xv6 demand paging.
→Certifications: OpenCV with Python; Google Cybersecurity; Google AI Essentials.

Contact

Timeline

Apr 2025 — Present

Undergraduate Researcher, Video-Language Group

CVIT, IIIT Hyderabad

Multimodal learning, video understanding, and vision-language models with Prof. Makarand Tapaswi.

2025

Backend Developer

Hustlr (startup)

API design, server-side logic, and cloud integration for an early-stage product.

Jul 2023 — May 2028

B.Tech + M.S. by Research in CSE

IIIT Hyderabad

UGEE rank 77. Coursework and research across ML, systems, and vision.

2022

Higher secondary

SAHITTII Junior College

95.4%

2020

Secondary

Viswabharati High School

99.16%

Stack

Languages

C
C++
Python
JavaScript
SQL

AI / ML & vision

PyTorch
TensorFlow
scikit-learn
OpenCV
MediaPipe

Systems & infra

Linux
POSIX
TCP/UDP
Sockets
Docker
Git
Redis

Web & data

React
Next.js
Node.js
React Native
Tailwind
PostgreSQL
Supabase
GCP
BigQuery