How I work.
I'm an undergraduate at IIIT Hyderabad pursuing a combined B.Tech and M.S. by Research in Computer Science and Engineering. Most of my recent focus is in the Video-Language Group at CVIT: multimodal models, video understanding, and turning research ideas into reproducible experiments.
I also ship backend code in a startup context (APIs, integrations, cloud), and I use hackathons as a forcing function to go deep fast — on-device inference, geospatial ML under tiny labels, and agentic pipelines on GCP.
This site is where I publish longer case studies when I have a story worth the length; shorter notes live under "Notes."
- →Hackathons: seven wins including on-device AI (Qualcomm Megathon), agentic data quality (Lloyds × GCP), and flood mapping from SAR (AISE Hack).
- →Projects: Med Veda (on-device MedGemma Android), cross-lingual QA pipeline, xv6 demand paging.
- →Certifications: OpenCV with Python; Google Cybersecurity; Google AI Essentials.
Timeline
Undergraduate Researcher, Video-Language Group
CVIT, IIIT Hyderabad
Multimodal learning, video understanding, and vision-language models with Prof. Makarand Tapaswi.
Backend Developer
Hustlr (startup)
API design, server-side logic, and cloud integration for an early-stage product.
B.Tech + M.S. by Research in CSE
IIIT Hyderabad
UGEE rank 77. Coursework and research across ML, systems, and vision.
Higher secondary
SAHITTII Junior College
95.4%
Secondary
Viswabharati High School
99.16%
Stack
- C
- C++
- Python
- JavaScript
- SQL
- PyTorch
- TensorFlow
- scikit-learn
- OpenCV
- MediaPipe
- Linux
- POSIX
- TCP/UDP
- Sockets
- Docker
- Git
- Redis
- React
- Next.js
- Node.js
- React Native
- Tailwind
- PostgreSQL
- Supabase
- GCP
- BigQuery