πββοΈ Hello there, Iβm Vaibhav!
About Me**
I am an MS by Research scholar at the Center for Machine Intelligence and Data Science (C-MInDS), IIT Bombay, where I work on core problems in Machine Learning, Computer Vision, and Generalizable Recognition Systems.
My research focuses on building models that adapt across visual domains, discover unseen categories, and remain robust in open-world settings. I am broadly interested in Domain Generalization (DG), Generalized Category Discovery (GCD), Representation Learning, and Fine-Grained Recognition.
π¬ Research Overview
My current research explores the fundamental question:
βHow can we build vision systems that generalize to new domains and discover new categories without supervision?β
I work at the intersection of:
- Domain Generalization (DG)
- Generalized Category Discovery (GCD)
- Open-world and continual recognition
- Hyperbolic and geometry-aware representation learning
- Vision Transformers & generative models (VAE/GAN/GS)
My work has been accepted at top-tier A* venues like CVPR 2025 and NeurIPS 2025.
π Selected Publications
When Domain Generalization meets Generalized Category Discovery
CVPR 2025 (A*)
Vaibhav Rathore, Shubhranil B., Saikat Dutta, Sarthak Mehrotra, Zsolt Kira, Biplab Banerjee
- Introduces DG2CD-Net, enabling models to adapt on the fly using episodic and synthetic domains.
- Works even when training and testing domains are drastically different.
π― Links:
- π Paper: Paper Link
- π» Code: https://github.com/Shubh-Nil/D_GCD
- π Project Page: https://shubh-nil.github.io/DG-GCD/
HIDISC: A Hyperbolic Framework for Domain Generalization with GCD
NeurIPS 2025 (A*)
Vaibhav Rathore, Divyam Gupta, Biplab Banerjee
- Hyperbolic geometryβbased DG-GCD model outperforming episodic baseline methods.
- Eliminates synthetic domain generation while improving generalization.
π― Links:
- π Paper: Paper Link
- π» Code: https://github.com/Vaibhavrathore1999/HiDISC
- π Project Page: https://vaibhavrathore1999.github.io/HiDISC/
M3 Questioning: Multimodal, Multi-span Medical QA
Under Review @ ACM Health
π― Links:
- π Paper: Draft under review
- π» Code: TBD
- π Project Page: Coming soon
π§ͺ Research Internships
Sony Research India β 3D Vision & Gaussian Splatting (2025)
Built a complete automatic 3D lip-sync pipeline using Gaussian Splatting and 3D avatars.
Motilal Oswal Financial Services β 3D Avatars & OmniSync (2025)
Evaluated Latent Sync, OmniSync, and GS for realistic 3D avatar lip-sync.
Clinical AI Assistance β LLMs & Medical AI (2024)
Fine-tuned LLMs via LoRA and built a multimodal diagnostic system integrating text + imaging modalities.
Reliance Industries β Predictive Analytics (2022β23)
Developed ML systems for predictive maintenance and operational optimization.
π§βπ« Teaching Experience
Teaching Assistant, IIT Bombay
- DS 303 Β· Intro to Machine Learning
- ME 781 Β· Statistical ML & Data Mining
- e-PGD Β· Python Programming
π οΈ Technical Skills
Programming: Python, C++, C, Java, Bash
ML/DL: PyTorch, TensorFlow, scikit-learn, OpenCV, YOLO
Generative AI: VAE, GANs, Gaussian Splatting, Latent Sync
Tools: HuggingFace, Pandas, Numpy
Web: Streamlit
Big Data: SAP, visualization tools
π₯ Achievements
- Top 1 percentile β JEE Main 2018 (among 1.2M candidates)
- Top 1 percentile β GATE 2022 (ME & XE)
- Publications in CVPR 2025 and NeurIPS 2025
π― Research Goals
I aim to build reliable, domain-agnostic visual systems capable of:
- Discovering unknown categories
- Adapting with minimal supervision
- Operating robustly in dynamic, real-world environments
I am deeply motivated by advancing foundational ML concepts that will power the next generation of generalizable AI systems.
π« Contact
π IIT Bombay, Mumbai
π§ vaibhav.rathor.in@gmail.com
π Google Scholar
π GitHub
π LinkedIn
π ORCID
π OpenReview
