🙋‍♂️ Hello there, I’m Vaibhav!

About Me**

I am an MS by Research scholar at the Center for Machine Intelligence and Data Science (C-MInDS), IIT Bombay, where I work on core problems in Machine Learning, Computer Vision, and Generalizable Recognition Systems.

My research focuses on building models that adapt across visual domains, discover unseen categories, and remain robust in open-world settings. I am broadly interested in Domain Generalization (DG), Generalized Category Discovery (GCD), Representation Learning, and Fine-Grained Recognition.

🔬 Research Overview

My current research explores the fundamental question:

“How can we build vision systems that generalize to new domains and discover new categories without supervision?”

I work at the intersection of:

Domain Generalization (DG)
Generalized Category Discovery (GCD)
Open-world and continual recognition
Hyperbolic and geometry-aware representation learning
Vision Transformers & generative models (VAE/GAN/GS)

My work has been accepted at top-tier A* venues like CVPR 2025 and NeurIPS 2025.

📝 Selected Publications

When Domain Generalization meets Generalized Category Discovery

CVPR 2025 (A*)
Vaibhav Rathore, Shubhranil B., Saikat Dutta, Sarthak Mehrotra, Zsolt Kira, Biplab Banerjee

Introduces DG2CD-Net, enabling models to adapt on the fly using episodic and synthetic domains.
Works even when training and testing domains are drastically different.

🎯 Links:

📝 Paper: Paper Link
💻 Code: https://github.com/Shubh-Nil/D_GCD
🔗 Project Page: https://shubh-nil.github.io/DG-GCD/

HIDISC: A Hyperbolic Framework for Domain Generalization with GCD

NeurIPS 2025 (A*)
Vaibhav Rathore, Divyam Gupta, Biplab Banerjee

Hyperbolic geometry–based DG-GCD model outperforming episodic baseline methods.
Eliminates synthetic domain generation while improving generalization.

🎯 Links:

📝 Paper: Paper Link
💻 Code: https://github.com/Vaibhavrathore1999/HiDISC
🔗 Project Page: https://vaibhavrathore1999.github.io/HiDISC/

M3 Questioning: Multimodal, Multi-span Medical QA

Under Review @ ACM Health

🎯 Links:

📝 Paper: Draft under review
💻 Code: TBD
🔗 Project Page: Coming soon

🧪 Research Internships

Sony Research India — 3D Vision & Gaussian Splatting (2025)

Built a complete automatic 3D lip-sync pipeline using Gaussian Splatting and 3D avatars.

Motilal Oswal Financial Services — 3D Avatars & OmniSync (2025)

Evaluated Latent Sync, OmniSync, and GS for realistic 3D avatar lip-sync.

Clinical AI Assistance — LLMs & Medical AI (2024)

Fine-tuned LLMs via LoRA and built a multimodal diagnostic system integrating text + imaging modalities.

Reliance Industries — Predictive Analytics (2022–23)

Developed ML systems for predictive maintenance and operational optimization.

🧑‍🏫 Teaching Experience

Teaching Assistant, IIT Bombay

DS 303 · Intro to Machine Learning
ME 781 · Statistical ML & Data Mining
e-PGD · Python Programming

🛠️ Technical Skills

Programming: Python, C++, C, Java, Bash
ML/DL: PyTorch, TensorFlow, scikit-learn, OpenCV, YOLO
Generative AI: VAE, GANs, Gaussian Splatting, Latent Sync
Tools: HuggingFace, Pandas, Numpy
Web: Streamlit
Big Data: SAP, visualization tools

🥇 Achievements

Top 1 percentile — JEE Main 2018 (among 1.2M candidates)
Top 1 percentile — GATE 2022 (ME & XE)
Publications in CVPR 2025 and NeurIPS 2025

🎯 Research Goals

I aim to build reliable, domain-agnostic visual systems capable of:

Discovering unknown categories
Adapting with minimal supervision
Operating robustly in dynamic, real-world environments

I am deeply motivated by advancing foundational ML concepts that will power the next generation of generalizable AI systems.

📫 Contact

📍 IIT Bombay, Mumbai
📧 vaibhav.rathor.in@gmail.com
🔗 Google Scholar
🔗 GitHub
🔗 LinkedIn
🔗 ORCID
🔗 OpenReview

Vaibhav Rathore(वैभव राठौर)

About Me**

🔬 Research Overview

📝 Selected Publications

When Domain Generalization meets Generalized Category Discovery

HIDISC: A Hyperbolic Framework for Domain Generalization with GCD

M3 Questioning: Multimodal, Multi-span Medical QA

🧪 Research Internships

Sony Research India — 3D Vision & Gaussian Splatting (2025)

Motilal Oswal Financial Services — 3D Avatars & OmniSync (2025)

Clinical AI Assistance — LLMs & Medical AI (2024)

Reliance Industries — Predictive Analytics (2022–23)

🧑‍🏫 Teaching Experience

🛠️ Technical Skills

🥇 Achievements

🎯 Research Goals

📫 Contact