πŸ™‹β€β™‚οΈ Hello there, I’m Vaibhav!

About Me**

I am an MS by Research scholar at the Center for Machine Intelligence and Data Science (C-MInDS), IIT Bombay, where I work on core problems in Machine Learning, Computer Vision, and Generalizable Recognition Systems.

My research focuses on building models that adapt across visual domains, discover unseen categories, and remain robust in open-world settings. I am broadly interested in Domain Generalization (DG), Generalized Category Discovery (GCD), Representation Learning, and Fine-Grained Recognition.


πŸ”¬ Research Overview

My current research explores the fundamental question:

β€œHow can we build vision systems that generalize to new domains and discover new categories without supervision?”

I work at the intersection of:

  • Domain Generalization (DG)
  • Generalized Category Discovery (GCD)
  • Open-world and continual recognition
  • Hyperbolic and geometry-aware representation learning
  • Vision Transformers & generative models (VAE/GAN/GS)

My work has been accepted at top-tier A* venues like CVPR 2025 and NeurIPS 2025.


πŸ“ Selected Publications

When Domain Generalization meets Generalized Category Discovery

CVPR 2025 (A*)
Vaibhav Rathore, Shubhranil B., Saikat Dutta, Sarthak Mehrotra, Zsolt Kira, Biplab Banerjee

  • Introduces DG2CD-Net, enabling models to adapt on the fly using episodic and synthetic domains.
  • Works even when training and testing domains are drastically different.

🎯 Links:

  • πŸ“ Paper: Paper Link
  • πŸ’» Code: https://github.com/Shubh-Nil/D_GCD
  • πŸ”— Project Page: https://shubh-nil.github.io/DG-GCD/

HIDISC: A Hyperbolic Framework for Domain Generalization with GCD

NeurIPS 2025 (A*)
Vaibhav Rathore, Divyam Gupta, Biplab Banerjee

  • Hyperbolic geometry–based DG-GCD model outperforming episodic baseline methods.
  • Eliminates synthetic domain generation while improving generalization.

🎯 Links:

  • πŸ“ Paper: Paper Link
  • πŸ’» Code: https://github.com/Vaibhavrathore1999/HiDISC
  • πŸ”— Project Page: https://vaibhavrathore1999.github.io/HiDISC/

M3 Questioning: Multimodal, Multi-span Medical QA

Under Review @ ACM Health

🎯 Links:

  • πŸ“ Paper: Draft under review
  • πŸ’» Code: TBD
  • πŸ”— Project Page: Coming soon

πŸ§ͺ Research Internships

Sony Research India β€” 3D Vision & Gaussian Splatting (2025)

Built a complete automatic 3D lip-sync pipeline using Gaussian Splatting and 3D avatars.

Motilal Oswal Financial Services β€” 3D Avatars & OmniSync (2025)

Evaluated Latent Sync, OmniSync, and GS for realistic 3D avatar lip-sync.

Clinical AI Assistance β€” LLMs & Medical AI (2024)

Fine-tuned LLMs via LoRA and built a multimodal diagnostic system integrating text + imaging modalities.

Reliance Industries β€” Predictive Analytics (2022–23)

Developed ML systems for predictive maintenance and operational optimization.


πŸ§‘β€πŸ« Teaching Experience

Teaching Assistant, IIT Bombay

  • DS 303 Β· Intro to Machine Learning
  • ME 781 Β· Statistical ML & Data Mining
  • e-PGD Β· Python Programming

πŸ› οΈ Technical Skills

Programming: Python, C++, C, Java, Bash
ML/DL: PyTorch, TensorFlow, scikit-learn, OpenCV, YOLO
Generative AI: VAE, GANs, Gaussian Splatting, Latent Sync
Tools: HuggingFace, Pandas, Numpy
Web: Streamlit
Big Data: SAP, visualization tools


πŸ₯‡ Achievements

  • Top 1 percentile β€” JEE Main 2018 (among 1.2M candidates)
  • Top 1 percentile β€” GATE 2022 (ME & XE)
  • Publications in CVPR 2025 and NeurIPS 2025

🎯 Research Goals

I aim to build reliable, domain-agnostic visual systems capable of:

  • Discovering unknown categories
  • Adapting with minimal supervision
  • Operating robustly in dynamic, real-world environments

I am deeply motivated by advancing foundational ML concepts that will power the next generation of generalizable AI systems.


πŸ“« Contact

πŸ“ IIT Bombay, Mumbai
πŸ“§ vaibhav.rathor.in@gmail.com
πŸ”— Google Scholar
πŸ”— GitHub
πŸ”— LinkedIn
πŸ”— ORCID
πŸ”— OpenReview