Federated Foundation Models
Training and adapting vision-language models, multimodal systems, and foundation models across decentralized data sources while preserving privacy, ownership, and deployment control.
I am a doctoral researcher at the Indian Institute of Technology Bombay, working under the supervision of Prof. Amit Sethi in the MedaL Lab. My research lies at the intersection of federated learning, trustworthy AI, healthcare, and distributed deep learning, with a focus on building models that can learn across decentralized data silos without compromising privacy, ownership, or institutional control.
My broader research direction is motivated by a simple systems-level constraint: in high-value domains such as healthcare, scientific research, and institutional AI, data often cannot be centralized. This makes AI development not only a modeling problem, but also an infrastructure problem — requiring algorithms, runtimes, and governance-aware systems that can move computation closer to data while still achieving strong generalization and scalability.
I work on federated domain generalization, long-tailed federated learning, personalized federated models, privacy-preserving representation learning, and federated adaptation of vision-language and foundation models. Alongside my doctoral research, I serve as System Administrator for the MedaL Lab, managing NVIDIA DGX systems, Linux servers, and high-performance GPU clusters that support large-scale medical AI research.
My work has been accepted at venues and workshops associated with ICML, CVPR, AAAI, NeurIPS, ICLR, IEEE BigData, and BMVC. I have also reviewed for leading AI venues including CVPR, NeurIPS, ICLR, AAAI, ICML, and AISTATS. Beyond academia, I have received competitive research and applied-science opportunities from Mayo Clinic Rochester MN as Research Fellow, Google DeepMind, Microsoft Research, Adobe Research, Samsung R&D Institute, and Sony Research.
Training and adapting vision-language models, multimodal systems, and foundation models across decentralized data sources while preserving privacy, ownership, and deployment control.
Exploring how nations, institutions, hospitals, and research networks can collaboratively build AI systems without surrendering control over sensitive data or derivative models.
Studying the systems layer behind scalable AI: GPU clusters, distributed optimization, model-update aggregation, communication efficiency, checkpointing, and performance bottlenecks.
Designing collaborative intelligence across decentralized agents, where autonomous systems coordinate across tools, memory, data, and institutional boundaries.
Developing privacy-preserving training strategies using differential privacy, secure aggregation, controlled representation sharing, and fairness-aware optimization.
Building robust models that generalize across hospitals, scanners, populations, and domains without requiring centralized access to all training data.
My research aligns with the emerging shift from centralized AI development toward distributed, sovereign, and infrastructure-aware AI systems. As foundation models become essential infrastructure, the central question is no longer only how to build accurate models, but how to train, adapt, evaluate, and deploy them across many data owners, compute environments, and governance constraints.
I am especially interested in the intersection of:
This connects my PhD work in federated healthcare AI with broader infrastructure challenges in frontier AI, where data, compute, model ownership, and deployment control are increasingly distributed.
, Shambhavi Shanker, Amit Sethi · Accepted for presentation at the AAAI 2026 Bridge Program: AI for Medicine and Healthcare (AIMedHealth), and published in the Proceedings of Machine Learning Research (PMLR), Singapore.
, Nikita Jangid, Amit Sethi · Accepted for presentation at the AAAI 2026 Bridge Program: AI for Medicine and Healthcare (AIMedHealth), and published in the Proceedings of Machine Learning Research (PMLR), Singapore.
, Shambhavi Shanker, Amit Sethi · Oral presentation at the AAAI 2026 1st Workshop on Federated Learning for Critical Applications (FLCA @ AAAI 2026), Singapore.
, Shounak Das, Amit Sethi · Accepted for presentation at the AAAI 2026 Workshop on AI for Robust Foundation Models (AIR-FM), Singapore.
, Nikita Jangid, Amit Sethi · Accepted as a Student Abstract and Poster at AAAI 2026, Singapore.
, Suraj Prasad, Amit Sethi · Accepted as a Student Abstract and Poster at AAAI 2026, Singapore.
, Shounak Das, Nikita Jangid, Amit Sethi · Oral presentation at the ICML 2025 Workshop on Collaborative and Federated Agentic Workflows (CFAgentic @ ICML'25), Vancouver, Canada.
, Vinay Sutar, Varunav Singh, Amit Sethi · Oral presentation at the 4th Workshop on Federated Learning for Computer Vision in conjunction with CVPR'25 (FedVision-2025), Nashville, USA.
, Nikita Jangid, Amit Sethi · Accepted for presentation at the AAAI Bridge Program 2025: AI for Medicine and Healthcare (AIMedHealth), to appear in PMLR, Philadelphia, USA.
, Pankhi Kashyap, Pranav Jeevan, Amit Sethi · Oral presentation at IEEE BigData 2024, Washington, DC, USA.
, Amit Sethi · Oral presentation at the International Workshop on Federated Foundation Models (FL@FM) at NeurIPS 2024, Vancouver, Canada.
Navyansh Mahla, , Amit Sethi · Accepted at the Open Science for Foundation Models (SCI-FM) Workshop, ICLR 2025, Vienna, Austria.
Pankhi Kashyap, Pavni Tandon, , Abhishek Tiwari, Ritwik Kulkarni, Kshitij Sharad Jadhav · The 35th British Machine Vision Conference (BMVC 2024), Glasgow, UK.
Developing and unifying federated learning methodologies for healthcare, with emphasis on privacy-preserving models, domain generalization, personalization, and decentralized learning across siloed medical data. My doctoral work studies how models can learn from distributed institutions without requiring raw data centralization, connecting algorithmic advances in federated learning with broader infrastructure questions in trustworthy and sovereign AI.
Completed the thesis “Identification of COVID-19 Disease Using Deep Learning Methods,” using transfer learning with CNN architectures including ResNet, AlexNet, VGG16, Inception, and DenseNet for chest X-ray diagnostics. This work established my foundation in medical imaging, deep learning, and clinically motivated AI systems.
Built foundations in signal processing, control systems, embedded computing, and systems thinking that now support my research in distributed, trustworthy, and infrastructure-aware AI.