Sunny Gupta | Machine Learning Researcher

About

I am a doctoral researcher at the Indian Institute of Technology Bombay, working under the supervision of Prof. Amit Sethi in the MedaL Lab. My research lies at the intersection of federated learning, trustworthy AI, healthcare, and distributed deep learning, with a focus on building models that can learn across decentralized data silos without compromising privacy, ownership, or institutional control.

My broader research direction is motivated by a simple systems-level constraint: in high-value domains such as healthcare, scientific research, and institutional AI, data often cannot be centralized. This makes AI development not only a modeling problem, but also an infrastructure problem — requiring algorithms, runtimes, and governance-aware systems that can move computation closer to data while still achieving strong generalization and scalability.

I work on federated domain generalization, long-tailed federated learning, personalized federated models, privacy-preserving representation learning, and federated adaptation of vision-language and foundation models. Alongside my doctoral research, I serve as System Administrator for the MedaL Lab, managing NVIDIA DGX systems, Linux servers, and high-performance GPU clusters that support large-scale medical AI research.

My work has been accepted at venues and workshops associated with ICML, CVPR, AAAI, NeurIPS, ICLR, IEEE BigData, and BMVC. I have also reviewed for leading AI venues including CVPR, NeurIPS, ICLR, AAAI, ICML, and AISTATS. Beyond academia, I have received competitive research and applied-science opportunities from Mayo Clinic Rochester MN as Research Fellow, Google DeepMind, Microsoft Research, Adobe Research, Samsung R&D Institute, and Sony Research.

Current Focus

Federated Foundation Models

Training and adapting vision-language models, multimodal systems, and foundation models across decentralized data sources while preserving privacy, ownership, and deployment control.

Sovereign and Decentralized AI

Exploring how nations, institutions, hospitals, and research networks can collaboratively build AI systems without surrendering control over sensitive data or derivative models.

Distributed Training and AI Infrastructure

Studying the systems layer behind scalable AI: GPU clusters, distributed optimization, model-update aggregation, communication efficiency, checkpointing, and performance bottlenecks.

Multi-Agent Federated Learning

Designing collaborative intelligence across decentralized agents, where autonomous systems coordinate across tools, memory, data, and institutional boundaries.

Trustworthy Federated Systems

Developing privacy-preserving training strategies using differential privacy, secure aggregation, controlled representation sharing, and fairness-aware optimization.

Federated Domain Generalization

Building robust models that generalize across hospitals, scanners, populations, and domains without requiring centralized access to all training data.

Research Alignment

My research aligns with the emerging shift from centralized AI development toward distributed, sovereign, and infrastructure-aware AI systems. As foundation models become essential infrastructure, the central question is no longer only how to build accurate models, but how to train, adapt, evaluate, and deploy them across many data owners, compute environments, and governance constraints.

I am especially interested in the intersection of:

federated learning and distributed foundation-model training
communication-efficient model updates and aggregation
privacy-preserving adaptation of large models
sovereign and institution-controlled AI deployment
GPU/TPU-scale training infrastructure
trustworthy AI for healthcare and scientific domains
decentralized agentic systems and AI operating layers

This connects my PhD work in federated healthcare AI with broader infrastructure challenges in frontier AI, where data, compute, model ownership, and deployment control are increasingly distributed.

Publications

FedNeuro: Multi-Site fMRI Analysis Using Hypernetwork Personalized and Privacy Enhanced Federated Learning

Sunny Gupta, Shambhavi Shanker, Amit Sethi · Accepted for presentation at the AAAI 2026 Bridge Program: AI for Medicine and Healthcare (AIMedHealth), and published in the Proceedings of Machine Learning Research (PMLR), Singapore.
FedHypeVAE: Federated Learning with Hypernetwork Generated Conditional VAEs for Differentially Private Embedding Sharing

Sunny Gupta, Nikita Jangid, Amit Sethi · Accepted for presentation at the AAAI 2026 Bridge Program: AI for Medicine and Healthcare (AIMedHealth), and published in the Proceedings of Machine Learning Research (PMLR), Singapore.
FedVR: Variance Regularized Hypernetwork for Federated Domain Generalization

Sunny Gupta, Shambhavi Shanker, Amit Sethi · Oral presentation at the AAAI 2026 1st Workshop on Federated Learning for Critical Applications (FLCA @ AAAI 2026), Singapore.
BiPrompt: Bilateral Prompt Optimization for Visual and Textual Debiasing in Vision Language Models

Sunny Gupta, Shounak Das, Amit Sethi · Accepted for presentation at the AAAI 2026 Workshop on AI for Robust Foundation Models (AIR-FM), Singapore.
UniVarFL: Uniformity and Variance Regularized Federated Learning for Heterogeneous Data

Sunny Gupta, Nikita Jangid, Amit Sethi · Accepted as a Student Abstract and Poster at AAAI 2026, Singapore.
Federated Cross-Modal Style-Aware Prompt Generation

Sunny Gupta, Suraj Prasad, Amit Sethi · Accepted as a Student Abstract and Poster at AAAI 2026, Singapore.
FEDTAIL: Federated Long-Tailed Domain Generalization with Sharpness-Guided Gradient Matching

Sunny Gupta, Shounak Das, Nikita Jangid, Amit Sethi · Oral presentation at the ICML 2025 Workshop on Collaborative and Federated Agentic Workflows (CFAgentic @ ICML'25), Vancouver, Canada.
FedAlign: Federated Domain Generalization with Cross-Client Feature Alignment

Sunny Gupta, Vinay Sutar, Varunav Singh, Amit Sethi · Oral presentation at the 4th Workshop on Federated Learning for Computer Vision in conjunction with CVPR'25 (FedVision-2025), Nashville, USA.
FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator

Sunny Gupta, Nikita Jangid, Amit Sethi · Accepted for presentation at the AAAI Bridge Program 2025: AI for Medicine and Healthcare (AIMedHealth), to appear in PMLR, Philadelphia, USA.
FLeNS: Federated Learning with Enhanced Nesterov-Newton Sketch

Sunny Gupta, Pankhi Kashyap, Pranav Jeevan, Amit Sethi · Oral presentation at IEEE BigData 2024, Washington, DC, USA.
FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator

Sunny Gupta, Amit Sethi · Oral presentation at the International Workshop on Federated Foundation Models (FL@FM) at NeurIPS 2024, Vancouver, Canada.
Sequential Compression Layers for Efficient Federated Learning in Foundational Models

Navyansh Mahla, Sunny Gupta, Amit Sethi · Accepted at the Open Science for Foundation Models (SCI-FM) Workshop, ICLR 2025, Vienna, Austria.
Taming the Tail: Leveraging Asymmetric Loss and Padé Approximation to Overcome Medical Image Long-Tailed Class Imbalance

Pankhi Kashyap, Pavni Tandon, Sunny Gupta, Abhishek Tiwari, Ritwik Kulkarni, Kshitij Sharad Jadhav · The 35th British Machine Vision Conference (BMVC 2024), Glasgow, UK.

Experience

Sony Research India

Apr 2026 – June 2026

AI/ML Research Consultant Bengaluru, India

Driving applied AI/ML research in the User Engagement Research Technology vertical, with emphasis on scalable, production-aware model development.
Designing and optimizing ML and multimodal LLM pipelines under practical constraints such as latency, cost efficiency, throughput, and deployment reliability.
Leading systematic experimentation, benchmarking, and model evaluation to translate research insights into production-ready AI capabilities.
Providing specialized expertise in causal inference, model behavior analysis, and responsible AI to support cross-functional research, product, and engineering teams.

Samsung R&D Institute India-Bangalore

Feb 2025 – Aug 2025

PhD Intern – Agentic AI Systems, AIOS Bengaluru, India

Contributed to the design and development of AIOS, an artificial intelligence operating system for managing multimodal LLM-based intelligent agents under constrained compute and memory environments.
Proposed architectural mechanisms for resource isolation, scheduling, memory management, and tool-execution separation through a centralized AIOS kernel.
Designed runtime support for agent orchestration, including context management, access control, storage management, and concurrent task scheduling for multi-agent environments.
Explored systems-level challenges in agentic AI, including runtime control, tool safety, resource governance, and scalable multi-agent coordination.

IIT Bombay – MedaL Lab

Jan 2023 – Present

System Administrator Mumbai, India

Managing NVIDIA DGX systems, Linux servers, and high-performance GPU clusters that support large-scale medical AI and federated learning research.
Supporting distributed deep learning workloads, experiment scheduling, GPU resource allocation, system monitoring, storage management, and research infrastructure reliability.
Maintaining the compute backbone for privacy-preserving medical AI projects involving imaging, multimodal data, and cross-institutional research collaboration.
Bridging algorithmic research with infrastructure practice by supporting the systems needed to train, debug, and scale deep learning models.

IIM Ahmedabad

Oct 2021 – Nov 2022

Research Associate Remote, India

Applied machine learning, data analytics, and behavioral modeling to decision intelligence problems.
Built data-driven pipelines for empirical analysis, predictive modeling, and research experimentation.
Developed foundations in applied ML research, structured experimentation, and interdisciplinary AI problem solving.

IIT Guwahati

Sep 2021 – Dec 2022

Project Research Assistant Guwahati, India

Contributed to speech-driven medical diagnostics and multimodal AI systems.
Worked on healthcare-oriented machine learning pipelines involving signal processing, deep learning, and clinical decision support.
Built early research experience in medical AI, model evaluation, and domain-specific learning systems.

Program Committees & Peer Review

Program Committee Member

AAAI'26 AAAI Conference on Artificial Intelligence
ICML'25 International Conference on Machine Learning
ICLR'26 International Conference on Learning Representations
ICML-CFAgentic'25 Collaborative & Federated Agentic Workflows Workshop · ICML 2025

Reviewer

CVPR · NeurIPS · ICLR · AAAI · ICML · AISTATS

Education

Indian Institute of Technology Bombay

PhD in Machine Learning · Advisor: Prof. Amit Sethi · Mumbai, India

Developing and unifying federated learning methodologies for healthcare, with emphasis on privacy-preserving models, domain generalization, personalization, and decentralized learning across siloed medical data. My doctoral work studies how models can learn from distributed institutions without requiring raw data centralization, connecting algorithmic advances in federated learning with broader infrastructure questions in trustworthy and sovereign AI.

Maulana Azad National Institute of Technology (MANIT), Bhopal

M.Tech in Artificial Intelligence · Advisor: Prof. Dhirendra Pratap Singh · Bhopal, India

Completed the thesis “Identification of COVID-19 Disease Using Deep Learning Methods,” using transfer learning with CNN architectures including ResNet, AlexNet, VGG16, Inception, and DenseNet for chest X-ray diagnostics. This work established my foundation in medical imaging, deep learning, and clinically motivated AI systems.

Noida Institute of Engineering & Technology (NIET)

B.Tech in Electrical & Electronics Engineering · Greater Noida, India

Built foundations in signal processing, control systems, embedded computing, and systems thinking that now support my research in distributed, trustworthy, and infrastructure-aware AI.

Technical Skills

Machine Learning

Federated Learning Deep Learning Distributed Deep Learning Computer Vision Vision-Language Models Foundation Model Adaptation Domain Generalization Long-Tailed Learning Differential Privacy Secure Aggregation Representation Learning

AI Systems and Infrastructure

NVIDIA DGX GPU Clusters Linux Servers CUDA Distributed Training High-Performance Computing Model Benchmarking Experiment Management Resource Scheduling Cluster Monitoring

Frameworks and Tools

Python PyTorch TensorFlow JAX CUDA NumPy SciPy Git Linux LaTeX

Research and Engineering Strengths

Federated Optimization Communication-Efficient Learning Privacy-Preserving AI Multimodal AI Agentic AI Systems Model Evaluation Reproducible Experimentation Research-to-Prototype Translation

“Building AI that not only performs, but also preserves privacy, sovereignty, and control.”

Currently Building

Writing

Federated Learning for Healthcare: The Ideas Behind My Research

From Federated Learning to Sovereign Foundation Models

The Systems Layer Behind Federated Foundation Models

A Field Guide to Machine Learning Research Interviews