- This event has passed.
PhD Thesis Defense of Mr. Siddarth Asokan
September 15 @ 9:30 AM - 11:00 AM IST
Name of the Candidate: Mr. Siddarth Asokan
Ph.D. Supervisor: Prof. Chandra Sekhar Seelamantula (EE)
External Examiner: Prof. Santanu Chaudhury (Director, IIT Jodhpur; Professor, IIT Delhi)
Title of the Thesis: On the Optimality of Generative Adversarial Networks — A Variational Perspective
Date & Time: September 15, 2023; 9.30 AM (Coffee will be served during the defense)
Multimedia Classroom (MMCR), Department of Electrical Engineering, IIScAbstract:Generative adversarial networks are a popular generative modeling framework, where the task is to learn the underlying distribution of data. GANs comprise a min-max game between two neural networks, the generator and the discriminator. The generator transforms noise, typically Gaussian distributed, into a desired output, typically images. The discriminator learns to distinguish between the target samples and the generator output. The objective is to learn the optimal generator — one that can generate samples that perfectly confuse the discriminator. GANs are trained to either minimize a divergence function or an integral probability metric (IPM) between the data and generator distributions. Common divergences include the Jensen-Shannon divergence in the standard GAN (SGAN), the chi-squared divergence in least-squares GAN (LSGAN) and f-divergences in f-GANs. Popular IPMs include the Wasserstein-2 metric or the Sobolev metric. The choice of the IPM results in a constraint class over which the discriminator is optimized, such as Lipschitz-1 functions in Wasserstein GAN (WGAN) or functions with bounded energy in their gradients as in the case of Sobolev GAN. While GANs excel at generating realistic images, their optimization is not well understood. This thesis focuses on understanding the optimality of GANs, viewed from the perspective of Variational Calculus. The thesis is organized into three parts.In Part-I, we consider the functional analysis of the discriminator in various GAN formulations. In f-GANs, the functional optimization of the loss coincides with pointwise optimization as reported in the literature. We extend the analysis to novel GAN losses via a new contrastive-learning framework called Rumi-GAN, in which the target data is split into positive and negative classes. We design novel GAN losses that allow for the generator to learn the positive class while the discriminator is trained on both classes. For the WGAN IPM, we propose a novel variant of the gradient-norm penalty, and show by means of Euler-Lagrange analysis, that the optimal discriminator solves the Poisson partial differential equation (PDE). We solve the PDE via Fourier-series approximations and involving radial basis function (RBF) expansions. We extend the approach to image generation by means of latent-space matching in Wasserstein autoencoders (WAE). We also present generalizations to higher-order gradient penalties for the LSGAN and WGAN losses, and show that the optimal discriminator can be implemented by means of a polyharmonic spline interpolator, giving rise to the name PolyGANs. PolyGANs, implemented by means of an RBF discriminator whose weights and centers are evaluated in closed-form, results in superior convergence of the generator.In Part-II, we tackle the issue of choosing the input distribution of the generator. We introduce Spider GANs, a generalization of image-to-image translation GANs, wherein providing the generator with data coming from a closely related/“friendly neighborhood” source dataset accelerates and stabilizes training, even in scenarios where there is no visual similarity between the source and target datasets. Spider GANs can be cascaded, resulting in state-of-the-art performance when trained with StyleGAN architectures on small, high-resolution datasets, in merely one-fifth of the training time. To identify “friendly neighbors” of a target dataset, we propose the “signed Inception distance” (SID), which employs the PolyGAN discriminator to quantify the proximity between datasets.In Part-III, we extend the analysis performed in Part-I to GAN generators. In divergence-minimizing GANs, the optimal generator matches the gradient of its push-forward distribution with the gradient of the data distribution (known as the score), linking GANs to score-based Langevin diffusion. In IPM-GANs, the optimal generator performs flow-matching on the gradient-field of the discriminator, thereby deriving an equivalence between the score-matching and flow-matching frameworks. We present implementations of flow-matching GANs, and develop an active-contour-based technique to train the generator in SnakeGANs. Finally, we leverage the gradient field of the discriminator to evolve particles in a Langevin-flow setting, and show that the proposed discriminator-guided Langevin diffusion accelerates baseline score-matching diffusion without the need for noise conditioning.Venue:
Biography of the Candidate: Siddarth Asokan received a Bachelor of Engineering (B.E.) degree in 2017 with a specialization in Electronics and Communication Engineering from M.S. Ramaiah Institute of Technology, Bangalore. During 2016–2017, he worked in Robert Bosch Centre for Cyber-Physical Systems (RBCCPS) as a Project Intern on the Smart Cities Project. Subsequently, he joined RBCCPS as a direct PhD student in 2017 working under the guidance of Prof. Chandra Sekhar Seelamantula, and has since been with the Spectrum Lab, Department of Electrical Engineering. He received the Microsoft Research Fellowship in 2018, the Qualcomm Innovation Fellowship in 2019, 2021, 2022, and 2023 and the RBCCPS PhD Fellowship in 2020 and 2021. He is also a recipient of the Best Presenter Award at the AI/ML track of the IISc EECS Symposium 2023, and has been selected to present his PhD research at the Doctoral Consortium at the British Machine Vision Conference, 2023. His research interests are in signal processing, image processing and machine learning, focusing on building mathematical foundations of generative learning frameworks.
All are invited.