I am a CS PhD student at Visual Computing Center @ KAUST University supervised by prof. Mohamed Elhoseiny. I do deep learning and my research interests include generative models and hypernetworks. Besides, I am also hyped by NeRF and have a small weakness for adversarial robustness. Before that, I was a DL researcher at MIPT, doing NLP things. And before even that, I was a software engineer at Yandex.
Aligning Latent and Image Spaces to Connect the Unconnectable
We proposed an idea of positioning GAN's latent codes on the coordinates plane. This means that each latent code, when sampled, is getting associated with an \((x,y)\)-position of the 2D image plane and our generator computes a color of a pixel from the interpolation of the neighboring latent codes (instead of just a single global one). This allows us 1) to generate images of infinite size (by generating infinitely many latent codes and positioning them on the grid); and 2) connecting unrelated frames into a single, arbitrarily large panorama.
Adversarial Generation of Continuous Images
We built a GAN model that generates images in the implicit neural representation (INR) form. An INR is a function \(F(c)\) which takes coordinates \(c = (x, y)\) as input and predicts a pixel value \(v = (r, g, b)\). In this way, our generator is a hypernetwork that generates parameters for \(F(c)\). We proposed two techniques to scale such a model to real-world datasets: factorized multiplicative modulation (FMM) and multi-scale INRs. We achieved decent (for INR-based models) generative quality on LSUN Churches \(256^2\), LSUN Bedrooms \(256^2\), and FFHQ \(1024^2\) and showed a lot of interesting properties of INR-based decoders. At the end of the day, our approach turned out to be very similar to StyleGAN2 with 1x1 convolutions, coordinate embeddings, and nearest neighbor upsampling.
Class Normalization for (Continual?) Generalized Zero-Shot Learning
In this paper, we dived into normalization techniques used in zero-shot learning (ZSL). We showed how scaled cosine similarity and attributes normalization influences signal's variance inside a model. We showed that for deeper models, there is a need for other normalization procedures and developed class normalization, which is similar to batch normalization but applied across the class dimension. Using class normalization, we built an MLP model that achieves state-of-the-art performance and trains x50-200 times faster than the current SotA. We also formulated a novel continual zero-shot learning problem and tested our approach in that setup.
Loss Landscape Sightseeing with Multi-Point Optimization
Beyond First Order Methods in ML workshop, NeurIPS 2019
Using mode connectivity ideas, we searched loss landscapes of different neural networks for different visual patterns. Due to the extreme overparametrization, it turned out that any pattern can be found inside the surface. This indicates that the loss landscapes of deep models are very complex and contain many irregularities.
While the existing interpolation techniques (nearest neighbour, bilinear, Lanczos, Hamming, etc.) assume that the known points positions construct a uniform grid, it is not always the case. Moreover one would like to backpropagate through these points positions. In this project, I implemented a CUDA kernel for points interpolation on a non-uniform grid based on the Gaussian Mixture Model.
RtRs is a small ray-tracing/rasterization engine written in rust. It works on both meshes and traditional quadrics and has some cool features however, like distributed RT/BVHs/arcball rotations/etc.
Omniplan was extensively used at my previous work but didn't have any web interface which made everyone annoyed. So I built one using their official API.
During the past 3 years, I had been building a framework for running deep learning experiments in pytorch and using it in my research projects. It is very similar to pytorch-lightning + hydra, but without a proper documentation and testing ¯\_(ツ)_/¯
An ALCQ description logic reasoner based on the tableau algorithm.