Optimization Theory
Convergence analysis for minimax and LMO-based methods, including Frank-Wolfe variants and optimizers used in deep learning.
Research Lead in Machine Learning and Optimization
I am an M.S. candidate at the Moscow Institute of Physics and Technology (MIPT), Faculty of Applied Mathematics and Informatics, Department of Intelligent Data Analysis.
My work focuses on machine learning optimization, stochastic dynamics of SGD, LLM post-training, SFT and teacher distillation, inference acceleration, minimax optimization, and applied traffic assignment.
I work as a Research Lead at BRAIn Lab / MIRAI under the supervision of Alexander Beznosikov, collaborate with Demyan Yarmoshik at LAB MMO in Alexander Gasnikov's research group, and recently worked as a visiting research student at MBZUAI under Eduard Gorbunov.
I also collaborate on SGD analysis and multi-agent reinforcement learning with Andrei Leonidov's team.
Research Focus
Convergence analysis for minimax and LMO-based methods, including Frank-Wolfe variants and optimizers used in deep learning.
Experiments and theory for finite-step SGD dynamics beyond Brownian-motion approximations and standard Langevin models.
Post-training and efficiency projects on SFT, teacher distillation, pruning, early exit, multi-agent RL, and LLM training dynamics.
Selected Publications
Accepted to a NeurIPS 2025 workshop.
NeurIPS 2026 submission on stochastic dynamics of SGD.
Published in Computer Research and Modeling.
Presented at TFN-2025 and accepted to the Journal of Mathematical Sciences, Series B.
Presented at TFN-2025 and accepted to the Journal of Mathematical Sciences, Series B.
Presented at OPTIMA and accepted to CCIS.
Talks and Media
Public lecture on distillation, structured pruning, and early exit for large language models.
RIA Tomsk quoted my explanation of layer removal for making large language models smaller while keeping useful quality.
Intelligent Systems at Phystech highlighted my work on SGD dynamics and traffic-flow optimization in its yearly research review.
Presentation at the Traffic Flows on Networks conference at the Sirius Mathematics Center.
Abstract in the proceedings of the 67th MIPT conference on applied mathematics and computer science.
Talk and abstract at the 66th MIPT conference on Frank-Wolfe modifications for equilibrium transportation-flow assignment.
Projects
Research and engineering projects where I am the main author or a direct contributor.
Projects I lead or supervise with student teams; public links are shown once the repository is ready.
CV
B.S. in Applied Mathematics and Physics; currently an M.S. candidate in the Department of Intelligent Data Analysis.
Research Lead at BRAIn Lab / MIRAI under Alexander Beznosikov; collaboration with Demyan Yarmoshik and Alexander Gasnikov's LAB MMO; Visiting Research Student at MBZUAI under Eduard Gorbunov; current work on LLM post-training and inference acceleration, including Qwen/DeepScaleR teacher-SFT, RLVR/GRPO-style math training, reward parsing, benchmark reporting, and vLLM early-exit/adaptive decoding pipelines; work on SGD analysis and multi-agent reinforcement learning with Andrei Leonidov's team; former Data Analyst Intern at Yandex.
Python, C++, SQL, PyTorch, JAX, vLLM, TRL, Hugging Face workflows, SFT, teacher distillation, RLVR/GRPO-style training, reinforcement learning, LLMs, convex and nonconvex optimization, stochastic processes, multi-GPU server workflows, Linux, Git, YQL, DataLens, and Nirvana.