On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models

Liu, XH; Du, Y; Wang, J; Yu, Y; (2025) On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models. In: 13th International Conference on Learning Representations ICLR 2025. ICLR Green open access

[thumbnail of 2672_On_the_Optimization_Lands.pdf]

Preview

PDF
2672_On_the_Optimization_Lands.pdf - Accepted Version
Download (454kB) | Preview

Abstract

Training Large Language Models (LLMs) poses significant memory challenges, making low-rank adaptation methods an attractive solution. Previously, Low-Rank Adaptation (LoRA) addressed this by adding a trainable low-rank matrix to the frozen pre-trained weights in each layer, reducing the number of trainable parameters and optimizer states. GaLore, which compresses the gradient matrix instead of the weight matrix, has demonstrated superior performance to LoRA with faster convergence and reduced memory consumption. Despite their empirical success, the performance of these methods has not been fully understood or explained theoretically. In this paper, we analyze the optimization landscapes of LoRA, GaLore, and full-rank methods, revealing that GaLore benefits from fewer spurious local minima and a larger region that satisfies the PL<sup>∗</sup> condition, a variant of Polyak-Łojasiewicz (PL) condition, leading to faster convergence. Our analysis leads to a novel method, GaRare, which further improves GaLore by using gradient random projection to reduce computational overhead. Practically, GaRare achieves strong performance in both pre-training and fine-tuning tasks, offering a more efficient approach to large-scale model adaptation. Code is available at https://github.com/liuxhym/GaRare.git.

Type:	Proceedings paper
Title:	On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models
Event:	ICLR 2025
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://openreview.net/forum?id=pxclAomHat
Language:	English
Additional information:	This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	large language model, LoRA, optimization
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10212513

Downloads since deposit

18Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item