Zhao, W;
Chen, Y;
Lee, R;
Qiu, X;
Gao, Y;
Fan, H;
Lane, ND;
(2024)
Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages.
In:
12th International Conference on Learning Representations: ICLR 2024.
ICLR: Vienna, Austria.
Preview |
Text
Chen_1909_Breaking_Physical_and_Lin.pdf Download (1MB) | Preview |
Abstract
Pre-trained large language models (LLMs) have emerged as a cornerstone in modern natural language processing, with their utility expanding to various applications and languages. However, the fine-tuning of multilingual LLMs, particularly for low-resource languages, is fraught with challenges steming from data-sharing restrictions (the physical border) and from the inherent linguistic differences (the linguistic border). These barriers hinder users of various languages, especially those in low-resource regions, from fully benefiting from the advantages of LLMs. To address these challenges, we propose the Federated Prompt Tuning Paradigm for multilingual scenarios, which utilizes parameter-efficient fine-tuning while adhering to data sharing restrictions. We have designed a comprehensive set of experiments and analyzed them using a novel notion of language distance to underscore the strengths of this paradigm: Even under computational constraints, our method not only bolsters data efficiency but also facilitates mutual enhancements across languages, particularly benefiting low-resource ones. Compared to traditional local cross-lingual transfer tuning methods, our approach achieves 6.9% higher accuracy, reduces the training parameters by over 99%, and demonstrates better stability and generalization. Such findings underscore the potential of our approach to promote social equality and champion linguistic diversity, so that no language will be left behind. Our code is released at https://github.com/Ryan0v0/multilingual_borders.
Type: | Proceedings paper |
---|---|
Title: | Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages |
Event: | 12th International Conference on Learning Representations: ICLR 2024 |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://openreview.net/forum?id=NufVaypXo9 |
Language: | English |
Additional information: | This is an open access article under the CC BY- SA 4.0 license (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
Keywords: | Multilingual Federated Learning, Natural Language Processing |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10211437 |
Archive Staff Only
![]() |
View Item |