Papers
arxiv:2602.04396

LoRDO: Distributed Low-Rank Optimization with Infrequent Communication

Published on Feb 4
Authors:
,
,
,
,
,
,
,

Abstract

LoRDO is a framework that combines low-rank optimization with infrequent synchronization to improve distributed training efficiency while maintaining performance parity with traditional methods.

Distributed training of foundation models via DDP is limited by interconnect bandwidth. While infrequent communication strategies reduce synchronization frequency, they remain bottlenecked by the memory and communication requirements of optimizer states. Low-rank optimizers can alleviate these constraints; however, in the local-update regime, workers lack access to the full-batch gradients required to compute low-rank projections, which degrades performance. We propose LoRDO, a principled framework unifying low-rank optimization with infrequent synchronization. We first demonstrate that, while global projections based on pseudo-gradients are theoretically superior, they permanently restrict the optimization trajectory to a low-rank subspace. To restore subspace exploration, we introduce a full-rank quasi-hyperbolic update. LoRDO achieves near-parity with low-rank DDP in language modeling and downstream tasks at model scales of 125M--720M, while reducing communication by approx 10 times. Finally, we show that LoRDO improves performance even more in very low-memory settings with small rank/batch size.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2602.04396
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.04396 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.04396 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.04396 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.