Introduction
A rapidly increasing number of satellites is orbiting Earth and collecting massive amounts of data. This data can be utilized in three principal ways: to train a machine learning (ML) model, to do inference, or to store it for later retrieval. We are interested in the first use case, where the conventional approach is to store the data locally and later aggregate it in a centralized location. But this requires massive amounts of bandwidth and storage, induces large delay and energy costs, and potentially infringes upon data ownership. Instead, recent developments strive to alleviate these downsides and bring the training process towards the data stored distributedly within the satellite constellation. This requires performing distributed ML (DML) within the constellation, where each satellite uses its collected data to locally train an ML model. The DML paradigm that operates in the context of systems with limited and intermittent connectivity is known as federated learning (FL) [1], [2]. There, the clients work independently on ML model updates based on their local data and share intermediate results among each other using a central parameter server (PS). Communication with this PS is an integral part of the training process and developing efficient communication protocols to support FL is recognized as one of the major design challenges for the sixth generation of communication systems [3]. Besides the apparent benefits of directly training an ML model close to the data, implementing FL within satellite constellations is also an important step towards seamless integration of terrestrial and non-terrestrial networks [4].