Can Artificial Intelligence steer a satellite? Researchers test Deep Reinforcement Learning in space systems

08/10/2025

A team from the Information Processing and Telecommunications Center (IPTC) at Universidad Politécnica de Madrid has taken a bold step toward applying Artificial Intelligence to one of the most safety-critical tasks in space: keeping a satellite correctly oriented in orbit. Their latest study explores the feasibility of Deep Reinforcement Learning (DRL) to control the attitude of small satellites in real time, focusing on the UPMSat-2 mission as a testbed.

Figure 1 illustrates the fundamental structure of a reinforcement learning system, where an agent learns to interact with the environment through actions, states, and rewards. This framework is the basis for training the controller used in the UPMSat-2 mission.

Unlike non-AI controllers, which rely on fixed mathematical models, the DRL agent learns its behavior through interaction with its environment. Using the Proximal Policy Optimization (PPO) algorithm, the researchers trained and refined an AI controller that could manage the satellite’s detumbling, stabilization, and steady phases. After extensive simulation, the trained agent was embedded into a real-time processor, where it successfully met strict software requirements such as predictable timing, bounded memory use, and robust performance under uncertainty.

Figure 2. For real-time validation, a Simulink-based environment was implemented, allowing interaction between sensors, actuators, and the embedded processor. Figure 2 shows the configuration used for these tests.

This marks one of the first demonstrations that AI-based controllers can be compatible with aerospace safety standards, opening the door to a new generation of autonomous, resilient, and adaptable spacecraft. Beyond satellites, the methodology could be transferred to avionics, robotics, and even autonomous vehicle domains where controllers must make rapid, reliable decisions under dynamic conditions.

With this work, the IPTC team shows that AI is not just a promising idea for space; it is becoming a viable technology for the next wave of autonomous missions.

Alejandro Alonso Muñoz: GS / ORCID / LinkedIn

Ángel Grover Pérez Muñoz:GS/ ORCID / LinkedIn

Bibliographic reference:

Grover-Pérez, Á., López-García, G., García-Villoria, I., Alonso, A., Porras-Hermoso, Á. & Pérez, M. Feasibility of Deep Reinforcement Learning for the real-time attitude control of a satellite system. Journal of Systems Architecture, 167, pp. 103513, https://doi.org/10.1016/j.sysarc.2025.103513

Images from the article:

Fig.1. Reinforcement learning elements. (p.3)

Fig.2. Configuration of the validation on an embedded system. In the figure, MGM refers to magnetometers and MGT to magnetorquers. (p.13)

For more information: www.iptc.upm.es

LinkedIn: https://www.linkedin.com/company/iptc-upm/