Integrated sensing and communication (ISAC) systems aim to jointly perform data transmission and environmental sensing, leading to coupled and often conflicting design objectives. This paper investigates a lightweight online learning framework for ISAC based on multi-armed bandit (MAB) algorithms. A finite beam codebook is used for sequential beam selection, while the sensing–communication trade-off is captured through a normalized reward combining communication rate and a CRLB-based sensing metric. To address time-varying environments, a sliding-window UCB strategy is adopted for beam selection and the trade-off parameter is adapted online on a slower timescale. Numerical results under abrupt changes, gradual drift, and multiple change points show that the proposed approach tracks environmental variations and improves dynamic-regret performance over fixed-α and stationary baselines.
Multi-Armed Bandit Learning for ISAC Systems in Time-Varying Environments / Taricco, Giorgio. - In: IEEE WIRELESS COMMUNICATIONS LETTERS. - ISSN 2162-2337. - (2026). [10.1109/lwc.2026.3696048]
Multi-Armed Bandit Learning for ISAC Systems in Time-Varying Environments
Taricco, Giorgio
2026
Abstract
Integrated sensing and communication (ISAC) systems aim to jointly perform data transmission and environmental sensing, leading to coupled and often conflicting design objectives. This paper investigates a lightweight online learning framework for ISAC based on multi-armed bandit (MAB) algorithms. A finite beam codebook is used for sequential beam selection, while the sensing–communication trade-off is captured through a normalized reward combining communication rate and a CRLB-based sensing metric. To address time-varying environments, a sliding-window UCB strategy is adopted for beam selection and the trade-off parameter is adapted online on a slower timescale. Numerical results under abrupt changes, gradual drift, and multiple change points show that the proposed approach tracks environmental variations and improves dynamic-regret performance over fixed-α and stationary baselines.| File | Dimensione | Formato | |
|---|---|---|---|
|
Multi-Armed_Bandit_Learning_for_ISAC_Systems_in_Time-Varying_Environments.pdf
accesso aperto
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Creative commons
Dimensione
770.79 kB
Formato
Adobe PDF
|
770.79 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/3011327
