In many contexts where data is streamed on a large scale, such as video surveillance systems, there is a dual requirement: secure data storage and continuous access to audio and video content by third parties, such as human operators or specific business logic, even while the media files are still being collected. However, using transactions to ensure data persistence often limits system throughput and latency. This paper presents a solution that enables both high ingestion rates with transactional data persistence and near real-time, low-latency access to the stream during collection. This immediate access enables the prompt application of specialized data engineering algorithms during data acquisition. The proposed solution is particularly suitable for binary data sources such as audio and video recordings in surveillance systems, and it can be extended to various big data scenarios via well-defined general interfaces. The scalability of the approach is based on the microservice architecture. Preliminary results obtained with Apache Kafka and MongoDB replica sets show that the proposed solution provides up to 3 times higher throughput and 2.2 times lower latency compared to standard multi-document transactions.

Hermes, a low-latency transactional storage for binary data streams from remote devices / Scaffidi Militone, Gabriele; Apiletti, Daniele; Malnati, Giovanni. - In: DATA & KNOWLEDGE ENGINEERING. - ISSN 0169-023X. - 153:(2024). [10.1016/j.datak.2024.102315]

Hermes, a low-latency transactional storage for binary data streams from remote devices

Scaffidi Militone, Gabriele;Apiletti, Daniele;Malnati, Giovanni
2024

Abstract

In many contexts where data is streamed on a large scale, such as video surveillance systems, there is a dual requirement: secure data storage and continuous access to audio and video content by third parties, such as human operators or specific business logic, even while the media files are still being collected. However, using transactions to ensure data persistence often limits system throughput and latency. This paper presents a solution that enables both high ingestion rates with transactional data persistence and near real-time, low-latency access to the stream during collection. This immediate access enables the prompt application of specialized data engineering algorithms during data acquisition. The proposed solution is particularly suitable for binary data sources such as audio and video recordings in surveillance systems, and it can be extended to various big data scenarios via well-defined general interfaces. The scalability of the approach is based on the microservice architecture. Preliminary results obtained with Apache Kafka and MongoDB replica sets show that the proposed solution provides up to 3 times higher throughput and 2.2 times lower latency compared to standard multi-document transactions.
File in questo prodotto:
File Dimensione Formato  
02_DKE_Hermes--a-low-latency-transactional-storage-for-binar_2024_Data---Knowledge-.pdf

accesso riservato

Descrizione: Versione accessibile sul sito dell'editore
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.31 MB
Formato Adobe PDF
1.31 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2992504