Disseminating the main research findings is one of the main requirements to become a successful researcher. Presentation slides are the most common way to present paper content. To support researchers in slide preparation, the NLP research community has explored the use of summarization techniques to automatically generate a draft of the slides consisting of the most salient sentences or phrases. State-of-the-art methods adopt a supervised approach, which first estimates global content relevance using a set of training papers and slides, then performs content selection by optimizing also section-level coverage. How- ever, in several domains and contexts there is a lack of training data, which hinders the use of supervised models. This paper focuses on addressing the above issue by applying unsupervised summarization methods. They are exploited to generate sentence-level summaries of the paper sections, which are then refined by applying an optimization step. Furthermore, it evaluates the quality of the output slides by taking into account the original paper structure as well. The results, achieved on a benchmark collection of papers and slides, show that unsupervised models performed better than supervised ones on specific paper facets, whereas they were competitive in terms of overall quality score.

Automatic slides generation in the absence of training data / Cagliero, Luca; LA QUATRA, Moreno. - ELETTRONICO. - (2021), pp. 103-108. (Intervento presentato al convegno IEEE Annual International Computer Software and Applications Conference (COMPSAC) tenutosi a Virtual, Online nel July 12-16, 2021) [10.1109/COMPSAC51774.2021.00025].

Automatic slides generation in the absence of training data

Luca Cagliero;Moreno La Quatra
2021

Abstract

Disseminating the main research findings is one of the main requirements to become a successful researcher. Presentation slides are the most common way to present paper content. To support researchers in slide preparation, the NLP research community has explored the use of summarization techniques to automatically generate a draft of the slides consisting of the most salient sentences or phrases. State-of-the-art methods adopt a supervised approach, which first estimates global content relevance using a set of training papers and slides, then performs content selection by optimizing also section-level coverage. How- ever, in several domains and contexts there is a lack of training data, which hinders the use of supervised models. This paper focuses on addressing the above issue by applying unsupervised summarization methods. They are exploited to generate sentence-level summaries of the paper sections, which are then refined by applying an optimization step. Furthermore, it evaluates the quality of the output slides by taking into account the original paper structure as well. The results, achieved on a benchmark collection of papers and slides, show that unsupervised models performed better than supervised ones on specific paper facets, whereas they were competitive in terms of overall quality score.
File in questo prodotto:
File Dimensione Formato  
COMPSAC_2021.pdf

accesso aperto

Descrizione: Accepted version with Conference Layout
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 432.37 kB
Formato Adobe PDF
432.37 kB Adobe PDF Visualizza/Apri
Automatic_slides_generation_in_the_absence_of_training_data.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 2.25 MB
Formato Adobe PDF
2.25 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2919520