Scientific articles can be annotated with short sentences, called highlights, providing readers with an at-a-glance overview of the main findings. Highlights are usually manually specified by the authors. This paper presents a supervised approach, based on regression techniques, with the twofold aim at automatically extracting highlights of past articles with missing annotations and simplifying the process of manually annotating new articles. To this end, regression models are trained on a variety of features extracted from previously annotated articles. The proposed approach extends existing extractive approaches by predicting a similarity score, based on n-gram co-occurrences, between article sentences and highlights. The experimental results, achieved on a benchmark collection of articles ranging over heterogeneous topics, show that the proposed regression models perform better than existing methods, both supervised and not.

Extracting highlights of scientific articles: A supervised summarization approach / Cagliero, L.; La Quatra, M.. - In: EXPERT SYSTEMS WITH APPLICATIONS. - ISSN 0957-4174. - 160 (113659):(2020). [10.1016/j.eswa.2020.113659]

Extracting highlights of scientific articles: A supervised summarization approach

Cagliero L.;La Quatra M.
2020

Abstract

Scientific articles can be annotated with short sentences, called highlights, providing readers with an at-a-glance overview of the main findings. Highlights are usually manually specified by the authors. This paper presents a supervised approach, based on regression techniques, with the twofold aim at automatically extracting highlights of past articles with missing annotations and simplifying the process of manually annotating new articles. To this end, regression models are trained on a variety of features extracted from previously annotated articles. The proposed approach extends existing extractive approaches by predicting a similarity score, based on n-gram co-occurrences, between article sentences and highlights. The experimental results, achieved on a benchmark collection of articles ranging over heterogeneous topics, show that the proposed regression models perform better than existing methods, both supervised and not.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0957417420304838-main.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 3.35 MB
Formato Adobe PDF
3.35 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
post_print_eswa_highlights.pdf

Open Access dal 28/06/2022

Descrizione: Post-prints dell'autore without editor layout
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 7.86 MB
Formato Adobe PDF
7.86 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2840600