Large Language Models (LLMs) have revolutionized the current landscape of Natural Language Processing, enabling unprecedented advances in text generation, translation, summarization, and more. Currently, limited efforts have been devoted to providing a high-level and systematic description of their properties. Today’s primary source of information is the Hugging Face (HF) catalog, a rich digital repository for researchers and developers. Although it hosts several models, datasets, and applications, its underlying data model supports limited exploration of linked information. In this work, we propose a conceptual map for describing the landscape of LLMs, organized by using the classical entity-relationship model. Our semantically rich data model allows end-users to answer insightful queries regarding, e.g., which metrics are most appropriate for assessing a specific LLM performance over a given downstream task. We first model the resources available in HF and then show how this map can be extended to support additional concepts and more insightful relationships. Our proposal is a first step towards developing a well-organized, high-level knowledge base supporting user-friendly interfaces for querying and discovering LLM properties.

Towards an Explorable Conceptual Map of Large Language Models / Bertetto, Lorenzo; Bettinelli, Francesca; Buda, Alessio; Da Mommio, Marco; DI BARI, Simone; Savelli, Claudio; Baralis, ELENA MARIA; Bernasconi, Anna; Cagliero, Luca; Ceri, Stefano; Pierri, Francesco. - 520:(2024), pp. 82-90. (Intervento presentato al convegno 36th International Conference on Advanced Information Systems Engineering, CAiSE 2024 tenutosi a Limassol (CYP) nel June 3–7, 2024) [10.1007/978-3-031-61000-4_10].

Towards an Explorable Conceptual Map of Large Language Models

Lorenzo Bertetto;Simone Di Bari;Claudio Savelli;ELENA MARIA BARALIS;Luca Cagliero;Francesco Pierri
2024

Abstract

Large Language Models (LLMs) have revolutionized the current landscape of Natural Language Processing, enabling unprecedented advances in text generation, translation, summarization, and more. Currently, limited efforts have been devoted to providing a high-level and systematic description of their properties. Today’s primary source of information is the Hugging Face (HF) catalog, a rich digital repository for researchers and developers. Although it hosts several models, datasets, and applications, its underlying data model supports limited exploration of linked information. In this work, we propose a conceptual map for describing the landscape of LLMs, organized by using the classical entity-relationship model. Our semantically rich data model allows end-users to answer insightful queries regarding, e.g., which metrics are most appropriate for assessing a specific LLM performance over a given downstream task. We first model the resources available in HF and then show how this map can be extended to support additional concepts and more insightful relationships. Our proposal is a first step towards developing a well-organized, high-level knowledge base supporting user-friendly interfaces for querying and discovering LLM properties.
2024
9783031609992
9783031610004
File in questo prodotto:
File Dimensione Formato  
978-3-031-61000-4_10.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 297.67 kB
Formato Adobe PDF
297.67 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2995232