XML is a rather verbose representation of semistructured data, which may require huge amounts of storage space. We propose a summarized representation of XML data, based on the concept of instance pattern, which can both provide succinct information and be directly queried. The physical representation of instance patterns exploits itemsets or association rules to summarize the content of XML datasets. Instance patterns may be used for (possibly partially) answering queries, either when fast and approximate answers are required, or when the actual dataset is not available, for example, it is currently unreachable. Experiments on largeXMLdocuments show that instance patterns allow a significant reduction in storage space, while preserving almost entirely the completeness of the query result. Furthermore, they provide fast query answers and show good scalability on the size of the dataset, thus overcoming the document size limitation of most current XQuery engines.
|Titolo:||Answering XML Queries by means of Data Summaries|
|Data di pubblicazione:||2007|
|Digital Object Identifier (DOI):||10.1145/1247715.1247716|
|Appare nelle tipologie:||1.1 Articolo in rivista|
File in questo prodotto:
|Answering XML Queries by means of Data Summaries - baralis.pdf||2. Post-print||Non Pubblico - Accesso privato/ristretto||Administrator Richiedi una copia|