Berro, Alain, Megdiche-Bousarsar, Imen and Teste, Olivier (2015) Graph-Based ETL Processes For Warehousing Statistical Open Data. In: 17th International Conference on Enterprise Information Systems, 27 April 2015 - 30 April 2015, Barcelona, Spain.
Preview |
Text
Download (1MB) | Preview |
Abstract
Warehousing is a promising mean to cross and analyse Statistical Open Data (SOD). But extracting structures, integrating and defining multidimensional schema from several scattered and heterogeneous tables in the SOD are major problems challenging the traditional ETL (Extract-Transform-Load) processes. In this paper, we present a three step ETL processes which rely on RDF graphs to meet all these problems. In the first step, we automatically extract tables structures and values using a table anatomy ontology. This phase converts structurally heterogeneous tables into a unified RDF graph representation. The second step performs a holistic integration of several semantically heterogeneous RDF graphs. The optimal integration is performed through an Integer Linear Program (ILP). In the third step, system interacts with users to incrementally transform the integrated RDF graph into a multidimensional schema.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Language: | English |
Date: | 2015 |
Uncontrolled Keywords: | Open Data - RDF Graphs - ETL - Holistic Integration - Multidimensional Schema |
Subjects: | H- INFORMATIQUE |
Divisions: | Institut de Recherche en Informatique de Toulouse |
Site: | UT1 |
Date Deposited: | 20 Feb 2019 15:34 |
Last Modified: | 02 Apr 2021 15:59 |
URI: | https://publications.ut-capitole.fr/id/eprint/29436 |