Berro, Alain, Megdiche-Bousarsar, Imen and Teste, Olivier (2015) Graph-Based ETL Processes For Warehousing Statistical Open Data. In: 17th International Conference on Enterprise Information Systems, 27 April 2015 - 30 April 2015, Barcelona, Spain.

[thumbnail of berro_15245.pdf]
Download (1MB) | Preview


Warehousing is a promising mean to cross and analyse Statistical Open Data (SOD). But extracting structures, integrating and defining multidimensional schema from several scattered and heterogeneous tables in the SOD are major problems challenging the traditional ETL (Extract-Transform-Load) processes. In this paper, we present a three step ETL processes which rely on RDF graphs to meet all these problems. In the first step, we automatically extract tables structures and values using a table anatomy ontology. This phase converts structurally heterogeneous tables into a unified RDF graph representation. The second step performs a holistic integration of several semantically heterogeneous RDF graphs. The optimal integration is performed through an Integer Linear Program (ILP). In the third step, system interacts with users to incrementally transform the integrated RDF graph into a multidimensional schema.

Item Type: Conference or Workshop Item (Paper)
Language: English
Date: 2015
Uncontrolled Keywords: Open Data - RDF Graphs - ETL - Holistic Integration - Multidimensional Schema
Divisions: Institut de Recherche en Informatique de Toulouse
Site: UT1
Date Deposited: 20 Feb 2019 15:34
Last Modified: 02 Apr 2021 15:59
View Item


Downloads per month over past year