%0 Conference Paper %A Berro, Alain %A Megdiche-Bousarsar, Imen %A Teste, Olivier %B 17th International Conference on Enterprise Information Systems %C Barcelona, Spain %D 2015 %F publications:29436 %I INSTICC - Institute for Systems and Technologies of Information, Control and Communication %K Open Data - RDF Graphs - ETL - Holistic Integration - Multidimensional Schema %P 271-278 %T Graph-Based ETL Processes For Warehousing Statistical Open Data %U https://publications.ut-capitole.fr/id/eprint/29436/ %X Warehousing is a promising mean to cross and analyse Statistical Open Data (SOD). But extracting structures, integrating and defining multidimensional schema from several scattered and heterogeneous tables in the SOD are major problems challenging the traditional ETL (Extract-Transform-Load) processes. In this paper, we present a three step ETL processes which rely on RDF graphs to meet all these problems. In the first step, we automatically extract tables structures and values using a table anatomy ontology. This phase converts structurally heterogeneous tables into a unified RDF graph representation. The second step performs a holistic integration of several semantically heterogeneous RDF graphs. The optimal integration is performed through an Integer Linear Program (ILP). In the third step, system interacts with users to incrementally transform the integrated RDF graph into a multidimensional schema.