A Hybrid Data Warehouse Model to Improve Mining Algorithms

kadhim B.S. AlJanabi, Rusul Kadhim Meshjal

Abstract


The performance of different Data Mining Algorithms including Classification, Clustering, Association, Prediction and others are highly related to the approaches used in Data Warehouse design and to the way the data is stored (lightly summarized, highly summarized and detailed).Detailed data is important to get detailed reports but as the amount of data is huge this represents a big challenge to the mining algorithms, on the other hand, the summarized data leads to better algorithms performance but the lack of the required knowledge may affect the overall mining process.
Knowledge extraction and mining algorithms performance and complexities represent a big challenge in data analysis field, hence the work in this paper represents a proposed approach to improve the algorithms performance throughout well designed warehouse and data reduction technique.
The work in this paper presents a hybrid warehouse galaxy model that stores data in three different formats including detailed, summarized and highly summarized data. The time and space complexity are the major criteria in the proposed approach.
Real data was collected about schools, students and teachers from different AlNajaf AlAshraf cities, the data was preprocessed, reduced mainly through concept hierarchy and then converted into dimensions and fact tables (Warehouse Galaxy Model) which in turn are converted into multidimensional cubes. Roll up and drill down queries were highly used to get the required information.
The resultant data cubes and in turn the corresponding warehouse model presented in this work showed a reasonable improvement in knowledge extraction algorithms for the data under discussion.
The results of the queries showed better performance of different roll up and drill down queries compared to detailed data queries

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


copyright©2018 Journal of Kufa for Mathematics and Computer.