International Journal of Advanced and Applied Sciences

Int. j. adv. appl. sci.

EISSN: 2313-3724

Print ISSN:2313-626X

Volume 3, Issue 8  (August 2016), Pages:  78-84


Title: Data fusion in data federation using modified discriminative Markov logic networks

Authors:  M. S. Hema 1, *, M. Nageswara Guptha 2

Affiliation(s):

1Department of CSE, Sri Venkateshwara College of Engineering, Bengaluru, India
2Department of ISE, Sri Venkateshwara College of Engineering, Bengaluru, India

http://dx.doi.org/10.21833/ijaas.2016.08.013

Full Text - PDF          XML

Abstract:

The quality integrated data is crucial for data mining process. The existing approaches are used trust your friends and cry with wolves principle to resolve the data conflicts. These principles are taking the value of a preferred source and taking the most frequent value. However, it is a challenge for data integration to choose the most trustworthy data source and it is arbitrary to trust only certain source. To mitigate above issues, Data Fusion in Data Federation using Modified Discriminative Markov Logic Networks (DF-MDMLN) approach is proposed. Data fusion is to resolve the data conflicts among the data from different heterogeneous databases by utilizing multi-angle features and knowledge of discriminative Markov Logic Network (MLN). The data fusion is used to improve the precision and recall of the end users’ data set. E-shopping for computer peripherals application is considered for experimentation to analyze the performance of DF-MDMLN approach. Experiments on E-shopping data sets show the effectiveness of DF-MDMLN approach. It is observed that the precision and recall of data fusion has been improved by 40% and 27% respectively. 

© 2016 The Authors. Published by IASE.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Keywords: Data federation, Data conflicts, Data fusion, Markov logic networks, Weight learning

Article History: Received 2 July 2016, Received in revised form 10 September 2016, Accepted 12 September 2016

Digital Object Identifier: http://dx.doi.org/10.21833/ijaas.2016.08.013

Citation:

Hema MS and Guptha MN (2016). Data fusion in data federation using modified discriminative Markov logic networks. International Journal of Advanced and Applied Sciences, 3(8): 78-84

http://www.science-gate.com/IJAAS/V3I8/Hema.html


References:

Bhattacharya I and Getoor L (2004). Iterative record linkage for cleaning and integration. In the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, Paris, France: 11-18
http://dx.doi.org/10.1145/1008694.1008697
Bilenko M and Mooney RJ (2003). Adaptive duplicate detection using learnable string similarity measures. In the 9th ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, DC, USA: 39-48
http://dx.doi.org/10.1145/956750.956759
Bleiholder J and Naumann F (2009). Data fusion. ACM Computing Surveys (CSUR), 41(1): 1-41.
http://dx.doi.org/10.1145/1456650.1456651
Dong XL and Naumann F (2009). Data fusion: resolving data conflicts for integration. Proceedings of the VLDB Endowment, 2(2): 1654-1655.
http://dx.doi.org/10.14778/1687553.1687620
Dong XL, Berti-Equille L and Srivastava D (2009). Integrating conflicting data: The role of source dependence. Proceedings of the VLDB Endowment, 2(1): 550-561.
http://dx.doi.org/10.14778/1687627.1687690
Hema MS and Chandramathi S (2011). Federated query processing service in service oriented business intelligence. In International Conference on Advances in Communication, Network, and Computing. Springer Berlin Heidelberg: 337-340
http://dx.doi.org/10.1007/978-3-642-19542-6_62
Hema MS and Chandramathi S (2012). Review on ontology based data federation. International Journal of Research and Reviews in Computer Science (IJRRCS). Science Academy Publisher, United Kingdom, 3(2): 1508-1513.
Hema MS and Chandramathi S (2013). Quality aware service oriented ontology based data integration. WSEAS transactions on computers, 12(12): 463-473.
Huang S, Zhang Y, Zhou J and Chen J (2009). Coreference resolution using markov logic networks. Advances in computational linguistics, 41: 157-168.
Hull R and King R (1987). Semantic database modeling: survey, applications, and research issues. ACM Computing Surveys (CSUR), 19(3): 201-260.
http://dx.doi.org/10.1145/45072.45073
Jarke M, Jeusfeld MA, Quix C and Vassiliadis P (1999). Architecture and quality in data warehouses: An extended repository approach. Information Systems, 24(3): 229-253.
http://dx.doi.org/10.1016/S0306-4379(99)00017-4
Lenzerini M (2002). Data integration: A theoretical perspective. In the 21st ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, Madison, WI, USA :233-246.
http://dx.doi.org/10.1145/543613.543644
Liu X, Dong XL, Ooi BC and Srivastava D (2011). Online data fusion. Proceedings of the VLDB Endowment, 4(11): 932-943.
http://dx.doi.org/10.1080/19479832.2010.523440
http://dx.doi.org/10.1080/19479832.2011.577458
http://dx.doi.org/10.1080/19479832.2010.546372
Lowd D and Domingos P (2007, September). Efficient weight learning for Markov logic networks. In European Conference on Principles of Data Mining and Knowledge Discovery. Springer Berlin Heidelberg: 200-211.
http://dx.doi.org/10.1007/978-3-540-74976-9_21
Motro A and Anokhin P (2006). Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources. Information Fusion, 7(2): 176-196.
http://dx.doi.org/10.1016/j.inffus.2004.10.001
Poon H and Domingos P (2006). Sound and efficient inference with probabilistic and deterministic dependencies. In the 21st national conference on Artificial intelligence (AAAI-06), Boston, Massachusetts, USA, 6: 458-463.
Scannapieco M, Virgillito A, Marchetti C, Mecella M and Baldoni R (2004). The DaQuinCIS architecture: a platform for exchanging and improving data quality in cooperative information systems. Information Systems, 29(7): 551-582.
http://dx.doi.org/10.1016/j.is.2003.12.004
Sheth AP and Larson JA (1990). Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys (CSUR), 22(3): 183-236.
http://dx.doi.org/10.1145/96602.96604
Singla P and Domingos P (2005). Discriminative training of Markov logic networks. In the 20th National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, Pennsylvania,USA: 868-873.
Singla P and Domingos P (2006). Entity resolution with markov logic. In the 6th IEEE International Conference on Data Mining (ICDM '06), Honk Kong: 572-582.
http://dx.doi.org/10.1109/icdm.2006.65
Song F, Zacharewicz G and Chen D (2013) An ontology-driven framework towards building enterprise semantic information layer. Advanced Engineering Informatics, 27(1): 38-50.
http://dx.doi.org/10.1016/j.aei.2012.11.003
Yin X, Han J and Philip SY (2008). Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering, 20(6): 796-808.
http://dx.doi.org/10.1109/TKDE.2007.190745