Approach to collaborative fuzzy clustering in large data analysis

Authors

  • Mai Dinh Sinh
  • Dang Trong Hop
  • Do Viet Duc
  • Ngo Thanh Long

DOI:

https://doi.org/10.54654/isj.v2i17.892

Keywords:

GPUs, collaborative clustering, fuzzy clustering, high-performance computing

Tóm tắt

Abstract— When data sets have one or more similar characteristics, the clustering in each of these data sets will have an effect on the other data sets. However, for various reasons such as data security issues, these data cannot be stored centrally but in different places. Collaborative clustering is a clustering technique that allows to performance of local clustering on each sub-data set and to exchange of information with other data sets. A collaborative process will be performed to adjust the clustering results on each subset to achieve better clustering results on the subsets of data. This paper presents a collaborative fuzzy clustering approach in big data analysis based on a high-performance computational model to improve the computation speed. Experiments on the Kitsune network attack dataset show that the proposed algorithm significantly improves the calculation speed compared to the previous method.

Downloads

Download data is not yet available.

References

James C.Bezdek, Robert Ehrlich, William Full. FCM: The fuzzy c-means clustering algorithm, Computers & Geosciences, Volume 10, Issues 2–3, 1984, Pages 191-203. https://doi.org/10.1016/0098-3004(84)90020-7.

K.R. Zalik, An efficient k-means clustering algorithm, Pattern Recognition Letters Vol. 29, pp 1385 - 1391, 2008. https://doi.org/10.1016/j.patrec.2008.02.014.

L. Zhu, F.L. Chung, S. Wang, Generalized Fuzzy C-Means Clustering Algorithm With Improved Fuzzy Partitions, IEEE Trans. On Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 39(3), pp 578-591, 2009. 10.1109/TSMCB.2008.2004818.

Haiyang Li, Zhaofeng Yang, Hongzhou He. An Improved Image Segmentation Algorithm Based on GPU Parallel Computing. Journal of Software, Vol 9, No 8 (2014), 1985-1990, 2014. 10.4304/jsw.9.8.1985-1990.

Mahmoud Al-Ayyoub, AnsamM Abu-Dalo, Yaser Jararweh, Moath Jarrah, Mohammad Al Sa’d, A GPU-based implementations of the fuzzy C-means algorithms for medical image segmentation. The Journal of Supercomputing, 1-14, Springer US, 2015. https://doi.org/10.1007/s11227-015-1431-y.

Shalom, S.A.A., Dash, M. and Tue, M. Graphics hardware based efficient and scalable fuzzy c-means clustering. 7th Australasian data mining conference, volume 87, 179-186, 2008. https://dl.acm.org/doi/10.5555/2449288.2449316.

Witold Pedrycz, Collaborative fuzzy clustering, Pattern Recognition Letters 23, pp1675–1686, 2002. https://doi.org/10.1016/S0167-8655(02)00130-7.

Witold Pedrycz, Collaborative clustering with the use of Fuzzy C-Means and its quantification, Fuzzy Sets and Systems, pp 2399 – 2427, 2008. https://doi.org/10.1016/j.fss.2007.12.030.

Trong Hop Dang, Dinh Sinh Mai, Long Thanh Ngo. Multiple kernel collaborative fuzzy clustering algorithm with weighted super-pixels for satellite image land-cover classification, Engineering Applications of Artificial Intelligence, Vol. 85, 2019, 85-98. https://doi.org/10.1016/j.engappai.2019.05.004.

N. Aitali, B. Cherradi, A. El Abbassi, O. Bouattane and M. Youssfi, "GPU based implementation of spatial fuzzy c-means algorithm for image segmentation," 2016 4th IEEE International Colloquium on Information Science and Technology (CiSt), 2016, pp. 460-464. 10.1109/CIST.2016.7805092.

Mishal Almazrooie, Mogana Vadiveloo, Rosni Abdullah. GPU-Based Fuzzy C-Means Clustering Algorithm for Image Segmentation, 2016, https://arxiv.org/abs/1601.00072.

L. T. Ngo, D. S. Mai and M. U. Nguyen, "GPU-based acceleration of interval type-2 fuzzy c-means clustering for satellite imagery land-cover classification," 12th International Conference on Intelligent Systems Design and Applications (ISDA), 2012, pp. 992-997, https://doi.org/10.1109/ISDA.2012.6416674.

J. C. Bezdek, Numerical taxonomy with fuzzy sets, Journal of Mathematical Biology, vol. 1, no. 1, pp. 57–71, 1974. https://doi.org/10.1007/BF02339490.

J. C. Bezdek, Cluster validity with fuzzy sets, Journal of Cybernetics, vol. 3, no. 3, pp. 58–73, 1974.https://doi.org/10.1080/01969727308546047.

https://archive.ics.uci.edu/ml/datasets.php.

Downloads

Abstract views: 101 / PDF downloads: 33

Published

2023-03-31

How to Cite

Sinh, M. D., Hop, D. T., Duc, D. V., & Long, N. T. (2023). Approach to collaborative fuzzy clustering in large data analysis. Journal of Science and Technology on Information Security, 3(17), 10-16. https://doi.org/10.54654/isj.v2i17.892

Issue

Section

Papers