Full Text

Turn on search term navigation

© 2021. This work is published under https://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Recent advancements in the internet, social media, and internet of things (IoT) devices have significantly increased the amount of data generated in a variety of formats. The data must be converted into formats that is easily handled by the data analysis techniques. It is mathematically and physically expensive to apply machine learning algorithms to big and complicated data sets. It is a resource-intensive process that necessitates a huge amount of logical and physical resources. Machine learning is a sophisticated data analytics technology that has gained in importance as a result of the massive amount of data generated daily that needs to be examined. Apache Spark machine learning library (MLlib) is one of the big data analysis platforms that provides a variety of outstanding functions for various machine learning tasks, spanning from classification to regression and dimension reduction. From a computational standpoint, this research investigated Apache Spark MLlib 2.0 as an open source, autonomous, scalable, and distributed learning library. Several real-world machine learning experiments are carried out in order to evaluate the properties of the platform on a qualitative and quantitative level. Some of the fundamental concepts and approaches for developing a scalable data model in a distributed environment are also discussed.

Details

Title
Large scale data analysis using MLlib
Author
Ali, Ahmed Hussein 1 ; Abbod, Maan Nawaf 2 ; Khaleel, Mohammed Khamees 3 ; Mohammed, Mostafa Abdulghafoor 2 ; Sutikno, Tole 4 

 ICCI, Informatics Institute for Postgraduate Studies, Baghdad, Iraq 
 Imam Aadham University College, Iraq 
 Department of Computer, College of Education, AL-Iraqia University, Baghdad, Iraq 
 Department of Electical Engineering, Universitas Ahmad Dahlan, Yogyakarta, Indonesia 
Pages
1735-1746
Publication year
2021
Publication date
Oct 2021
Publisher
Ahmad Dahlan University
ISSN
16936930
e-ISSN
23029293
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2582833813
Copyright
© 2021. This work is published under https://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.