Content area

Abstract

This paper presents a high-performance web search system leveraging big data technology. Utilizing a heterogeneous architecture and a parallel distributed computing model based on the MapReduce framework, the system significantly enhances efficiency, scalability, and reliability. The design includes a storage management scheme that integrates cloud storage and grid computing technologies, facilitating efficient storage and rapid access to large-scale data. Key components such as an inverted index structure, vector space model, and semantic analysis models are employed to implement functionalities across the data, logic, and display layers. An experimental environment was set up on the Microsoft Azure cloud platform using the Common Crawl dataset for testing. Performance evaluation, based on metrics including response time, accuracy, and stability, demonstrates the system's superior performance compared to two existing systems, thereby validating its effectiveness.

Details

1009240
Business indexing term
Title
A High Performance Computing Web Search Engine Based on Big Data and Parallel Distributed Models
Author
Ma, Jun 1 

 School of Information Engineering, Changsha Medical University Changsha 410219, China 
Publication title
Informatica; Ljubljana
Volume
48
Issue
20
Pages
27-38
Publication year
2024
Publication date
Dec 2024
Publisher
Slovenian Society Informatika / Slovensko drustvo Informatika
Place of publication
Ljubljana
Country of publication
Slovenia
Publication subject
ISSN
03505596
e-ISSN
18543871
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
ProQuest document ID
3163253355
Document URL
https://www.proquest.com/scholarly-journals/high-performance-computing-web-search-engine/docview/3163253355/se-2?accountid=208611
Copyright
© 2024. This work is published under https://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-08-14
Database
ProQuest One Academic