Abstract

It is difficult for computer to correctly understand Chinese language, because Chinese is a very complicated language. Therefore, there are many problems existing in Chinese search technology. This paper focuses on solving the defects of Chinese words segmentation technology. Proposes an algorithm which is named as Chinese word segmentation system based on two-way maximum matching. Applies the algorithm into Nutch search engine, testing results shows that the proposed algorithm improved the abilities of Nutch system in Chinese processing.

Details

Title
Analysis and Improvement of Chinese Index Technology of Open Source Search Engine Nutch
Author
Tan, Min 1 ; Hong, Lan 1 

 School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi, China 
Publication year
2019
Publication date
Mar 2019
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2566022046
Copyright
© 2019. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.