Content area
The Part-Of-Speech tagging is widely used in the natural language process. There are many statistical approaches in this area. The most popular one is Hidden Markov Model. In this paper, an alternative approach, linear-chain Conditional Random Fields, is introduced. The Conditional Random Fields is a factor graph approach that can naturally incorporate arbitrary, non-independent features of the input without conditional independence among the features or distributional assumptions of inputs. This paper applied the Conditional Random Fields for the car review word Part-Of-Speech tagging and then the feature extraction, which can be used as an input to an opinion mining system. To reduce the computational time, we also proposed applying the Limited-memory BFGS algorithm to train the Conditional Random Fields. Furthermore, this paper evaluated the Conditional Random Fields and the classical graph approach using the car review dataset to demonstrate that the Conditional Random Fields have a more robust result with a smaller training dataset.
Details
; Shen, Gang 3 ; Gao, Di 4 ; Wang, Yu 5 1 Syngenta Seeds, LLC, Basel, Switzerland
2 Louisiana Tech University, Department of Mathematics and Statistics, Ruston, USA (GRID:grid.259237.8) (ISNI:0000000121506076)
3 North Dakota State University, Department of Statistics, Fargo, USA (GRID:grid.261055.5) (ISNI:0000 0001 2293 4611)
4 Sam Houston State University, Department of Mathematics and Statistics, Huntsville, USA (GRID:grid.263046.5) (ISNI:0000 0001 2291 1903)
5 Texas A&M University, Department of Statistics, College Station, USA (GRID:grid.264756.4) (ISNI:0000 0004 4687 2082)