Content area
Abstract
Approximate pattern matching not only is more general than exact pattern matching, but also allows some data noise. Most of them adopt the Hamming distance to measure similarity, which indicates the number of different characters in two sequences, but it cannot reflect the approximation between two characters. This paper addresses the approximate pattern matching with a local distance no larger than δ and a global distance no larger than γ, which is named Delta and gamma Pattern matching with gap constraints under One-off condition (DPO). First, we show that the problem is an NP-Hard problem. Therefore, we construct a heuristic algorithm named approximate Nettree for DPO (NetDPO), which transforms the problem into an approximate Nettree based on δ distance which is a specially designed data structure. Then, NetDPO calculates the number of paths that reach the roots within γ distance. To find the maximal occurrences, we employ the rightmost parent strategy and the optimal parent strategy to select the better occurrence which can minimize the influence after removing the occurrence. Iterate this process until there are no occurrences. Finally, we analyze the time and space complexities of NetDPO. Extensive experimental results verify the superiority of the proposed algorithm.
Details
; Wu, Xindong 5 1 Hebei University of Technology, School of Economics and Management, Tianjin, China (GRID:grid.412030.4) (ISNI:0000 0000 9226 1013)
2 Hebei University of Technology, School of Artificial Intelligence, Tianjin, China (GRID:grid.412030.4) (ISNI:0000 0000 9226 1013)
3 Hebei University of Technology, State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Tianjin, China (GRID:grid.412030.4) (ISNI:0000 0000 9226 1013)
4 Hebei University of Technology, School of Artificial Intelligence, Tianjin, China (GRID:grid.412030.4) (ISNI:0000 0000 9226 1013); Hebei Key Laboratory of Big Data Computing, Tianjin, China (GRID:grid.412030.4)
5 Ministry of Education, Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology), Hefei, China (GRID:grid.256896.6) (ISNI:0000 0001 0395 8562); Mininglamp Technology, Mininglamp Academy of Sciences, Beijing, China (GRID:grid.513524.4)





