Content area
Error detection is an important part of preparing data for data analysis. Erroneous data can result in inaccurate analysis, resulting in garbage-in, garbage-out. Currently, many models utilize either or both Qualitative and Quantitative methods to detect errors in the data. However, these methods are still limited in the errors they can detect. Hence FADE, Focused and Attention-based Detector of Errors, was proposed. FADE can detect errors within structured data with rows and columns. FADE utilizes the information found in surrounding cells within the same row to help determine if a cell is erroneous. It also learns the expected structure of the attributes in the dataset and the values expected in each attribute. This results in FADE having a much wider range of error type detection and having a higher classification of errors than other methods. FADE was evaluated and was found to detect these errors with relatively high performance.
Details
1 University of Alberta, Canada
2 IBM Canada, Canada
