Content area
Full Text
Information fraud and investigation: a global perspective
Edited by Dr. Xin (Robert) Luo, Dr. Stephen Burd and Dr. Wei Li
1 Introduction
As the first author of this paper learned first-hand when he started his career in accounting around 18 years ago, account reconciliation is one of the first skills learned by entry-level accountants. It is also a function performed, surely, in every internal audit department. The reason for this is simple: account reconciliation is widely accepted to be a key component of accounting housekeeping necessary to maintain a financial system's accuracy and integrity. It is no surprise, then, that [9] Ge and McVay (2005) find that a common material internal control deficiency is the lack of reconciliation procedures.
In essence, account reconciliation boils down to matching data items. Separate from his accounting career, the first author has a background in research and development in data mining and text analysis, especially with large datasets. Over recent months, while working on a large reconciliation project and realizing that even "semi-automated" approaches to reconciliation (such as query-writing or use of off-the-shelf tools such as Reconart or ReconNet) can be extremely time-consuming and ad hoc , this author has become curious as to whether there are more scientific, repeatable, and generally applicable approaches that can be used to good effect, avoiding the need for the extensive dataset-specific configuration which is required even by off-the-shelf products. Our idea, essentially, is to develop algorithms that learn everything that can be learned directly from the data, so that the same algorithm could be "pointed" at any pair of datasets and reconcile them with minimal human intervention. Through experimentation, the first author has found that this kind of approach is indeed possible, and in particular that the fields of statistics, probability theory, and computational linguistics have much to offer in this area.
In this paper, we consider a specific instance of reconciliation: reconciliation of receipts to disbursements. In our use case, money designated for particular individuals is collected in an account and then later distributed to those individuals. The task, then, is to match receipts to corresponding distributions; from a management point of view, we want to know whose money is still in the account, and this can be ascertained from...