Content area
Detecting manipulated document images is essential for verifying the authenticity of official records and preventing document forgery. However, forgery artifacts are often subtle and localized in fine-grained regions, such as text boundaries or character outlines, where visual symmetry and structural regularity are typically expected. These manipulations can disrupt the inherent symmetry of document layouts, making the detection of such inconsistencies crucial for forgery identification. Conventional CNN-based models face limitations in capturing such edge-level asymmetric features, as edge-related information tends to weaken through repeated convolution and pooling operations. To address this issue, this study proposes an edge-focused method composed of two components: the Edge Attention (EA) layer and the Edge Concatenation (EC) layer. The EA layer dynamically identifies channels that are highly responsive to edge features in the input feature map and applies learnable weights to emphasize them, enhancing the representation of boundary-related information, thereby emphasizing structurally significant boundaries. Subsequently, the EC layer extracts edge maps from the input image using the Sobel filter and concatenates them with the original feature maps along the channel dimension, allowing the model to explicitly incorporate edge information. To evaluate the effectiveness and compatibility of the proposed method, it was initially applied to a simple CNN architecture to isolate its impact. Subsequently, it was integrated into various widely used models, including DenseNet121, ResNet50, Vision Transformer (ViT), and a CAE-SVM-based document forgery detection model. Experiments were conducted on the DocTamper, Receipt, and MIDV-2020 datasets to assess classification accuracy and F1-score using both original and forged text images. Across all model architectures and datasets, the proposed EA–EC method consistently improved model performance, particularly by increasing sensitivity to asymmetric manipulations around text boundaries. These results demonstrate that the proposed edge-focused approach is not only effective but also highly adaptable, serving as a lightweight and modular extension that can be easily incorporated into existing deep learning-based document forgery detection frameworks. By reinforcing attention to structural inconsistencies often missed by standard convolutional networks, the proposed method provides a practical solution for enhancing the robustness and generalizability of forgery detection systems.
