Content area

Abstract

The dissertation is devoted to solving the problem of recovering highly fragmented data on users' workstations in the lack of file system metadata by developing and improving models and methods for information technology for carving files of electronic documents with a high level of fragmentation. The use of the proposed information technology makes it possible to increase the efficiency of carving the above data after attempts of destroying digital traces or cyberattacks.

The analysis of scientific sources shows that data recovery is a technically complex task that depends on the file system, the character and manner of the user's actions when deleting data, the time that has passed since the events, etc. The most difficult cases occur when recovering files without file system metadata. Carving non-fragmented or two-fragmented files is generally not a problem in digital forensics.Instead, carving files with a high level of fragmentation is one of the most difficult tasks. At the same time, the investigation of criminal proceedings is of great interest to electronic documents stored on the workstations of the persons involved in the cases. Therefore, the development of information technology for carving files of electronic documents with a high level of fragmentation is one of the directions focused on in this thesis.

The object of research is the process of recovering highly fragmented files of electronic documents in the lack of file system metadata associated with data blocks.

The subject of the study is advanced models and methods of information technology for carving files of electronic documents with a high level of fragmentation.

The goal of the study is to increase the efficiency of carving highly fragmented files of electronic documents.

The scientific novelty of the results is as follows:

– for the first time, models for file fragment identification based on multilayer convolutional neural networks have been developed, incorporating an additional classifier head with feature space regularization and reconstruction of hyperspherical class containers, which significantly improved classification accuracy for binary data blocks and detection of non-target file types;

– the classification model for binary data blocks was improved through the introduction of adapters that adjust based on marginal entropy estimated at the neural network output during inference, which enabled enhanced accuracy on data block samples that are underrepresented in the training dataset;

– methods for reconstructing OOXML files have been further advanced by introducing syntactical analysis techniques for examining the internal structure and content of files, enabling efficient identification of separate fragments of OOXML files in unallocated space and restoring the original file structure.

The result of this dissertation research is also developed within the proposed information technology for carving files of electronic documents with a high level of fragmentation:

- an ontological scheme of file carving for the systematization of various aspects and approaches to solving file carving tasks;

- generalized and detailed functional models of the process of carving highly fragmented files of electronic documents;

- a detailed functional model of the process of optimizing the parameters of the model for identifying binary data blocks.

A software implementation of the above information technology was created as part of the dissertation research. The functionality of the developed software product includes the ability to classify binary data blocks by type, reconstruct OOXML document files and/or their contents, search for missing fragments of a fragmented file, and identify OOXML documents that originate from the same source.

The results of scientific research in the form of methods and software for working with files of electronic documents and information on file carving are implemented and used in the work and training of experts of the Expert Service of the Ministry of Internal Affairs of Ukraine. In addition, the Scientific Board of the Expert Service of the Ministry of Internal Affairs of Ukraine approved and recommended for implementation in forensic activities the methodological recommendations "Forensic examination of Microsoft Office documents and their metadata," which uses the scientific results of the dissertation research on working with OOXML files (minutes No. 82 of the meeting of the Scientific Board of the Expert Service of the Ministry of Internal Affairs dated 30.11.2023).

Also, to automate the processes of carving highly fragmented OOXML documents, processing and analyzing Microsoft Word files, the National Anti-Corruption Bureau of Ukraine uses information technology for carving highly fragmented files of electronic documents, a method of reconstructing highly fragmented OOXML files based on the analysis of their internal structure and content, as well as the software for handling OOXML files in its digital researches and for educational purposes.

The introduction substantiates the relevance of the scientific and applied problem of carving files of electronic documents with a high level of fragmentation. Also, it shows a list of scientific works by world researchers in the field related to the subject of this study.

The first section provides an analysis of the current state and peculiarities of the use of information technology in data recovery in digital forensics. In particular, the paper presents an overview of the specifics of information technology usage for identifying binary data blocks and reconstructing files, recovering their contents, and clustering. The results of the above analysis are also summarized.

The second section analyzes the process of carving highly fragmented files and provides a reason for choosing research areas. It also presents a formalized statement of the research problem, criteria for evaluating the efficiency of the file carving process, new and improved neural network models for identifying binary data blocks, and a method for reconstructing highly fragmented OOXML files. Finally, it concludes based on the obtained results.

The third section presents the results of optimizing the parameters of the developed neural network models for identifying binary data blocks, as well as their analysis. This section also shows the specifics of classifying fragments of electronic documents as compound file types. The details of the implementation and analysis of the results of the method for reconstructing highly fragmented OOXML files, and the results are summarized.

The fourth section implements the information technology of carving files of electronic documents with a high level of fragmentation, presents generalized and detailed functional models of the process of carving highly fragmented files of electronic documents, as well as a detailed functional model of the process of optimizing the parameters of the model for identifying binary data blocks. Then, this section presents the software implementation of the information technology of carving files of electronic documents with a high level of fragmentation and summarizes the results.

The conclusions contain the scientific and practical results of this dissertation research.

The appendices contain scientific papers in which the main scientific results of the dissertation are published; scientific papers confirming the approbation of the dissertation materials; scientific papers that additionally show the scientific results of the dissertation; data on the approbation of the dissertation results; documents on the implementation of the dissertation results; software listing.

Details

1010268
Title
Models and Methods of Information Technology of Intelligent Data Analysis for Digital Forensic Examination of Electronic Documents
Number of pages
171
Publication year
2025
Degree date
2025
School code
2251
Source
DAI-A 87/5(E), Dissertation Abstracts International
ISBN
9798265441225
University/institution
Sumy State University
Department
Інформаційні технології (Information Technology)
University location
Ukraine
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
Ukrainian
Document type
Dissertation/Thesis
Dissertation/thesis number
32395262
ProQuest document ID
3276281811
Document URL
https://www.proquest.com/dissertations-theses/models-methods-information-technology-intelligent/docview/3276281811/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic