It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
The rise of AI-generated texts (AIGTs), particularly with the arrival of advanced language models like ChatGPT, has spurred a growing need for effective detection methods. While these models offer various beneficial applications, their potential for misuse, such as facilitating plagiarism and the generation of fake textual content, raises significant ethical concerns. These concerns have sparked extensive academic research into detecting AIGTs. Efforts to mitigate potential misuse include commercial platforms like Turnitin, GPTZero, and more. Notably, most evaluations conducted on the current AI detection thus far have predominantly focused on English or languages rooted in Latin-driven scripts. However, the effectiveness of existing AI detectors is notably hampered when processing Arabic texts due to the unique challenges posed by the language's diacritics, which are small marks placed above or below letters to indicate pronunciation. These diacritics can cause human-written texts (HWTs) to be misclassified as AIGTs. Recognizing the limitations of current detectors, this research first established a baseline performance assessment using a newly developed benchmark dataset of Arabic texts that contain HWTs and AIGTs against the existing detection systems such as OpenAI Text Classifier and GPTZero. This evaluation highlighted critical weaknesses in existing detectors' ability to handle diacritics and differentiate between HWTs and AIGTs, particularly in essay-length texts. This research introduces a novel AI text detector designed explicitly for Arabic to address these limitations, leveraging transformer-based pre-trained models trained on several novel datasets. Our resulting detector significantly outperforms the existing detection models in accurately identifying both HWTs and AIGTs in Arabic. Although the research focus was on Arabic due to its unique writing challenges, our detector architecture is adaptable to other languages.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer