Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Large language models (LLMs) and generative artificial intelligence (AI) have demonstrated notable capabilities, achieving human-level performance in intelligent tasks like medical exams. Despite the introduction of extensive LLM evaluations and benchmarks in disciplines like education, software development, and general intelligence, a privacy-centric perspective remains underexplored in the literature. We introduce Priv-IQ, a comprehensive multimodal benchmark designed to measure LLM performance across diverse privacy tasks. Priv-IQ measures privacy intelligence by defining eight competencies, including visual privacy, multilingual capabilities, and knowledge of privacy law. We conduct a comparative study evaluating seven prominent LLMs, such as GPT, Claude, and Gemini, on the Priv-IQ benchmark. Results indicate that although GPT-4o performs relatively well across several competencies with an overall score of 77.7%, there is room for significant improvements in capabilities like multilingual understanding. Additionally, we present an LLM-based evaluator to quantify model performance on Priv-IQ. Through a case study and statistical analysis, we demonstrate that the evaluator’s performance closely correlates with human scoring.

Details

Title
Priv-IQ: A Benchmark and Comparative Evaluation of Large Multimodal Models on Privacy Competencies
Author
Sakib Shahriar  VIAFID ORCID Logo  ; Dara, Rozita  VIAFID ORCID Logo 
First page
29
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
26732688
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3170854914
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.