ChatGPT: Precision Answer Comparison and Evaluation Model

Aso Mohammed Aladdin; Rebwar Khalid Muhammed; Hemin Sardar Abdulla; Tarik Ahmad Rashid

doi:10.51173/ijds.v3i1.60

Authors

Aso Mohammed Aladdin Computer Science Department, College of Science, Charmo University, Sulaimani, Chamchamal 46023, KR, Iraq https://orcid.org/0000-0002-8734-0811
Rebwar Khalid Muhammed Network Department, Computer Science Institute, Sulaimani Polytechnic University, Sulaimani 46001, KR, Iraq https://orcid.org/0009-0009-3288-7340
Hemin Sardar Abdulla Computer Science Department, College of Science, Charmo University, Sulaimani, Chamchamal 46023, KR, Iraq https://orcid.org/0009-0008-4654-5248
Tarik Ahmad Rashid Computer Science and Engineering Department, University of Kurdistan Hewler, Erbil 44001, KR, Iraq https://orcid.org/0000-0002-8661-258X

DOI:

https://doi.org/10.51173/ijds.v3i1.60

Keywords:

ChatGPT, PACEM, Answer Comparison, Human-Like Responses

Abstract

Artificial Intelligence (AI) has made advancements, among other things, OpenAI created the sophisticated model ChatGPT. Conversational, ChatGPT supports natural interactions, providing human-like responses to queries across myriad topics. But it is not infallible, and the degree of accuracy also depends on the complexity of the queries, the context, and how often the prompts are repeated. This work thus proposes a new model, the Precision Answer Comparison and Evaluation Model (PACEM), to systematically address these types of questions and assess ChatGPT's performance. PACEM assesses the correctness and coherence of ChatGPT's answers across numerous fields, including literature, history, law, ethics, and sports. By providing these analyses and comparisons, PACEM goes on record with a detailed understanding of what ChatGPT does well and poorly as a source of reliable information. On top of that, it includes an assessment of response time, considering ChatGPT's speed in producing answers in relation to real or expected ones. The findings show that ChatGPT's answers are usually substantially accurate and often of superior quality compared to those written by the user and other alternatives. Response time generally increases with the complexity or length of the answer. Finally, the study reviews notable takeaways from PACEM's deployment and offers suggestions for future research to address the evolving challenges in AI-driven response assessment.

Downloads

Download data is not yet available.

References

O. M. Alyasiri, D. Akhtom, and M. N. Alrasheedy, "An Overview of GPT -4's Characteristics through the Lens of 10V's of Big Data," in 2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA), IEEE, Dec. 2023, pp. 201–206. doi: 10.1109/ICICyTA60173.2023.10429032.

A. A. Hassan, H. S. Abdulla, T. Y. Mawlood, R. K. Muhammed, A. M. Aladdin, and T. A. Rashid, "A Multi-Account Statistical Evaluation of ChatGPT Proficiency in the Kurdish Sorani Language," UHD Journal of Science and Technology, vol. 9, no. 2, pp. 319–334, Nov. 2025, doi: 10.21928/uhdjst.v9n2y2025.pp319-334.

J. Shabbir and T. Anwer, "Artificial intelligence: A powerful paradigm for scientific research," The Innovation, 2018, doi: 10.1016/j.xinn.2021.100179.

B. K. Arif and A. M. Aladdin, "A Comparative Analysis of ChatGPT and Traditional Machine Learning Algorithms on Real-World Data," Kurdistan Journal of Applied Research, vol. 10, no. 2, pp. 93–118, Sep. 2025, doi: 10.24017/science.2025.2.8.

A. Shollo, K. Hopf, T. Thiess, and O. Müller, "Shifting ML value creation mechanisms: A process model of ML value creation," The Journal of Strategic Information Systems, vol. 31, no. 3, p. 101734, 2022, doi: 10.1016/j.jsis.2022.101734.

R. Schmidt, A. Zimmermann, M. Möhring, and B. Keller, "Value creation in connectionist artificial intelligence–a research agenda," AMCIS 2020 proceedings-Advancings in information systems research: August 10-14, 2020, Online, pp. 1–10, 2020.

J. Tang, G. Liu, and Q. Pan, "A Review on Representative Swarm Intelligence Algorithms for Solving Optimization Problems: Applications and Trends," IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 10, pp. 1627–1643, 2021, doi: 10.1109/JAS.2021.1004129.

A. M. Aladdin and T. A. Rashid, "A New Lagrangian Problem Crossover—A Systematic Review and Meta-Analysis of Crossover Standards," Systems, vol. 11, no. 3, p. 144, Mar. 2023, doi: 10.3390/systems11030144.

A. A. H. Amin, A. M. Aladdin, D. O. Hasan, S. R. Mohammed-Taha, and T. A. Rashid, "Enhancing Algorithm Selection through Comprehensive Performance Evaluation: Statistical Analysis of Stochastic Algorithms," Computation, vol. 11, no. 11, p. 231, 2023.

A. M. Aladdin, C. M. Rahman, and M. S. Abdulkarim, "The Scientific Comparison between Web-Based Site and Web-Builder (Open Source) Project: Functionalities, Usability, Design and Security," International Journal of Scientific Research and Management (IJSRM), vol. 6, no. 06, Jun. 2018, doi: 10.18535/ijsrm/v6i6.ec05.

V. Taecharungroj, “‘What can ChatGPT do?’ Analyzing early reactions to the innovative AI chatbot on Twitter," Big Data and Cognitive Computing, vol. 7, no. 1, p. 35, 2023, doi: 10.3390/bdcc7010035.

A. Nazir and Z. Wang, "A comprehensive survey of ChatGPT: Advancements, applications, prospects, and challenges," Meta-Radiology, vol. 1, no. 2, p. 100022, 2023, doi: 10.1016/j.metrad.2023.100022.

M. Mijwil, Mohammad Aljanabi, and Ahmed Hussein Ali, "ChatGPT: Exploring the Role of Cybersecurity in the Protection of Medical Information," Mesopotamian Journal of CyberSecurity, vol. 2023, pp. 18–21, Feb. 2023, doi: 10.58496/MJCS/2023/004.

R. K. Muhammed et al., "Comparative Analysis of AES, Blowfish, Twofish, Salsa20, and ChaCha20 for Image Encryption," Kurdistan Journal of Applied Research, vol. 9, no. 1, pp. 52–65, 2024, doi: 10.24017/science.2024.1.5.

R. K. Muhammed, K. H. Ali Faraj, J. F. G. Mohammed, Ahmad Al Attar Tara Nawzad, S. J. Saydah5, and D. A. Rashid, "Automated Performance Analysis E-services by AES-Based Hybrid Cryptosystems with RSA, ElGamal, and ECC," Advances in Science, Technology and Engineering Systems Journal, vol. 9, no. 3, pp. 84–91, 2024, doi: 10.25046/aj090308.

P. P. Ray, "ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope," Internet of Things and Cyber-Physical Systems, vol. 3, pp. 121–154, 2023, doi: 10.1016/j.iotcps.2023.04.003.

M. Mijwil and M. Aljanabi, "Towards artificial intelligence-based cybersecurity: The practices and ChatGPT generated ways to combat cybercrime," Iraqi Journal For Computer Science and Mathematics, vol. 4, no. 1, pp. 65–70, 2023, doi: 10.52866/ijcsm.2023.01.01.0019.

N. Khan, Z. Khan, A. Koubaa, M. K. Khan, and R. bin Salleh, "Global insights and the impact of generative AI-ChatGPT on multidisciplinary: a systematic review and bibliometric analysis," Conn Sci, vol. 36, no. 1, p. 2353630, Dec. 2024, doi: 10.1080/09540091.2024.2353630.

M. C. Keiper, G. Fried, J. Lupinek, and H. Nordstrom, "Artificial intelligence in sport management education: Playing the AI game with ChatGPT," J Hosp Leis Sport Tour Educ, vol. 33, p. 100456, 2023, doi: 10.1016/j.jhlste.2023.100456.

R. Xu and Z. Wang, "ChatGPT in Healthcare from the Perspective of Digital Media: Applications, Opportunities and Challenges," Heliyon, 2024, doi: 10.1016/j.heliyon.2024.e32364.

A. M. Aladdin, Y. N. Bakir, and S. I. Saeed, "The effects to trend the suitable os platform," Journal: Journal of Advances in Natural Sciences, vol. 5, no. 01, 2018, doi: 10.24297/jns.v5i1.7528.

M. Aljanabi, "ChatGPT: Future directions and open possibilities," Mesopotamian journal of Cybersecurity, vol. 2023, pp. 16–17, 2023, doi: 10.58496/MJCS/2023/003.

H. Nasef et al., "Evaluating the Accuracy, Comprehensiveness, and Validity of ChatGPT Compared to Evidence-Based Sources Regarding Common Surgical Conditions: Surgeons' Perspectives," Am Surg, p. 00031348241256075, 2024, doi: 10.1177/000313482412560.

C. B. Lau, E. Lilly, J. Yu, and G. P. Smith, "Evaluating the efficacy of ChatGPT in addressing patient queries about acne and atopic dermatitis," Clin Exp Dermatol, p. llae187, 2024, doi: doi.org/10.1093/ced/llae187.

W. W. Jedrzejczak, P. H. Skarzynski, D. Raj-Koziak, M. D. Sanfins, S. Hatzopoulos, and K. Kochanek, "ChatGPT for Tinnitus Information and Support: Response Accuracy and Retest after Three and Six Months," Brain Sci, vol. 14, no. 5, p. 465, 2024, doi: 10.3390/brainsci14050465.

R. W. Puyt and D. Ø. Madsen, "Evaluating ChatGPT -4's historical accuracy: a case study on the origins of SWOT analysis," Front Artif Intell, vol. 7, p. 1402047, 2024, doi: 10.3389/frai.2024.1402047.

A. J. Neuhouser, A. Kamboj, A. Mokhtarzadeh, and A. R. Harrison, "Artificial intelligence in practice: measuring its medical accuracy in oculoplastics consultations," Modeling and Artificial Intelligence in Ophthalmology, vol. 6, no. 1, pp. 1–11, May 2024, doi: 10.35119/maio.v6i1.137.

D. Braithwaite et al., "Evaluating ChatGPT's Accuracy in Providing Screening Mammography Recommendations among Older Women: Artificial Intelligence and Cancer Communication," Res Sq, 2024, doi: 10.21203/rs.3.rs-3911155/v1.

H. L. Walker et al., "Reliability of medical information provided by ChatGPT: assessment against clinical guidelines and patient information quality instrument," J Med Internet Res, vol. 25, p. e47479, 2023, doi: 10.2196/47479.

O. Al-Janabi, O. M. Alyasiri, E. A. Jebur, and S. M. Nafl, "Evaluating AI Language Models in News Retrieval: A Comparative Study Of ChatGPT-Plus and DeepSeek (R1)," InfoTech Spectrum: Iraqi Journal of Data Science, vol. 2, no. 2, pp. 14–20, Jun. 2024, doi: 10.51173/ijds.v2i2.33.

S. Atlas, "ChatGPT for higher education and professional development: A guide to conversational AI," University of Rhode Island, 2023, Accessed: Aug. 24, 2024. [Online]. Available: : https://digitalcommons.uri.edu/cba_facpubs/548

H. Bahak, F. Taheri, Z. Zojaji, and A. Kazemi, "Evaluating chatgpt as a question answering system: A comprehensive analysis and comparison with existing models," arXiv preprint arXiv:2312.07592, 2023.