REVOLUTIONIZING DIGITAL INCLUSION: A GPT-4 POWERED HYBRID EVALUATION OF INTERFACES

uma avaliação híbrida de interfaces potencializada pelo GPT-4

Authors

DOI:

https://doi.org/10.21728/p2p.2025v12n1e-7624

Keywords:

GPT-4, GenderMag, Inclusives Interfaces

Abstract

This article presents a hybrid approach to evaluating inclusive digital interfaces by combining the GenderMag method with the GPT-4 language model. The research explores Prompt Engineering techniques to identify inclusion barriers, such as confusing labels and lack of visual feedback, while maintaining the critical perspective of a human inspector. Applied to the Kahoot platform, using the “Abby” persona as a reference, the method demonstrated moderate convergence—measured by Cohen's Kappa—between the automated and traditional analyses, reinforcing GPT-4’s potential to offer useful insights and significantly reduce evaluation time. Even so, limitations such as the difficulty of capturing all aspects of direct navigation and the limit on images sent to the model highlight the importance of preserving the human specialist’s role in the inspection process.

Downloads

Download data is not yet available.

Author Biographies

  • Christian Silva, Universidade Federal do Amazonas

    Especialização em Design de Produtos Digitais - UX/UI pela Universidade Anhanguera - Uniderp (2022) e bacharelado em Design com ênfase em Interfaces Digitais (2018). 

  • Tayana Conte, Universidade Federal do Amazonas

    Graduação em Ciência da Computação pela Universidade Federal do Pará (1996), mestrado em Ciências da Computação pela Universidade Federal de Minas Gerais (2001) e doutorado em Engenharia de Sistemas e Computação pela Universidade Federal do Rio de Janeiro (2009). Pós-doutorado na University of California - Irvine (UCI) - (2019-2020), como bolsista do Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). Professora Associada da Universidade Federal do Amazonas.

  • Wilson Prata, Universidade Federal do Amazonas

    Graduação em Desenho Industrial pela UFAM (Universidade Federal do Amazonas, 2003), especialização em Arte Multimídia pela mesma instituição (2003), MBA em Marketing pela FGV/ISAE AM (2011) e Mestrado (2013) e Doutorado (2017) em Design pela PUC-Rio.  UX LEAD no Instituto de Pesquisas Eldorado

References

BISANTE, Alba; DATLA, Venkata Srikanth Varma; PANIZZI, Emanuele; TRASCIATTI, Gabriella; ZEPPIERI, Stefano. Enhancing Interface Design with AI: An Exploratory Study on a ChatGPT-4-Based Tool for Cognitive Walkthrough Inspired Evaluations. In: International Conference on Advanced Visual Interfaces (AVI 2024), Arenzano, Genoa, Itália, 2024. DOI: 10.1145/3656650.3656676. Disponível em: https://doi.org/10.1145/3656650.3656676. Acesso em: 02 fev. 2025.

BROWN, Tom B.; MANN, Benjamin; RYDER, Nick; SUBBIAH, Melanie; et al. Language Models are Few-Shot Learners. ArXiv preprint arXiv:2005.14165, 2020. Disponível em: https://arxiv.org/abs/2005.14165. Acesso em: 25 mar. 2025.

BURNETT, Margaret; STUMPF, Simone; MACBETH, Jacob; MAKRI, Stephann; BECKWITH, Laura; KWAN, Irene; PETERS, Anneliese; JERNIGAN, Wendy. GenderMag: A Method for Evaluating Software’s Gender Inclusiveness. Interacting with Computers, v. 28, n. 6, p. 760-787, 2016. DOI: 10.1093/iwc/iwv046. Disponível em: https://doi.org/10.1093/iwc/iwv046. Acesso em: 01 mar. 2025.

CHATTERJEE, Amreeta; GUIZANI, Mariam; STEVENS, Catherine; EMARD, Jillian; MAY, Mary Evelyn; BURNETT, Margaret; AHMED, Iftekhar; SARMA, Anita. AID: An automated detector for gender-inclusivity bugs in OSS project pages. [S.l.]: [s.n.], [s.d.]. Acesso em: 02 fev. 2025.

COHEN, Jacob. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, v. XX, n. 1, p. xx-xx, 1960. Acesso em: 25 mar. 2025.

DESMOND, Michael; BRACHMAN, Michelle. Exploring Prompt Engineering Practices in the Enterprise. IBM Research, [s.l.], [s.d.]. Acesso em: 20 fev. 2025.

ISMAGAMBETOV, Zhenis. User Interface Testing Methods in Complex Software Systems. International Research Journal of Modernization in Engineering Technology and Science, v. 07, n. 02, fev. 2025. DOI: https://www.doi.org/10.56726/IRJMETS68058. Disponível em: https://www.doi.org/10.56726/IRJMETS68058. Acesso em: 20 mar. 2025.

KAHOOT. Kahoot. Disponível em: https://kahoot.com/. Acesso em: 20 fev. 2025.

KOVALEVA, Yekaterina; HAPPONEN, Ari; KINDSIKO, Eneli. Designing gender-neutral software engineering program: stereotypes, social pressure, and current attitudes based on recent studies. In: 2022 IEEE/ACM 3rd International Workshop on Gender Equality, Diversity and Inclusion in Software Engineering (GE@ICSE'22), Pittsburgh, PA, EUA, 20 de maio de 2022. Proceedings [...]. ACM, 2022. p. 43-50. DOI: 10.1145/3524501.3527600. Disponível em: https://doi.org/10.1145/3524501.3527600. Acesso em: 13 fev. 2025.

MAHATODY, Thomas; SAGAR, Mouldi; KOLSKI, Christophe. State of the Art on the Cognitive Walkthrough Method, Its Variants and Evolutions. International Journal of Human-Computer Interaction, v. 26, n. 8, p. 741-785, 2010. DOI: 10.1080/10447311003781409. Disponível em: https://www.researchgate.net/publication/220302514. Acesso em: 13 mar. 2025.

MENDEZ, Christopher; ANDERSON, Andrew; BHUVA, Brijesh; BURNETT, Margaret. The GenderMag Recorder’s Assistant. In: VLHCC18 – Showpiece, 2018. ©2018 IEEE. Acesso em: 20 mar. 2025.

NAVEED, Humza; KHAN, Asad Ullah; QIU, Shi; SAQIB, Muhammad; ANWAR, Saeed; USMAN, Muhammad; AKHTAR, Naveed; BARNES, Nick; MIAN, Ajmal. A Comprehensive Overview of Large Language Models. Preprint submetido à Elsevier, 2024. Acesso em: 21 mar. 2025.

NUNES, Inês; MOREIRA, Ana; ARAUJO, João. GIRE: Gender-Inclusive Requirements Engineering. Data & Knowledge Engineering, v. 143, p. 102108, 2023. DOI: 10.1016/j.datak.2022.102108. Disponível em: https://doi.org/10.1016/j.datak.2022.102108. Acesso em: 13 dez. 2024.

OPENAI. Creating a GPT. Disponível em: https://help.openai.com/en/articles/8554397-creating-a-gpt. Acesso em: 16 dez. 2024.

PINHEIRO, Rayane Marques. Usando GenderMag como técnica para avaliação de requisitos de inclusão em ferramentas de ensino online. 2021. 54 f. Trabalho de Conclusão de Curso (Bacharelado em Engenharia de Software) – Instituto de Computação, Universidade Federal do Amazonas, Manaus, 2021.

RATHJE, Steve; MIREA, Dan-Mircea; SUCHOLUTSKY, Ilia; MARJIEH, Raja; ROBERTSON, Claire E.; VAN BAVEL, Jay J. GPT is an effective tool for multilingual psychological text analysis. Proceedings of the National Academy of Sciences (PNAS), v. 121, n. 34, p. e2308950121, 2024. DOI: 10.1073/pnas.2308950121. Disponível em: https://doi.org/10.1073/pnas.2308950121. Acesso em: 01 mar. 2025.

VOGELSANG, Andreas. From Specifications to Prompts: On the Future of Generative LLMs in Requirements Engineering. IEEE, 2024. DOI: 10.1109/XXX.0000.0000000. Disponível em: https://doi.org/10.1109/XXX.0000.0000000. Acesso em: 13 dez. 2024.

WANG, Xinyuan; LI, Chenxi; WANG, Zhen; BAI, Fan; LUO, Haotian; ZHANG, Jiayou; JOJIC, Nebojsa; XING, Eric; HU, Zhiting. PromptAgent: Strategic Planning with Language Models Enables Expert-Level Prompt Optimization. [S.l.], [s.d.]. Disponível em: https://github.com/XinyuanWangCS/PromptAgent. Acesso em: 15 mar. 2025.

WANG, Zichong; CHU, Zhibo; DOAN, Thang Viet; NI, Shiwen; YANG, Min; ZHANG, Wenbin. History, Development, and Principles of Large Language Models—An Introductory Survey. [S.l.], [s.d.]. Acesso em: 13 dez. 2024.

WEI, Jason; WANG, Xuezhi; SCHUURMANS, Dale; BOSMA, Maarten; ICHTER, Brian; XIA, Fei; CHI, Ed H.; LE, Quoc V.; ZHOU, Denny. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In: Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022. Disponível em: https://papers.nips.cc/paper/2022. Acesso em: 18 fev. 2025.

Published

28/07/2025

Issue

Section

Plataformas Digitais: Governança e Regulação

Similar Articles

1-10 of 222

You may also start an advanced similarity search for this article.