Digital Accessibility for Disabilities Peer reviewed

An Empirical Study on Evaluating Accessible Code Generation Capabilities of LLMs

Hyunjae Suh, Mahan Tafreshipour, Sam Malek, Iftekhar Ahmed

ACM Transactions on Software Engineering and Methodology | Jun 12, 2026

Scollr summary

What this paper is about

This paper presents an empirical study comparing the accessibility of web code generated by GPT-4o, Qwen2.5-Coder-32B-Instruct-AWQ, and Gemini-3-Flash against human-written code, showing that LLMs often produce more accessible code, but struggle with complex issues such as ARIA attributes.

Full abstract

Read the full abstract

Web accessibility is essential for inclusive digital experiences, yet the accessibility of LLM-generated code remains underexplored. This paper presents an empirical study comparing the accessibility of web code generated by GPT-4o, Qwen2.5-Coder-32B-Instruct-AWQ, and Gemini-3-Flash against human-written code. Results show that LLMs often produce more accessible code, especially for basic features like color contrast and alternative text, but struggle with complex issues such as ARIA attributes. We also assess advanced prompting strategies (Zero-Shot, Few-Shot, Self-Criticism), finding they offer some gains but are limited. To address these gaps, we introduce FeedA11y , a feedback-driven ReAct-based approach that demonstrates the potential of incorporating accessibility evaluation results into the code generation process. Our work highlights the promise of LLMs for accessible code generation and emphasizes the need for feedback-based techniques to address persistent challenges. We provide the source code and datasets that were used in our experiments in the companion website [15].

Direct answer

What can I do from this paper page?

Use this page to scan "An Empirical Study on Evaluating Accessible Code Generation Capabilities of LLMs" quickly: start with the summary and abstract, then check the authors, source, topics, and related papers. From here, open Scollr to follow Digital Accessibility for Disabilities research, save the paper, or map adjacent work.

Authors

Researchers on this paper

Hyunjae Suh

first | University of California, Irvine | ORCID 0009-0008-9487-9365

Mahan Tafreshipour

middle | University of California, Irvine | ORCID 0009-0003-9791-4940

Sam Malek

middle | University of California, Irvine | ORCID 0000-0001-6152-7402

Iftekhar Ahmed

last | University of California, Irvine | ORCID 0000-0001-8221-5352

Research areas

Follow related topics

Text Readability and Simplification Software Engineering Research Latest Digital Accessibility for Disabilities research

Citation

BibTeX

@article{Suh2026Empirical,
  title = {An Empirical Study on Evaluating Accessible Code Generation Capabilities of LLMs},
  author = {Hyunjae Suh and Mahan Tafreshipour and Sam Malek and Iftekhar Ahmed},
  journal = {ACM Transactions on Software Engineering and Methodology},
  year = {2026},
  doi = {10.1145/3820782},
  url = {https://doi.org/10.1145/3820782}
}

FAQ

Using this paper in a discovery workflow

How do I find related work for this paper?

Use the related papers and topic links on this page as starting points. In Scollr, you can also open the paper and build a literature map around its references, citing papers, and related work.

How can I keep up with new Digital Accessibility for Disabilities research papers?

Follow Digital Accessibility for Disabilities research in Scollr. New papers from the topic flow into a personalized feed, and you can save useful studies to revisit later.

Can I cite this paper from this page?

This page includes a static BibTeX block for An Empirical Study on Evaluating Accessible Code Generation Capabilities of LLMs. Always verify the DOI, source, and publication details against the publisher record before submitting a manuscript.

Follow this research in Scollr

Follow the topics and authors behind this paper, save useful studies, and build a literature map when you are ready to go deeper.

Get the app