Abstract
Abstract
Many web-based visualizations are deployed as Scalable Vector Graphics (SVG), a format that faithfully preserves visual appearance but typically omits the higher-level semantic structure needed for machine interpretation. Once rendered and published, information about a visualization's components, roles, and encodings is no longer explicitly available, limiting downstream operations such as querying, accessibility augmentation, explanation, personalization, and transformation. To address this gap, we introduce CSL, an AI-enabled, multi-stage pipeline for automatically recovering visualization semantics from deployed SVGs through two complementary mechanisms: (1) cohort-based decomposition, which organizes heterogeneous SVG primitives into structurally coherent subsets that reduce the semantic assignment space, and (2) hybrid semantic grounding, which combines model-based inference with deterministic structural validation and propagation to make labeling both context-sensitive and structurally anchored. CSL produces Semantic SVG (SSVG), a representation in which SVG elements are annotated with graphical mark type, visualization role, and data role. We implemented CSL as an end-to-end prototype and evaluated it on 102 SVG visualizations, achieving global macro-averaged accuracies of 0.822 for mark type, 0.853 for visualization role, and 0.860 for data-role recovery. An ablation against a non-cohort whole-chart baseline showed that cohorting significantly improves accuracy (paired t-test: t > 20, p 2.0), and repeated labeling of a randomly selected SVG over 100 runs yielded mean agreement above 91.9% across all three attributes. These results provide strong evidence that CSL can transform deployed SVGs into machine-usable semantic representations, enabling more accessible, adaptive, and user-steerable visualization systems.
Direct answer
What can I do from this paper page?
Use this page to scan "Cohort-based Semantic Labeling: AI-Enabled Recovery of Visualization Semantics from Deployed SVGs" quickly: start with the summary and abstract, then check the authors, source, topics, and related papers. From here, open Scollr to follow Data Visualization and Analytics research, save the paper, or map adjacent work.
Research areas
Follow related topics
Citation
BibTeX
@article{Lee2026Cohort,
title = {Cohort-based Semantic Labeling: AI-Enabled Recovery of Visualization Semantics from Deployed SVGs},
author = {Jeongah Lee and Hima Varshini Surisetty and Durga Nirmaleswaran and Jahnavi Sharma and Srikiran Kavuri and Narges Mahyar and Ali Sarvghad},
journal = {arXiv (Cornell University)},
year = {2026},
doi = {10.48550/arxiv.2606.09782},
url = {https://doi.org/10.48550/arxiv.2606.09782}
}
FAQ
Using this paper in a discovery workflow
How do I find related work for this paper?
Use the related papers and topic links on this page as starting points. In Scollr, you can also open the paper and build a literature map around its references, citing papers, and related work.
How can I keep up with new Data Visualization and Analytics research papers?
Follow Data Visualization and Analytics research in Scollr. New papers from the topic flow into a personalized feed, and you can save useful studies to revisit later.
Can I cite this paper from this page?
This page includes a static BibTeX block for Cohort-based Semantic Labeling: AI-Enabled Recovery of Visualization Semantics from Deployed SVGs. Always verify the DOI, source, and publication details against the publisher record before submitting a manuscript.
Follow this research in Scollr
Follow the topics and authors behind this paper, save useful studies, and build a literature map when you are ready to go deeper.
Get the app