PDF to Accessible Markdown: Introducing Equalify Reflow, an Open Source AI Tool
Accessibility NYC Meetup - April 7, 2026
VIDEO | AUDIO | RECAP EN / ES / FR | ARCHIVE | PERMALINK
Speakers: Blake Bertuccelli-Booth (Assistant Director of Digital Accessibility Engineering, University of Illinois Chicago); Dylan Isaac (UIC Digital Accessibility)
Moderator: Ben Ogilvie (A11yNYC)
Blake Bertuccelli-Booth and Dylan Isaac present Equalify Reflow, an open source AI-driven system designed to convert PDFs into accessible Markdown, addressing longstanding accessibility challenges and enabling scalable compliance with regulations such as ADA Title II.
The Problem with PDF Accessibility
PDFs were originally designed for print, not for structured, accessible digital use. As a result, they lack semantic structure, making them difficult for assistive technologies like screen readers.
Key issues include:
Visual-first format with minimal inherent semantics
High cost of remediation ($5–$35 per page)
Massive scale (e.g., UIC managing 200,000+ PDFs)
Ineffectiveness of existing auto-tagging solutions
This creates both a technical and economic barrier to accessibility.
Rethinking the Approach
Instead of attempting to make PDFs accessible, the team reframed the problem: convert PDFs into a format that is inherently accessible.
They selected Markdown because it is:
Plain text and non-proprietary
Human-readable and easy to edit
Semantically structured
Easily convertible to accessible HTML
Widely used by AI systems
This shift from “accessible PDF” to “accessible content” was the key breakthrough.
Why AI is Necessary
Traditional rule-based or programmatic approaches cannot handle the complexity of real-world PDFs due to layout variability and the need to interpret visual meaning.
AI is used as a semantic translator, converting visual presentation into structured meaning. Examples include:
Large centered text → heading
Blue underlined text → link
Proximity and styling → captions
This allows the system to reconstruct meaning rather than simply extract text.
Equalify Reflow Architecture
Reflow is an AI harness that guides models through a controlled pipeline to improve accuracy and reduce hallucinations.
The process includes:
Text extraction using IBM Docling
Document analysis to identify structure and features
Heading hierarchy correction
Page-by-page AI translation into Markdown
Final assembly into a continuous, reflowable document
Each page is handled by a dedicated AI agent to maintain focus and context.
Demonstration Highlights
The demo shows:
Conversion of academic PDFs into structured Markdown
Correction of OCR errors and formatting issues
Handling of complex tables via HTML fallback
Transformation of two-column layouts into linear reading order
Reconstruction of malformed tables
Interpretation of posters into logical document structures
The system also generates alt text through specialized sub-agents that analyze images in context.
Use Cases and Integration
Reflow can be deployed in multiple ways:
Web-based upload and conversion service
Integration with LMS platforms like Canvas
Conversion of PDFs into CMS content (e.g., WordPress posts)
Potential browser extensions or assistive technology integrations
It enables accessibility without disrupting existing workflows.
Economics and Impact
Reflow significantly reduces costs:
Traditional remediation: $5–$35 per page
Reflow: ~ $0.10 per page
Approximate 50× cost reduction
Future use of open source models could reduce costs even further.
Open Source and Extensibility
The project is released under AGPLv3 and designed for expansion:
Developers can add new AI agents for specific document types
Supports customization for niche use cases
Model-agnostic architecture allows flexibility
The goal is a collaborative ecosystem for accessibility solutions.
Limitations and Future Work
Current challenges include:
Handling mathematical notation (planned improvements)
Refining accuracy across diverse document types
Improving accessibility-aware AI training
Q&A Highlights
Extensibility:
Developers can contribute custom agents for specialized needs.
Math support:
Not yet fully implemented; under development.
Handling different PDFs:
Combines PDF parsing with image-based analysis.
Privacy:
Uses tools like Microsoft Presidio to detect sensitive data before processing.
Accessibility of the tool:
Interface includes semantic structure, keyboard navigation, and skip links; core system is API-based.
Conclusion
Equalify Reflow shifts the accessibility paradigm from repairing PDFs to transforming them into accessible, structured content. By combining AI with a guided workflow, it offers a scalable, cost-effective solution with strong potential for global adoption and open source collaboration.
RESOURCES
Equalify Reflow — open-source agentic AI tool that converts PDFs to accessible Markdown, introduced by Blake Bertuccelli-Booth and Dylan Isaac
Equalify on GitHub — source code and repositories for Equalify Reflow and the broader Equalify platform
Docling — IBM open-source PDF text extraction library used as Reflow’s first-stage pipeline step
Microsoft Presidio — open-source PII detection and anonymization framework used in Reflow’s document upload workflow
ADA Title II Digital Accessibility — UIC guidance — UIC’s overview of the April 24, 2026 DOJ rule requiring WCAG 2.1 compliance for public higher education institutions
Accessibility NYC Meetup (A11yNYC) — monthly meetup series bringing together New York City’s accessibility community
Blake Bertuccelli-Booth — UIC profile — Assistant Director of Digital Accessibility Engineering at UIC, creator of Equalify
Dylan Isaac — UIC profile — AI Accessibility Engineer at UIC, former Lead AI Engineer at Deque Systems where he built axe Assistant


