Automating Iconclass: LLMs and RAG for LargeScale Classification of Religious Woodcuts

Written By: Drew B. Thomas

Abstract: This article presents a novel methodology for classifying early modern religious images by using Large Language Models (LLMs) and vector databases in combination with Retrieval-Augmented Generation (RAG). The approach leverages the full-page context of book illustrations from the Holy Roman Empire, allowing the LLM to generate detailed descriptions that incorporate both visual and textual elements. These descriptions are then matched to relevant Iconclass codes through a hybrid vector search. This method achieves 87% and 92% precision at five and four levels of classification, significantly outperforming traditional image and keyword-based searches. By employing full-page descriptions and RAG, the system enhances classification accuracy, offering a powerful tool for large-scale analysis of early modern visual archives. This interdisciplinary approach demonstrates the growing potential of LLMs and RAG in advancing research within art history and digital humanities.

Keywords: book history, protestant reformation, computer vision, digital humanities, iconclass, large-language models, vector database, semantic search, retrieval-augmented generation, information retrieval, early modern Europe, woodcuts, printing press, Martin Luther, bible illustrations

The Illusion of Knowledge: Rethinking AI “Hallucinations” in Islamic Studies and Arabic Digital Librarianship

Written By: Amina El Ganadi

Abstract: Large language models (LLMs) are transforming the digital humanities by automating translation, summarisation, classification, and textual analysis at unprecedented scale. However, their fluency is often mistaken for genuine understanding, creating an illusion of knowledge that can mask unreliable factual grounding. This paper reframes the commonly used term “hallucination” as AI-induced error to emphasise the structural decoupling of linguistic plausibility from accuracy, a problem compounded by the limited interpretability of LLMs (the “black box” problem). The risk is especially acute in fields requiring textual precision, such as Islamic studies and Arabic digital librarianship, where distortions can affect interpretation, misrepresent doctrinal concepts, and undermine metadata reliability, with errors propagating into cataloguing systems and historical research. The paper analyses computational and epistemological factors that generate AI-induced errors, surveys mitigation approaches (retrieval-augmented generation, explainability methods, and domain-specific fine-tuning), and argues that technical safeguards are insufficient in isolation without human-in-the-loop oversight grounded in domain expertise. Drawing on practice-based evidence from the Digital Maktaba Project, it documents recurring error typologies encountered in real institutional workflows, including fabricated Islamic bibliographic categories, misidentification of foundational Islamic figures, fabricated hadith reports, and subject headings generated in unrelated languages. It concludes by advocating for transparent workflows, rigorous human involvement, and interdisciplinary collaboration among AI developers, domain experts, and humanities scholars as necessary conditions for responsible AI integration in culturally sensitive research contexts.

Keywords: AI-induced error, hallucination, large language models, illusion of knowledge, human-in-the-loop oversight, Islamic studies, Arabic digital librarianship, retrieval-augmented generation, Digital Maktaba Project.