ArchiLense: A Framework for Quantitative Analysis of Architectural Styles Based on Vision Large Language Models

  • 2025-06-10 03:26:01
  • Jing Zhong, Jun Yin, Peilin Li, Pengyu Zeng, Miao Zang, Ran Luo, Shuai Lu
  • 0

Abstract

Architectural cultures across regions are characterized by stylisticdiversity, shaped by historical, social, and technological contexts in additionto geograph-ical conditions. Understanding architectural styles requires theability to describe and analyze the stylistic features of different architectsfrom various regions through visual observations of architectural imagery.However, traditional studies of architectural culture have largely relied onsubjective expert interpretations and historical literature reviews, oftensuffering from regional biases and limited ex-planatory scope. To address thesechallenges, this study proposes three core contributions: (1) We construct aprofessional architectural style dataset named ArchDiffBench, which comprises1,765 high-quality architectural images and their corresponding styleannotations, collected from different regions and historical periods. (2) Wepropose ArchiLense, an analytical framework grounded in Vision-Language Modelsand constructed using the ArchDiffBench dataset. By integrating ad-vancedcomputer vision techniques, deep learning, and machine learning algo-rithms,ArchiLense enables automatic recognition, comparison, and preciseclassi-fication of architectural imagery, producing descriptive languageoutputs that ar-ticulate stylistic differences. (3) Extensive evaluations showthat ArchiLense achieves strong performance in architectural style recognition,with a 92.4% con-sistency rate with expert annotations and 84.5% classificationaccuracy, effec-tively capturing stylistic distinctions across images. Theproposed approach transcends the subjectivity inherent in traditional analysesand offers a more objective and accurate perspective for comparative studies ofarchitectural culture.

 

Quick Read (beta)

loading the full paper ...