Mobile QR Code QR CODE : Journal of the Korean Society of Civil Engineers
Title Performance Validation of a Hybrid LLM-Based QA System for Bridge Management
Authors 고은별(Koh, Eunbyul);선종완(Sun, Jong-Wan);박경훈(Park, Kyung-Hoon)
DOI https://doi.org/10.12652/Ksce.2026.46.3.0259
Page pp.259-268
ISSN 10156348
Keywords 교량 유지관리; 거대언어모델(LLM); 질의응답 시스템; 환각 억제 Bridge management; Large Language Model (LLM); QA system; Hallucination mitigation
Abstract With the rapid surge in bridge management data, the limitations of traditional database query methods in terms of accessibility and efficiency have become apparent. This study develops a hybrid question-answering (QA) system that integrates the natural language understanding of Large Language Models (LLMs) with the computational precision of Python, and validates its effectiveness in suppressing hallucinations. Using South Korea's bridge management data from 2012 to 2025 (474,670 records), we systematically evaluated six models, including cloud-based (GPT-4o, GPT-5, Gemini 2.0 Flash) and local open-source models (GPT-OSS 20B, Qwen3 8B, Gemma3 4B). Experiments were conducted across 20 questions categorized into six types and three levels of complexity. The results showed that GPT-4o and GPT-OSS 20B achieved the highest accuracy of 95 %. While GPT-4o exhibited superior real-time responsiveness (7.61 s), suitable for real-time services, GPT-OSS 20B demonstrated stable performance in large-scale data ranking queries without API token constraints. Error analysis revealed that intent extraction errors (46 %) were the primary cause of failure, and cloud models specifically faced token limit issues (15 %). This study proves that a hybrid architecture separating natural language understanding from numerical computation effectively suppresses hallucinations, and provides model selection criteria based on practical requirements such as cost, performance, and security for bridge management system implementation.