Multi-LLM-based augmentation and synthetic data generation of construction schedules and task descriptions with SLM-as-a-judge assessment
Journal
Advanced Engineering Informatics
Journal Volume
69
Start Page
103825
ISSN
14740346
Date Issued
2026-01
Author(s)
Singh, Akarsth Kumar
Abstract
The fragmented structure, semantic inconsistency, and limited availability of construction schedule data significantly hinder the development of intelligent planning tools in the architecture, engineering, and construction (AEC) domain. In particular, the absence of high-quality, hierarchically structured Work Breakdown Structure with Task Dependency (WBS-TD) datasets restricts the training and evaluation of AI-based models for automated construction workflows. This study investigates whether Large Language Models (LLMs) can be systematically applied to enhance and generate construction schedule and task description data, and whether lightweight, locally deployed Small Language Models (SLMs) can effectively evaluate these outputs using domain-specific rubrics in a scalable and privacy-preserving manner. To address this, an integrated methodology is proposed, consisting of three components: (1) Role-Guided Modular Prompt Chaining (RGPC), which transforms inconsistent WBS-TD inputs into logically ordered and semantically enriched outputs; (2) synthetic data generation via a multi-LLM pipeline using structured prompt strategies to produce diverse, realistic construction schedules and descriptions; and (3) SLM-as-a-Judge, a rubric-based evaluation approach that uses a lightweight, locally deployed SLMs to assess output quality across structural, logical, and domain-specific dimensions without requiring sensitive data to leave secure environments. Experimental results show that Claude-3.5-Sonnet achieved 77 % quality in augmented schedule generation, Gemini-2.0-Flash reached 92 % in synthetic schedule generation, and DeepSeek-R1 provided the best balance of quality and diversity in synthetic construction task description generation, demonstrating strong domain alignment across tasks. The framework generates reusable, machine-readable knowledge graph datasets supporting downstream applications such as AI-assisted planning, progress monitoring, and risk analysis. This study delivers a scalable, model-agnostic pipeline that advances automation and evaluation in construction informatics.
Subjects
Automation in planning and scheduling
Construction data augmentation
Construction informatics
Large language models (LLMs)
SLM-as-a-judge
Small language models (SLMs)
Synthetic data generation
Publisher
Elsevier Ltd
Type
journal article
