From Zero to Golden: A Self-Validating Multi-Agent Framework to Solve the Text-to-SQL Cold Start
Tóm tắt
The adoption of Text-to-SQL in enterprises faces a major barrier: data cost. The most powerful AI models require extensive training on expensive ”golden data,” which must be custom-built for each application’s database schema. In the absence of data, off-the-shelf solutions such as static system prompts, while incurring operational costs, yield unstable results with low accuracy and fail to accumulate value over time. To resolve the paradox that ”without data, the system cannot operate, and without operation, there is no data,” we introduce Trustay-AI — an intelligent AI architecture that requires no prior training, utilizing four coordinated agents to accurately simulate expert reasoning: an Orchestrator Agent for intent understanding, a SQL Generation Agent with self-correction capabilities, a Response Generator Agent to format results into user-friendly outputs, and a Validator Agent to ensure the final result is logically correct. The core value of Trustay-AI lies in its ”knowledge harvesting” capability — each SQL query validated as correct by the Validator Agent is automatically stored in a canonical SQL knowledge repository, not only enabling faster responses for subsequent queries but, more importantly, serving as valuable data assets that accumulate over time. Our experiments demonstrate that this architecture achieves an impressive accuracy of approximately 80.0%, significantly surpassing the 45.8% baseline achieved by conventional static system prompt approaches. This accumulated canonical SQL repository constitutes the ”golden data” that enterprises can leverage to finetune specialized models in the future, thoroughly addressing the challenges of cost and dependency on initial training data