Google Unveils Gemini-SQL2: New System Achieves 80.04% Accuracy in Database Query Translation

Google Research has launched Gemini-SQL2, a new text-to-SQL system powered by Gemini 3.1 Pro that achieves an impressive 80.04% execution accuracy on the BIRD benchmark. This marks a significant 3.9% improvement over its predecessor, Gemini-SQL, and outperforms leading competitors such as AWS Q-SQL and OpenAI's GPT-5.5 in SQL generation.
Technical Advancements in Database Interaction
Gemini-SQL2 specializes in translating complex human queries into 'execution-ready SQL.' This means the generated SQL code is both syntactically correct and functionally accurate. The system effectively addresses challenges in database querying, particularly:
- Inconsistent data values requiring contextual interpretation
- Multi-table joins with unclear column relationships
- Time-series analysis requiring window functions
- Domain-specific business logic embedded in queries
According to Google Research, "Data subtlety and complex business contexts make SQL generation notoriously difficult." The system's architecture emphasizes schema understanding, a critical challenge identified in previous research on SQL generation.
Setting New Standards in SQL Generation
BIRD Benchmark: The Ultimate Accuracy Test
The BIRD benchmark (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) is the leading standard for assessing SQL generation systems. It includes 12,751 question-SQL pairs across 95 databases in 37 professional domains, testing:
- Execution accuracy (EX): Whether the generated SQL actually returns the correct results
- Handling messy data that requires additional context
- Complex queries that extend beyond basic SELECT statements
Gemini-SQL2 has made notable progress in SQL generation, narrowing the 12.92-point gap with human performance (92.96% accuracy) and achieving the highest single-model score to date in the field of database query translation.
Competitive Landscape
Gemini-SQL2 leads the BIRD Single-Model Leaderboard, surpassing both general-purpose models and specialized SQL generation systems. This advanced SQL model shows exceptional performance in accuracy:
| System | Organization | Accuracy | Date |
|---|---|---|---|
| Gemini-SQL2 | 80.04% | Jun 2026 | |
| Gemini-SQL | 77.2% | Mar 2026 | |
| Q-SQL | AWS | 76.5% | Dec 2025 |
| GPT-5.5-xhigh | OpenAI | 72.5% | Apr 2026 |
Specialized 32B SQL models from Snowflake (Arctic-Text2SQL) and Alibaba (SQLWeaver) also outperform general-purpose models like Claude Opus 4.6, demonstrating the effectiveness of domain-specific SQL training.
Market Reception and Adoption of Gemini-SQL2
The announcement of Gemini-SQL2 garnered significant attention within technical communities:
- Over 144K views on X within the first 3 hours
- 2.8K likes and 1.3K bookmarks
- A 9.3:1 bookmark-plus-like to reply ratio indicates strong approval
While Google hasn't confirmed specific product integrations, analysts anticipate the near-term implementation of Gemini-SQL2 in BigQuery Studio and AlloyDB AI, where Gemini-based SQL generation is already present.
Future Directions for Gemini-SQL2
Google's research roadmap aims to enhance the handling of temporal databases and improve multi-step reasoning capabilities. With enterprise customers seeking better natural language access to complex data ecosystems, Gemini-SQL2 represents a significant advancement in AI-powered query generation and data analytics.