Week 5 - SQL Assistant with Progressive Disclosure

Building Smart Database Agents

Lesson Overview

Segment	Duration
Lecture: Progressive Disclosure Concept	10 minutes
Setup: Environment & Skills	15 minutes
Guided Coding: Building the SQL Agent	20 minutes
Exploration: Advanced Constraints & Multi-turn	15+ minutes

Learning Objectives: By the end of this lesson, students will be able to:

Understand the progressive disclosure pattern for context management
Build an agent that loads database schemas on-demand
Create skill-based tools that inject knowledge when needed
Implement stateful agents that remember loaded context
Design constrained tools that enforce prerequisites

Colab Notebook for Today:

Week 5 - SQL Assistant with Progressive Disclosure
(It is recommended to download a copy of the notebook to your own google colab)

Based on: LangChain SQL Skills Assistant tutorial

Lecture (10 min): Progressive Disclosure Concept

What is Progressive Disclosure?

Progressive disclosure is a design pattern where you show only the minimum information needed upfront, and load details on-demand. In the context of AI agents and databases:

Traditional Approach (❌ Wasteful): - Load ALL database schemas into the agent’s context upfront - For a database with 100 tables, this fills the entire context window - Extremely high cost and token usage - Agent is overwhelmed with irrelevant information

Progressive Disclosure (✅ Efficient): - Show only lightweight skill descriptions in the system prompt - Agent calls load_skill() on-demand to retrieve full schemas - Only loads what it needs for the current task - Scales to massive databases

Real-World Example

Imagine you work at a company with 50 different database schemas (sales, inventory, HR, finance, marketing, etc.).

Old way: Load all 50 schemas → 500,000 tokens → $10 per query
New way: Agent sees 50 one-sentence descriptions → loads 1 schema when needed → 10,000 tokens → $0.20 per query

That’s a 50x cost reduction while maintaining the same functionality!

Core Concept

Instead of this:

System: Here are all 100 database schemas...
[500 pages of schema details]
User: What were our top customers last month?

We do this:

System: You have access to these skills:
- sales_analytics: Customer and order data
- inventory_management: Product and stock data
- ...
User: What were our top customers last month?
Agent: I need sales_analytics. Let me load it.
[Loads only that one schema]
Agent: Here's your query based on the sales schema.

Getting Started: API Key Setup

Choose Your Platform

You can follow along using either Google Colab (recommended for beginners - no setup required!) or local development (requires Python environment setup). The code examples below include instructions for both.

About API keys

An API key gives you access to Google’s Gemini AI models. You’ll need one to run this lesson.

Do This

Access the developer’s site for Gemini here: Gemini API key
Sign in with your Google account
A key should be created automatically. Copy the string of characters and numbers.

If it doesn’t work

Make sure you are using a personal Google account, not your school email
If it says you don’t have access in your region, you may need to verify your age in your Google account settings

For Google Colab:

In Google Colab, we’ll use Colab’s built-in secrets feature to store your API key securely.

Steps: 1. Click the 🔑 key icon in the left sidebar (“Secrets”) 2. Click “Add a new secret” 3. Name it: GOOGLE_API_KEY 4. Paste your Gemini API key as the value 5. Toggle on “Notebook access” for this notebook

# Set up API key from Colab secrets
import os
from google.colab import userdata

os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')
print("Google API key configured!")

For Local Development:

# Load environment variables from .env file (local development only)
from dotenv import load_dotenv
load_dotenv()

Install Required Packages

!pip install -q -U langchain langchain-google-genai langgraph

Initialize the Gemini Model

from langchain.chat_models import init_chat_model

model = init_chat_model(
    model="gemini-2.5-flash",
    model_provider="google_genai",
)

# Sanity check
response = model.invoke("Hello world!")
print("✅ Gemini 2.5 Flash model ready!")
print(f"Response: {response.content}")

STOP

If you got things working, great!

Make sure to help your neighbor get there too. Then you can continue.

Understanding Skills

What are Skills?

A skill is a package of domain-specific knowledge that the agent can load on-demand. Each skill has three components:

name — unique identifier (e.g., "sales_analytics")
description — short 1-2 sentence summary shown in the system prompt upfront
content — the full schema and business logic, loaded only when needed

This is the heart of progressive disclosure: descriptions are cheap, full content is loaded only when needed.

Example Skill Structure

Skill(
    name="sales_analytics",
    description="Database schema and business logic for sales data analysis.",
    content="""
    # Full database schema here
    Tables: customers, orders, order_items
    Columns: ...
    Business rules: ...
    """
)

The description goes in the system prompt. The content is loaded via the load_skill() tool.

Building the SQL Agent

The complete code for building the agent is in the Colab notebook. Here’s the workflow:

Step 1: Define Skills

Create skill objects with lightweight descriptions and full content.

Step 2: Create the `load_skill` Tool

This tool fetches skill content on-demand:

@tool
def load_skill(skill_name: str) -> str:
    \"\"\"Load the full content of a skill into the agent's context.\"\"\"
    skill = SKILLS_MAP.get(skill_name)
    if skill:
        return f"Loaded skill: {skill_name}\\n\\n{skill.content}"
    return f"Skill '{skill_name}' not found."

Step 3: Build System Prompt with Skill Descriptions

skills_list = "\\n".join(
    f"- **{skill.name}**: {skill.description}"
    for skill in SKILLS
)

SYSTEM_PROMPT = f\"\"\"You are a SQL query assistant.

## Available Skills
{skills_list}

Before writing SQL, call load_skill to get the schema.\"\"\"

Step 4: Create the Agent

agent = create_react_agent(
    model=model,
    tools=[load_skill],
    prompt=SYSTEM_PROMPT,
    checkpointer=MemorySaver(),  # Remembers loaded skills
)

Step 5: Test It!

ask_agent("Find customers who made orders over $1000 last month.")

The agent will: 1. Recognize it needs sales data 2. Call load_skill("sales_analytics") 3. Get the full schema 4. Write the SQL query

Exploration: Advanced Features

The Colab notebook includes advanced sections on:

Multi-turn conversations with persistent memory
Constrained tools that enforce prerequisites (must load skill before validating query)
Custom state management to track which skills are loaded

Extending This Pattern

Add more skills: Just add entries to the SKILLS list — no other code changes needed.

Large databases: For 100+ tables, use vector search to find relevant skills:

from langchain_community.vectorstores import FAISS
# Embed skill descriptions, retrieve top-k relevant

Real query execution: Combine with LangChain’s SQL Agent to actually run queries against a database.

Remote skill storage: Store skill content in S3, a database, or an API instead of in-memory.

Summary

Progressive disclosure is a powerful pattern for managing context in AI agents. By showing lightweight descriptions upfront and loading details on-demand, you can:

Reduce costs by 10-50x
Scale to massive databases (100s of tables)
Keep context focused on what matters
Maintain accuracy by loading complete schemas when needed

This pattern applies beyond just database schemas — think of it for: - API documentation - Code repositories - Knowledge bases - Tool libraries

The key insight: Don’t load everything upfront. Load what you need, when you need it.

Lesson Overview

Lecture (10 min): Progressive Disclosure Concept

What is Progressive Disclosure?

Real-World Example

Core Concept

Getting Started: API Key Setup

About API keys

Do This

If it doesn’t work

For Google Colab:

For Local Development:

Install Required Packages

Initialize the Gemini Model

STOP

Understanding Skills

What are Skills?

Example Skill Structure

Building the SQL Agent

Step 1: Define Skills

Step 2: Create the load_skill Tool

Step 3: Build System Prompt with Skill Descriptions

Step 4: Create the Agent

Step 5: Test It!

Exploration: Advanced Features

Extending This Pattern

Summary

Step 2: Create the `load_skill` Tool