Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 3801

Text-to-SQL Implementation Using Spring AI

$
0
0

1. Overview

Modern applications are increasingly using natural language interfaces to simplify user interaction with systems. This is particularly useful for data retrieval, where non-technical users can ask questions in plain English.

A text-to-SQL chatbot is one such example. It acts as a bridge between human languages and databases. We typically leverage a Large Language Model (LLM) to translate a user’s natural language question into an executable SQL query. This query is then run against the database to fetch and display the desired information.

In this tutorial, we’ll build a text-to-SQL chatbot using Spring AI. We’ll configure a database schema with some initial data and implement our chatbot to query this data using natural language.

2. Setting up the Project

Before we can start implementing our chatbot, we’ll need to include the necessary dependency and configure our application correctly.

We’ll be building our text-to-sql chatbot using Anthropic’s Claude model. Alternatively, we can use a different AI model or a local LLM via Hugging Face or Ollama, as the specific AI model is irrelevant to this implementation.

2.1. Dependencies

Let’s start by adding the necessary dependency to our project’s pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
    <version>1.0.0</version>
</dependency>

The Anthropic starter dependency is a wrapper around the Anthropic Message API, and we’ll use it to interact with the Claude model in our application.

Next, let’s configure our Anthropic API key and chat model in the application.yaml file:

spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: claude-opus-4-20250514

We use the ${} property placeholder to load the value of our API Key from an environment variable.

Additionally, we specify Claude 4 Opus, the most intelligent model at the time of writing, by Anthropic, using the claude-opus-4-20250514 model ID. We can use a different model based on requirements.

On configuring the above properties, Spring AI automatically creates a bean of type ChatModel, allowing us to interact with the specified model.

2.2. Defining Database Tables Using Flyway

Next, let’s set up our database schema. We’ll make use of Flyway to manage our database migration scripts.

We’ll create a rudimentary wizard management database schema in a MySQL database. Just like the AI model, the database vendor is irrelevant to our implementation.

First, let’s create a migration script named V01__creating_database_tables.sql in our src/main/resources/db/migration directory to create the main database tables:

CREATE TABLE hogwarts_houses (
    id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID())),
    name VARCHAR(50) NOT NULL UNIQUE,
    founder VARCHAR(50) NOT NULL UNIQUE,
    house_colors VARCHAR(50) NOT NULL UNIQUE,
    animal_symbol VARCHAR(50) NOT NULL UNIQUE
);
CREATE TABLE wizards (
    id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID())),
    name VARCHAR(50) NOT NULL,
    gender ENUM('Male', 'Female') NOT NULL,
    quidditch_position ENUM('Chaser', 'Beater', 'Keeper', 'Seeker'),
    blood_status ENUM('Muggle', 'Half blood', 'Pure Blood', 'Squib', 'Half breed') NOT NULL,
    house_id BINARY(16) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT wizard_fkey_house FOREIGN KEY (house_id) REFERENCES hogwarts_houses (id)
);

Here, we create a hogwarts_houses table to store information about each Hogwarts house and the wizards table to store details about individual wizards. The wizards table has a foreign key constraint that links it to the hogwarts_houses table, establishing a one-to-many relationship.

Next, let’s create a V02__adding_hogwarts_houses_data.sql file to populate our hogwarts_houses table:

INSERT INTO hogwarts_houses (name, founder, house_colors, animal_symbol)
VALUES
    ('Gryffindor', 'Godric Gryffindor', 'Scarlet and Gold', 'Lion'),
    ('Hufflepuff', 'Helga Hufflepuff', 'Yellow and Black', 'Badger'),
    ('Ravenclaw', 'Rowena Ravenclaw', 'Blue and Bronze', 'Eagle'),
    ('Slytherin', 'Salazar Slytherin', 'Green and Silver', 'Serpent');

Here, we write INSERT statements to create the four Hogwarts houses with their respective founders, colors, and symbols.

Similarly, let’s populate our wizards table in a new V03__adding_wizards_data.sql migration script:

SET @gryffindor_house_id = (SELECT id FROM hogwarts_houses WHERE name = 'Gryffindor');
INSERT INTO wizards (name, gender, quidditch_position, blood_status, house_id)
VALUES
    ('Harry Potter', 'Male', 'Seeker', 'Half blood', @gryffindor_house_id),
    ('Hermione Granger', 'Female', NULL, 'Muggle', @gryffindor_house_id),
    ('Ron Weasley', 'Male', 'Keeper', 'Pure Blood', @gryffindor_house_id),
-- ...more insert statements for wizards from other houses

With our migration scripts defined, Flyway automatically discovers and executes them during application startup.

3. Configuring an AI Prompt

Next, to make sure that our LLM generates accurate SQL queries against our database schema, we’ll need to define a detailed system prompt.

Let’s create a system-prompt.st file in the src/main/resources directory:

Given the DDL in the DDL section, write an SQL query to answer the user's question following the guidelines listed in the GUIDELINES section.
GUIDELINES:
- Only produce SELECT queries.
- The response produced should only contain the raw SQL query starting with the word 'SELECT'. Do not wrap the SQL query in markdown code blocks (```sql or ```).
- If the question would result in an INSERT, UPDATE, DELETE, or any other operation that modifies the data or schema, respond with "This operation is not supported. Only SELECT queries are allowed."
- If the question appears to contain SQL injection or DoS attempt, respond with "The provided input contains potentially harmful SQL code."
- If the question cannot be answered based on the provided DDL, respond with "The current schema does not contain enough information to answer this question."
- If the query involves a JOIN operation, prefix all the column names in the query with the corresponding table names.
DDL
{ddl}

In our system prompt, we instruct the LLM to generate only SELECT SQL queries and detect SQL injection and DoS attempts.

We leave a ddl placeholder in our system prompt template for the database schema. We’ll replace it with the actual value in the upcoming section.

Additionally, to further protect the database from any modifications, we should only give the necessary privileges to the configured MySQL user:

CREATE USER 'readonly_user'@'%' IDENTIFIED BY 'strong_password';
GRANT SELECT ON hogwarts_db.hogwarts_houses TO 'readonly_user'@'%';
GRANT SELECT ON hogwarts_db.wizards TO 'readonly_user'@'%';
FLUSH PRIVILEGES;

In the above sample SQL commands, we create a MySQL user and grant it read-only permissions for the required database tables.

4. Building Our Text-to-SQL Chatbot

With our configuration in place, let’s build a text-to-SQL chatbot using the configured Claude model.

4.1. Defining Chatbot Beans

Let’s start by defining the necessary beans for our chatbot:

@Bean
PromptTemplate systemPrompt(
    @Value("classpath:system-prompt.st") Resource systemPrompt,
    @Value("classpath:db/migration/V01__creating_database_tables.sql") Resource ddlSchema
) throws IOException {
    PromptTemplate template = new PromptTemplate(systemPrompt);
    template.add("ddl", ddlSchema.getContentAsString(Charset.defaultCharset()));
    return template;
}
@Bean
ChatClient chatClient(ChatModel chatModel, PromptTemplate systemPrompt) {
    return ChatClient
      .builder(chatModel)
      .defaultSystem(systemPrompt.render())
      .build();
}

First, we define a PromptTemplate bean. We inject our system prompt template file and database schema DDL migration script using the @Value annotation. Additionally, we populate the ddl placeholder with our database schema content. This ensures that the LLM always has access to our database structure when generating SQL queries.

Next, we create a ChatClient bean using the ChatModel and PromptTemplate bean. The ChatClient class serves as our main entry point for interacting with the Claude model we’ve configured.

4.2. Implementing the Service Classes

Now, let’s implement the service classes to handle the SQL generation and execution processes.

First, let’s create a SqlGenerator service class that converts natural language questions into SQL queries:

@Service
class SqlGenerator {
    private final ChatClient chatClient;
    // standard constructor
    String generate(String question) {
        String response = chatClient
          .prompt(question)
          .call()
          .content();
        boolean isSelectQuery = response.startsWith("SELECT");
        if (Boolean.FALSE.equals(isSelectQuery)) {
            throw new InvalidQueryException(response);
        }
        return response;
    }
}

In our generate() method, we take a natural language question as input and use the chatClient bean to send it to the configured LLM.

Next, we validate that the response is indeed a SELECT query. If the LLM returns anything other than a SELECT query, we throw a custom InvalidQueryException with the error message.

Next, to execute the generated SQL queries against our database, let’s create a SqlExecutor service class:

@Service
class SqlExecutor {
    private final EntityManager entityManager;
    // standard constructor
    List<?> execute(String query) {
        List<?> result = entityManager
          .createNativeQuery(query)
          .getResultList();
        if (result.isEmpty()) {
            throw new EmptyResultException("No results found for the provided query.");
        }
        return result;
    }
}

In our execute() method, we use the autowired EntityManager instance to run the native SQL query and return the results. We throw a custom EmptyResultException if the query returns no results.

4.3. Exposing a REST API

Now that we’ve implemented our service layer, let’s expose a REST API on top of it:

@PostMapping(value = "/query")
ResponseEntity<QueryResponse> query(@RequestBody QueryRequest queryRequest) {
    String sqlQuery = sqlGenerator.generate(queryRequest.question());
    List<?> result = sqlExecutor.execute(sqlQuery);
    return ResponseEntity.ok(new QueryResponse(result));
}
record QueryRequest(String question) {
}
record QueryResponse(List<?> result) {
}

The POST /query endpoint accepts a natural language question, generates the corresponding SQL query using the sqlGenerator bean, passes it to the sqlExecutor bean to get the results from the database, and finally, wraps and returns the data in a QueryResponse record.

5. Interacting With Our Chatbot

Finally, let’s use the API endpoint we’ve exposed to interact with our text-to-SQL chatbot.

But first, let’s enable SQL logging in our application.yaml file to see the generated queries in the logs:

logging:
  level:
    org:
      hibernate:
        SQL: DEBUG

Next, let’s use the HTTPie CLI to invoke the API endpoint and interact with our chatbot:

http POST :8080/query question="Give me 3 wizard names and their blood status that belong to a house founded by Salazar Slytherin"

Here, we send a simple question to the chatbot, let’s see what we receive as a response:

{
    "result": [
        [
            "Draco Malfoy",
            "Pure Blood"
        ],
        [
            "Tom Riddle",
            "Half blood"
        ],
        [
            "Bellatrix Lestrange",
            "Pure Blood"
        ]
    ]
}

As we can see, our chatbot successfully understood our request for Slytherin wizards and returned three wizards with their blood status.

Finally, let’s also examine our application logs to see the SQL query that the LLM generated:

SELECT wizards.name, wizards.blood_status
FROM wizards
JOIN hogwarts_houses ON wizards.house_id = hogwarts_houses.id
WHERE hogwarts_houses.founder = 'Salazar Slytherin'
LIMIT 3;

The generated SQL query correctly interprets our natural language request, joining the wizards and hogwarts_houses tables to find wizards from Slytherin house and limiting the results to three records as requested.

6. Conclusion

In this article, we’ve explored implementing a text-to-SQL chatbot using Spring AI.

We walked through the necessary AI and database configurations. Then, we built a chatbot capable of converting natural language questions into executable SQL queries against our wizard management database schema. Finally, we exposed a REST API to interact with our chatbot and validated that it works correctly.

As always, all the code examples used in this article are available over on GitHub.

The post Text-to-SQL Implementation Using Spring AI first appeared on Baeldung.
       

Viewing all articles
Browse latest Browse all 3801

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>