Building Smart Data Agents with LLMs

Summary

This article explores how to build Large Language Model (LLM) powered data agents for data analysis. These agents are designed to handle complex data analysis tasks by leveraging various tools and memory modules. We will delve into the components of an LLM agent, including tools, memory, planning, and the agent core, and demonstrate how to create a data agent for inventory management.

Understanding LLM Agents

LLM agents are sophisticated systems that combine planning capabilities, memory, and tools to perform tasks requested by users. They are particularly useful for data analysis tasks that require reasoning, search, and planning capabilities.

Types of LLM Agents

There are two main types of LLM agents:

  • Data Agents: These agents are designed for extractive goals, helping users extract information from various data sources. They assist with assistive reasoning tasks, such as answering questions about inventory levels or financial data.
  • API or Execution Agents: These agents are designed for execution goals, carrying out tasks or sets of tasks requested by users. They can interact with external services or APIs to perform tasks like organizing data in spreadsheets.

Building a Data Agent

To build a data agent, we need to identify the LLM to use, select a use case, and define the necessary tools and memory modules.

Choosing an LLM

For this example, we will use the Mixtral 8x7B LLM available in the NVIDIA NGC catalog. This model accelerates various models and makes them available as APIs.

Selecting a Use Case

Our use case is talking to an SQL database for inventory management. We will populate the database with tables containing supplier information, product details, and inventory levels.

Defining Tools and Memory

Our data agent will use two tools:

  • Calculator: For basic calculations needed after querying the data.
  • SQL Query Executor: For querying the database for raw data.

The memory module will be a simple buffer or list to keep track of all the agent’s actions.

Planning Module

The planning module will use a linear greedy approach. We will create a “faux tool” for “generate the final answer” to guide the agent in selecting the appropriate tools.

Example: Inventory Management

Let’s consider an example where a user asks, “How much excess inventory do I have for Google Pixel 6?”

To solve this question, the agent will perform the following steps:

  1. QueryDB Tool:

    • Generate SQL query to retrieve the current quantity and minimum required quantity for Google Pixel 6.
    • Query the database and store the results in memory.
  2. Calculator Tool:

    • Solve the math problem to calculate the excess inventory by subtracting the minimum required quantity from the current quantity.
    • Store the results in memory.
  3. Final Answer Generation:

    • Use the results from the previous steps to generate the final answer.

Agent Core

The agent core will use the following prompt to guide the agent:

You are an agent capable of using a variety of tools to answer a data analytics question.
Always use memory to help select the tools to be used.

Memory:
- Previous Question: How much excess inventory do we have for 'Google Pixel 6'?
- SQL Query: SELECT inventory.id, inventory.quantity, inventory.min_required FROM inventory WHERE product_id = 7

Tools:
- Generate Final Answer: Use if answer to user's question can be given with memory.
- Calculator: Use this tool to solve mathematical problems.
- Query_Database: Write an SQL query to query the database.

Answer Format:
- tool_name: Calculator

Table: Example Database Schema

Table Name Columns
suppliers name, address, contact
products name, description, price, supplier_id
inventory product_id, quantity, min_required

Table: Example Inventory Data

Product ID Quantity Min Required
1 150 30
2 100 20
3 120 30
4 80 15
5 200 40
6 150 25
7 100 20
8 90 18
9 170 35
10 220 45

Table: Example Product Data

Product ID Name Description Price Supplier ID
1 Samsung Galaxy S21 Samsung flagship smartphone 799.99 1
2 Samsung Galaxy Note 20 Samsung premium smartphone with stylus 999.99 1
3 iPhone 13 Pro Apple flagship smartphone 999.99 2
4 iPhone SE Apple budget smartphone 399.99 2
5 OnePlus 9 High performance smartphone 729.00 3
6 OnePlus Nord Mid-range smartphone 499.00 3
7 Google Pixel 6 Google’s latest smartphone 599.00 4
8 Google Pixel 5a Affordable Google smartphone 449.00 4
9 Xiaomi Mi 11 Xiaomi high-end smartphone 749.99 5
10 Xiaomi Redmi Note 10 Xiaomi budget smartphone 199.99 5

Conclusion

Building LLM-powered data agents can significantly enhance data analysis capabilities. By combining planning modules, memory, and various tools, these agents can handle complex tasks and provide highly personalized answers. This article has demonstrated how to create a data agent for inventory management, showcasing the potential of LLM agents in real-world applications.