Integrating PandasAI with LM Studio Local Models for Stock Data Analysis: Evaluating AI-Assisted Financial Data Processing

Francis Benistant
8 min readJul 14, 2024

--

Introduction

In the rapidly evolving landscape of data analysis and artificial intelligence, two tools have emerged that promise to revolutionize how we interact with and analyze data: PandasAI and LM Studio.

PandasAI is an innovative Python library that extends the capabilities of the popular data manipulation library Pandas. It introduces a natural language interface to dataframes, allowing users to query and analyze data using plain English commands. By leveraging large language models (LLMs), PandasAI interprets user queries and generates the corresponding Python code to perform the requested analysis.

LM Studio, on the other hand, is a powerful application that enables users to run large language models locally on their personal computers. It provides a user-friendly interface for downloading, managing, and running various open-source language models. By offering local deployment options, LM Studio addresses concerns related to data privacy and reduces dependency on cloud-based AI services.

The integration of PandasAI with LM Studio’s local models presents an intriguing opportunity for data analysts, particularly in sensitive domains like financial analysis. This combination allows for AI-assisted data processing without the need to send potentially confidential information to external servers. However, as with any emerging technology, it’s crucial to understand both the capabilities and limitations of these tools.

In this article, we’ll explore how PandasAI and LM Studio can be applied to stock data analysis, examining their potential to streamline analytical processes while also discussing the current constraints and challenges of AI-assisted financial data processing.

LM Studio Server

LM Studio Server is a feature of LM Studio that allows you to run large language models (LLMs) locally on your computer and access them via an API.

  1. LM Studio Server runs on your local machine, typically on localhost:1234.
  2. It provides an OpenAI-compatible API interface for interacting with locally loaded LLMs.
  3. Supports endpoints like /v1/chat/completions, /v1/embeddings, and /v1/completions.
  4. Allows you to use local LLMs as drop-in replacements for cloud-based APIs like OpenAI.
  5. You can start the server from the LM Studio app’s “Local Server” tab.
  6. Multiple models can be loaded and accessed through the server simultaneously.
  7. Enables integration of local LLMs into applications, scripts, or other tools.
  8. Provides a way to use powerful language models offline and privately.
  9. Supports various model parameters like temperature, max_tokens, and top_p.
  10. Can be used with curl commands or integrated into code using standard HTTP requests.

PANDASAI LLM library

PandasAI’s LLM library provides a flexible interface for integrating various large language models into data analysis workflows.

  1. Supports multiple LLM providers: Including BambooLLM (PandasAI’s own model), OpenAI, Google PaLM, Azure OpenAI, HuggingFace, LangChain, Amazon Bedrock, IBM watsonx.ai, and local LLM models.
  2. Easy integration: LLMs can be instantiated and passed to SmartDataFrame or SmartDatalake constructors.
  3. Configuration options: LLM settings can be specified in code or via a pandasai.json file.
  4. API key management: Supports both direct API key input and environment variable usage for security.
  5. Local model support: Compatible with local inference servers like Ollama and LM Studio.
  6. Customization: Allows for model-specific parameters and proxy settings.
  7. Token counting: Provides utilities to track token usage and associated costs (for applicable models).
  8. Extensibility: Supports integration with LangChain models for additional flexibility.

The library aims to make it easy for users to leverage powerful language models for data analysis tasks, offering a range of options to suit different needs and preferences.

Installing PANDASAI:

The installation is made for conda environement,

conda create -n pandasai_env python=3.10
conda activate pandasai_env

pip install pandasai
  1. If you want to install additional dependencies, you can do so by specifying them like this:
pip install pandasai[extra-dependency-name]
for example
pip install pandasai[excel]

Replace extra-dependency-name with the specific dependency you need, such as google-ai, excel, langchain, etc

To ensure all dependencies are properly installed and compatible with your Conda environment, you might want to install some common data science packages using Conda first:

conda install pandas numpy matplotlib

PANDASAI and LM Studio

Integrating PANDASAI and LM Studio is taking only few lines of code. First, we need to define Pandasai libraries, then the cvs file path for the stock price and then the agent that will infer with the cvs file.

  1. Import necessary libraries:
import os
import pandas as pd
from pandasai import Agent
from pandasai import SmartDataframe
from pandasai.llm.local_llm import LocalLLM

2. Define the CSV file path and load the data:

file_path = 'MSFT_updated.csv'  # Replace with the actual file path

3. Configure the LLM to use LM Studio’s local server and create the SmartDataframe:

# Configure the local LLM to use LM Studio's local server
lm_studio_llm = LocalLLM(api_base="http://localhost:1234/v1")
stock_data= SmartDataframe("MSFT_updated.csv", config={"llm": lm_studio_llm})

4. Define the trading agent:

# Initialize the trading agent with the stock data and the local LLM
agent = Agent(stock_data, config={"llm": lm_studio_llm})

5. interaction with the data using natural language queries thru the trading agent:

# Example trading queries
agent.chat("Forget all previous context and start a new conversation to analyse the agent.")
print(agent.chat('You are an expert in python coding for stock data analysis from Yahoo dataframe, write a python code using pandas library to return the highest and lowest closing prices for the stock and the the average trading volume for the stock'))

Pandasai and LM Studio model interaction:

In order to interact with pandasai, LM Studio models need to be defined on the LM Studio server:

The model is loaded with the maximum layers running on GPU, then the preset needs to be set up.

We used the Codestral LLM model as it gave the best results. This model is a quantization Q8 model, as by experience it is better to use the highest possible quantization to get meaningful results. Other models can be used, with lower quantization and samller size, but the results are not as good as with large models with high quantization, and require much more details prompt in order to give the desired results.

Then you start the Lm Studio Server and copy the api to the python pandasai code:

lm_studio_llm = LocalLLM(api_base="http://localhost:1234/v1")

Once it is done, you can run the pandasai agent using the LLM locla model.

In the following, we will give two full codes for pandasai and LM Studio interaction:

import os
import pandas as pd
from pandasai import Agent
from pandasai import SmartDataframe
from pandasai.llm.local_llm import LocalLLM

# Load stock price data from a CSV file downloaded from Yahoo Finance
file_path = 'MSFT_updated.csv'

# Configure the local LLM to use LM Studio's local server
lm_studio_llm = LocalLLM(api_base="http://localhost:1234/v1")
stock_data= SmartDataframe("MSFT_updated.csv", config={"llm": lm_studio_llm})

# Initialize the trading agent with the stock data and the local LLM
agent = Agent(stock_data, config={"llm": lm_studio_llm})

print('before agent queries')
# Example trading queries
agent.chat("Forget all previous context and start a new conversation to analyse the agent.")
print(agent.chat('You are an expert in stock data analysis from Yahoo dataframe, search and report for the highest and lowest closing prices for the stock?'))
print(agent.chat('You are an expert in stock data analysis from Yahoo dataframe so calculate using pandas library and report the average trading volume for the stock?'))
response1 = agent.chat("You are an expert in stock data analysis from Yahoo dataframe so Using python library matplotlit Plot a graph of the stock price vs time then on the same plot Using python libraries pandas and matplotlib Plot a graph of the 100 days moving average of the stock price vs time and finally on the same plot add the 30 days moving average of the stock price vs time")
print(response1)

The outputs of this code will be the information fetched in the cvs file, for highest and lowest closing prices and the average trading volume:

The highest closing price is 467.55999755859375 and the lowest closing price is 130.37559509277344.
29481416.41791045

as well as the plot for the stock price and the moving averages:

Plot given by the trading agent using LM Studio local LLM model

The results are exactly what we could expect.

The second code is just a variation of the fist one with different agent queries:

import os
import pandas as pd
from pandasai import Agent
from pandasai import SmartDataframe
from pandasai.llm.local_llm import LocalLLM

# Load stock price data from a CSV file downloaded from Yahoo Finance
file_path = 'MSFT_updated.csv'

# Configure the local LLM to use LM Studio's local server
lm_studio_llm = LocalLLM(api_base="http://localhost:1234/v1")
stock_data= SmartDataframe("MSFT_updated.csv", config={"llm": lm_studio_llm})

# Initialize the trading agent with the stock data and the local LLM
agent = Agent(stock_data, config={"llm": lm_studio_llm})

print('before agent queries')
# Example trading queries
agent.chat("Forget all previous context and start a new conversation to analyse the agent.")
print(agent.chat('You are an expert in python coding for stock data analysis from Yahoo dataframe, write a python code using pandas library to return the highest and lowest closing prices for the stock and the the average trading volume for the stock'))

response1 = agent.chat("You are an expert in stoclk data analysis from Yahoo dataframe so Using python library matplotlit Plot a graph of the stock price vs time then on the same plot Using python libraries pandas and matplotlib Plot a graph of the 100 days moving average of the stock price vs time and finally on the same plot add the 30 days moving average of the stock price vs time")
print(response1)

As previously, we got the information fetched in the cvs file:

The highest closing price is 467.55999755859375, the lowest closing price is 130.37559509277344, and the average trading volume is 29481416.41791045.

Depending how we frame the query we get a different answer shape but the number we are looking for are the same.

The plot is as for the first code:

Plot given by the trading agent using LM Studio local LLM model

Conclusion:

In this work, we show how to use PandasAI with LM Studio. The LLM model follows the agent instructions, creating the code needed for extracting data and plotting graphs as requested by the prompts.

While the results can be satisfactory, they are heavily dependent on the choice of LLM model. Overall, the runs are slow, even when using an RTX-4090 GPU for laptop (16GB).

The main issue with local models is that if the quantization is low (Q4 or below) or the model is too small, the search in the CSV file may return incorrect results. The most common error occurs when asked for the highest and lowest prices; weaker models tend to look for the last and first prices (df[-1] and df[0]) instead of using max and min functions on the dataframe.

Another difficulty is getting three plots on the same graph, as weaker models would return three separate plots instead of one combined plot.

Overall, using PandasAI and LM Studio demonstrates both the potential and limitations of agents using local models.

References:

1- Pandasai documentation

2- Pandasai introduction

3- pandasai LLM models

https://docs.pandas-ai.com/llms

4- Lm Studio :

--

--