Introduction
Generative Large Language Models like ChatGPT and Bard AI have taken the world by storm. This has helped people and businesses access a vast database of data responses using chatbots.
This article will walk the reader through step-by-step instructions and code examples using LangChain’s natural language processing capabilities to understand and process textual data and integrate it with Streamlit to create dynamic and user-friendly interfaces.
Understanding LangChain and Streamlit
LangChain is a framework that simplifies app development using LLMs. It is designed to understand human language and give appropriate responses and insights based on the query context. LangChain helps users interact with data using natural language queries, making data analysis more accessible to a broader range of users.
Streamlit is an open-source Python library that streamlines the creation of interactive web applications for data visualization. It has an intuitive API that users with web development experience can use to build and deploy interactive data apps effortlessly. It also allows the creation and sharing of powerful visualizations in a user-friendly format.
Project Setup
To start your development, create a virtual environment for the project. In this article, you will use Pipenv as follows:
pipenv install openai langchain streamlit streamlit_chat faiss-cpu tiktoken
This installs the necessary packages and creates a virtual environment all at once.
Once done, activate the virtual environment.
pipenv shell
Building a Chatbot
We will build a chatbot using a conversation CSV dataset from Kaggle.
In your project directory, create a new file and name it chatbot.py. In the file, add the following lines of code to import the previously installed libraries.
Import necessary libraries and modules
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.vectorstores import FAISS
import streamlit as st
from streamlit_chat import message
import tempfile
Navigate to your OpenAI platform and create a new API key.
On the platform dashboard, click the profile icon on the top right of the page.
Next, click the Create new secret key button to get a new API key.
Enter the requested API name of the key or leave it as it is.
You will be prompted with a secret key.
Copy and store the API key in a safe place for future use.
Load the OpenAI API Key File
Set up the user OpenAI API key input using Streamlit’s text_input
method.
Get the OpenAI API key from the user Using the Streamlit sidebar input
user_api_key = st.sidebar.text_input(
label="#### OpenAI API key INPUT ##### ",
placeholder="ENTER your OpenAI API key",
type="password")
Load the CSV File
Using the Streamlit file uploader library, you will enable the user to upload a CSV file to the application.
Enable the user to upload a CSV file using Streamlit file uploader
uploaded_file = st.sidebar.file_uploader("upload", type="csv")
if uploaded_file:
# Use tempfile because CSVLoader only accepts a file_path
with tempfile.NamedTemporaryFile(delete=False) as tmp_file:
tmp_file.write(uploaded_file.getvalue())
tmp_file_path = tmp_file.name
# Load the CSV file using CSVLoader and store the data
loader = CSVLoader(file_path=tmp_file_path, encoding="utf-8", csv_args={'delimiter': ','})
data = loader. Load()
You will use the CSVLoader class from LangChain to load and store the uploaded CSV file.
Retrieving the Conversational Model
You will use the uploaded API key to connect the OpenAI GPT-3.5 model.
Initialize embeddings, vectorstore, and chain for conversational retrieval
embeddings = OpenAIEmbeddings(openai_api_key=user_api_key)
vectorstore = FAISS.from_documents(data, embeddings)
chain = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(temperature=0.0, model_name='gpt-3.5-turbo'),
retriever=vectorstore.as_retriever()
)
This code above uses the model API access to enable the chatbot to send the user’s interactions (questions) history to the ConversationalRetrievalChain to respond.
Initialize Chatbot Session
Initialize the chatbot session using Streamlit’s st.session_state['history'] method.
def chat(query):
# Perform conversational chat using the retrieval chain
result = retrieval_chain({"question": query, "chat_history": st.session_state['history']})
st.session_state['history'].append((query, result["answer"]))
return result["answer"]
The function takes a user query (user input) argument. It then uses the chain to perform conversational chat. The chain connects various components of a conversational chat system, like the question, answer, and chat history.
Testing the Chatbot
Write a test code to check and initialize the session state variables.
Check and initialize session state variables if not present
if 'history' not in st.session_state:
st.session_state['history'] = []
if 'generated' not in st.session_state:
st.session_state['generated'] = ["Hello! Ask me about the " + uploaded_file.name + "!"]
if 'past' not in st.session_state:
st.session_state['past'] = ["Hello"]
This code will display the initial messages in the chat. If the state history is empty, the chatbot will prompt a “Hello” message and reply with “Hello! Ask me about the CSV file uploaded.” Otherwise, the chatbot resumes from the last conversations.
Containers Configurations
You will configure the Streamlit container to enhance our user interface. A Streamlit container is a feature that allows the grouping of multiple elements in a Streamlit app. It helps to organize your app or create more complex layouts than the basic Streamlit widgets.
Create containers for chat history and user input
response_container = st.container()
input_container = st.container()
with input_container:
with st.form(key='my_form', clear_on_submit=True):
# Retrieve user input via Streamlit text input
user_input = st.text_input("Your query:", placeholder="Query your CSV data", key='input')
submit_button = st.form_submit_button(label='Ask')
if submit_button and user_input:
# Call the chat function with user input and retrieve output
output = chat(user_input)
# Update session state with user input and generated output
st.session_state['past'].append(user_input)
st.session_state['generated'].append(output)
if st.session_state['generated']:
# Display chat history in the response container
with response_container:
for i in range(len(st.session_state['generated'])):
message(st.session_state["past"][i], is_user=True, key=str(i) + '_user', avatar_style="big-smile")
message(st.session_state["generated"][i], key=str(i), avatar_style="thumbs")
The code above is used to customize the Streamlit app.
Once done, run the application using the command below to launch the chatbot app.
streamlit run chatbot.py
** Note: Make sure to indent all the code after the “if uploaded_file:” to avoid errors.**