Building an intelligent WhatsApp bot for PDF Question-Answering using Langchain, Twilio, Python and ChatGPT
In today's digital age, messaging platforms have become integral to our daily communication. WhatsApp, with its vast user base and powerful features, offers an excellent opportunity to build intelligent chatbots that can provide instant and personalized assistance. In this article, we will explore how to leverage the power of Flask, Twilio WhatsApp, Langcain and ChatGPT models to create an intelligent WhatsApp bot that excels in PDF question-answering.
Twilio, a leading cloud communications platform, provides an array of tools and services to integrate messaging capabilities into your applications seamlessly. By combining Twilio's messaging API with Flask, a lightweight and flexible Python web framework, we can develop a robust backend for our WhatsApp bot.
To tackle the challenge of processing PDF files and extracting relevant information, we will dive into the realms of language models and natural language processing. OpenAI's advanced language models, such as GPT-3.5 Turbo, offer state-of-the-art capabilities for understanding and generating human-like text. By harnessing the power of OpenAI models using Langcain, we can transform our bot into a knowledgeable assistant that can answer questions based on the contents of uploaded PDF files.
Throughout this article, I will guide you step-by-step, drawing inspiration from Twilio's comprehensive publication guides. You will learn how to set up your development environment, handle incoming messages from WhatsApp, process PDF files, generate document embeddings, and perform question-answering tasks using Twilio, Flask, and powerful language models.
By the end of this tutorial, you will have a fully functional WhatsApp bot capable of providing accurate and insightful answers to questions posed by users. This opens up exciting possibilities for customer support, information retrieval, and automation, empowering businesses and individuals with an intelligent conversational agent at their fingertips.
So, let's dive into the world of Flask, Twilio, OpenAI, and Langchain and embark on a journey to build a remarkable WhatsApp bot that revolutionizes the way we interact with PDFs and obtain instant knowledge.
Prerequisites
Python: Make sure you have Python installed on your system.
Twilio Account: In order to send and receive messages, it is necessary to have a Twilio account. You will require an AUTH_TOKEN, Twilio phone number and ACCOUNT_SID for this purpose.
OpenAI API Key: To use OpenAI's GPT-3.5 model, you need an API key.
Ngrok: Ngrok is a tool that allows us to expose our local Flask server to the internet.
Install dependencies
To get started you'll need the following packages:
dotenv Library: Aloows you to load environment variables in your app. Install using
pip install python-dotenv
Langchain: Allows you to use multiple tools for building AI powered apps. Install using
pip install langchain
PyPDF2 Library: You will use the PyPDF2 library to read and extract text from PDF files. Install using
pip install PyPDF2
Twilio Python Library: Allows you to interact with the twilio api in python. Install using
pip install twilio
OpenAI: Enables you to have access to OpenAI's GPT models. Install using
pip install openai
However, after installing this packages, you might be prompted to install additional packages in order for you to run the app.
Setting up Twilio Account
To illustrate the process, you will configure your Twilio account to utilize WhatsApp by utilizing the Twilio Sandbox for WhatsApp. Access the WhatsApp Sandbox within your Twilio Console by navigating to the Messaging section on the left sidebar (if you don't see it, click on Explore Products to reveal the product list, where you can find Messaging). Next, expand the "Try it out" dropdown and select "Send a WhatsApp message" from the options. You will then see this:
You will then go ahead and scan the QR code and you'd see "join disease-see" message in your Whatsapp, send it and like that you're connected to your WhatsApp.
Setting up our coding environment
We will then go into our code editor and create an app.py file and a .env file. In our .env file we will go ahead and include the following:
TWILIO_ACCOUNT_SID = xxxxxxxx
TWILIO_AUTH_TOKEN = xxxxxxxx
TWILIO_PHONE_NUMBER = xxxxxxxx
OPENAI_API_KEY = xxxxxxxx
In the pre-requisites, you should have named the respective tokens, which you will now substitute in place of xxxxxxxx
.
Building the Bot
Now you've added all your tokens to your .env file and created an app.py file, you will go ahead and import all your dependencies
from flask import Flask, request
import os
from twilio.twiml.messaging_response import MessagingResponse
from twilio.rest import Client
from dotenv import load_dotenv
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
import tempfile
from PyPDF2 import PdfReader
from langchain.vectorstores import FAISS
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
app = Flask(__name__)
@app.route("/message", methods=["POST", "GET"])
def message():
return "Hello, world"
if __name__ == "__main__":
app.run(debug=True)
So, the above is the bare bones of the PDF Q and A bot, MessagingResponse
to send messages in twilio, Client
to access our twilio account, dotenv
to access environment variables, RecursiveCharacterTextSplitter
to split the texts in the uploaded texts, OpenAIEmbeddings
to create word embeddings, requests
to get our PDF file from twilio, templfile
to create a temporary directory to store the uploaded PDF , PdfReader
to be able to get read data from our uploaded pdf file, FAISS
to create a vector store for similar texts of your questions in your PDF, load_qa_chain
to create a question and answer model.
Next, append this code in the message() function:
load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
response = None
account_sid = os.getenv('TWILIO_ACCOUNT_SID')
auth_token = os.getenv('TWILIO_AUTH_TOKEN')
client = Client(account_sid, auth_token)
twilio_phone_number = os.getenv('TWILIO_PHONE_NUMBER')
sender_phone_number = request.values.get('From', '')
pdf_url = request.values.get('MediaUrl0')
response = None
This code is used to establish a connection with the Twilio client by retrieving environment variables. Also extracting important information from the sender, specifically the PDF URL, which will be requested later as it is stored in an S3 bucket.
Now you'd add this variables at the top of the @app.route
as they will act as as global variables to be accessed later.
pdf_exists = False
VectorStore = None
Generating responses from the PDF
Receiving PDF
if media_content_type == 'application/pdf':
global pdf_exists, VectorStore
pdf_exists = True
response = requests.get(pdf_url)
with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as temp_file:
temp_file.write(response.content)
temp_file_path = temp_file.name
pdf = PdfReader(temp_file_path)
text = ""
for page in pdf.pages:
text += page.extract_text()
text_splitter = RecursiveCharacterTextSplitter(
chunk_size = 1000,
chunk_overlap = 200,
length_function = len
)
chunks = text_splitter.split_text(text=text)
embeddings = OpenAIEmbeddings()
VectorStore = FAISS.from_texts(chunks, embedding=embeddings)
response = "Recieved, You can now ask your Questions"
Firstly, the code verifies whether a PDF file has been received. If a PDF file is detected, the global variable called "pdf_exists" is set to true. Next, the code sends a request to the PDF's URL and retrieves the file. The file is temporarily stored in a directory and its contents are read. Then, the code iterates through the pages of the PDF, dividing the text into segments of 1000 words with an overlap of 200 words.
Afterwards, the code utilizes the OpenAIEmbeddings
function to generate embeddings for the text segments. These embeddings are then passed into a VectorStore. Finally, a notification message is sent to indicate that the code is ready to answer questions related to the PDF.
Receiving text
elif pdf_exists:
question = request.values.get('Body')
if pdf_exists:
docs = VectorStore.similarity_search(query=question, k=3)
llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0.4)
chain = load_qa_chain(llm, chain_type="stuff")
answer = chain.run(input_documents=docs, question=question)
message = client.messages.create(
body=answer,
from_=twilio_phone_number,
to=sender_phone_number
)
return str(message.sid)
else:
response = "No PDF file uploaded."
The provided code begins by checking if a text was received. If a text was indeed received, it further checks if a PDF file was previously sent by examining the variable pdf_exists. Following this, the code utilizes the VectorStore to search for similar texts based on the question provided. It then employs the gpt-3.5-turbo model to generate an answer based on the retrieved information. The generated answer is subsequently sent as a message, and the message SID (unique identifier) is returned.
However, if a text was sent but no PDF file was uploaded beforehand, the code sends a response stating "No PDF file uploaded."
Receiving an invalid format
else:
print(media_content_type)
response = "The media content type is not application/pdf"
print(media_content_type)
message = client.messages.create(
body=response,
from_=twilio_phone_number,
to=sender_phone_number
)
return str(message.sid)
If the conditions in the if and elif statements are not met, the code will respond with a message stating "The media content type is not application/pdf."
Running our bot
Now go ahead to your terminal and run python
app.py
and our app will be running on localhost:5000
. Now you can go ahead to Ngrok and run ngrok http 5000
so that you can send and receive WhatsApp messages. We should see something like this
Now, copy the circled link, go back to your Twilio sandbox settings and paste it there
There you have it, you can now upload PDFs to your WhatsApp bot and ask it questions
Conclusion
In conclusion, you have gained insights into various aspects of the provided code. You have explored its functionality, how it verifies PDF files, retrieves and processes their contents, generates embeddings, and uses models to answer questions. Furthermore, we have considered situations where specific response messages are triggered when certain requirements are not fulfilled. By comprehending these details, you now have a better understanding of the code's overall behavior and its outcomes in different scenarios.
Happy Building!!