Building an intelligent WhatsApp bot for PDF Question-Answering using Langchain, Twilio, Python and ChatGPT

Building an intelligent WhatsApp bot for PDF Question-Answering using Langchain, Twilio, Python and ChatGPT

In today's digital age, messaging platforms have become integral to our daily communication. WhatsApp, with its vast user base and powerful features, offers an excellent opportunity to build intelligent chatbots that can provide instant and personalized assistance. In this article, we will explore how to leverage the power of Flask, Twilio WhatsApp, Langcain and ChatGPT models to create an intelligent WhatsApp bot that excels in PDF question-answering.

Twilio, a leading cloud communications platform, provides an array of tools and services to integrate messaging capabilities into your applications seamlessly. By combining Twilio's messaging API with Flask, a lightweight and flexible Python web framework, we can develop a robust backend for our WhatsApp bot.

To tackle the challenge of processing PDF files and extracting relevant information, we will dive into the realms of language models and natural language processing. OpenAI's advanced language models, such as GPT-3.5 Turbo, offer state-of-the-art capabilities for understanding and generating human-like text. By harnessing the power of OpenAI models using Langcain, we can transform our bot into a knowledgeable assistant that can answer questions based on the contents of uploaded PDF files.

Throughout this article, I will guide you step-by-step, drawing inspiration from Twilio's comprehensive publication guides. You will learn how to set up your development environment, handle incoming messages from WhatsApp, process PDF files, generate document embeddings, and perform question-answering tasks using Twilio, Flask, and powerful language models.

By the end of this tutorial, you will have a fully functional WhatsApp bot capable of providing accurate and insightful answers to questions posed by users. This opens up exciting possibilities for customer support, information retrieval, and automation, empowering businesses and individuals with an intelligent conversational agent at their fingertips.

So, let's dive into the world of Flask, Twilio, OpenAI, and Langchain and embark on a journey to build a remarkable WhatsApp bot that revolutionizes the way we interact with PDFs and obtain instant knowledge.

Prerequisites

  • Python: Make sure you have Python installed on your system.

  • Twilio Account: In order to send and receive messages, it is necessary to have a Twilio account. You will require an AUTH_TOKEN, Twilio phone number and ACCOUNT_SID for this purpose.

  • OpenAI API Key: To use OpenAI's GPT-3.5 model, you need an API key.

  • Ngrok: Ngrok is a tool that allows us to expose our local Flask server to the internet.

Install dependencies

To get started you'll need the following packages:

  • dotenv Library: Aloows you to load environment variables in your app. Install using pip install python-dotenv

  • Langchain: Allows you to use multiple tools for building AI powered apps. Install using pip install langchain

  • PyPDF2 Library: You will use the PyPDF2 library to read and extract text from PDF files. Install using pip install PyPDF2

  • Twilio Python Library: Allows you to interact with the twilio api in python. Install using pip install twilio

  • OpenAI: Enables you to have access to OpenAI's GPT models. Install using pip install openai

However, after installing this packages, you might be prompted to install additional packages in order for you to run the app.

Setting up Twilio Account

To illustrate the process, you will configure your Twilio account to utilize WhatsApp by utilizing the Twilio Sandbox for WhatsApp. Access the WhatsApp Sandbox within your Twilio Console by navigating to the Messaging section on the left sidebar (if you don't see it, click on Explore Products to reveal the product list, where you can find Messaging). Next, expand the "Try it out" dropdown and select "Send a WhatsApp message" from the options. You will then see this:

Twilio sandbox

You will then go ahead and scan the QR code and you'd see "join disease-see" message in your Whatsapp, send it and like that you're connected to your WhatsApp.

Twilio sandbox

Setting up our coding environment

We will then go into our code editor and create an app.py file and a .env file. In our .env file we will go ahead and include the following:

TWILIO_ACCOUNT_SID = xxxxxxxx
TWILIO_AUTH_TOKEN = xxxxxxxx
TWILIO_PHONE_NUMBER = xxxxxxxx
OPENAI_API_KEY = xxxxxxxx

In the pre-requisites, you should have named the respective tokens, which you will now substitute in place of xxxxxxxx.

Building the Bot

Now you've added all your tokens to your .env file and created an app.py file, you will go ahead and import all your dependencies

from flask import Flask, request
import os
from twilio.twiml.messaging_response import MessagingResponse
from twilio.rest import Client
from dotenv import load_dotenv
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
import tempfile
from PyPDF2 import PdfReader
from langchain.vectorstores import FAISS
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
app = Flask(__name__)


@app.route("/message", methods=["POST", "GET"])
def message():
    return "Hello, world"

if __name__ == "__main__":
    app.run(debug=True)

So, the above is the bare bones of the PDF Q and A bot, MessagingResponse to send messages in twilio, Client to access our twilio account, dotenv to access environment variables, RecursiveCharacterTextSplitter to split the texts in the uploaded texts, OpenAIEmbeddings to create word embeddings, requests to get our PDF file from twilio, templfile to create a temporary directory to store the uploaded PDF , PdfReader to be able to get read data from our uploaded pdf file, FAISS to create a vector store for similar texts of your questions in your PDF, load_qa_chain to create a question and answer model.

Next, append this code in the message() function:

    load_dotenv()
    OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
    response = None
    account_sid = os.getenv('TWILIO_ACCOUNT_SID')
    auth_token = os.getenv('TWILIO_AUTH_TOKEN')
    client = Client(account_sid, auth_token)
    twilio_phone_number = os.getenv('TWILIO_PHONE_NUMBER')
    sender_phone_number = request.values.get('From', '')
    pdf_url = request.values.get('MediaUrl0')
    response = None

This code is used to establish a connection with the Twilio client by retrieving environment variables. Also extracting important information from the sender, specifically the PDF URL, which will be requested later as it is stored in an S3 bucket.

Now you'd add this variables at the top of the @app.route as they will act as as global variables to be accessed later.

pdf_exists = False
VectorStore = None

Generating responses from the PDF

Receiving PDF

if media_content_type == 'application/pdf':
        global pdf_exists, VectorStore
        pdf_exists = True
        response = requests.get(pdf_url)
        with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as temp_file:
            temp_file.write(response.content)
            temp_file_path = temp_file.name
            pdf = PdfReader(temp_file_path)
            text = ""
            for page in pdf.pages:
                text += page.extract_text()
            text_splitter = RecursiveCharacterTextSplitter(
                chunk_size = 1000,
                chunk_overlap = 200,
                length_function = len
            )
            chunks = text_splitter.split_text(text=text)
            embeddings = OpenAIEmbeddings()
            VectorStore = FAISS.from_texts(chunks, embedding=embeddings)
            response = "Recieved, You can now ask your Questions"

Firstly, the code verifies whether a PDF file has been received. If a PDF file is detected, the global variable called "pdf_exists" is set to true. Next, the code sends a request to the PDF's URL and retrieves the file. The file is temporarily stored in a directory and its contents are read. Then, the code iterates through the pages of the PDF, dividing the text into segments of 1000 words with an overlap of 200 words.

Afterwards, the code utilizes the OpenAIEmbeddings function to generate embeddings for the text segments. These embeddings are then passed into a VectorStore. Finally, a notification message is sent to indicate that the code is ready to answer questions related to the PDF.

Receiving text

elif pdf_exists:
        question = request.values.get('Body')
        if pdf_exists:
            docs = VectorStore.similarity_search(query=question, k=3)
            llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0.4)
            chain = load_qa_chain(llm, chain_type="stuff")
            answer = chain.run(input_documents=docs, question=question)
            message = client.messages.create(
                body=answer,
                from_=twilio_phone_number,
                to=sender_phone_number
            )
            return str(message.sid)
        else:
            response = "No PDF file uploaded."

The provided code begins by checking if a text was received. If a text was indeed received, it further checks if a PDF file was previously sent by examining the variable pdf_exists. Following this, the code utilizes the VectorStore to search for similar texts based on the question provided. It then employs the gpt-3.5-turbo model to generate an answer based on the retrieved information. The generated answer is subsequently sent as a message, and the message SID (unique identifier) is returned.

However, if a text was sent but no PDF file was uploaded beforehand, the code sends a response stating "No PDF file uploaded."

Receiving an invalid format

else:
        print(media_content_type)
        response = "The media content type is not application/pdf"
    print(media_content_type)
    message = client.messages.create(
        body=response,
        from_=twilio_phone_number,
        to=sender_phone_number
    )

    return str(message.sid)

If the conditions in the if and elif statements are not met, the code will respond with a message stating "The media content type is not application/pdf."

Running our bot

Now go ahead to your terminal and run python app.py and our app will be running on localhost:5000. Now you can go ahead to Ngrok and run ngrok http 5000 so that you can send and receive WhatsApp messages. We should see something like this

Ngrok

Now, copy the circled link, go back to your Twilio sandbox settings and paste it there

Ngrok Image

There you have it, you can now upload PDFs to your WhatsApp bot and ask it questions

Whatsapp Chat

Whatsapp Chat

Conclusion

In conclusion, you have gained insights into various aspects of the provided code. You have explored its functionality, how it verifies PDF files, retrieves and processes their contents, generates embeddings, and uses models to answer questions. Furthermore, we have considered situations where specific response messages are triggered when certain requirements are not fulfilled. By comprehending these details, you now have a better understanding of the code's overall behavior and its outcomes in different scenarios.

Happy Building!!