Step-by-Step: PDF Chatbots with Langchain and Ollama


PDF Chatbots with Langchain and Ollama

Introduction

As technology reshapes our interaction with information, PDF chatbots introduce unmatched convenience and efficiency. This article explores the creation of a PDF chatbot with Langchain and Ollama, making open-source models easily accessible with minimal setup. Forget the hassle of complex framework choices and model configurations. Instead, discover how to install Ollama, download models, and build a PDF chatbot that intelligently responds to your queries. Join us as we dive into this powerful blend of tech and document processing, simplifying information retrieval like never before.


Also Read : How to Build LLM Applications with LangChain Guide


Learning Objectives

  • Step-by-step guide to installing Ollama on your computer
  • Instructions to download and run open-source models with Ollama
  • Walkthrough for creating a PDF chatbot using Langchain and Ollama

Prerequisites

To get the most out of this article, you'll need:

  • Proficiency in Python programming
  • Basic understanding of Langchain components (chains, vector stores, etc.)
  • Familiarity with command-line tools for installation and model management

What is Ollama?

Ollama provides a powerful and user-friendly platform for downloading and running open-source AI models locally on your machine. By automating the process of fetching models from the best sources, Ollama saves you the hassle of manual model setup. Additionally, if you have a dedicated GPU, Ollama detects it automatically and utilizes GPU acceleration to improve model performance, without requiring any manual adjustments.

One of the standout features of Ollama is its simplicity in model customization. You can modify the model's behavior by adjusting the prompt directly, making it easier to tailor the AI to specific use cases without needing to integrate additional frameworks like Langchain.

For more advanced users, Ollama also offers a Docker image, allowing you to deploy your AI models as Docker containers. This makes it incredibly flexible and ideal for various deployment scenarios, whether you're running the models on a personal machine or in a cloud environment.

Ready to get started? Let’s explore the steps to install Ollama on your machine and begin using these powerful AI models with minimal setup.

Also Read : Creating a Personal Assistant with LangChain: A Step-by-Step Guide

How to Install Ollama?

Unfortunately, Ollama is only officially available for macOS and Linux. However, Windows users can still use Ollama by leveraging WSL2 (Windows Subsystem for Linux 2). WSL2 allows you to run a Linux environment on your Windows machine, enabling the installation of tools like Ollama that are typically exclusive to Linux or macOS. 

If you don't have WSL2 installed on your computer, follow these steps:

  • Enable WSL: Open PowerShell as Administrator and run the following command:

wsl --install
This will install WSL2 and the default Linux distribution, which is typically Ubuntu. If you already have an older version of WSL, the command will upgrade it to version 2. 
  • Check WSL2 Installation: After installation, verify that WSL2 has been properly installed by running:
wsl --list --verbose
  • Install a Linux Distribution (if Needed): If you haven't installed Ubuntu or any other Linux distribution yet, you can do so by running:
wsl --install -d Ubuntu

Once WSL2 is set up and you have Ubuntu (or another Linux distribution) running, follow these steps to install Ollama: 

  • Open Ubuntu (WSL2): 

            Open Ubuntu from the Start Menu or through Windows Terminal. 

  • Run the Installation Command: 

            In the Ubuntu terminal, run the following command to install Ollama:

curl https://ollama.ai/install.sh | sh

  • Verify Installation: 

            After the installation is complete, verify that Ollama has been successfully                 installed by checking its version:

ollama --version

This will install Ollama on WSL2. If you're using macOS, you can find installation instructions here. Now that Ollama is installed, you're ready to download a model. Keep the terminal open, as we're not done yet!

Downloading a Model

After installing Ollama, you can easily download and run various AI models locally on your machine. Ollama supports a variety of models, including Llama2, Codellama Orca-mini, and others, depending on your needs. To download a model, simply run the command like `ollama run orca-mini`, and the model will be downloaded and started automatically. 

ollama run orca-mini

Ollama will also handle any necessary GPU acceleration if your system has a dedicated GPU. Once the model is running, you can interact with it directly through the terminal for tasks like text generation or question answering. Models are stored locally, so they don’t need to be re-downloaded each time, and you can manage them by removing unused models to save space. Ollama also allows for customization of models, such as changing prompts, making it a flexible tool for various use cases.

ollama run orca-mini

This command will download and run the orca-mini model in the terminal. Ensure that your computer has at least 8GB of RAM before running this model.

Now that the model is set up and functioning properly, let’s proceed to create the PDF chatbot using it.

Creating the Chatbot

Set up the Project Directory

First, create the project directory and subdirectories, and set up a virtual environment.

# Create the project folder and navigate into it
mkdir pdf-chatbot
cd pdf-chatbot

# Create subdirectories for data, scripts, and logs
mkdir data scripts logs

# Set up a virtual environment
python3 -m venv venv

# Activate the virtual environment
source venv/bin/activate   # On Linux/MacOS
# .\venv\Scripts\activate   # On Windows

Installing Necessary Libraries

If you'd like to install the required libraries directly without creating a file, you can run the following pip commands:

pip install langchain pymupdf huggingface-hub faiss-cpu sentence-transformers

Creating Necessary Functions

Here are the points for creating the necessary functions for the PDF chatbot:

1. PDF Text Extraction Function:

   - Use `PyPDF2` to extract text from PDF files.

  - Implement function `extract_text_from_pdf(pdf_path)` to read PDF and return text.

import PyPDF2

def extract_text_from_pdf(pdf_path):
    """
    Extracts text from the provided PDF file.

    Args:
        pdf_path (str): Path to the PDF file.
    
    Returns:
        str: The extracted text from the PDF.
    """
    with open(pdf_path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        text = ''
        for page in reader.pages:
            text += page.extract_text()
    return text

2. Text Embedding for Storage:

   - Use Langchain’s FAISS vector store for embedding text.

   - Implement function `embed_text(text)` to embed the extracted text and store it in a FAISS vector store.

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

def embed_text(text):
    """
    Embeds the extracted text and stores it in a FAISS vector store for fast similarity search.

    Args:
        text (str): The text extracted from the PDF.
    
    Returns:
        FAISS: A FAISS vector store containing the embedded text.
    """
    embeddings = OpenAIEmbeddings()  # Using OpenAI's embeddings (adjust as necessary)
    vectorstore = FAISS.from_documents([text], embeddings)  # Embedding the text
    return vectorstore

3. Query Handling Function:

   - Create function `create_chatbot(pdf_path)` to load the PDF, extract text, embed it, and create a chatbot.

   - Implement `answer_query(query)` to answer user questions based on the embedded PDF content.

from langchain.chains import ConversationalChain
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import Ollama
from scripts.pdf_parser import extract_text_from_pdf
from scripts.embed_text import embed_text

def create_chatbot(pdf_path):
    """
    Creates a chatbot based on the text extracted from the provided PDF file.

    Args:
        pdf_path (str): Path to the PDF file.
    
    Returns:
        callable: A function that answers queries based on the PDF content.
    """
    # Step 1: Extract text from the PDF
    pdf_text = extract_text_from_pdf(pdf_path)
    
    # Step 2: Embed the extracted text
    vectorstore = embed_text(pdf_text)

    # Step 3: Load the Ollama model
    llm = Ollama(model="orca-mini")  # Adjust the model if necessary
    qa_chain = load_qa_chain(llm, chain_type="map_reduce")

    def answer_query(query):
        """
        Answers the user's query based on the PDF content.

        Args:
            query (str): The question to ask.
        
        Returns:
            str: The answer to the query.
        """
        # Retrieve the most relevant document for the query
        relevant_document = vectorstore.similarity_search(query, k=1)

        # Get the answer using the QA chain
        answer = qa_chain.run(input_documents=relevant_document, question=query)
        return answer
    
    return answer_query

4. Main Function:

   - Implement an interactive loop in `main()` where the user can input queries and receive answers from the chatbot.

from scripts.chatbot import create_chatbot

def main():
    """
    Main function to interact with the PDF-based chatbot.
    """
    # Specify the PDF file path
    pdf_path = 'data/example.pdf'  # Adjust the path to your PDF file

    # Create the chatbot with the PDF content
    chatbot = create_chatbot(pdf_path)

    # Start chatting with the chatbot
    while True:
        query = input("Ask a question about the PDF (or type 'exit' to quit): ")
        if query.lower() == 'exit':
            break
        answer = chatbot(query)
        print("Answer:", answer)

if __name__ == "__main__":
    main()

5. Final Project Structure:

   - Organize the code into separate files for PDF parsing, embedding, chatbot creation, and the main interaction script.

pdf-chatbot/
│
├── data/                  # PDF files (e.g., example.pdf)
│
├── scripts/               # Python scripts
│   ├── pdf_parser.py      # PDF text extraction script
│   ├── embed_text.py      # Text embedding script
│   ├── chatbot.py         # Chatbot script
│   └── main.py            # Main script to run the chatbot
│
├── logs/                  # Log files (optional)
│
└── venv/                  # Virtual environment

Import the Necessary Packages

The steps outlined above provide a foundation for creating a basic PDF-based chatbot. However, if you’re looking to build a more sophisticated chatbot, additional features are necessary to enhance its functionality. These features could include incorporating memory to retain previous interactions, routing to direct specific queries to appropriate modules, and the ability to manage dynamic conversations. 

As the complexity increases, it becomes crucial to structure the code efficiently. To avoid redundancy and simplify testing, we will create dedicated functions for each key step of the process. This modular approach ensures that each part of the chatbot’s functionality is reusable and easily maintainable.

# Importing essential packages to build the PDF-based chatbot
from langchain.embeddings import HuggingFaceEmbeddings  # For creating text embeddings using Hugging Face models
from langchain.document_loaders import PyMuPDFLoader  # For loading and extracting text from PDF documents
from langchain.text_splitter import RecursiveCharacterTextSplitter  # To split large documents into smaller chunks for processing
from langchain.vectorstores import FAISS  # For efficient vector-based similarity search using FAISS
from langchain.chains import RetrievalQA  # For building the question-answering chain with document retrieval capabilities
import textwrap  # For text wrapping to format and display text in a readable manner


Next, we define our first function, which will be responsible for loading the PDF file. In this function, we will use the PyMuPDFLoader from Langchain to read and extract text from the PDF document.

# Function to load the PDF file
def load_pdf_data(file_path):
    # Initialize the PyMuPDFLoader with the provided file path
    loader = PyMuPDFLoader(file_path)
    
    # Load the content of the PDF file
    documents = loader.load()
    
    # Return the loaded document content
    return documents

To split the documents into multiple chunks, we’ll use the RecursiveCharacterTextSplitter from Langchain, a widely-used tool for text segmentation. This splitter divides the text based on character count while respecting logical boundaries within the text, ensuring each chunk is coherent and within the desired size limit.

from langchain.text_splitter import RecursiveCharacterTextSplitter

def split_docs(documents, chunk_size=1000, chunk_overlap=20):
    # Initialize the text splitter with given chunk size and overlap
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
    
    # Split each document into chunks
    all_chunks = []
    for doc in documents:
        chunks = text_splitter.split_text(doc)
        all_chunks.extend(chunks)
    
    return all_chunks


To perform the embedding, we’ll load the HuggingFaceEmbedding model from Langchain and use FAISS (Facebook AI Similarity Search) to create a vector store. The model we’ll use for embedding is all-MiniLM-L6-v2, which is optimized for generating dense embeddings suitable for tasks like semantic search.

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS

# Function for loading the embedding model
def load_embedding_model(model_path, normalize_embedding=True):
    """
    Loads the Hugging Face embedding model with optional normalization.

    Parameters:
    - model_path (str): The path or name of the Hugging Face model.
    - normalize_embedding (bool): Whether to normalize the embeddings for cosine similarity.

    Returns:
    - HuggingFaceEmbeddings: The embedding model.
    """
    return HuggingFaceEmbeddings(
        model_name=model_path,
        model_kwargs={'device': 'cpu'},  # Using CPU for model inference
        encode_kwargs={'normalize_embeddings': normalize_embedding}  # Normalize embeddings for cosine similarity
    )

# Function for creating embeddings and saving them using FAISS
def create_embeddings(chunks, embedding_model, storing_path="vectorstore"):
    """
    Creates embeddings for text chunks and stores them in a FAISS vector store.

    Parameters:
    - chunks (list of str): List of text chunks to embed.
    - embedding_model: The model used to generate embeddings.
    - storing_path (str): Path to save the FAISS vector store.

    Returns:
    - vectorstore: The FAISS vector store containing the embeddings.
    """
    # Generate embeddings for the chunks using the embedding model
    vectorstore = FAISS.from_documents(chunks, embedding_model)
    
    # Save the vector store to the specified path
    vectorstore.save_local(storing_path)
    
    return vectorstore


To create a custom prompt template for the chatbot using the orca-mini model, you can define a prompt template that guides the model to respond in a specific way based on your use case.

from langchain.prompts import PromptTemplate

# Custom prompt template for Orca-Mini model
custom_prompt = """
You are an AI assistant with expertise in {expertise_area}. Your goal is to assist the user by providing detailed and clear responses to their queries. Be friendly, informative, and concise in your answers.

User's Question: {question}
Your Answer:
"""

# Define the prompt template using Langchain's PromptTemplate class
prompt_template = PromptTemplate(
    input_variables=["expertise_area", "question"],
    template=custom_prompt
)

# Example usage with a question and expertise area
expertise_area = "Python programming"
question = "How can I optimize a Python function for performance?"

# Generate the final prompt
final_prompt = prompt_template.format(expertise_area=expertise_area, question=question)

# Output the final prompt to see how it looks
print(final_prompt)


To create a Question Answering (QA) chain with Langchain using RetrievalQA, the first step is to prepare the documents by embedding them and storing them in a vector store like FAISS, ensuring that the documents are ready for efficient retrieval. Next, you set up the retriever, which searches the document store to find the most relevant information based on the input query and returns the relevant documents. 

Also Read: Master Integrating Language Models into Existing Software Systems

Finally, the RetrievalQA chain combines the retriever with the model, passing the retrieved documents to the model to generate the final answer. This approach allows the model to answer questions based on the current query and the relevant documents, but it doesn't retain memory of previous interactions, meaning each query is treated independently.

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI  # Replace with Orca-Mini if needed

# Function to load the embedding model
def load_embedding_model(model_path, normalize_embedding=True):
    return HuggingFaceEmbeddings(
        model_name=model_path,
        model_kwargs={'device': 'cpu'},
        encode_kwargs={'normalize_embeddings': normalize_embedding}
    )

# Function to create a vector store using FAISS
def create_embeddings(chunks, embedding_model, storing_path="vectorstore"):
    vectorstore = FAISS.from_documents(chunks, embedding_model)
    vectorstore.save_local(storing_path)
    return vectorstore

# Initialize embedding model and vector store
embedding_model = load_embedding_model("sentence-transformers/all-MiniLM-L6-v2")
document_chunks = ["This is the first document.", "Here is the second one.", "More document text."]
vectorstore = create_embeddings(document_chunks, embedding_model, storing_path="vectorstore")

# Initialize the retriever from the vector store
retriever = vectorstore.as_retriever()

# Initialize the QA model (e.g., Orca-Mini or OpenAI)
qa_model = OpenAI(temperature=0, model="orca-mini")  # Use your chosen model here

# Create the RetrievalQA chain
qa_chain = RetrievalQA(combine_docs_chain=qa_model, retriever=retriever)

# Example query
query = "What is the content of the first document?"

# Get the answer
response = qa_chain.run(query)

# Print the response
print(response)


Example Query and Response:

query = "What is the content of the first document?"
response = qa_chain.run(query)

# Output might be something like:
# "The first document contains text about an introduction to the system."


This setup allows you to implement a retrieval-based QA system where each question is answered based on the information in the document store, but with no memory of prior queries.

Conclusion

I hope now you have a clear understanding about how to create a PDF Chatbot using Langchain and Ollama. Ollama is a new kid in this domain and it really makes our life easier. You already see how we initialized the orca-mini model in just one line. Otherwise, you have to use Hugging face Pipeline from Langchain.

Key Takeaways

  • Ollama Simplifies Model Deployment: Ollama simplifies the deployment of open-source models by providing an easy way to download and run them on your local computer.
  • PDF Chatbot Development: Learn the steps involved in creating a PDF chatbot, including loading PDF documents, splitting them into chunks, and creating a chatbot chain.
  • Customization for Better Responses: Understand how to customize prompts and templates to improve the responses of your chatbot.

You will get all the codes used in this article here.