• Home
  • Unlocking the Power of ChatGPT: A Guide to Answering Questions on Own Data

Unlocking the Power of ChatGPT: A Guide to Answering Questions on Own Data

ChatGPT (Generative Pretrained Transformer), a revolutionary architecture developed by OpenAI, is incredibly adept at a range of natural language processing tasks. But despite its remarkable capabilities, it comes with a limitation when handling customized tasks – the input token size.

For instance, ChatGPT’s input size is capped at 4,096 tokens for ChatGPT 3.5. This poses a problem when you need to ask questions about a text that far exceeds this limit, such as a book. Fine-tuning GPT with your private data is an option, but it’s a resource-intensive process, demanding both high computational power and a solid understanding of machine learning.

But what if I told you there’s an alternative? This guide will delve into the utilization of an external vector database to augment ChatGPT’s capabilities. This allows ChatGPT to access an ‘infinite external memory,’ greatly enhancing its potential for customized tasks.

The Approach

Our approach involves dividing your extensive text into bite-sized chunks that ChatGPT can handle, storing these in the vector database. Each chunk is indexed by an embedding vector. When you query ChatGPT, the question is converted to an embedding vector, used to retrieve the relevant chunks from the vector database. These chunks and the original question are combined and fed to ChatGPT, yielding your response.

Key Terms

Before we proceed, let’s familiarize ourselves with a couple of crucial terms:


Embedding, in essence, is a vector – a list of numbers with machine-readable semantic meaning. For instance, if we represent the words “dog” and “cat” as vectors, these vectors would be mathematically close together because they’re both common pets. In our context, when a question is submitted, it is transformed into a question embedding. By comparing this vector to the ones representing the chunks in the database, we can retrieve the chunks most relevant to the question.

Vector Database

A vector database is a unique type of database where information is stored as vectors or arrays of numbers, with each number in the vector corresponding to a specific attribute or feature of the data. A query vector doesn’t need to perfectly match the database vectors; the database engine efficiently retrieves data indexed by similar vectors. In our context, it retrieves chunks semantically related to your question


  1. Preprocessing: Initiate by cleaning and preprocessing your data corpus, thereby ensuring uniformity and eliminating extraneous information.
  2. Embedding Generation: Apply a pre-trained language model to compute embeddings for each document or textual passage within the corpus.
  3. Vector Database Storage: Store the computed embeddings in a vector database. This step will facilitate efficient searches based on similarity.
  4. Query Processing: For any user query, convert it into an embedding – essentially a sequence of numbers – using the same pre-trained language model.
  5. Similarity Search: Carry out a similarity search within the vector database to pinpoint the most relevant matches for the query embedding.
  6. Prompt Engineering: Skillfully create a prompt that integrates the user’s query with the retrieved text passages. This prompt will guide ChatGPT towards generating a pertinent and precise response.
  7. Response Generation: Present the engineered prompt to ChatGPT and obtain the answer it generates.

In conclusion, this guide outlines how to significantly expand ChatGPT’s potential for customized tasks using a vector database, breaking through the barrier of the input token size limitation. With this solution, ChatGPT becomes an even more powerful tool, capable of analyzing extensive text data with relative ease and efficiency.


Shariq Hussain

Leave Comment