Knowledge Base
The KnowledgeBase
class is a static utility for creating, managing, and querying vectorized knowledge stores. Its primary purpose is to ingest information from various sources (files, URLs, text), process it into searchable chunks, and store it in a vector database (ChromaDB).
This class serves as the data foundation for the ChatWithData
class. You first populate a KnowledgeBase
store with information, and then a ChatWithData
instance can use that store to answer questions.
Since all methods in this class are static, you do not need to create an instance of it.
addKnowledge
Adds information from a source into a specific knowledge store. The function automatically detects the source type, processes the content, splits it into manageable chunks, creates vector embeddings, and stores them.
Parameters
store
: A string name for the knowledge store (e.g.,'product-manuals'
).knowledge
: The content to add. This can be:A URL string (e.g.,
'https://my-company.com/faq'
).A file path string (e.g.,
'/path/to/my-document.pdf'
). Supported extensions:pdf
,txt
,csv
,docx
,pptx
,html
.A plain text string.
A JSON object.
name
: A string name to identify this knowledge source, which is useful for later management (e.g.,'Product Manual v2'
).options
: An optional object for text splitting.chunkSize
: The maximum size of each text chunk. Defaults to2000
.chunkOverlap
: The number of characters to overlap between chunks. Defaults to400
.
Example 1: Adding knowledge from a PDF file
# store
'product-manuals'
# knowledge
'/path/to/files/ATR7000-manual.pdf'
# name
'ATR7000 Manual'
Example 2: Adding knowledge from a website
# store
'company-info'
# knowledge
'https://heisenware.com/about-us'
# name
'About Heisenware'
Example 3: Adding knowledge from a plain text string
# store
'faq'
# knowledge
'Question: What are the support hours? Answer: Support is available 24/7 via email.'
# name
'Support Hours FAQ'
Output
Returns true once the knowledge has been successfully added.
similaritySearch
Performs a similarity search against a knowledge store to find text chunks that are most relevant to a given query.
Parameters
store
: The name of the knowledge store.query
: The text query to search for.nDocs
: The maximum number of relevant documents (chunks) to return.
Output
An array of document objects relevant to the query.
getDocuments
Provides a list of all the named documents currently stored in a knowledge base.
Parameters
store
: The name of the knowledge store.
Output
An array of objects, where each object represents a document source you've added.
deleteDocument
Deletes all chunks associated with a specific named document from the knowledge store.
Parameters
store
: The name of the knowledge store.name
: The name of the document source to delete (the same name provided inaddKnowledge
).
getMetaData
Retrieves detailed metadata for all chunks stored in the knowledge base, useful for debugging.
Parameters
store
: The name of the knowledge store.options
: An optional object.showData
: Iftrue
, includes the raw text content of each chunk. Defaults tofalse
.
reset
Completely deletes all information in a given knowledge store.
Parameters
store
: The name of the knowledge store to reset.
Last updated