The Tembo VectorDB Stack is a platform that simplifies the process of working with embeddings in Postgres. It provides tools to automate the process of generating, managing, and working with embeddings from your existing data, giving you vector search capabilities on day one.
Technical specifications
Extensions
pg_stat_statements
provides statistics on SQL statements executed by the database. It helps users analyze query performance and identify areas for optimization.- pg_vectorize provides a simple interface to generate embeddings from text, store them in Postgres, and then search for similar vectors using
pgvector
. - pgvector is a vector similarity search engine for Postgres. It is used to store embeddings, create indexes, and conduct vector searches on that data.
pg_vectorize
relies onpgvector
for indices and similiary search. - pgmq is used by
pg_vectorize
as a job queue for managing and separating the calculation of embeddings from source data. - pg_cron is used by
pg_vectorize
to schedule recurring updates to embeddings. - Extensions from Trunk can be installed on-demand.
Container services
The VectorDB Stack deploys a container in the same Kubernetes namespace as your Postgres database to host text embedding models.
When embeddings need to be computed, pg_vectorize
sends a HTTP request to this container. This container hosts any SentenceTransformers models as well as any private models uploaded to Hugging Face. The models are downloaded to this container on-demand and cached for all subsequent requests.
The container is private to your Tembo instance.
Additional technical specs of this container can be found in the VectorDB Stack Specification.
Getting started
The VectorDB Stack comes pre-configured for building applications that require embeddings.
The fastest way to build applications on text embeddings is to use the pg_vectorize
Postgres extension.
This extension provides a consistent interface for generating embeddings from many common text embedding model sources including OpenAI and Hugging Face, as well as searching the embeddings and keeping them up-to-date.
The general flow is to first call vectorize.table()
on your source data table to initialize the process. This can be done in a SQL migration.
This configures pg_vectorize
to generate embeddings from data in that table, keeps track of which transformer model was used to generate embeddings, and watches for updates to the table to update embeddings.
Then, you’ll need to call vectorize.search()
to search for similar embeddings based on a query and return the source data that is most similar to the query.
The extension handles the transformation of the query into embeddings and the search for similar embeddings in the table.
Let’s get started!
You’ll want to begin by connecting to your Postgres instance.
psql 'postgresql://postgres:<your-password>@<your-host>:5432/postgres'
Using Hugging Face sentence transformers
vectorize.table()
works with ANY model that can be loaded via the SentenceTransformers() API so long as it does not require any additional code execution (which includes most open source sentence transformers).
To get started, setup a products table. Copy from the example data provided by the extension.
CREATE TABLE products (LIKE vectorize.example_products INCLUDING ALL);
INSERT INTO products SELECT * FROM vectorize.example_products;
SELECT * FROM products LIMIT 2;
product_id | product_name | description | last_updated_at
------------+--------------+--------------------------------------------------------+-------------------------------
1 | Pencil | Utensil used for writing and often works best on paper | 2023-07-26 17:20:43.639351-05
2 | Laptop Stand | Elevated platform for laptops, enhancing ergonomics | 2023-07-26 17:20:43.639351-05
Create a job to vectorize the products table. We’ll specify the tables primary key (product_id
) and the columns that we want to search (product_name
and description
).
SELECT vectorize.table(
job_name => 'product_search_hf',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
schedule => 'realtime',
transformer => 'sentence-transformers/all-MiniLM-L12-v2'
);
Private models from Hugging Face
If you’ve uploaded a private model to Hugging Face, you can still host it on Tembo Cloud. Simply reference your Hugging Face org and model name,
and pass the API key in as an arg
to vectorize.table()
.
SELECT vectorize.table(
job_name => 'product_search_hf',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'my-hugging-face-org/my-private-model',
schedule => 'realtime',
args => '{"api_key": "hf_my_private_api_key"}'
);
Then search,
SELECT * FROM vectorize.search(
job_name => 'product_search_hf',
query => 'accessories for mobile devices',
return_columns => ARRAY['product_id', 'product_name'],
num_results => 3
);
search_results
---------------------------------------------------------------------------------------------
{"product_id": 13, "product_name": "Phone Charger", "similarity_score": 0.8147814132322894}
{"product_id": 6, "product_name": "Backpack", "similarity_score": 0.7743061352550308}
{"product_id": 11, "product_name": "Stylus Pen", "similarity_score": 0.7709902653575383}
Using OpenAI embedding models
pg_vectorize
also works with using OpenAI’s embeddings, but first you’ll need an OpenAI API key.
Start by setting your API key as a Postgres configuration parameter.
ALTER SYSTEM SET vectorize.openai_key TO '<your api key>';
SELECT pg_reload_conf();
Create an example table if it does not already exist.
CREATE TABLE products (LIKE vectorize.example_products INCLUDING ALL);
INSERT INTO products SELECT * FROM vectorize.example_products;
Then create the job:
SELECT vectorize.table(
job_name => 'product_search_openai',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'openai/text-embedding-ada-002',
schedule => 'realtime'
);
It may take some time to generate embeddings, depending on API latency.
SELECT * FROM vectorize.search(
job_name => 'product_search_openai',
query => 'accessories for mobile devices',
return_columns => ARRAY['product_id', 'product_name'],
num_results => 3
);
search_results
--------------------------------------------------------------------------------------------
----
{"product_id": 13, "product_name": "Phone Charger", "similarity_score": 0.8564681325237845}
{"product_id": 24, "product_name": "Tablet Holder", "similarity_score": 0.8295988934993099}
{"product_id": 4, "product_name": "Bluetooth Speaker", "similarity_score": 0.8250355616233103}
(3 rows)
Changing the configured database
By default, vectorize
is configured to run on the postgres
database, but that can be changed to any database in Postgres.
Update the following configuration parameters so that the corresponding background workers connect to the correct database.
Both pg_vectorize
and pg_cron
will need their configuration updated.
This can be done by running the following SQL commands.
ALTER SYSTEM SET cron.database_name TO 'my_new_db';
ALTER SYSTEM SET vectorize.database_name TO 'my_new_db';
Then, restart Postgres to apply the changes and, if you haven’t already, enable the extension in your new database.
CREATE EXTENSION vectorize CASCADE;
Updating embeddings
Embeddings are immediately computed for your data when vectorize.table()
is called.
However, a time will come when rows are updated or inserted, resulting in the need for embeddings to be updated.
pg_vectorize
supports two methods of keeping embeddings up-to-date: trigger based, and on a recurring interval.
This behavior is configured by setting the schedule
parameter on vectorize.table()
.
The default behavior is a cron-like syntax schedule => '* * * * *'
which checks for new rows or updates to existing rows every minute.
In both cases, the schedule
parameter determines how new or updated rows are identified, resulting in jobs enqueued to pgmq to update the embeddings.
Using triggers
Setting the parameter schedule => 'realtime'
will create triggers on the table to create embedding update jobs whenever a new row is inserted or an existing row is updated.
SELECT vectorize.table(
job_name => 'my_search_project',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'sentence-transformers/all-MiniLM-L12-v2',
schedule => 'realtime'
);
Interval updates with pg_cron
The schedule parameter accepts a cron-like syntax to check for updates on a recurring basis.
For example, to check for updates every hour, set the schedule parameter to 0 * * * *
.
Using this method, you will also be required to provide the column that contains the last updated timestamp.
pg_vectorize
uses this column to determine which rows have been updated since the last time the embeddings were updated.
SELECT vectorize.table(
job_name => 'my_search_project',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'sentence-transformers/all-MiniLM-L12-v2',
update_col => 'last_updated_at',
schedule => '0 * * * *'
);
The cron job can then be viewed by running the following SQL command.
select command, jobname from cron.job where jobname = 'my_search_project';
command | jobname
---------------------------------------------------+-------------------
select vectorize.job_execute('my_search_project') | my_search_project
On-demand updates
If you need to update the embeddings on an ad-hoc basis, you can do so by calling vectorize.job_execute()
.
SELECT vectorize.job_execute('my_search_project');
Embedding locations
Embeddings can be created either on the same table as the source data, or on a separate table in the vectorize
schema.
The table_method
parameter determines where the embeddings are stored, and the default is table_method => 'join'
, which creates a table in the vectorize
schema named _embeddings_<project_name>
for each vectorize
job.
For example, if you create a job named my_search_project
, the embeddings will be stored in a table named vectorize._embedding_my_search_project
.
Alternatively, pg_vectorize
can be configured to create the embeddings on the same table as the source data.
By setting the table_method => 'append'
, pg_vectorize will create two columns on the source table: one for the embedding, and one for the updated-at timestamp.
Separate table
The default behavior is table_method => 'join'
, and a new table is created in the vectorize
schema to store the embeddings.
SELECT vectorize.table(
job_name => 'my_search_project',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'sentence-transformers/all-MiniLM-L12-v2',
table_method => 'join'
);
postgres=# \d vectorize._embeddings_my_search_project;
Table "vectorize._embeddings_my_search_project"
Column | Type | Collation | Nullable | Default
------------+--------------------------+-----------+----------+---------
product_id | integer | | not null |
embeddings | vector(384) | | not null |
updated_at | timestamp with time zone | | not null | now()
Indexes:
"_embeddings_my_search_project_product_id_key" UNIQUE CONSTRAINT, btree (product_id)
"my_search_project_idx" hnsw (embeddings vector_cosine_ops)
Foreign-key constraints:
"_embeddings_my_search_project_product_id_fkey" FOREIGN KEY (product_id) REFERENCES products(product_id) ON DELETE CASCADE
New columns, same table
To create the embeddings on the same table as the source data, set the table_method
parameter to append
.
SELECT vectorize.table(
job_name => 'my_search_project',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'sentence-transformers/all-MiniLM-L12-v2',
table_method => 'append'
);
Note two new columns: my_search_project_embeddings
and my_search_project_updated_at
have been added to the table.
postgres=# \d products
Table "public.products"
Column | Type | Collation | Nullable | Default
------------------------------+--------------------------+-----------+----------+----------------------------------------------------------------
product_id | integer | | not null | nextval('vectorize.example_products_product_id_seq'::regclass)
product_name | text | | not null |
description | text | | |
last_updated_at | timestamp with time zone | | | CURRENT_TIMESTAMP
my_search_project_embeddings | vector(384) | | |
my_search_project_updated_at | timestamp with time zone | | |
Indexes:
"products_pkey" PRIMARY KEY, btree (product_id)
"my_search_project_idx" hnsw (my_search_project_embeddings vector_cosine_ops)
Ad-hoc embedding requests
Any text can be transformed into an embedding using vectorize.transform_embeddings()
.
This works with any of the sentence-transformers:
SELECT vectorize.transform_embeddings(
input => 'the quick brown fox jumped over the lazy dogs',
model_name => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1'
);
{-0.2556323707103729,-0.3213586211204529 ..., -0.0951206386089325}
Privately hosted models on Hugging Face can be referenced like so:
SELECT vectorize.transform_embeddings(
input => 'the quick brown fox jumped over the lazy dogs',
model_name => 'my-private-org/my-private-model',
api_key => 'your Hugging Face key'
)
For OpenAI requests, you can either set the API key as a Postgres configuration parameter or pass it in as an argument.
Passing it as an argument will override the configuration parameter.
As an argument:
SELECT vectorize.transform_embeddings(
input => 'the quick brown fox jumped over the lazy dogs',
model_name => 'openai/text-embedding-ada-002',
api_key => 'your OpenAI API key'
)
You do not need to provide the API key as an argument if it already been set via ALTER SYSTEM SET vectorize.openai_key
.
SELECT vectorize.transform_embeddings(
input => 'the quick brown fox jumped over the lazy dogs',
model_name => 'openai/text-embedding-ada-002'
)
Filtering Results
vectorize.search()
results can be filtered by supplying a where clause to the where_sql
parameter in vectorize.search()
.
The filter operation happens after the embeddings are searched. To pre-filter the search of embeddings, you will need to separate
embeddings into multiple tables or deploy a partitioning strategy.
In the example below, we will filter results to only return the product with product_id = 3
.
Note that product_id is unique, so only one result will be returned.
SELECT * FROM vectorize.search(
job_name => 'product_search_hf',
query => 'accessories for mobile devices',
return_columns => ARRAY['product_id', 'product_name'],
num_results => 3,
where_sql => $$product_id = 3$$
);
search_results
----------------------------------------------------------------------------------------
{"product_id": 3, "product_name": "Desk Lamp", "similarity_score": 0.6498761419705363}
Manually searching embeddings
vectorize.transform_embeddings()
can be useful when you want to manually query your embeddings.
To do this, place the vectorize.transform_embeddings()::vector
call into your query and manually compute the distance using pgvector’s distance operators. Note that you must select the same transformer model that was used to generate the embeddings.
The example below assumes embeddings are in a column named my_search_project_embeddings
on the products
table.
SELECT
product_name,
description,
1 - (
my_search_project_embeddings <=>
vectorize.transform_embeddings('mobile electronic devices', 'sentence-transformers/all-MiniLM-L12-v2')::vector
) as similarity
FROM products
ORDER by similarity DESC
LIMIT 3;
product_name | description | similarity
-------------------+------------------------------------------------------------+---------------------
Phone Charger | Device to replenish the battery of mobile phones | 0.5351522883863631
Bluetooth Speaker | Portable audio device with wireless connectivity | 0.38232471837548787
Wireless Mouse | Pointing device without the need for a physical connection | 0.35592426991011383
How it works
When vectorize.table()
is executed, the extension creates jobs in pgmq to generate embeddings for your existing data.
These jobs are executed by a background worker in Postgres. The background worker calls the appropriate embedding model, whether it’s coming from Hugging Face or OpenAI.
vectorize.search()
transforms the raw text query into embeddings using the same model that was used to generate the embeddings in the first place.
It then uses the pgvector
extension to search for the most similar embeddings in the table and returns the results in a JSON format to the caller.
Support
Join the Tembo Community in Slack to ask a question or see how others are building on https://cloud.tembo.io.