Powered by HuggingFace sentence transformers
The embeddings API allows you to generate embeddings from your text. It is similar in functionality to OpenAI’s embeddings API, except it is hosted privately and is powered by HuggingFace. Every Tembo instance gets its own service, and all data passed to the Tembo embeddings API is not retained by this service.
You can call the embeddings app directly from your Postgres instance by using the pg_vectorize extension. The embedding service supports all of the HuggingFace sentence-transformers. Simply replace model_name
with the sentence-transformer of your choice.
select * from vectorize.transform_embeddings(
input => 'the quick brown fox jumped over the lazy dog',
model_name => 'paraphrase-MiniLM-L6-v2'
);
{0.5988337993621826,-0.12069590389728546, .... -0.11859191209077836}
One of the most common use cases for embeddings is to perform vector similarity search on your data. In this guide we will walk through using all-MiniLM-L6-v2 as an alternative to OpenAI’s embeddings API to perform vector similarity search on text data in your Postgres instance.
Enabling embeddings on Tembo Cloud
Via UI
You can enable the embeddings app on your Tembo Cloud instance by navigating to “Apps”, then “Embeddings”. Click “Activate” to enable the embeddings app. This runs a container next to your Tembo Postgres instance that is pre-configured to communicate with Postgres.
Via API
You can also enable the embeddings app by using the Tembo Platform API. First, you will need to generate an API token so that you can communicate with the Tembo platform API. Navigate to cloud.tembo.io/generate-jwt and follow the instructions to generate a token. Alternatively, you can follow the instructions here.
Set your Tembo token as an environment variable, along with your organization id and the Tembo instance id. Fetch the TEMBO_DATA_DOMAIN
from the “Host” parameter of your Tembo instance.
export TEMBO_TOKEN=<your token>
export TEMBO_ORG=<your organization id>
export TEMBO_INST_ID=<your instance id>
export TEMBO_INST_NAME=<your instance name>
export TEMBO_DATA_DOMAIN=<you Tembo domain>
Patch your existing Tembo instance using the Tembo Cloud Platform API to enable the embeddings API. We’ll set the the configurations to None
so that the defaults are assigned.
Python:
import requests
TEMBO_ORG = os.environ["TEMBO_ORG"]
TEMBO_INST = os.environ["TEMBO_INST"]
TEMBO_TOKEN = os.environ["TEMBO_TOKEN"]
resp = requests.patch(
url=f"https://api.tembo.io/api/v1/orgs/{TEMBO_ORG}/instances/{TEMBO_INST}",
headers={"Authorization": f"Bearer {TEMBO_TOKEN}"},
json={
"app_services": [
{"embeddings": None}, // default configuration
]
}
)
Curl:
curl -X PATCH \
-H "Authorization: Bearer ${TEMBO_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"app_services": [{"embeddings": null}]}' \
"https://api.tembo.io/api/v1/orgs/${TEMBO_ORG}/instances/${TEMBO_INST}"
Using the embeddings API for vector similarity search
Connect to your Tembo Postgres instance.
psql postgres://$yourUser:$yourPassword@${TEMBO_DATA_DOMAIN}:5432/postgres
If you already have a table, you can start with that. You could also start with an example dataset, which is what we will use for this example.
CREATE TABLE products (LIKE vectorize.example_products INCLUDING ALL);
INSERT INTO products SELECT * FROM vectorize.example_products;
Initialize a table for automated vector search
We need to specify a job_name
. There can be more than one job per table, but generally people will have just one job. However, it must be unique. Specify the table
name and the primary_key
for the table that you want to search.
The columns
parameter specifies the exact columns within that table that you want to search. In our example, we’ll search the product_name
and description
columns. The transformer
parameter specifies the transformer model that you want to use (for now, we only support all_MiniLM_L12_v2
for privately hosted open source models).
The schedule
parameter specifies how often you want to update the embeddings. In our example, we’ll update the embeddings every minute.
SELECT vectorize.table(
job_name => 'product_search',
"table" => 'products',
primary_key => 'product_id',
columns => ARRAY['product_name', 'description'],
transformer => 'all_MiniLM_L12_v2',
schedule => '* * * * *'
);
We can start the initial load of embeddings immediately by running the following command:
SELECT vectorize.job_execute('product_search');
Or we can simply wait for the cron job to complete.
Search the table with raw text
To search that table’s job, use the same job_name
that we specified in the previous step.
Provide a raw text search query
, and specify which num_results
you want to receive from the request.
Finally, specify the num_results
to return. This amounts to effectively a limit
statement.
vectorize.search
will automatically use the same embedding model that was used during the vectorize.table
call.
SELECT * FROM vectorize.search(
job_name => 'product_search',
query => 'accessories for mobile devices',
return_columns => ARRAY['product_id', 'product_name'],
num_results => 3
);
search_results
------------------------------------------------------------------------------------------------
{"product_id": 13, "product_name": "Phone Charger", "similarity_score": 0.8564774308489237}
{"product_id": 24, "product_name": "Tablet Holder", "similarity_score": 0.8295404213393001}
{"product_id": 4, "product_name": "Bluetooth Speaker", "similarity_score": 0.8248579643539758}
Get started now at cloud.tembo.io!
Using the embeddings API directly
The embeddings service can also be used directly via your Tembo instance’s API. For example, to generate embeddings for a single sentence, use the following as a reference.
Export your Tembo host domain into an environment variable.
export TEMBO_DATA_DOMAIN=org-yourOrg-inst-yourInst.prd.data-1.use1.tembo.io
Python:
import requests
TEMBO_TOKEN = os.environ["TEMBO_TOKEN"]
TEMBO_DATA_DOMAIN = os.environ["TEMBO_DATA_DOMAIN"]
resp = requests.post(
url=f"https://{TEMBO_DATA_DOMAIN}/embeddings/v1/embeddings",
headers={"Authorization": f"Bearer {TEMBO_TOKEN}"},
json={
"input": [
"I enjoy taking long walks along the beach with my dog.",
"I enjoy playing video games."
],
"model": "sentence-transformers/all-MiniLM-L6-v2"
}
)
Curl:
curl -X POST "https://${TEMBO_DATA_DOMAIN}/embeddings/v1/embeddings" \
-H "Authorization: Bearer ${TEMBO_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"input": [
"I enjoy taking long walks along the beach with my dog.",
"I enjoy playing video games."
],
"model": "sentence-transformers/all-MiniLM-L6-v2"
}'