Hybrid search
Combine keyword search with semantic search.
Hybrid search combines full text search (searching by keyword) with semantic search (searching by meaning) to identify results that are both directly and contextually relevant to the user's query.
Why would I want to use hybrid search?
Sometimes a single search method doesn't quite capture what a user is really looking for. For example, if a user searches for "Italian recipes with tomato sauce" on a cooking app, a keyword search would pull up recipes that specifically mention "Italian," "recipes," and "tomato sauce" in the text. However, it might miss out on dishes that are quintessentially Italian and use tomato sauce but don't explicitly label themselves with these words, or use variations like "pasta sauce" or "marinara." On the other hand, a semantic search might understand the culinary context and find recipes that match the intent, such as a traditional "Spaghetti Marinara," even if they don't match the exact keyword phrase. However, it could also suggest recipes that are contextually related but not what the user is looking for, like a "Mexican salsa" recipe, because it understands the context to be broadly about tomato-based sauces.
Hybrid search combines the strengths of both these methods. It would ensure that recipes explicitly mentioning the keywords are prioritized, thus capturing direct hits that satisfy the keyword criteria. At the same time, it would include recipes identified through semantic understanding as being related in meaning or context, like different Italian dishes that traditionally use tomato sauce but might not have been tagged explicitly with the user's search terms. It identifies results that are both directly and contextually relevant to the user's query while ideally minimizing misses and irrelevant suggestions.
When would I want to use hybrid search?
The decision to use hybrid search depends on what your users are looking for in your app. For a code repository where developers need to find exact lines of code or error messages, keyword search is likely ideal because it matches specific terms. In a mental health forum where users search for advice or experiences related to their feelings, semantic search may be better because it finds results based on the meaning of a query, not just specific words. For a shopping app where customers might search for specific product names yet also be open to related suggestions, hybrid search combines the best of both worlds - finding exact matches while also uncovering similar products based on the shopping context.
How to combine search methods
Hybrid search merges keyword search and semantic search, but how does this process work?
First, each search method is executed separately. Keyword search, which involves searching by specific words or phrases present in the content, will yield its own set of results. Similarly, semantic search, which involves understanding the context or meaning behind the search query rather than the specific words used, will generate its own unique results.
Now with these separate result lists available, the next step is to combine them into a single, unified list. This is achieved through a process known as “fusion”. Fusion takes the results from both search methods and merges them together based on a certain ranking or scoring system. This system may prioritize certain results based on factors like their relevance to the search query, their ranking in the individual lists, or other criteria. The result is a final list that integrates the strengths of both keyword and semantic search methods.
Reciprocal Ranked Fusion (RRF)
One of the most common fusion methods is Reciprocal Ranked Fusion (RRF). The key idea behind RRF is to give more weight to the top-ranked items in each individual result list when building the final combined list.
In RRF, we iterate over each record and assign a score (noting that each record could exist in one or both lists). The score is calculated as 1 divided by that record's rank in each list, summed together between both lists. For example, if a record with an ID of 123
was ranked third in the keyword search and ninth in semantic search, it would receive a score of . If the record was found in only one list and not the other, it would receive a score of 0 for the other list. The records are then sorted by this score to create the final list. The items with the highest scores are ranked first, and lowest scores ranked last.
This method ensures that items that are ranked high in multiple lists are given a high rank in the final list. It also ensures that items that are ranked high in only a few lists but low in others are not given a high rank in the final list. Placing the rank in the denominator when calculating score helps penalize the low ranking records.
Smoothing constant k
To prevent extremely high scores for items that are ranked first (since we're dividing by the rank), a k
constant is often added to the denominator to smooth the score:
This constant can be any positive number, but is typically small. A constant of 1 would mean that a record ranked first would have a score of instead of . This adjustment can help balance the influence of items that are ranked very high in individual lists when creating the final combined list.
Hybrid search in Postgres
Let's implement hybrid search in Postgres using tsvector
(keyword search) and pgvector
(semantic search).
First we'll create a documents
table to store the documents that we will search over. This is just an example - adjust this to match the structure of your application.
_10create table documents (_10 id bigint primary key generated always as identity,_10 content text,_10 fts tsvector generated always as (to_tsvector('english', content)) stored,_10 embedding vector(512)_10);
The table contains 4 columns:
id
is an auto-generated unique ID for the record. We'll use this later to match records when performing RRF.content
contains the actual text we will be searching over.fts
is an auto-generatedtsvector
column that is generated using the text incontent
. We will use this for full text search (search by keyword).embedding
is a vector column that stores the vector generated from our embedding model. We will use this for semantic search (search by meaning). We chose 512 dimensions for this example, but adjust this to match the size of the embedding vectors generated from your preferred model.
Next we'll create indexes on the fts
and embedding
columns so that their individual queries will remain fast at scale:
_10-- Create an index for the full-text search_10create index on documents using gin(fts);_10_10-- Create an index for the semantic vector search_10create index on documents using hnsw (embedding vector_ip_ops);
For full text search we use a generalized inverted (GIN) index which is designed for handling composite values like those stored in a tsvector
.
For semantic vector search we use an HNSW index, which is a high performing approximate nearest neighbor (ANN) search algorithm. Note that we are using the vector_ip_ops
(inner product) operator with this index because we plan on using the inner product (<#>
) operator later in our query. If you plan to use a different operator like cosine distance (<=>
), be sure to update the index accordingly. For more information, see distance operators.
Finally we'll create our hybrid_search
function:
_48create or replace function hybrid_search(_48 query_text text,_48 query_embedding vector(512),_48 match_count int,_48 full_text_weight float = 1,_48 semantic_weight float = 1,_48 rrf_k int = 50_48)_48returns setof documents_48language sql_48as $$_48with full_text as (_48 select_48 id,_48 -- Note: ts_rank_cd is not indexable but will only rank matches of the where clause_48 -- which shouldn't be too big_48 row_number() over(order by ts_rank_cd(fts, websearch_to_tsquery(query_text)) desc) as rank_ix_48 from_48 documents_48 where_48 fts @@ websearch_to_tsquery(query_text)_48 order by rank_ix_48 limit least(match_count, 30) * 2_48),_48semantic as (_48 select_48 id,_48 row_number() over (order by doc_vector <#> query_embedding) as rank_ix_48 from_48 documents_48 order by rank_ix_48 limit least(match_count, 30) * 2_48)_48select_48 documents.*_48from_48 full_text_48 full outer join semantic_48 on full_text.id = semantic.id_48 join documents_48 on coalesce(full_text.id, semantic.id) = documents.id_48order by_48 coalesce(1.0 / (rrf_k + full_text.rank_ix), 0.0) * full_text_weight +_48 coalesce(1.0 / (rrf_k + semantic.rank_ix), 0.0) * semantic_weight_48 desc_48limit_48 least(match_count, 30)_48$$;
Let's break this down:
-
Parameters: The function accepts quite a few parameters, but the main (required) ones are
query_text
,query_embedding
, andmatch_count
.query_text
is the user's query text (more on this shortly)query_embedding
is the vector representation of the user's query produced by the embedding model. We chose 512 dimensions for this example, but adjust this to match the size of the embedding vectors generated from your preferred model. This must match the size of theembedding
vector on thedocuments
table (and use the same model).match_count
is the number of records returned in thelimit
clause.
The other parameters are optional, but give more control over the fusion process.
full_text_weight
andsemantic_weight
decide how much weight each search method gets in the final score. These are both 1 by default which means they both equally contribute towards the final rank. Afull_text_weight
of 2 andsemantic_weight
of 1 would give full-text search twice as much weight as semantic search.rrf_k
is thek
smoothing constant added to the reciprocal rank. The default is 50.
-
Return type: The function returns a set of records from our
documents
table. -
CTE: We create two common table expressions (CTE), one for full-text search and one for semantic search. These perform each query individually prior to joining them.
-
RRF: The final query combines the results from the two CTEs using reciprocal rank fusion (RRF).
Running hybrid search
To use this function in SQL, we can run:
_10select_10 *_10from_10 hybrid_search(_10 'Italian recipes with tomato sauce', -- user query_10 '[...]'::vector(512), -- embedding generated from user query_10 10_10 );
In practice, you will likely be calling this from the Supabase client or through a custom backend layer. Here is a quick example of how you might call this from an Edge Function using JavaScript:
_38import { createClient } from 'npm:@supabase/supabase-js'_38import OpenAI from 'npm:openai'_38_38const supabaseUrl = Deno.env.get('SUPABASE_URL')!_38const supabaseServiceRoleKey = Deno.env.get('SUPABASE_SERVICE_ROLE_KEY')!_38const openaiApiKey = Deno.env.get('OPENAI_API_KEY')!_38_38Deno.serve(async (req) => {_38 // Grab the user's query from the JSON payload_38 const { query } = await req.json()_38_38 // Instantiate OpenAI client_38 const openai = new OpenAI({ apiKey: openaiApiKey })_38_38 // Generate a one-time embedding for the user's query_38 const embeddingResponse = await openai.embeddings.create({_38 model: 'text-embedding-3-large',_38 input: query,_38 dimensions: 512,_38 })_38_38 const [{ embedding }] = embeddingResponse.data_38_38 // Instantiate the Supabase client_38 // (replace service role key with user's JWT if using Supabase auth and RLS)_38 const supabase = createClient(supabaseUrl, supabaseServiceRoleKey)_38_38 // Call hybrid_search Postgres function via RPC_38 const { data: documents } = await supabase.rpc('hybrid_search', {_38 query_text: query,_38 query_embedding: embedding,_38 match_count: 10,_38 })_38_38 return new Response(JSON.stringify(documents), {_38 headers: { 'Content-Type': 'application/json' },_38 })_38})
This uses OpenAI's text-embedding-3-large
model to generate embeddings (shortened to 512 dimensions for faster retrieval). Swap in your preferred embedding model (and dimension size) accordingly.
To test this, make a POST
request to the function's endpoint while passing in a JSON payload containing the user's query. Here is an example POST
request using cURL:
_10curl -i --location --request POST \_10 'http://127.0.0.1:54321/functions/v1/hybrid-search' \_10 --header 'Authorization: Bearer <anonymous key>' \_10 --header 'Content-Type: application/json' \_10 --data '{"query":"Italian recipes with tomato sauce"}'
For more information on how to create, test, and deploy edge functions, see Getting started.