Setting up Oracle Database 23ai with AI Vector Search

8 min readJun 12, 2024

On May 2nd, Oracle released the latest version of the Oracle Database, version 23ai. This article describes the setup I performed to configure automated generation of vector embeddings for structured data and documents in the database, creating the basis for RAG applications.

Introduction

Oracle released the latest version of it’s flagship Database, version 23ai, with a large set of new features. One of the most interesting features is the inclusion of a wide range of Vector Embedding capabilities within the database. This allows you to create, manipulate and search data in the database using vector embeddings of your data, allowing a whole range of possible new AI applications, most notably the use of RAG (Retrieval Augmented Generation) for answering questions related to your business, based on your corporate data and documents.

The key advantage is that this capability is now included as part of the standard database environment, allowing you to simply add an extra column to an existing table of the type “vector”, and directly generate the embedding of your data into that column. Once this has been set up, you can use simple SQL queries to do semantic searches on your data.

An important element is the capability to upload an Embedding model of your choice into the database, and perform the vector generation completely inside the database, for example using an “on insert trigger”. Calling of an external embedding API is also possible. This same logic applies in the “search” part: you can use standard SQL to calculate the vector representation of your question, before performing the actual vector search.

Documents can be handled in the same way. Whether you store your documents in the database itself or elsewhere, with the new AI Vector capabilities it’s a single SQL command to cut a pdf document into text chunks and store the embeddings of each text chunk in the database, again allowing for easy proximity search and for example passing the document chunks on in the prompt of an (external) LLM system to generate a nicely formulated answer to your question, based on your data and documents.

The basis for RAG

RAG (Retrieval Augmented Generation) is a mechanism that allows you to combine the text generating capabilities of a generic LLM with the private content of your enterprise, by first performing a contextual search on your (private) corporate information, and then adding that information into the prompt you will send to the LLM, allowing for more precise and up-to-date answers to your questions.

With the Vector capabilities of the database we have now available, this allows us to do all the preparation work for the generation of a contextual prompt inside your database, using the latest information and any private corporate documents you own. Only the actual call with the question and the added contextual elements would be passed as a prompt to an external Generative AI system, for example Oracle OCI GenAI service, as in the drawing below:

Hybrid RAG Architecture with on-premise Oracle Database 23ai

Let’s get practical

So far for the theory, what does this mean in practice? How easy is it to set up such an environment and what are the dependencies on external tools?

As of 2nd of May 2024, the Oracle Database 23ai Free version is generally available, so you can just download it and start experimenting!

In this article I will use a vanilla Oracle Linux 8 environment to install all the components of the solution, running on a simple 1 OCPU virtual machine on Oracle Cloud.

I downloaded the Database software for Oracle Linux 8 from this location and followed the simple instructions to install the software … in a few minutes I had my listener and database 23ai environment up and running!

If you need to get a basic understanding of what the database vector type and vector distances are, you could run some of the steps of the LiveLab on this topic.

You’ll create a user with the appropriate privileges, and create some simple tables with a vector type column.
You’ll use 2-dimensional vector examples to visually showcase the vector search and the ability to locate “nearby” results.

But this is just the simple stuff, how do we create “real” embedding of items in the database, and which LLM’s can we use?

The key document to get started is the “Oracle AI Vector Search User Guide”. Here you’ll find all the details on how to generate the vector embedding from within the database, explaining the possible options and plenty of examples.

Loading the model

For my experiment I chose to upload an LLM Embedding model directly into the database. This allows you to run the bulk of your RAG solution on-premise and inside the database, using standard SQL. There are other options, like using an external REST API endpoint with a range of the Cloud providers of AI services: Oracle OCI Generative AI but also Cohere, OpenAI, etc.

Now there are a few constraints to the models you are able to use: first you need to obtain your chosen model in the ONNX format, and the input and output parameters must match the requirements imposed by the database environment — a single input dimension, and a single output which is the embedding vector.

To keep things simple, you can use the OML4Py Release 2 API (Oracle Machine Learning for Python) to obtain several preconfigured models that are of the right type and format to be uploaded directly into the database.

This requires the installation of a set of components on your environment whereby you must take particular care of the required versions. All the details are in the “Install OML4Py for On-Premises Databases” manual :

Python3.12.0 with a list of packages like pandas, torch, etc.
Oracle Instant Client version 23.4
OML4Py Client for on-premises Databases

Once you are through the installation, you should be able to launch python in interactive mode and do “import oml”. If this works, your installation was successful!

Now you can list the available Embedding models with the EmbeddingModelConfig.show_preconfigured command. This will produce the following list of models:

EmbeddingModelConfig.show_preconfigured()
· [
· "sentence-transformers/all-mpnet-base-v2",
· "sentence-transformers/all-MiniLM-L6-v2",
· "sentence-transformers/multi-qa-MiniLM-L6-cos-v1",
· "ProsusAI/finbert",
· "medicalai/ClinicalBERT",
· "sentence-transformers/distiluse-base-multilingual-cased-v2",
· "sentence-transformers/all-MiniLM-L12-v2",
· "BAAI/bge-small-en-v1.5",
· "BAAI/bge-base-en-v1.5",
· "taylorAI/bge-micro-v2",
· "intfloat/e5-small-v2",
· "intfloat/e5-base-v2",
· "prajjwal1/bert-tiny",
· "thenlper/gte-base",
· "thenlper/gte-small",
· "TaylorAI/gte-tiny",
· "infgrad/stella-base-en-v2",
· "sentence-transformers/paraphrase-multilingual-mpnet-base-v2",
· "intfloat/multilingual-e5-base",
· "intfloat/multilingual-e5-small",
· "sentence-transformers/stsb-xlm-r-multilingual"
· ]

To download the onnx model of your choice, execute the following commands in python:

from oml.utils import EmbeddingModel, EmbeddingModelConfig

em = EmbeddingModel(model_name="sentence-transformers/all-MiniLM-L6-v2")
em.export2file("all-MiniLM-L6-v2.onnx", output_dir=".")

You now have your onnx model locally on your system, connect to the database with SQLPLUS and you can upload the model into the database:

EXECUTE DBMS_VECTOR.LOAD_ONNX_MODEL('DM_DUMP', 'all_MiniLM_L6-v2.onnx', 'doc_model');

Make sure to follow the detailed steps in the manual to configure your user to have the appropriate privileges (DB_DEVELOPER_ROLE and CREATE MINING MODEL) and define the ‘DM_DUMP’ folder rights on your filesystem where your onnx model should be located.

Working with embeddings

Now we have the model uploaded in the DB, let’s test if we can call it to embed something:

SELECT TO_VECTOR(VECTOR_EMBEDDING(doc_model USING 'hello world' as data)) 
AS embedding;

This should produce a large vector, representing the “Hello World” string. We can now start playing around with tables and triggers as suggested in the vector documentation.

Embedding structured data in the database

You can simply add a column of the type “vector” to any database table and use the model we uploaded to create a vector embedding for each record.

Create a table my_items with a column of the type VECTOR
Insert a series of items in the table, with the vector for the moment empty. As an example, I insert a series of records with some pets (cat, dog, …) as well as some objects (table, chair, etc.)
Once your table is populated, you can update the embedding field, using the VECTOR_EMBEDDING function

CREATE TABLE my_items (id int, my_object varchar2(4000), v vector);

INSERT INTO my_items values (1, 'cat', NULL);
...

BEGIN
  FOR rec IN (SELECT * FROM my_items) LOOP
    UPDATE my_items set my_items.v=
      (SELECT TO_VECTOR (VECTOR_EMBEDDING (
        doc_model USING my_items.my_object as data)) AS embedding)
    WHERE ID = my_items.ID;
  END LOOP;
END;
/

To illustrate how to find pets in this mixed list of animals and objects, a query with the question “give me some pet names “ would return cat and dog on the top of the list, without ever mentioning the word cat or dog itself:

SELECT my_object FROM my_items ORDER BY vector_distance(
  (SELECT TO_VECTOR(VECTOR_EMBEDDING(doc_model 
    USING 'give me some pet names' as data))
    AS embedding),v) 
FETCH FiRST 4 ROWS ONLY;

Handling documents

So far we’ve used structured data, but we can do the same for documents. Depending on the type of document, you might want to create an embedding for each document, or more likely, you will want to be able to refer to specific paragraphs or phrases in each document, creating an individual embedding for each document chunk. This would then be the prompt context that you provide back to an LLM to formulate an answer based on your document.

To illustrate this, I followed a few of the examples from the documentation:

Create a table to store whole documents
Upload a pdf, for example one of the database documentation chapters
Create a trigger on the table my_items to generate the embedding for each record inserted

CREATE TABLE documentation_tab (id number, data blob);
INSERT INTO documentation_tab values(1, to_blob(
    bfilename('DM_DUMP', 'database-concepts23ai.pdf')));
COMMIT;

CREATE OR REPLACE TRIGGER UPDATE_items_VECTOR
BEFORE INSERT ON my_items FOR EACH ROW
declare
  the_v vector;
BEGIN
  SELECT TO_VECTOR(VECTOR_EMBEDDING(
    doc_model2 USING :new.my_object as data)) 
  AS embedding into the_v;
  :new.v := the_v;
END;
/

And finally you can use the functions that are part of the vector toolkit to split the original pdf into chunks of text :

INSERT INTO my_items SELECT et.chunk_id, et.chunk_data,'[1]' 
FROM documentation_tab dt, 
  dbms_vector_chain.utl_to_chunks(
    dbms_vector_chain.utl_to_text(dt.data),
    json('{"by" : "words",
      "max" : "100",
      "overlap" : "0",
      "split" : "recursively",
      "language" : "american","normalize":"all"}')
    ) t,
  JSON_TABLE(
    t.column_value,
    '$[*]' COLUMNS (chunk_id NUMBER PATH '$.chunk_id',
      chunk_data VARCHAR2(4000) PATH '$.chunk_data')
  ) et;

You now have a table with all the pieces of text from the document, and the associated embedding in the column v !

This is the basis for doing contextual searches on the document content, in the same way as we did before for the structured data, using your local model to get the vector associated with the search query, and use the vector distance function to get the closest match.

Conclusions

In this article I explored the basic elements to get started with vector embeddings in the Oracle database, and demonstrated the resulting vector search capabilities.

There is a lot more to the vector capabilities of the database : using external LLM REST API’s, vector indexing for performance, multi-vector searches, etc.

And of course this is just the foundation for an actual RAG application that would answer questions based on private uploaded documents … so stay tuned for follow-up articles to explore this topic further!