What is Weaviate

Weaviate is an open-source, cloud-native vector database that stores both objects and vectors allowing for the combination of vector search with traditional structured filling. Weaviate can run in the following ways:

Cloud : Weaviate can run over weaviate cloud abstracting all hardware and deployment from the user.
Local(Docker): Weaviate can also run locally as a Docker container.

Core Architecture

Weaviate is organized into a 3-layer architecture for maximum performance, highly efficient vector search and ACID compliance.

1. API Layer(Top)

The client layer which handles the frontend and is closest to the user, handing all external communications

REST API : Standard HTTP endpoint handling all CRUD operations
GraphQL : Smarter alternative to REST API, trades payload length for query complexity.
Grpc: High performance binary protocol by google for low latency applications.

2. Search Layer(Middle)

The primary engine of weaviate responsible for executing queries.

Vector search(HNSW) : Hierarchical navigable small world is an algorithm that finds a data point in a dataset that’s very close to the given query point but not necessarily the absolute closest one.
BM25 : Best Matching 25 algorithm is used for keyword searches , it uses an optimized version of traditional TF-IDF method.
Hybrid Fusion : Uses Reciprocal rank fusion(RRF) algorithm, both HNSW and BM25 gives their own ranks to each documents , RRF is used to aggregate these rankings into one unified ranking system.

\mathrm{RRF}(d)=\sum_{i=1}^{N}\frac{1}{k+\mathrm{rank}_i(d)}
d = A document or item
N = The number of rankings , usually equal to number of independent retrievers.
k = smoothing parameter(usually 60) prevents one rank from dominating.
rank = Rank / Position of how relevant the document is .

3. Storage Layer(Bottom)

Persistence layer with three specialized storage systems.

Object Store : Stores the actual documents and their metadata , it is the main persistence layer.
Inverted Index : Enables property filtering and keyword search.
Vector Index : Organizes points in latent space , used by HNSW for vector similarity search.

Implementation

Let's try to run weaviate on docker and run a similarity search on some sample embedding objects.

Step 1: Install required libraries & check if all dependencies are installed:

docker --version : Shows the installed Docker version
docker info : Displays Docker system and runtime details
pip install -U weaviate-client[agents] : Installs or updates the Weaviate Python client with agent support
touch docker-compose.yaml : Creates an empty docker-compose configuration file

Python

%%shell
docker --version
docker info
pip install -U weaviate-client[agents]
touch docker-compose.yaml # Create a .yaml file for config

Step 2: Starting weaviate and ollama with docker compose , paste the below config in a docker-compose.yaml file.

Python

services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.35.1
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_MODULES: 'text2vec-ollama,generative-ollama'
      CLUSTER_HOSTNAME: 'node1'
      OLLAMA_API_ENDPOINT: 'http://ollama:11434'
    depends_on:
      - ollama

  ollama:
    image: ollama/ollama:0.12.9
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama

volumes:
  weaviate_data:
  ollama_data:

Step 3: Start the docker containers of weaviate and ollama

Python

!docker-compose up -d
!docker compose exec ollama ollama pull nomic-embed-text # Embedding model

Step 4: Create a weaviate collection and generate embeddings using Ollama nomic embed text model:

Python

import weaviate
from weaviate.classes.config import Configure

def seed_movies():
    movie_records = [
        {
            "title": "The Matrix",
            "description": (
                "A computer hacker discovers the true nature of reality and "
                "his part in the conflict against its hidden rulers."
            ),
            "genre": "Science Fiction",
        },
        {
            "title": "Spirited Away",
            "description": (
                "A young girl is trapped in a strange spirit realm and must "
                "rescue her parents to return home."
            ),
            "genre": "Animation",
        },
        {
            "title": "The Lord of the Rings: The Fellowship of the Ring",
            "description": (
                "A humble Hobbit and his allies embark on a dangerous quest "
                "to destroy a powerful ring and save Middle-earth."
            ),
            "genre": "Fantasy",
        },
    ]
    return movie_records


# Establish connection to the local Weaviate instance
with weaviate.connect_to_local() as client:
    # Create the Movie collection with Ollama-based vectorization
    client.collections.create(
        name="Movie2",
        vector_config=Configure.Vectors.text2vec_ollama(
            model="nomic-embed-text",
            api_endpoint="http://ollama:11434",  # adjust if running outside Docker
        ),
    )

    movie_collection = client.collections.use("Movie2")
    movies_to_insert = seed_movies()

    # Batch insert objects
    with movie_collection.batch.fixed_size(batch_size=200) as batch:
        for movie in movies_to_insert:
            batch.add_object(properties=movie)

    print(
        f"Imported and vectorized {len(movie_collection)} "
        f"records into the Movie collection"
    )

Output:

Screenshot-2025-12-31-170122 — Vectors created and stored in storage layer

Step 5 : We perform a vector search, Internally weaviate automatically performs a HNSW + BM25 search and gives the closest 'k' results.

Python

import json
import weaviate

def run_semantic_search(collection, text_query, top_k=2):
    result = collection.query.near_text(
        query=text_query,
        limit=top_k,
    )
    return result.objects

# Open a connection to the local Weaviate instance
with weaviate.connect_to_local() as client:
    movie_collection = client.collections.use("Movie2")

    hits = run_semantic_search(
        collection=movie_collection,
        text_query="sci-fi",
        top_k=2,
    )

    for item in hits:
        print(json.dumps(item.properties, indent=2))

Output:

You can find the source code here

Using weaviate for RAG

Weaviate for RAG (Retrieval-Augmented Generation) means using Weaviate as the vector database / retriever in a RAG pipeline so an LLM can ground its answers in your data.

In a standard RAG setup:

Ingest data = chunk documents -> create embeddings
Store embeddings = this is where Weaviate is used
Retrieve relevant chunks for a user query.
Generate an answer using an LLM + retrieved context

Weaviate handles steps 2 & 3, using native vector search and embedding store built into it, Weaviate integrates well with frameworks like langchain, llamaindex e.t.c.

Use Cases

Semantic Search: Enables similarity search over text embeddings from models like OpenAI, Cohere or SentenceTransformers.
Recommendation Systems: Matches user embeddings to item embeddings for personalized recommendations.
Image & Video Retrieval: Finds visually similar content using feature embeddings from CNNs or CLIP models.
Anomaly Detection: Identifies unusual data points in high-dimensional feature space.
Generative AI & RAG Systems: Integrates with LangChain, LlamaIndex and LLMs for embedding-based retrieval augmentation.

Weaviate v/s Traditional Databases

Database	Best For	Key strength
Weaviate	Hybrid Search & RAG	Native Hybrid search
Pinecone	Production apps	Ease of use & reliablity
Milvus	Large-scale performance	Horizontal scaling
Qdrant	Advanced filtering	Rust performance
Chroma	Prototyping & LLMs	Developer experience

Advantages

Native Hybrid Search: combines both semantic as well as keyword based search, which most providers don't.
Open-source flexibility: Due to its open-source nature ,people avoid vendor lock-in.
Multi-modal support: Supports images, videos, texts out of the box.

Limitations

Operational complexity(Self-Hosted): running weaviate locally raises complexity and requires specialized people or team to handle, adding to deployment complexity.
Smaller ecosystem: Relatively newer so lacks adoption by big players, less battle-tested software.
Limited free-trial (cloud): Only provides a 14-day free tier which may not be enough for most use cases.

Core Architecture

1. API Layer(Top)

2. Search Layer(Middle)

3. Storage Layer(Bottom)

Implementation

Using weaviate for RAG

Use Cases

Weaviate v/s Traditional Databases

Advantages

Limitations

Explore