How to Use Embeddings Model API in Spring AI

Omozegie AziegbeJune 13th, 2025Last Updated: June 13th, 2025

0 1,016 4 minutes read

Embeddings are numerical vector representations of data (often text) that capture semantic meaning in a format suitable for similarity search, clustering, and downstream machine-learning tasks. Spring AI provides a clean abstraction over embedding models, enabling us to integrate powerful language model features into Spring Boot applications. This article will guide you through the Embeddings Model API in Spring AI and demonstrate how to integrate it with an Ollama model.

1. What Are Embeddings?

Embeddings are numerical vector representations of data, specifically text in this context, that enable machines to understand semantic similarity, context, and meaning. Instead of comparing raw text, models use embeddings to perform tasks such as search, recommendation, classification, and clustering.

For example, the words “king”, “queen”, and “royalty” might be mapped to vectors that are close together in a high-dimensional space, reflecting their related meanings. Embeddings are fundamental to tasks like:

Semantic search (finding similar documents)
Text classification
Question-answering systems
Clustering or grouping similar content

Spring AI provides a clean abstraction over embedding generation through the EmbeddingModel API, which integrates seamlessly with providers like OpenAI, Hugging Face, and Ollama.

1.1 Introduction to EmbeddingModel API in Spring AI

At the heart of Spring AI’s Embeddings API is the EmbeddingModel interface. It defines the contract for generating embeddings, along with convenient methods for handling both individual and batch requests. This interface is automatically implemented by Spring AI using the configuration provided (e.g., for Ollama or OpenAI). Here is the source structure of the EmbeddingModel interface:

EmbeddingModel

The EmbeddingModel interface extends Spring AI’s generic Model<EmbeddingRequest, EmbeddingResponse> contract and represents the core abstraction for all embedding model implementations.

public interface EmbeddingModel extends Model<EmbeddingRequest, EmbeddingResponse> {

	@Override
	EmbeddingResponse call(EmbeddingRequest request);
        
        // other methods omitted
}

Full source code available at: Spring AI Embeddings API Documentation

This interface defines essential methods. For instance, call(EmbeddingRequest) acts as the primary entry point for performing the embedding operation, while dimensions() provides an easy way to determine the size of the generated output vector. Other methods like embed(String) and embed(List<String>) simplify working with individual or multiple input strings.

EmbeddingRequest

public class EmbeddingRequest implements ModelRequest<List<String>> {
	private final List<String> inputs;
	private final EmbeddingOptions options;
	// other methods omitted
}

This class wraps the list of strings to embed and optional configuration options for the model.

EmbeddingResponse

public class EmbeddingResponse implements ModelResponse<Embedding> {
	private List<Embedding> embeddings;
	private EmbeddingResponseMetadata metadata = new EmbeddingResponseMetadata();
	// other methods omitted
}

It holds the results returned by the embedding model, including the vector data and any optional metadata.

Embedding

public class Embedding implements ModelResult<float[]> {
    private float[] embedding;
    private Integer index;
    private EmbeddingResultMetadata metadata;
    // other methods omitted
}

Each Embedding object contains a float[] vector representing the input string’s meaning in high-dimensional space, as well as metadata and ordering index.

2. Project Setup

To begin, create a Spring Boot project with the required dependencies. For this article, we will use Maven and integrate with Ollama.

pom.xml

This pom.xml includes Spring Boot and the Spring AI Ollama Starter to support embeddings generation from Ollama models. Ollama must be installed and running locally.

	<properties>
		<java.version>21</java.version>
		<spring-ai.version>1.0.0</spring-ai.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-model-ollama</artifactId>
		</dependency>
	</dependencies>
	<dependencyManagement>
		<dependencies>
			<dependency>
				<groupId>org.springframework.ai</groupId>
				<artifactId>spring-ai-bom</artifactId>
				<version>${spring-ai.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
		</dependencies>
	</dependencyManagement>

Configuration for Ollama

For embedding models, we can configure the application.yml to point to our Ollama server and preferred model.

spring:
  ai:
    ollama:
      embedding:
        model: "nomic-embed-text"
        base-url: "http://localhost:11434"

This YAML file tells Spring AI to use the nomic-embed-text model served by Ollama on port 11434. We can change the model name if using a different embedding model available in Ollama.

3. Creating the Embedding Service

Spring AI automatically configures the EmbeddingModel based on the provided Ollama settings. The following code snippet shows a service that uses the EmbeddingModel interface to generate embeddings from input strings.

@Service
public class EmbeddingService {

    private final EmbeddingModel embeddingModel;

    public EmbeddingService(EmbeddingModel embeddingModel) {
        this.embeddingModel = embeddingModel;
    }

    public EmbeddingResponse generateEmbedding(String input) {
        EmbeddingRequest request = new EmbeddingRequest(List.of(input), null);
        return embeddingModel.call(request);
    }
}

In this code, the generateEmbedding(String input) method creates an EmbeddingRequest with the provided input text and invokes the call() method on the EmbeddingModel to obtain an EmbeddingResponse.

4. Exposing a REST Endpoint

Next, let’s expose the embedding functionality through a REST controller.

@RestController
@RequestMapping("/api/embeddings")
public class EmbeddingController {

    private final EmbeddingService embeddingService;

    public EmbeddingController(EmbeddingService embeddingService) {
        this.embeddingService = embeddingService;
    }

    @PostMapping
    public ResponseEntity<EmbeddingResponse> generateEmbedding(@RequestBody String input) {
        EmbeddingResponse response = embeddingService.generateEmbedding(input);
        return ResponseEntity.ok(response);
    }
}

The generateEmbedding method handles POST requests, delegates the embedding generation to the EmbeddingService and returns the resulting EmbeddingResponse wrapped in a ResponseEntity.

Sample Request and Output

Use curl or Postman to test the embedding service once the application is running.

curl -X POST http://localhost:8080/api/embeddings \
  -H "Content-Type: application/json" \
  -d '{"text": "Spring AI is amazing!"}'

Sample Response

Below is a truncated version of the embedding response (the full output is much larger):

{
  "metadata": {
    "model": "nomic-embed-text",
    "usage": {
      "promptTokens": 12,
      "completionTokens": 0,
      "totalTokens": 12
    }
  },
  "result": {
    "index": 0,
    "output": [0.04198507, 0.023562618, -0.17958216, …, -0.04682972, -0.035352502]
  }
}

What This Output Means

metadata:
- "model": "nomic-embed-text" indicates which embedding model was used.
- "usage" shows how many tokens were consumed during request processing.
result / output:
"index": 0 signals this is the first (and only) item in the batch request.
The "output" array is the float vector embedding for our input text.
We typically use this high-dimensional vector for downstream tasks like similarity searches or clustering.
Each value in this vector represents one dimension of the semantic embedding space.

5. Conclusion

The EmbeddingModel API in Spring AI provides a straightforward way to integrate vector representations into our applications. With abstraction layers and models like Ollama, switching or upgrading models becomes easy without rewriting business logic. This article covered the configuration, implementation, and usage of embeddings using Spring AI integrated with Ollama.