Mixing Java and Python: Building Polyglot Apps for AI and Data Science

Eleftheria DrosopoulouOctober 9th, 2025Last Updated: September 30th, 2025

0 1,082 7 minutes read

Enterprise software runs on Java. Data science thrives in Python. Rather than forcing an either-or choice, smart teams build polyglot architectures that leverage both languages where they excel.

This isn’t about abandoning your Java stack or rewriting everything in Python. It’s about surgical integration—letting Python handle ML and data science while Java manages business logic, transactions, and enterprise concerns.

Why Polyglot Makes Sense

The reality: Your Spring Boot application handles millions of transactions daily. Your data scientists just built a sophisticated recommendation model in PyTorch. You need both working together.

Monoglot limitations:

Pure Java misses Python’s ML ecosystem richness
Pure Python struggles with enterprise requirements—complex transactions, strict typing, mature frameworks
Microservices for everything adds latency and operational overhead

Polyglot advantages:

Use each language for its strengths
Preserve existing investments
Enable parallel development—Java and Python teams work independently

Approach 1: JNI and Native Calls

Java Native Interface lets Java code call native libraries directly. Python can compile to native code using tools like Cython, creating a direct bridge.

When to use: Maximum performance, minimal latency, tight coupling needed.

Java calling Python via JNI (using JPype):

First, add the dependency:

<dependency>
    <groupId>org.jpype</groupId>
    <artifactId>jpype</artifactId>
    <version>1.4.1</version>
</dependency>

Java implementation:

import org.jpype.JPype;
import org.jpype.JPypeContext;

public class PythonBridge {
    private JPypeContext context;
    
    public PythonBridge() {
        context = JPype.startJVM();
    }
    
    public double[] runPrediction(double[] features) {
        try {
            // Import Python module
            var py = context.importModule("builtins");
            var module = py.getAttr("__import__")
                .call("ml_model");
            
            // Call Python function
            var predictor = module.getAttr("predict");
            var result = predictor.call(features);
            
            // Convert result back to Java
            return (double[]) result.getValue();
            
        } catch (Exception e) {
            throw new RuntimeException("Prediction failed", e);
        }
    }
    
    public void cleanup() {
        JPype.shutdownJVM();
    }
}

Usage in Spring service:

@Service
public class RecommendationService {
    private final PythonBridge bridge;
    
    public RecommendationService() {
        this.bridge = new PythonBridge();
    }
    
    public List<Product> getRecommendations(User user) {
        double[] userFeatures = extractFeatures(user);
        double[] scores = bridge.runPrediction(userFeatures);
        
        return Arrays.stream(scores)
            .mapToObj(score -> findProductByScore(score))
            .collect(Collectors.toList());
    }
    
    private double[] extractFeatures(User user) {
        return new double[]{
            user.getAge(),
            user.getPurchaseHistory().size(),
            user.getAverageOrderValue(),
            user.getDaysSinceLastPurchase()
        };
    }
}

Trade-offs:

Pros: Fast, in-process, no network overhead
Cons: Complex setup, version compatibility issues, debugging difficulties, JVM crashes affect entire app

Approach 2: REST APIs for Loose Coupling

The most common pattern: Python services expose REST endpoints, Java services consume them.

When to use: Need independence, separate deployment cycles, scalability requirements differ.

Python service (Flask example):

from flask import Flask, request, jsonify
import numpy as np
import joblib

app = Flask(__name__)
model = joblib.load('recommendation_model.pkl')

@app.route('/predict', methods=['POST'])
def predict():
    data = request.get_json()
    features = np.array(data['features']).reshape(1, -1)
    predictions = model.predict_proba(features)[0]
    
    return jsonify({
        'predictions': predictions.tolist(),
        'model_version': '1.2.0'
    })

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Java client (using Spring RestTemplate):

@Service
public class MLPredictionClient {
    private final RestTemplate restTemplate;
    private final String mlServiceUrl;
    
    public MLPredictionClient(
            RestTemplateBuilder builder,
            @Value("${ml.service.url}") String mlServiceUrl) {
        this.restTemplate = builder
            .setConnectTimeout(Duration.ofSeconds(5))
            .setReadTimeout(Duration.ofSeconds(10))
            .build();
        this.mlServiceUrl = mlServiceUrl;
    }
    
    public PredictionResponse predict(double[] features) {
        PredictionRequest request = new PredictionRequest(features);
        
        try {
            ResponseEntity response = 
                restTemplate.postForEntity(
                    mlServiceUrl + "/predict",
                    request,
                    PredictionResponse.class
                );
            
            return response.getBody();
            
        } catch (RestClientException e) {
            log.error("ML service call failed", e);
            return fallbackPrediction();
        }
    }
    
    private PredictionResponse fallbackPrediction() {
        // Return cached or default predictions
        return new PredictionResponse(
            new double[]{0.5, 0.3, 0.2},
            "fallback"
        );
    }
}

record PredictionRequest(double[] features) {}
record PredictionResponse(double[] predictions, String modelVersion) {}

Production patterns:

@Configuration
public class MLClientConfig {
    
    @Bean
    public RestTemplate mlRestTemplate(RestTemplateBuilder builder) {
        return builder
            .setConnectTimeout(Duration.ofSeconds(5))
            .setReadTimeout(Duration.ofSeconds(10))
            .interceptors(new MLServiceInterceptor())
            .errorHandler(new MLServiceErrorHandler())
            .build();
    }
}

// Circuit breaker pattern
@Service
public class ResilientMLService {
    private final MLPredictionClient client;
    private final CircuitBreaker circuitBreaker;
    
    @CircuitBreaker(name = "mlService", fallbackMethod = "fallbackPredict")
    public PredictionResponse predict(double[] features) {
        return client.predict(features);
    }
    
    private PredictionResponse fallbackPredict(
            double[] features, 
            Exception e) {
        log.warn("Using fallback predictions due to: {}", 
            e.getMessage());
        return getCachedPrediction(features);
    }
}

Trade-offs:

Pros: Language independence, separate scaling, easier debugging, clear boundaries
Cons: Network latency (5-50ms typical), serialization overhead, requires service discovery/management

Approach 3: gRPC for High Performance

When REST is too slow but you need loose coupling, gRPC delivers binary protocols with strong typing.

Protocol definition (prediction.proto):

syntax = "proto3";

service MLService {
    rpc Predict (PredictRequest) returns (PredictResponse);
    rpc BatchPredict (stream PredictRequest) returns (stream PredictResponse);
}

message PredictRequest {
    repeated double features = 1;
    string model_id = 2;
}

message PredictResponse {
    repeated double predictions = 1;
    string model_version = 2;
    double confidence = 3;
}

Java client implementation:

@Service
public class GrpcMLClient {
    private final MLServiceGrpc.MLServiceBlockingStub blockingStub;
    private final MLServiceGrpc.MLServiceStub asyncStub;
    
    public GrpcMLClient(
            @Value("${ml.grpc.host}") String host,
            @Value("${ml.grpc.port}") int port) {
        
        ManagedChannel channel = ManagedChannelBuilder
            .forAddress(host, port)
            .usePlaintext()
            .build();
        
        this.blockingStub = MLServiceGrpc.newBlockingStub(channel);
        this.asyncStub = MLServiceGrpc.newStub(channel);
    }
    
    public PredictResponse predict(List<Double> features, String modelId) {
        PredictRequest request = PredictRequest.newBuilder()
            .addAllFeatures(features)
            .setModelId(modelId)
            .build();
        
        try {
            return blockingStub
                .withDeadlineAfter(500, TimeUnit.MILLISECONDS)
                .predict(request);
                
        } catch (StatusRuntimeException e) {
            log.error("gRPC call failed: {}", e.getStatus());
            throw new MLServiceException("Prediction failed", e);
        }
    }
    
    // Async batch processing
    public CompletableFuture<List<PredictResponse>> batchPredict(
            List<List<Double>> featureSets) {
        
        CompletableFuture<List<PredictResponse>> future = 
            new CompletableFuture<>();
        List<PredictResponse> responses = new ArrayList<>();
        
        StreamObserver<PredictResponse> responseObserver = 
            new StreamObserver<>() {
                @Override
                public void onNext(PredictResponse response) {
                    responses.add(response);
                }
                
                @Override
                public void onCompleted() {
                    future.complete(responses);
                }
                
                @Override
                public void onError(Throwable t) {
                    future.completeExceptionally(t);
                }
            };
        
        StreamObserver<PredictRequest> requestObserver = 
            asyncStub.batchPredict(responseObserver);
        
        featureSets.forEach(features -> {
            PredictRequest request = PredictRequest.newBuilder()
                .addAllFeatures(features)
                .build();
            requestObserver.onNext(request);
        });
        
        requestObserver.onCompleted();
        return future;
    }
}

Usage in service layer:

@Service
public class ProductRecommendationService {
    private final GrpcMLClient mlClient;
    private final ProductRepository productRepository;
    
    public List<Product> recommendProducts(
            User user, 
            int count) {
        
        List<Double> features = buildFeatureVector(user);
        PredictResponse response = mlClient.predict(
            features, 
            "product-recommender-v2"
        );
        
        return response.getPredictionsList().stream()
            .limit(count)
            .map(score -> findProductByScore(score))
            .filter(Objects::nonNull)
            .collect(Collectors.toList());
    }
    
    private List<Double> buildFeatureVector(User user) {
        return List.of(
            (double) user.getAge(),
            user.getLifetimeValue(),
            (double) user.getDaysSinceRegistration(),
            user.getAveragePurchaseFrequency()
        );
    }
}

Trade-offs:

Pros: 2-5x faster than REST, streaming support, strong typing, efficient binary protocol
Cons: More complex setup, less human-readable, requires .proto file coordination

Approach 4: Message Queues for Async Processing

When immediate responses aren’t needed, message queues decouple Java and Python completely.

When to use: Batch processing, background jobs, event-driven architectures.

Java producer (RabbitMQ):

@Service
public class MLJobProducer {
    private final RabbitTemplate rabbitTemplate;
    
    @Value("${ml.queue.name}")
    private String mlQueueName;
    
    public MLJobProducer(RabbitTemplate rabbitTemplate) {
        this.rabbitTemplate = rabbitTemplate;
    }
    
    public String submitPredictionJob(PredictionJob job) {
        String jobId = UUID.randomUUID().toString();
        
        MLJobMessage message = MLJobMessage.builder()
            .jobId(jobId)
            .features(job.getFeatures())
            .modelId(job.getModelId())
            .priority(job.getPriority())
            .timestamp(Instant.now())
            .build();
        
        rabbitTemplate.convertAndSend(
            mlQueueName,
            message,
            msg -> {
                msg.getMessageProperties()
                    .setCorrelationId(jobId);
                msg.getMessageProperties()
                    .setPriority(job.getPriority());
                return msg;
            }
        );
        
        log.info("Submitted ML job: {}", jobId);
        return jobId;
    }
}

Java consumer for results:

@Service
public class MLResultConsumer {
    private final Map<String, CompletableFuture<PredictionResult>> 
        pendingJobs = new ConcurrentHashMap<>();
    
    @RabbitListener(queues = "${ml.results.queue}")
    public void handleResult(MLResultMessage result) {
        String jobId = result.getJobId();
        
        CompletableFuture<PredictionResult> future = 
            pendingJobs.remove(jobId);
        
        if (future != null) {
            PredictionResult predictionResult = PredictionResult.builder()
                .predictions(result.getPredictions())
                .confidence(result.getConfidence())
                .modelVersion(result.getModelVersion())
                .processingTime(result.getProcessingTime())
                .build();
            
            future.complete(predictionResult);
        } else {
            log.warn("Received result for unknown job: {}", jobId);
        }
    }
    
    public CompletableFuture<PredictionResult> awaitResult(String jobId) {
        CompletableFuture<PredictionResult> future = new CompletableFuture<>();
        pendingJobs.put(jobId, future);
        
        // Timeout after 30 seconds
        CompletableFuture.delayedExecutor(30, TimeUnit.SECONDS)
            .execute(() -> {
                if (pendingJobs.containsKey(jobId)) {
                    pendingJobs.remove(jobId);
                    future.completeExceptionally(
                        new TimeoutException("Job timeout: " + jobId)
                    );
                }
            });
        
        return future;
    }
}

Orchestration service:

@Service
public class AsyncMLService {
    private final MLJobProducer producer;
    private final MLResultConsumer consumer;
    
    public CompletableFuture<PredictionResult> predictAsync(
            double[] features, 
            String modelId) {
        
        PredictionJob job = PredictionJob.builder()
            .features(features)
            .modelId(modelId)
            .priority(5)
            .build();
        
        String jobId = producer.submitPredictionJob(job);
        return consumer.awaitResult(jobId);
    }
    
    public List<PredictionResult> predictBatch(
            List<double[]> featuresList,
            String modelId) throws Exception {
        
        List<CompletableFuture<PredictionResult>> futures = 
            featuresList.stream()
                .map(features -> predictAsync(features, modelId))
                .collect(Collectors.toList());
        
        return CompletableFuture.allOf(
            futures.toArray(new CompletableFuture[0])
        )
        .thenApply(v -> futures.stream()
            .map(CompletableFuture::join)
            .collect(Collectors.toList())
        )
        .get(60, TimeUnit.SECONDS);
    }
}

Trade-offs:

Pros: Complete decoupling, automatic retries, load leveling, handles service failures gracefully
Cons: Added complexity, eventual consistency, requires message broker infrastructure

Approach 5: Embedded Python with GraalVM

GraalVM allows running Python code directly in the JVM using its polyglot capabilities.

When to use: Simple Python scripts, need tight integration, want single deployment artifact.

Java implementation:

import org.graalvm.polyglot.Context;
import org.graalvm.polyglot.Value;

@Service
public class EmbeddedPythonService {
    private final Context pythonContext;
    
    public EmbeddedPythonService() {
        this.pythonContext = Context.newBuilder("python")
            .allowAllAccess(true)
            .build();
        
        // Load Python module at startup
        pythonContext.eval("python", 
            "import numpy as np\n" +
            "def normalize(data):\n" +
            "    return (data - np.mean(data)) / np.std(data)"
        );
    }
    
    public double[] normalizeData(double[] data) {
        Value pythonFunc = pythonContext
            .getBindings("python")
            .getMember("normalize");
        
        Value result = pythonFunc.execute((Object) data);
        
        double[] normalized = new double[data.length];
        for (int i = 0; i < data.length; i++) {
            normalized[i] = result.getArrayElement(i).asDouble();
        }
        
        return normalized;
    }
    
    public Map<String, Object> runDataAnalysis(List<Double> values) {
        String script = String.format(
            "import statistics\n" +
            "data = %s\n" +
            "result = {\n" +
            "    'mean': statistics.mean(data),\n" +
            "    'median': statistics.median(data),\n" +
            "    'stdev': statistics.stdev(data)\n" +
            "}",
            values.toString()
        );
        
        Value result = pythonContext.eval("python", script);
        Value resultDict = pythonContext
            .getBindings("python")
            .getMember("result");
        
        return Map.of(
            "mean", resultDict.getMember("mean").asDouble(),
            "median", resultDict.getMember("median").asDouble(),
            "stdev", resultDict.getMember("stdev").asDouble()
        );
    }
    
    @PreDestroy
    public void cleanup() {
        pythonContext.close();
    }
}

Trade-offs:

Pros: Single deployment, no network calls, simple for scripts
Cons: Limited library support, performance overhead, GraalVM setup complexity, not suitable for heavy ML

Production Architecture Example

Real-world polyglot system combining approaches:

@Configuration
public class MLInfrastructureConfig {
    
    // Fast, synchronous predictions via gRPC
    @Bean
    public GrpcMLClient realtimeMLClient() {
        return new GrpcMLClient("ml-service.internal", 9000);
    }
    
    // Batch processing via message queue
    @Bean
    public MLJobProducer batchMLProducer(RabbitTemplate template) {
        return new MLJobProducer(template);
    }
    
    // Fallback REST client with circuit breaker
    @Bean
    public MLPredictionClient fallbackMLClient(RestTemplate template) {
        return new MLPredictionClient(template, "http://ml-fallback");
    }
}

@Service
public class HybridMLService {
    private final GrpcMLClient primaryClient;
    private final MLPredictionClient fallbackClient;
    private final MLJobProducer batchProducer;
    
    // Real-time: gRPC with REST fallback
    @CircuitBreaker(name = "primaryML", fallbackMethod = "fallbackPredict")
    public PredictionResponse predict(double[] features) {
        return primaryClient.predict(
            Arrays.stream(features).boxed().collect(Collectors.toList()),
            "default-model"
        );
    }
    
    private PredictionResponse fallbackPredict(
            double[] features, 
            Exception e) {
        log.warn("Primary ML failed, using REST fallback");
        return fallbackClient.predict(features);
    }
    
    // Batch: message queue
    public String submitBatchJob(List featureSets) {
        return batchProducer.submitPredictionJob(
            PredictionJob.builder()
                .features(featureSets)
                .modelId("batch-model-v3")
                .build()
        );
    }
}

Monitoring and Observability

Track polyglot interactions:

@Component
@Aspect
public class MLServiceMonitoring {
    private final MeterRegistry registry;
    
    @Around("@annotation(MLOperation)")
    public Object monitorMLCall(ProceedingJoinPoint joinPoint) 
            throws Throwable {
        
        Timer.Sample sample = Timer.start(registry);
        String operation = joinPoint.getSignature().getName();
        
        try {
            Object result = joinPoint.proceed();
            
            sample.stop(Timer.builder("ml.operation.duration")
                .tag("operation", operation)
                .tag("status", "success")
                .register(registry));
            
            return result;
            
        } catch (Exception e) {
            sample.stop(Timer.builder("ml.operation.duration")
                .tag("operation", operation)
                .tag("status", "error")
                .tag("error", e.getClass().getSimpleName())
                .register(registry));
            
            registry.counter("ml.operation.errors",
                "operation", operation,
                "error", e.getClass().getSimpleName()
            ).increment();
            
            throw e;
        }
    }
}

Decision Framework

Choose JNI/JPype when:

Need sub-millisecond latency
Tight coupling acceptable
Small, stable Python modules

Choose REST when:

Services owned by different teams
Need language/framework independence
Occasional calls (< 100 QPS)

Choose gRPC when:

High throughput requirements (1000+ QPS)
Need streaming or bidirectional communication
Performance critical

Choose Message Queues when:

Async processing acceptable
Need reliability and retry logic
Variable load patterns

Choose GraalVM when:

Simple Python scripts
Want single deployment
Limited external dependencies

Learning Resources

Official Documentation:

Tools and Libraries:

Best Practices:

The Bottom Line

Polyglot architecture isn’t about choosing sides—it’s about pragmatic engineering. Java handles what Java does best: enterprise integration, transaction management, type safety. Python excels at ML and data science with unmatched libraries and flexibility.

Build bridges, not walls. Your architecture should enable collaboration between languages, not force artificial choices. Start with the simplest approach that meets your requirements—usually REST APIs. Optimize to gRPC or message queues only when you have concrete performance needs.

The best system is the one that ships and scales. Polyglot done right gives you both.

Mixing Java and Python: Building Polyglot Apps for AI and Data Science

Why Polyglot Makes Sense

Approach 1: JNI and Native Calls

Approach 2: REST APIs for Loose Coupling

Approach 3: gRPC for High Performance

Approach 4: Message Queues for Async Processing

Approach 5: Embedded Python with GraalVM

Production Architecture Example

Monitoring and Observability

Decision Framework

Learning Resources

The Bottom Line

Thank you!

Eleftheria Drosopoulou

Thank you!

Why Polyglot Makes Sense

Approach 1: JNI and Native Calls

Approach 2: REST APIs for Loose Coupling

Approach 3: gRPC for High Performance

Approach 4: Message Queues for Async Processing

Approach 5: Embedded Python with GraalVM

Production Architecture Example

Monitoring and Observability

Decision Framework

Learning Resources

The Bottom Line

Thank you!

Related Articles

Thank you!