Ollamac Java | Work
:
private String extractToken(String chunk) // Parse JSON lines, extract "response" field // ... ollamac java work
Introduction: The Shift Toward Private, On-Premise AI For the past two years, the software engineering world has been obsessed with cloud-based large language models (LLMs) like GPT-4, Claude, and Gemini. However, a quiet revolution is taking place in enterprise Java departments. Concerns over data privacy, latency, and API costs are driving developers to run LLMs locally. Enter Ollama – the tool that makes running models like Llama 3, Mistral, and Phi-3 as easy as ollama run llama3 . But Java developers face a critical question: How do we bridge the gap between Ollama’s Go/Echo HTTP server and a production-grade JVM application? : private String extractToken(String chunk) // Parse JSON
import okhttp3.*; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; public class OllamaHttpClient private static final String OLLAMA_URL = "http://localhost:11434/api/generate"; private final OkHttpClient client = new OkHttpClient(); private final ObjectMapper mapper = new ObjectMapper(); Concerns over data privacy, latency, and API costs
public Flux<String> streamGenerate(String model, String prompt) return WebClient.create("http://localhost:11434") .post() .uri("/api/generate") .bodyValue(Map.of("model", model, "prompt", prompt, "stream", true)) .retrieve() .bodyToFlux(String.class) .map(this::extractToken);
try (Response response = client.newCall(request).execute()) JsonNode root = mapper.readTree(response.body().string()); return root.get("response").asText();