Irish inference · EU GDPR · Nvidia H200

Free, local, European
inference.

LeemerLabs runs frontier open-weight models on Nvidia H200s in Ireland. An OpenAI-compatible gateway, generous free quotas, and a data path that never leaves the EU. Built to serve the Leemer Group — opened up for everyone else.

Request access Browse models LFM2.5-350M

EU GDPR

Native, not bolted on

Ireland-hosted

Waterford + Dublin

Nvidia H200

Frontier inference

Free tier

No card required

api.leemerlabs.ie

POST /v1/chat/completions

{
  "model": "lfm2.5-350m-free",
  "messages": [
    { "role": "user",
      "content": "Parse this JSON →" }
  ],
  "stream": true
}

Throughput

200tok/s free

capable of 40,400 on H100

Hardware

H200 · 141GB HBM3e

LFM2.5-350M

Our first hosted model

Read

10BTokens served

H200EU-hosted GPUs

2023Shipping since

0msTrans-Atlantic hops

The Leemer Group

An ecosystem of products — built on one piece of infrastructure.

LeemerLabs exists because our own products outgrew generic APIs. Today the same infrastructure that powers LeemerChat, Foundry, and Critique is available to developers across Europe.

Flagship

LeemerChat

The flagship product of the Leemer Group. Every frontier model in one workspace — GPT-5.4, Claude Opus, Gemini 3, Kimi K2.5, GLM-5.1 — answering together through the KingLeemer consensus architecture.

10B

tokens processed

20+

frontier models

The reason for LeemerLabs

LeemerFoundry

Ireland's first custom LLM creation studio. Your data, your model, our GPUs. Fine-tune open-weights from 1B to 235B parameters on distributed infrastructure. We built LeemerLabs to serve Foundry at scale.

4 wks

data → deployment

235B

MoE supported

Newest from the Group

Critique.sh

GitHub-native AI pull request review with sandbox-backed analysis. The sandbox writes the final artifact. The app publishes it. Reviews become infrastructure — inspectable, repeatable, owned.

v3.1

sandbox-native

48h

release cadence

Why LeemerLabs

Inference built on the assumption that Europe matters.

Nvidia H200 · 141GB HBM3e

Hosted in Ireland · Zero trans-Atlantic routing

Primitive

Local inference in Ireland

Every request is served from European data centres. No trans-Atlantic hops, no silent routing through third-party clouds. Your data stays where the law says it should.

Primitive

EU GDPR protected by default

The architecture is GDPR-native. Data residency, purpose limitation, and deletion rights are primitives in the gateway — not policy documents.

Primitive

Powered by Nvidia H200

Frontier-grade accelerators with 141GB of HBM3e per card. Enough memory to run modern MoE models without the latency penalty of weight streaming.

Primitive

Free inference tier

A real free tier, not a trial. Generous limits, an OpenAI-compatible gateway, and a single public model alias to start against. Paid tiers are opt-in when scale demands it.

How a request flows

Five steps. Zero trans-Atlantic hops.

01
TLS 1.3 terminates in Dublin
Inside the EU boundary. Nothing cached at the edge.
02
Auth + rate limit
Metered record opens — token counts only, never body.
03
Dispatched to an H200 worker
Waterford or Dublin. Prompt is held in memory only.
04
Response streams back
Straight to your client. Worker memory is freed at end of request.
05
Record finalised
Zero-retention mode purges even the metered row after reconciliation.

Read the full privacy page

Founder note

“The future of AI is not a single super-intelligence. It is systems that coordinate intelligence well.”

— Repath ‘Ray’ Khan · KingLeemer launch, Feb 2026

LeemerLabs is founded by Ray out of Waterford. An Ireland-based builder who has shipped seven AI products since 2023 — and who argues that coordination, sovereignty, and open weights are how serious organisations should operate.

About the founder A Day With Ray

Founding lineup

Our starting models.

Two founding models, one for speed and one for depth. Liquid LFM2.5-350M handles every hot-path request on LeemerChat and will soon run fully offline in your browser. Gemma 4 26B A4B picks up where Liquid stops — more than 10× the active parameters, still served at 40+ tokens per second.

Read the full brief

Speed tier

Liquid LFM2.5-350M

alias · lfm2.5-350m-free

Routing, titles, safety gating — every fast path on LeemerChat. Capable of 40,400 tok/s on a single H100; we serve the free tier at a throttled but still blazing 200 tok/s. Offline mode in the browser is next.

350M

params

32K

context

200

tok/s free

Depth tier

Gemma 4 26B A4B

alias · gemma4-26b-a4b

Google DeepMind's MoE Gemma 4. 25.2B total, 3.8B active — more than 10× our speed model. Multimodal, 256K context, native thinking mode, sustained 40+ tok/s.

25.2B

total

3.8B

active

256K

context

Start building

OpenAI-compatible. EU-hosted. Free to start.

Point your client at https://api.leemerlabs.ie/v1, use your existing OpenAI SDK, and stay on European compute.

Request access Privacy & compliance Contact

Free, local, Europeaninference.

An ecosystem of products — built on one piece of infrastructure.

LeemerChat

LeemerFoundry

Critique.sh

Inference built on the assumption that Europe matters.

Local inference in Ireland

EU GDPR protected by default

Powered by Nvidia H200

Free inference tier

Five steps. Zero trans-Atlantic hops.

Our starting models.

Liquid LFM2.5-350M

Gemma 4 26B A4B

OpenAI-compatible. EU-hosted. Free to start.

Free, local, European
inference.