Irish inference · EU GDPR · Nvidia H200

Free, local, European
inference.

LeemerLabs runs frontier open-weight models on Nvidia H200s in Ireland. An OpenAI-compatible gateway, generous free quotas, and a data path that never leaves the EU. Built to serve the Leemer Group — opened up for everyone else.

EU GDPR

Native, not bolted on

Ireland-hosted

Waterford + Dublin

Nvidia H200

Frontier inference

Free tier

No card required

api.leemerlabs.ie

POST /v1/chat/completions

{
  "model": "lfm2.5-350m-free",
  "messages": [
    { "role": "user",
      "content": "Parse this JSON →" }
  ],
  "stream": true
}

Throughput

200tok/s free

capable of 40,400 on H100

Hardware

Nvidia

H200 · 141GB HBM3e

LiquidLiquid

LFM2.5-350M

Our first hosted model

Read
10BTokens served
H200EU-hosted GPUs
2023Shipping since
0msTrans-Atlantic hops

The Leemer Group

An ecosystem of products — built on one piece of infrastructure.

LeemerLabs exists because our own products outgrew generic APIs. Today the same infrastructure that powers LeemerChat, Foundry, and Critique is available to developers across Europe.

Why LeemerLabs

Inference built on the assumption that Europe matters.

Nvidia

Nvidia H200 · 141GB HBM3e

Hosted in Ireland · Zero trans-Atlantic routing

Primitive

Local inference in Ireland

Every request is served from European data centres. No trans-Atlantic hops, no silent routing through third-party clouds. Your data stays where the law says it should.

Primitive

EU GDPR protected by default

The architecture is GDPR-native. Data residency, purpose limitation, and deletion rights are primitives in the gateway — not policy documents.

Primitive

Powered by Nvidia H200

Frontier-grade accelerators with 141GB of HBM3e per card. Enough memory to run modern MoE models without the latency penalty of weight streaming.

Primitive

Free inference tier

A real free tier, not a trial. Generous limits, an OpenAI-compatible gateway, and a single public model alias to start against. Paid tiers are opt-in when scale demands it.

How a request flows

Five steps. Zero trans-Atlantic hops.

  1. 01

    TLS 1.3 terminates in Dublin

    Inside the EU boundary. Nothing cached at the edge.

  2. 02

    Auth + rate limit

    Metered record opens — token counts only, never body.

  3. 03

    Dispatched to an H200 worker

    Waterford or Dublin. Prompt is held in memory only.

  4. 04

    Response streams back

    Straight to your client. Worker memory is freed at end of request.

  5. 05

    Record finalised

    Zero-retention mode purges even the metered row after reconciliation.

Read the full privacy page

Founder note

“The future of AI is not a single super-intelligence. It is systems that coordinate intelligence well.”

— Repath ‘Ray’ Khan · KingLeemer launch, Feb 2026

LeemerLabs is founded by Ray out of Waterford. An Ireland-based builder who has shipped seven AI products since 2023 — and who argues that coordination, sovereignty, and open weights are how serious organisations should operate.

Founding lineup

Our starting models.

Two founding models, one for speed and one for depth. Liquid LFM2.5-350M handles every hot-path request on LeemerChat and will soon run fully offline in your browser. Gemma 4 26B A4B picks up where Liquid stops — more than 10× the active parameters, still served at 40+ tokens per second.

Read the full brief

Start building

OpenAI-compatible. EU-hosted. Free to start.

Point your client at https://api.leemerlabs.ie/v1, use your existing OpenAI SDK, and stay on European compute.