Born-9B dossier · preview release status
The preview winner
is still v2.
This page is the single release record for Born-9B: what the current preview actually is, how it beat base Qwen, where later runs failed to replace it, and what still has to change before a general release deserves a stronger claim.
Public preview
rk500/Born-9B-Qwen3.5-9B-Preview
0.9244
local weighted score
23 / 25
fixed held-out pass count
0.8559
SWE proxy score
22 / 25
SWE proxy pass count
The public preview is the v2 generated-expanded LoRA on top of Qwen/Qwen3.5-9B, not v3, v4, v2-recovery, or preview recovery.
The two numbers that matter most are 0.9244 on the fixed local gate and 0.8559 on the local SWE proxy, with both caveated as project-local probes.
Born-9B is optimized for coding-agent closure: Plan, Patch or Code, Checks, and Result, with compact visible rationale and no hidden <think> targets.
Later runs are transparent diagnostics, not silent upgrades. That honesty is part of the value of this page.
Full timeline
Every important turn,
in one sequence.
13 May 2026
v0 proved the loop, not the quality
The first public LoRA shipped on top of Qwen/Qwen3.5-9B after a small 225-row, roughly 48k-token proof mix on RTX 6000 Ada. It tied base Qwen on the tiny sanity probe, which was useful because the pipeline existed, not because the score was strong.
15 May 2026
v1 got bigger and still lost
The training mix jumped to 6,673 SFT rows and roughly 8.45M tokens. A TeichAI continuation was added, but the held-out 25-task gate still came in below base Qwen: 0.8166 versus 0.8511, with both at 22/25 passes.
15–16 May 2026
v2 became the preview winner
Generated-expanded v2 reached 7,097 rows and roughly 9.11M tokens, plus 85 targeted recovery rows and 14 real DPO pairs. After fixing the evaluator to score the implementation in “Patch or Code:” instead of later fenced checks, v2 became the promoted preview at 0.9244 and 23/25.
16 May 2026
The first post-v2 attempts failed promotion
v3 repair-focus regressed to 0.8003 and 20/25. v4 DPO recovery regressed to 0.8291 and 18/25. v2-recovery improved pass count to 24/25 but only reached 0.9119, so it was preserved and not promoted.
16 May 2026
The exact-code hotfixes still missed the same narrow failure
v2.2 and v2.3 targeted the remaining exact-code contract miss directly, but the chunked iterable/string materialization behavior still failed. The repo keeps those runs as evidence that a same-style SFT patch was not enough.
16 May 2026
v2 also won on the local SWE proxy
On the project’s SWE-bench Verified issue-resolution proxy, Born-9B v2 scored 0.8559 and 22/25 versus base Qwen at 0.7561 and 19/25. This is explicitly a local proxy, not an official Docker-harness SWE result.
17 May 2026
Preview recovery trained cleanly and still stayed below v2
A targeted DeepSeek recovery pass trained cleanly on RTX 6000 Ada, but the interrupted held-out eval only reached 0.8703 over 22 completed tasks. It could not mathematically beat v2, so v2 remained the public preview.
17 May 2026
Public preview identity was locked
The release decision stayed simple: upload the proven v2 adapter as rk500/Born-9B-Qwen3.5-9B-Preview, keep later runs transparent, and shift the general-release plan toward narrow executable recovery work rather than another broad data swell.
Scoreboard
The page keeps the misses.
That is the right way to read the model. Later adapters can be useful diagnostics and still fail to replace the public preview.
Qwen 3.5-9B base
Corrected internal baseline.
0.8511 · 22 / 25
Born-9B v0
Proof artifact, not a win claim.
tiny probe tie
Born-9B v1 + TeichAI
Larger, but still below base.
0.8166 · 22 / 25
Born-9B Preview v2
Current promoted preview checkpoint.
0.9244 · 23 / 25
Born-9B v2-recovery
Higher pass count, lower weighted score.
0.9119 · 24 / 25
Preview recovery
Clean train, non-promoted result.
0.8703 partial · 19 / 22
The SWE figure is a project-local proxy, not an official Docker-harness SWE-bench score. The internal gate is a fixed held-out pack. Both remain useful because they are stable enough to block fake promotion.
Remaining gaps
Where the preview
still needs work.
A narrow exact-code contract failure remains, especially around chunked iterable and string materialization behavior.
Tool-use closure is better than it was, but still weaker than the exact-code slice and still important for the general-release path.
Generic “more data” continuation is now considered lower quality than targeted executable recovery work.
General release path
Precision,
not scale.
Keep v2 as the immutable anchor instead of treating every later run as the new baseline.
Scale executable exact-code recovery rows with real tests, not broad style-heavy continuation data.
Scale tool-use and provider-closure rows around concrete commands, stop conditions, output parsing, and final-result reporting.
Keep rehearsal rows from the promoted v2 mix so recovery work does not erase the current win.
Cap visible-thinking and generic reasoning imports unless they prove themselves on the same held-out gate.
Origin story
The lore now lives
inside the release.
The first meaningful Born-9B story is not triumph. It is restraint: the project published the tie, kept the caveat visible, and treated that honesty as a prerequisite for the next run.
The second story is that the model got better when the work got narrower. v2 became the turning point not because the project found a grand theory, but because it found a better mix of executable, high-signal data and then fixed the evaluator bug that had been misreading the answer format.
The third story is that Born now behaves like a real release system. Later runs can train successfully, look interesting, and still fail promotion. That is the discipline the preview page is trying to show.
Artifacts and trail
Download the thing,
or inspect the paper trail.
HF preview adapter
The public preview adapter on top of Qwen/Qwen3.5-9B.
Open Hugging Face
GitHub release
Release trail for the original Born-9B artifact and attached assets.
Open release
Adapter archive
The original downloadable adapter tarball from the public release trail.
Download adapter
Provenance pack
Run notes, benchmark notes, model card, and related release documentation.
Download provenance