Your AI, Your Machine.
No internet. No subscription. No compromises.
GrimoireScribe will ship with a purpose-built 4.5 GB writing model that runs entirely on your hardware. Built on an 18-trillion-token foundation, sharpened for creative fiction, then refined with 1,500 hand-crafted examples from every GrimoireScribe preset. This is not a generic chatbot stuffed into a writing app. This is a model built for novelists.
Built on giants. Sharpened for fiction. Refined for you.
Our model is not trained from scratch. It stands on top of two powerful existing layers, then adds a third that is entirely ours.
Qwen2.5-7B
Alibaba's Qwen2.5 is a 7-billion-parameter model pre-trained on 18 trillion tokens of web text, books, code, and academic papers across 29+ languages. That is over twice the training data of its predecessor. It outperforms both Llama 3.1-8B and Gemma 2-9B across standard benchmarks. This gives our model its raw intelligence, reasoning, and language understanding.
EVA Fine-Tune
EVA-UNIT-01 took that foundation and performed a full-parameter fine-tune using the Celeste data mixture: 70% creative writing and fiction, 30% instruction-following. This is what unlocked the model for uncensored creative prose, character voice, and long-form narrative. EVA turns a general-purpose model into a fiction writer.
GrimoireScribe
Our layer. We fine-tune EVA further with 1,000+ curated literary works and 1,500 hand-crafted Golden Examples built from every GrimoireScribe writing preset. This is where the model learns our specific craft philosophy: show don't tell, subtext-driven dialogue, preset-native style. We train at Q5 precision for maximum quality, then quantize to Q4 for the ~4.5 GB file you download.
The result: a compact model that carries 18 trillion tokens of general knowledge, creative fiction instincts from the Celeste mixture, and GrimoireScribe's specific writing craft baked into its weights. It will download once through the app, then live on your machine permanently. No internet required. No subscription. No cloud calls.
Context window: 128,000 tokens, roughly 96,000 words. For most novels, that means the model can read your entire manuscript in a single pass.
1,000+ literary works. 1,500 Golden Examples.
Our fine-tuning layer combines two datasets. The first is a curated corpus of over a thousand literary works spanning centuries and genres: classic realism from Hardy, Austen, and Dickens. Russian literary giants like Dostoevsky and Tolstoy. Gothic and fantasy precursors from Dunsany, Morris, and MacDonald. French masters like Flaubert and Hugo. This deepens the model's sense of vivid, varied prose across traditions.
The second dataset is what makes GrimoireScribe's model unique.
Golden Examples. Hand-crafted. Preset by preset.
Golden Examples are hand-crafted training samples that encode the exact writing quality we want the model to produce. Each one is a pair: an instruction ("Write a scene where...") and a prose output that meets GrimoireScribe's standards. All 1,500 examples were built against our writing style presets, ensuring the model learns our craft philosophy from the inside out.
Think of it this way: Qwen gives the model intelligence. EVA gives it creative instincts. The Golden Examples teach it how GrimoireScribe prose works specifically.
Show, Don't Tell
Physical anchoring, sensory triggers, backstory revealed through action rather than exposition. The model learns to convey emotion through what characters do, not what they feel.
Dialogue with Subtext
Natural conversation where characters talk around what they mean. Minimal tags. What goes unsaid carries as much weight as what is spoken.
Supernatural Atmosphere
Wrongness-first worldbuilding. The strange is presented as fact, not explained. Magic systems conveyed through constraint and consequence, not exposition dumps.
Character Psychology
Internal states revealed through behavior, micro-decisions, and physical response. No labeling emotions. The reader infers the psychology from what the character does.
Tonal Range
Register shifts between intimacy and grandeur. Sentence structure that mirrors emotional state. The model learns to match tone to moment, not default to one register.
Mature and Dark Content
Violence with consequence. Intimacy with honesty. Moral complexity without judgment. The model does not shy away from the parts of fiction that make stories real.
Cultural Authenticity
Rituals shown as automatic habit, not anthropological lecture. Spiritual practices embedded in character behavior. Worlds feel lived-in because the details are woven, not announced.
Foreshadowing
Environmental logic and unremarked details that pay off later. The model learns to plant seeds without highlighting them. Subtlety is the skill being trained here.
Edit Mode
Not just generation. The model also learns to take weak prose and improve it: tightening language, removing crutch words, and elevating passages to match your chosen preset.
Preset-native. Not prompt-fighting.
Most local AI models ignore detailed style instructions. You write a careful system prompt describing your voice, your tone, your rules. The model reads it, nods politely, and produces the same generic output it always does. This happens because your prompt conflicts with what the model learned during training. The trained behavior wins.
GrimoireScribe's model was trained on the presets. The writing style rules are baked into the model's weights, not fighting against them. When you select a preset and write a prompt, the system prompt and the model's training reinforce each other instead of competing. The prompt can focus on the creative direction of your scene because the craft principles are already handled.
The result: a local model that actually follows your style instructions. Not perfectly. Not at the level of Claude or GPT with a 200K-token context window. But meaningfully, noticeably better than any off-the-shelf 7B model trying to parse a style guide from a system prompt alone.
The AI reads your book. Not just your prompt.
A writing model is only as good as the context it receives. Most AI writing tools either dump your entire manuscript into the prompt (expensive, slow, often truncated) or give the AI nothing and hope for the best. GrimoireScribe takes a different approach.
The app builds a structured knowledge system for every book: a fast structural index of your chapters, scenes, and characters. A full-text search layer that can find any passage in your manuscript instantly. AI-generated summaries of character arcs and chapter events that update automatically as you write.
When you ask the AI anything, a query router determines what context is needed and assembles it. A question about a character pulls their arc summary and scene appearances. A question about your magic system searches your prose and world notes. A request to write a new scene loads the prior scene, your style guide, character profiles, and relevant world-building notes.
This architecture is the same whether you use cloud providers or Offline Mode. The knowledge layers, full-text search, and query routing are all local. They run on your machine, in milliseconds, at zero cost. The only thing that changes between online and offline is which model generates the response.
Structural Index
Chapter titles, scene lists, word counts, character appearances. Built from your files in milliseconds. No AI needed. Always current.
Full-Text Search
Find any passage in your manuscript by content, not just title. "Find scenes about the magic system" returns matching scenes instantly. Runs locally via SQLite.
World Doc Retrieval
Your lore bibles and research notes are searchable by keyword locally. Semantic search is available through optional cloud or local embedding models for writers with large world-building collections.
Uncensored. Honestly.
We chose an uncensored base model because fiction requires freedom. Dark fantasy needs violence with weight. Romance needs intimacy with honesty. Grimdark needs moral ambiguity without the AI inserting a lesson. Horror needs to unsettle without pulling its punches.
Through testing, we confirmed that the model handles mature fiction comfortably: adult content, dark themes, morally complex scenarios, and difficult subject matter. If you value the creative freedom that platforms like Grok have championed, you will feel at home here.
We are transparent about one thing: we cannot guarantee the complete absence of all guardrails in every edge case. Language models are probabilistic, and occasional refusals on extreme content are possible. But in our testing across hundreds of prompts covering violence, intimacy, horror, and moral complexity, the model consistently delivered without censoring the work.
A living model. Free updates, always.
This is not a ship-and-forget model. As GrimoireScribe launches and real writers put it through real workflows, we gather feedback, expand our Golden Examples, and release improved versions of the model for free. Every update ships through the app. Your local AI gets better over time at no extra cost.
Improvements will be documented on our blog as we release them: what changed, why it changed, and what to expect. We believe in showing our work.
Our roadmap includes expanding genre-specific training (more literary fiction, more thriller and mystery craft, more nonfiction techniques), improving edit mode accuracy, and refining the model's ability to match established voice across longer passages. The 1,500 Golden Examples we ship with are the beginning, not the ceiling.
How Offline Mode compares to cloud AI.
We believe in honesty. A 4.5 GB model running on your laptop will not match the output quality of Claude Sonnet or GPT-4 running on a data center with hundreds of billions of parameters. Those models are larger, more capable, and benefit from massive computational resources.
What Offline Mode offers is a genuinely capable local alternative for writers who prioritize privacy, who work without reliable internet, or who simply do not want a subscription. The prose quality is good. It follows your presets. It handles mature content. For many writers and many scenes, it will be everything you need.
For writers who want the absolute best AI output available, GrimoireScribe also supports Claude, GPT, Grok, Gemini, and Mistral through your own API keys. Offline Mode and cloud providers are not mutually exclusive. Use whichever fits the moment.
Will it run on my machine?
Offline Mode runs a compact 4.5 GB AI model entirely on your computer. No internet. No subscription. Here is how it performs on common hardware.
| Device | Speed | 200-Word Scene | 16 GB+ Safe? | Rating |
|---|---|---|---|---|
| MacBook Pro M5 Pro/Max 2026 | $1,999+ |
53-90 words/sec | 2-4 sec | Yes (24 GB+) | |
| MacBook Air M5 2026 | $1,099 |
34-49 words/sec | 4-6 sec | Yes (16 GB) | |
| MacBook Pro M3 Pro ~2023 | 3 years old |
29-41 words/sec | 5-7 sec | Yes (18 GB+) | |
| MacBook Air M4 2025 | ~$999 |
25-36 words/sec | 6-8 sec | Yes (16 GB) | |
| MacBook Air M2 (16 GB) ~2023 | 3 years old |
17-23 words/sec | 9-12 sec | Yes | |
| MacBook Air M2 (8 GB) ~2023 | 3 years old |
14-19 words/sec | 11-15 sec | Tight | |
| MacBook Neo 2026 | $599 |
9-14 words/sec | 15-22 sec | Tight (8 GB) | |
| Windows Laptop (current) 2025-2026 avg | $700-$1,100 |
6-11 words/sec | 18-33 sec | Mostly (16 GB) | |
| Windows Laptop (older) ~2023 avg | 3 years old |
4-8 words/sec | 30-50+ sec | 8 GB: No |
A note about 8 GB machines
Offline Mode will run on 8 GB, but it is a tight fit. The model itself takes about 4.5 GB, leaving limited room for the OS and other apps. We recommend closing browsers and background apps before using Offline Mode on 8 GB hardware. If performance is not satisfactory, GrimoireScribe will suggest switching to cloud API mode for the best experience. The app shows this recommendation automatically when it detects constrained memory.
The models behind our model.
We believe in transparency. Here are the open-source projects and research that our local AI is built on.
V1 today. Offline Mode when V2 ships.
Every V1 license includes V2 at no extra cost. Download the 14-day free trial and get everything that ships, forever.