Coming in V2

Your AI, Your Machine.
No internet. No subscription. No compromises.

GrimoireScribe will ship with a purpose-built 4.5 GB writing model that runs entirely on your hardware. Built on an 18-trillion-token foundation, sharpened for creative fiction, then refined with 1,500 hand-crafted examples from every GrimoireScribe preset. This is not a generic chatbot stuffed into a writing app. This is a model built for novelists.

V2 feature. Included in all V1 licenses at no extra cost.
Three Layers Deep

Built on giants. Sharpened for fiction. Refined for you.

Our model is not trained from scratch. It stands on top of two powerful existing layers, then adds a third that is entirely ours.

1 Foundation

Qwen2.5-7B

Alibaba's Qwen2.5 is a 7-billion-parameter model pre-trained on 18 trillion tokens of web text, books, code, and academic papers across 29+ languages. That is over twice the training data of its predecessor. It outperforms both Llama 3.1-8B and Gemma 2-9B across standard benchmarks. This gives our model its raw intelligence, reasoning, and language understanding.

2 Creative Unlock

EVA Fine-Tune

EVA-UNIT-01 took that foundation and performed a full-parameter fine-tune using the Celeste data mixture: 70% creative writing and fiction, 30% instruction-following. This is what unlocked the model for uncensored creative prose, character voice, and long-form narrative. EVA turns a general-purpose model into a fiction writer.

3 Craft Refinement

GrimoireScribe

Our layer. We fine-tune EVA further with 1,000+ curated literary works and 1,500 hand-crafted Golden Examples built from every GrimoireScribe writing preset. This is where the model learns our specific craft philosophy: show don't tell, subtext-driven dialogue, preset-native style. We train at Q5 precision for maximum quality, then quantize to Q4 for the ~4.5 GB file you download.

The result: a compact model that carries 18 trillion tokens of general knowledge, creative fiction instincts from the Celeste mixture, and GrimoireScribe's specific writing craft baked into its weights. It will download once through the app, then live on your machine permanently. No internet required. No subscription. No cloud calls.

Context window: 128,000 tokens, roughly 96,000 words. For most novels, that means the model can read your entire manuscript in a single pass.

Offline AI Model Architecture
Layer 1
Qwen2.5-7B Foundation
18 trillion tokens  ·  7B parameters  ·  128K context
full-parameter fine-tune
Layer 2
EVA Creative Unlock
Celeste mixture: 70% fiction  ·  30% instruction
GrimoireScribe fine-tune
Layer 3
GrimoireScribe Craft Layer
1,000+ lit. works  ·  1,500 Golden Examples  ·  Q5 precision
Deliverable
EVA-GrimoireScribe-Q4.gguf  ·  ~4.5 GB
Our Training Data

1,000+ literary works. 1,500 Golden Examples.

Our fine-tuning layer combines two datasets. The first is a curated corpus of over a thousand literary works spanning centuries and genres: classic realism from Hardy, Austen, and Dickens. Russian literary giants like Dostoevsky and Tolstoy. Gothic and fantasy precursors from Dunsany, Morris, and MacDonald. French masters like Flaubert and Hugo. This deepens the model's sense of vivid, varied prose across traditions.

The second dataset is what makes GrimoireScribe's model unique.

The Secret Ingredient

Golden Examples. Hand-crafted. Preset by preset.

Golden Examples are hand-crafted training samples that encode the exact writing quality we want the model to produce. Each one is a pair: an instruction ("Write a scene where...") and a prose output that meets GrimoireScribe's standards. All 1,500 examples were built against our writing style presets, ensuring the model learns our craft philosophy from the inside out.

Think of it this way: Qwen gives the model intelligence. EVA gives it creative instincts. The Golden Examples teach it how GrimoireScribe prose works specifically.

Show, Don't Tell

Physical anchoring, sensory triggers, backstory revealed through action rather than exposition. The model learns to convey emotion through what characters do, not what they feel.

Dialogue with Subtext

Natural conversation where characters talk around what they mean. Minimal tags. What goes unsaid carries as much weight as what is spoken.

Supernatural Atmosphere

Wrongness-first worldbuilding. The strange is presented as fact, not explained. Magic systems conveyed through constraint and consequence, not exposition dumps.

Character Psychology

Internal states revealed through behavior, micro-decisions, and physical response. No labeling emotions. The reader infers the psychology from what the character does.

Tonal Range

Register shifts between intimacy and grandeur. Sentence structure that mirrors emotional state. The model learns to match tone to moment, not default to one register.

Mature and Dark Content

Violence with consequence. Intimacy with honesty. Moral complexity without judgment. The model does not shy away from the parts of fiction that make stories real.

Cultural Authenticity

Rituals shown as automatic habit, not anthropological lecture. Spiritual practices embedded in character behavior. Worlds feel lived-in because the details are woven, not announced.

Foreshadowing

Environmental logic and unremarked details that pay off later. The model learns to plant seeds without highlighting them. Subtlety is the skill being trained here.

Edit Mode

Not just generation. The model also learns to take weak prose and improve it: tightening language, removing crutch words, and elevating passages to match your chosen preset.

Why This Matters

Preset-native. Not prompt-fighting.

Most local AI models ignore detailed style instructions. You write a careful system prompt describing your voice, your tone, your rules. The model reads it, nods politely, and produces the same generic output it always does. This happens because your prompt conflicts with what the model learned during training. The trained behavior wins.

GrimoireScribe's model was trained on the presets. The writing style rules are baked into the model's weights, not fighting against them. When you select a preset and write a prompt, the system prompt and the model's training reinforce each other instead of competing. The prompt can focus on the creative direction of your scene because the craft principles are already handled.

The result: a local model that actually follows your style instructions. Not perfectly. Not at the level of Claude or GPT with a 200K-token context window. But meaningfully, noticeably better than any off-the-shelf 7B model trying to parse a style guide from a system prompt alone.

AI Settings
Cloud API
Offline Mode
EVA-GrimoireScribe-Q4
4.5 GB  ·  Downloaded  ·  Q4_K_M
Active
Device MacBook Air M4
Inference 29 words / sec
RAM used 5.1 GB / 16 GB
Context window 128,000 tokens
Atmospheric Gothic Trained
This preset is baked into the model's weights. Style rules reinforce rather than compete with your prompt.
Context, Not Guessing

The AI reads your book. Not just your prompt.

A writing model is only as good as the context it receives. Most AI writing tools either dump your entire manuscript into the prompt (expensive, slow, often truncated) or give the AI nothing and hope for the best. GrimoireScribe takes a different approach.

The app builds a structured knowledge system for every book: a fast structural index of your chapters, scenes, and characters. A full-text search layer that can find any passage in your manuscript instantly. AI-generated summaries of character arcs and chapter events that update automatically as you write.

When you ask the AI anything, a query router determines what context is needed and assembles it. A question about a character pulls their arc summary and scene appearances. A question about your magic system searches your prose and world notes. A request to write a new scene loads the prior scene, your style guide, character profiles, and relevant world-building notes.

This architecture is the same whether you use cloud providers or Offline Mode. The knowledge layers, full-text search, and query routing are all local. They run on your machine, in milliseconds, at zero cost. The only thing that changes between online and offline is which model generates the response.

Structural Index

Chapter titles, scene lists, word counts, character appearances. Built from your files in milliseconds. No AI needed. Always current.

Full-Text Search

Find any passage in your manuscript by content, not just title. "Find scenes about the magic system" returns matching scenes instantly. Runs locally via SQLite.

World Doc Retrieval

Your lore bibles and research notes are searchable by keyword locally. Semantic search is available through optional cloud or local embedding models for writers with large world-building collections.

Creative Freedom

Uncensored. Honestly.

We chose an uncensored base model because fiction requires freedom. Dark fantasy needs violence with weight. Romance needs intimacy with honesty. Grimdark needs moral ambiguity without the AI inserting a lesson. Horror needs to unsettle without pulling its punches.

Through testing, we confirmed that the model handles mature fiction comfortably: adult content, dark themes, morally complex scenarios, and difficult subject matter. If you value the creative freedom that platforms like Grok have championed, you will feel at home here.

We are transparent about one thing: we cannot guarantee the complete absence of all guardrails in every edge case. Language models are probabilistic, and occasional refusals on extreme content are possible. But in our testing across hundreds of prompts covering violence, intimacy, horror, and moral complexity, the model consistently delivered without censoring the work.

Always Improving

A living model. Free updates, always.

This is not a ship-and-forget model. As GrimoireScribe launches and real writers put it through real workflows, we gather feedback, expand our Golden Examples, and release improved versions of the model for free. Every update ships through the app. Your local AI gets better over time at no extra cost.

Improvements will be documented on our blog as we release them: what changed, why it changed, and what to expect. We believe in showing our work.

Our roadmap includes expanding genre-specific training (more literary fiction, more thriller and mystery craft, more nonfiction techniques), improving edit mode accuracy, and refining the model's ability to match established voice across longer passages. The 1,500 Golden Examples we ship with are the beginning, not the ceiling.

Setting Expectations

How Offline Mode compares to cloud AI.

We believe in honesty. A 4.5 GB model running on your laptop will not match the output quality of Claude Sonnet or GPT-4 running on a data center with hundreds of billions of parameters. Those models are larger, more capable, and benefit from massive computational resources.

What Offline Mode offers is a genuinely capable local alternative for writers who prioritize privacy, who work without reliable internet, or who simply do not want a subscription. The prose quality is good. It follows your presets. It handles mature content. For many writers and many scenes, it will be everything you need.

For writers who want the absolute best AI output available, GrimoireScribe also supports Claude, GPT, Grok, Gemini, and Mistral through your own API keys. Offline Mode and cloud providers are not mutually exclusive. Use whichever fits the moment.

Hardware Compatibility

Will it run on my machine?

Offline Mode runs a compact 4.5 GB AI model entirely on your computer. No internet. No subscription. Here is how it performs on common hardware.

Device Speed 200-Word Scene 16 GB+ Safe? Rating
MacBook Pro M5 Pro/Max
2026 | $1,999+
53-90 words/sec 2-4 sec Yes (24 GB+) Exceptional
MacBook Air M5
2026 | $1,099
34-49 words/sec 4-6 sec Yes (16 GB) Excellent
MacBook Pro M3 Pro
~2023 | 3 years old
29-41 words/sec 5-7 sec Yes (18 GB+) Very Good
MacBook Air M4
2025 | ~$999
25-36 words/sec 6-8 sec Yes (16 GB) Great
MacBook Air M2 (16 GB)
~2023 | 3 years old
17-23 words/sec 9-12 sec Yes Good
MacBook Air M2 (8 GB)
~2023 | 3 years old
14-19 words/sec 11-15 sec Tight OK
MacBook Neo
2026 | $599
9-14 words/sec 15-22 sec Tight (8 GB) Limited
Windows Laptop (current)
2025-2026 avg | $700-$1,100
6-11 words/sec 18-33 sec Mostly (16 GB) Moderate
Windows Laptop (older)
~2023 avg | 3 years old
4-8 words/sec 30-50+ sec 8 GB: No Slow

A note about 8 GB machines

Offline Mode will run on 8 GB, but it is a tight fit. The model itself takes about 4.5 GB, leaving limited room for the OS and other apps. We recommend closing browsers and background apps before using Offline Mode on 8 GB hardware. If performance is not satisfactory, GrimoireScribe will suggest switching to cloud API mode for the best experience. The app shows this recommendation automatically when it detects constrained memory.

Further Reading

The models behind our model.

We believe in transparency. Here are the open-source projects and research that our local AI is built on.

Qwen2.5 Announcement qwenlm.github.io Official blog post introducing the Qwen2.5 model family and 18T-token training corpus.
Qwen2.5 Technical Report arxiv.org Full benchmarks, training methodology, and architecture details (December 2024).
Qwen2.5-7B on HuggingFace huggingface.co The base 7B model. Apache 2.0 license. 128K context window.
EVA-Qwen2.5-7B-v0.1 huggingface.co The uncensored creative fiction fine-tune by EVA-UNIT-01. Full-parameter training on the Celeste data mixture.
Celeste Data Mixture huggingface.co The training data blend used by EVA: 70% creative writing (Reddit WP, fiction, roleplay), 30% instruction-following.

V1 today. Offline Mode when V2 ships.

Every V1 license includes V2 at no extra cost. Download the 14-day free trial and get everything that ships, forever.

Download for Windows Pricing ($79)