Model card — GOAT 1.0 Nano (AIARCO-Lite-7B-Instruct v0.1)
Public name:
GOAT 1.0 Nano— the first model in the GOAT family by AIARCO. Internal codename:aiarco-lite(used in code, ASC app name, HF repo, R2 paths, Postgres tables, env vars). Do not rename internal artefacts — only the public/ customer-facing surfaces (gateway model id, model card title, marketing copy) usegoat-1.0-nano.Public gateway model id:
aiarco/goat-1.0-nano(alias ofaiarco/aiarco-lite-7b-awq). Family roadmap: Nano → Mini → Standard → Pro → Ultra. See the "GOAT model roadmap" section inAIARVA MASTER/REVAMP_PLAN.md(Phase 15+).
Model details
- Developer: AIARCO Inc.
- Family: GOAT (Generalist Open Apache-licensed Transformer).
- Tier: Nano — the smallest, cheapest, fastest tier; targets free + low-latency agent-reasoning workloads. Larger tiers (Mini/Standard/Pro/Ultra) are on the roadmap and are NOT shipped in v0.1.
- Base model:
mistralai/Mistral-7B-Instruct-v0.3(Apache-2.0). - Method: QLoRA fine-tune (r=16, NF4 + double-quant) → DPO alignment pass → AWQ int4 / GGUF Q4_K_M quantization.
- Languages: English only.
- Licence: Apache-2.0 (matches base; LoRA adapter weights also Apache-2.0).
- Release: v0.1 — May 2026 (rebranded as GOAT 1.0 Nano).
Intended use
- AIARCO-product-internal RAG over our curated, license-clean corpus.
- Customer-facing chat completion through
/v1/lite/chaton ASCasc functions gpu=a10(RunPod Serverless interim until Phase E.2). - Drop-in low-cost alternative to gateway-routed Anthropic/OpenAI for use cases where freshness < quality is acceptable.
Out of scope
- Medical, legal, financial advice (not graded for any).
- Languages other than English.
- Generation of code targeting niche/private APIs.
- Anything safety-critical.
Training data
License-clean. No copyright-infringing sources (no Sci-Hub, LibGen,
Anna's Archive, books3 derivatives, etc.). Per-source breakdown,
licences, and SHA-256 manifest are in
AIARCO_AI/lite/manifests/sources.yml
and the Postgres lite_corpus_manifest ledger.
| Source | Licence | ~Tokens (post-dedup) |
|---|---|---|
| Wikipedia EN | CC-BY-SA-4.0 | 4 B |
| C4 EN (5% sample) | ODC-BY | 6 B |
| RedPajama-V2 (1%) | Apache-2.0 | 6 B |
| OpenWebText | CC0-1.0 | 7 B |
| Pile-uncopyrighted (25%) | MIT (subset) | 12 B |
| Project Gutenberg | Public domain | 2 B |
| Institutional Books 1.0 | Public domain | 9 B |
| Harvard PD Corpus | Public domain | 5 B |
| arXiv metadata + abstracts | CC0 | 1 B |
| S2ORC OA-only (10%) | CC-BY (mostly) | 9 B |
| OpenAlex (5%) | CC0-1.0 | 14 B |
| CORD-19 | CC-BY (mostly) | 0.5 B |
| unarXive | CC-BY-SA | 1 B |
| GitHub permissive (20%) | Apache/MIT/BSD/ISC | 4 B |
| Total | ≈ 80 B |
Evaluation
Run make eval. Targets vs base Mistral-7B-Instruct-v0.3:
| Benchmark | Target |
|---|---|
| MMLU subset (200) | ≥ base − 1pp |
| ARC-easy | ≥ base |
| HellaSwag | ≥ base |
| AIARCO-domain prompts | ≥ base + 5pp (Claude grader) |
Latency at int4 on ASC A10 (RunPod Serverless A10G interim):
- p50 < 1.5 s for 256 tokens
- p95 < 4.0 s
Risks & limitations
- Hallucinations: model will fabricate facts not present in the
retrieved context. RAG is strongly recommended (
rag_k ≥ 3). - Bias: inherits any biases in the base model and corpus.
- Recency: training cutoff matches the corpus snapshot date; this v0.1 cuts off at March 2025.
- No HIPAA / no PII: do not send Personal Health Information or Personal Data — RAG context is logged for safety review.
Cost & sustainability
End-to-end ops cost cap: $100/month. Run make cost to see MTD
spend across AWS, RunPod, ASC, R2. Train→deploy refresh cycle is
once a month.
Provenance
Every training row traces to (source_url, sha256, fetched_at) via
lite_corpus_manifest. Auditor sample procedure is documented in
compliance/procedures/access-review.md.
Contact
trust@aiarco.com for licence and dataset questions; security@aiarco.com
for vulnerability reports against the served endpoint.