Jarvis: Hermes AI Operating System
A persistent, always-on AI operating system built on the Hermes Agent framework. Layered memory, a growing library of reusable skills, an Obsidian knowledge vault backed by GitHub, and cron-driven autonomous workflows. Accessible from anywhere via Telegram.
"Most agent systems solve for capability. This one solves for continuity. An agent that keeps getting smarter, more autonomous, and more useful the longer it runs."
contrast with Multi-Agent OpenClaw
OpenClaw solves the coordination problem: how a team of specialized agents route tasks, hand off work, and avoid role confusion. Jarvis solves the persistence problem: how a single agent accumulates knowledge, builds reusable procedures, and operates autonomously over weeks and months without losing context. Two different design philosophies for two different failure modes.
01 The Problem
Capable AI agents have a fundamental operational problem: they forget. Every session starts cold. Context from last week's research is gone. A workflow that worked once has to be re-explained. Insights get buried in chat history no one will read again.
The result is an assistant that is impressive in a single conversation but does not compound. You cannot build on it. It does not get better. It certainly does not do anything unless you ask it to. No scheduling, no proactive monitoring, no autonomous habits.
That gap between a capable AI and a reliable AI operator is what Jarvis was built to close.
02 What I Built
A personal AI operating system running on a VPS, reachable via Telegram 24/7, with a layered architecture designed specifically around durability, knowledge accumulation, and autonomous operation.
Layered Memory
USER.md for personal preferences, MEMORY.md for system facts. Compact, high-value state. Not a dump.
Skills Library
125 reusable procedure files across 28 categories. The agent gets smarter by accumulating playbooks, not just better prompts.
Obsidian Knowledge Vault
590 markdown files, git-backed and synced daily. Research, reports, and project notes persist as durable assets.
Cron Automation
Daily AI idea briefs, weekly backlog synthesis, vault git syncs. Scheduled jobs run without being asked.
The interface is Telegram. Mobile-first, always accessible, no terminal required. The underlying model is GPT-5.4 (primary) with fallback routing through DeepSeek and Claude via OpenRouter. Google Workspace integration handles report delivery via Gmail, Docs, and Calendar hooks.
03 A Different Design Philosophy
Multi-Agent OpenClaw is built around a team. Jeeves orchestrates, Ada codes, Hermes researches, Scribe documents. The design challenge is coordination: how do specialized agents route work, hand off context, and avoid role confusion?
Jarvis is built around a single operator that does not forget. The design challenge is entirely different: how does one agent accumulate knowledge across months, build reusable procedures instead of re-explaining the same task every week, and eventually run jobs without being asked at all?
OpenClaw approach
- Team of specialists, explicit routing
- Coordination doctrine (baton files, role boundaries)
- Claude Code desktop environment
- Strength: parallelism, specialization
- Challenge: handoff reliability
Jarvis/Hermes approach
- Single persistent operator, layered memory
- Knowledge accumulation (skills, vault, sessions)
- VPS + Telegram, always-on mobile access
- Strength: continuity, autonomous scheduling
- Challenge: memory discipline, signal vs. noise
Neither is universally better. They solve different problems. OpenClaw is the right model when a task needs parallel specialization. Jarvis is the right model when you need an operator that compounds knowledge and works without constant prompting.
04 Key Design Decisions
Memory as compact high-value state. USER.md holds personal preferences and constraints (~1,400 chars). MEMORY.md holds system facts: mount paths, OAuth quirks, deployment notes (~2,200 chars). The discipline is what gets left out. No task logs, no raw conversation summaries, no ambient noise. Memory that grows unbounded becomes useless.
Skills as procedural memory. 125 documented procedures across 28 categories mean the agent doesn't have to reason about how to do a GitHub PR or generate a report. It follows a playbook. This shifts improvement from "better prompting" to "accumulated procedures," which compounds over time in a way that prompt tweaks don't.
Obsidian as the knowledge surface. Rather than letting outputs disappear into chat history, work products get routed into a git-backed Obsidian vault. 590 files, synced daily at 4:30 AM UTC. Research done last month is retrievable. Reports from three weeks ago have a permanent home.
Cron as an autonomy layer. Scheduled jobs (daily idea briefs, weekly backlog synthesis, vault syncs) run without user prompting. This is the difference between a reactive assistant and an agent that has habits: it operates on its own schedule, not just yours.
05 Challenges
Memory discipline turned out to be harder than memory capacity. The failure mode is not "the agent forgets things." It is "the agent's memory fills up with things that should not be in memory." Keeping USER.md and MEMORY.md small and high-signal required active curation, not just appending.
Skills library maintenance has similar dynamics. A 125-procedure library is only useful if procedures stay current. Outdated playbooks that silently mislead are worse than no playbook at all. They are wrong with false confidence.
The gbrain/QMD integration (a planned knowledge graph layer) is documented in the repo but not yet active in the deployed container. Including it honestly as "documented, not live" is a deliberate choice. This portfolio shows real systems, not inflated claims.
06 What I Learned
Compounding beats capability
An agent that accumulates knowledge and procedures over months is more useful than a more capable agent that starts fresh every session.
Memory needs curation, not just storage
The discipline is what you leave out. Memory that grows unbounded degrades into noise. Compact and high-signal beats comprehensive and cluttered.
Scheduled autonomy changes the relationship
When an agent runs jobs on its own schedule, it stops being a tool you use and starts being infrastructure you rely on.
Honest docs build more trust
Documenting what's live vs. what's planned honestly (gbrain integration) matters. Systems that overclaim erode trust when reality doesn't match.
07 Why This Matters
Most AI demos show what a model can do in a single impressive session. This project shows what an agent system looks like after months of deliberate operational design. Structured memory, a growing skills library, autonomous scheduled workflows, and a knowledge vault that compounds over time.
Separating profile memory from system facts, building procedural playbooks, routing outputs to durable storage, scheduling jobs that run without prompting: these are the same patterns any organization needs when AI agents stop being demos and start being infrastructure.