Architecture Deep Dive¶
This page provides a detailed technical explanation of how Hermes Agent works internally.
System Overview¶
Hermes Agent is built with a modular architecture that separates concerns:
┌─────────────────────────────────────────────────────────────┐
│ System Components │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Entry Points │ │
│ │ cli.py │ gateway/ │ acp_server.py │ cron/ │ │
│ └─────────────────────┬───────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Core Agent │ │
│ │ run_agent.py │ agent/ │ model_tools.py │ │
│ └─────────────────────┬───────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Tools & Skills │ │
│ │ tools/ │ skills/ │ toolsets.py │ │
│ └─────────────────────┬───────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Storage │ │
│ │ hermes_state.py │ memories/ │ sessions/ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Core Components¶
1. AIAgent (run_agent.py)¶
The AIAgent class is the heart of Hermes:
class AIAgent:
def __init__(self, config, tools, memory):
self.config = config
self.tools = tools
self.memory = memory
self.messages = []
def run_conversation(self, user_input):
"""Main conversation loop"""
# 1. Build system prompt
system_prompt = self.build_system_prompt()
# 2. Add user message
self.messages.append({"role": "user", "content": user_input})
# 3. Loop until complete
while self.iterations < self.max_turns:
# Call LLM
response = self.call_llm(system_prompt, self.messages)
# Process response
if response.tool_calls:
# Execute tools
for tool_call in response.tool_calls:
result = self.execute_tool(tool_call)
self.messages.append({
"role": "tool",
"content": result
})
else:
# Return text response
return response.content
2. Prompt Builder (agent/prompt.py)¶
The prompt builder constructs the system prompt:
def build_system_prompt(self):
"""Build the complete system prompt"""
sections = []
# 1. Base persona
sections.append(self.load_persona())
# 2. Memory injection
sections.append(self.inject_memory())
# 3. Skills context
sections.append(self.load_skills())
# 4. Tool schemas
sections.append(self.get_tool_schemas())
# 5. Context files (AGENTS.md, etc.)
sections.append(self.load_context_files())
return "\n\n".join(sections)
3. Tool Dispatch (model_tools.py)¶
Tools are dispatched through a central registry:
# tools/registry.py
class ToolRegistry:
def __init__(self):
self.tools = {}
def register(self, name, schema, handler, check_fn=None):
"""Register a tool"""
self.tools[name] = {
"schema": schema,
"handler": handler,
"check_fn": check_fn
}
def dispatch(self, name, arguments, **kwargs):
"""Execute a tool"""
tool = self.tools[name]
# Check requirements
if tool["check_fn"] and not tool["check_fn"]():
return {"error": "Requirements not met"}
# Execute handler
return tool["handler"](arguments, **kwargs)
4. Context Compression (agent/compression.py)¶
When conversations get long, Hermes compresses context:
def compress_context(self, messages, target_ratio=0.2):
"""Compress conversation context"""
# 1. Calculate current size
current_size = self.calculate_token_count(messages)
target_size = int(current_size * target_ratio)
# 2. Protect recent messages
protected = messages[-self.protect_last_n:]
compressible = messages[:-self.protect_last_n]
# 3. Summarize older messages
summary = self.summarize_messages(compressible)
# 4. Reconstruct messages
compressed = [
{"role": "system", "content": f"Previous context summary: {summary}"},
*protected
]
return compressed
Tool System Architecture¶
Tool Registration¶
Each tool is defined in tools/ directory:
# tools/terminal.py
import json
from tools.registry import registry
def terminal_handler(arguments, **kwargs):
"""Execute shell command"""
command = arguments.get("command", "")
# Execute command
result = subprocess.run(
command,
shell=True,
capture_output=True,
text=True
)
return json.dumps({
"output": result.stdout,
"error": result.stderr,
"exit_code": result.returncode
})
registry.register(
name="terminal",
toolset="terminal",
schema={
"name": "terminal",
"description": "Execute shell commands",
"parameters": {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "Command to execute"
}
},
"required": ["command"]
}
},
handler=terminal_handler,
check_fn=lambda: True # Always available
)
Toolset Organization¶
Tools are grouped into toolsets:
# toolsets.py
_HERMES_CORE_TOOLS = [
"terminal",
"file",
"web",
"browser",
"vision",
"memory",
"skills",
"delegation",
"cronjob",
# ...
]
def get_toolsets(platform="cli"):
"""Get available toolsets for platform"""
return _PLATFORM_TOOLSETS.get(platform, _HERMES_CORE_TOOLS)
Memory System¶
Memory Storage¶
Memory is stored in ~/.hermes/memories/:
~/.hermes/memories/
├── user.json # User profile
├── memory.json # Environment facts
└── sessions/ # Session transcripts
Memory Injection¶
Memory is injected into every conversation:
def inject_memory(self):
"""Inject memory into system prompt"""
sections = []
# User profile
user = self.load_user_profile()
if user:
sections.append(f"## User Profile\n{user}")
# Memory facts
memory = self.load_memory()
if memory:
sections.append(f"## Memory\n{memory}")
return "\n\n".join(sections)
Skills System¶
Skill Loading¶
Skills are loaded from ~/.hermes/skills/:
def load_skills(self, skill_names=None):
"""Load skills into context"""
skills = []
for skill_dir in self.skills_dir.iterdir():
skill_file = skill_dir / "SKILL.md"
if skill_file.exists():
# Parse frontmatter
content = skill_file.read_text()
frontmatter, body = parse_frontmatter(content)
skills.append({
"name": frontmatter["name"],
"description": frontmatter["description"],
"content": body
})
return skills
Skill Creation¶
Skills are created when Hermes learns something new:
def create_skill(self, name, description, content):
"""Create a new skill"""
skill_dir = self.skills_dir / name
skill_dir.mkdir(parents=True, exist_ok=True)
skill_file = skill_dir / "SKILL.md"
skill_file.write_text(f"""---
name: {name}
description: {description}
---
{content}
""")
Gateway Architecture¶
Platform Adapters¶
Each platform has an adapter:
# gateway/platforms/telegram.py
class TelegramAdapter:
def __init__(self, config):
self.bot = telegram.Bot(token=config["token"])
async def handle_message(self, update, context):
"""Handle incoming message"""
user_id = update.effective_user.id
message = update.message.text
# Process through agent
response = await self.agent.process(message)
# Send response
await self.bot.send_message(
chat_id=update.effective_chat.id,
text=response
)
Message Routing¶
Messages are routed through the gateway:
class Gateway:
def __init__(self):
self.adapters = {}
self.agent = AIAgent()
def register_adapter(self, name, adapter):
"""Register a platform adapter"""
self.adapters[name] = adapter
async def route_message(self, platform, message):
"""Route message to agent"""
response = await self.agent.process(message)
# Send back through platform
await self.adapters[platform].send_response(response)
Credential Pool¶
Key Rotation¶
Multiple API keys rotate automatically:
class CredentialPool:
def __init__(self, provider, keys):
self.provider = provider
self.keys = keys
self.current_index = 0
def get_next_key(self):
"""Get next available key"""
key = self.keys[self.current_index]
self.current_index = (self.current_index + 1) % len(self.keys)
return key
def mark_exhausted(self, key):
"""Mark a key as rate-limited"""
# Move to next key
pass
Configuration System¶
Config Structure¶
Configuration is stored in ~/.hermes/config.yaml:
model:
default: claude-sonnet-4
provider: anthropic
api_key: ${ANTHROPIC_API_KEY}
agent:
max_turns: 90
tool_use_enforcement: auto
terminal:
backend: local
timeout: 180
compression:
enabled: true
threshold: 0.5
target_ratio: 0.2
memory:
memory_enabled: true
user_profile_enabled: true
Environment Variables¶
Secrets are stored in ~/.hermes/.env:
Data Flow¶
Request Processing¶
User Input
│
▼
┌─────────────────┐
│ Platform Adapter │ (Telegram, Discord, etc.)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Gateway │ (Message routing)
└────────┬────────┘
│
▼
┌─────────────────┐
│ AIAgent │ (Core agent loop)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Prompt Builder │ (System prompt construction)
└────────┬────────┘
│
▼
┌─────────────────┐
│ LLM API │ (Model inference)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Tool Dispatch │ (Execute tools)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Response Gen │ (Generate response)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Platform Adapter │ (Send response)
└─────────────────┘
Performance Considerations¶
Token Optimization¶
- Context compression reduces token usage by 80%
- Skill loading is lazy (only when needed)
- Tool schemas are cached after first load
Memory Management¶
- Session transcripts are stored in SQLite
- Memory facts are compact JSON
- Skills are loaded on-demand
Concurrency¶
- Gateway handles multiple platforms concurrently
- Delegation spawns subagents for parallel work
- Cron jobs run in separate processes
Security Model¶
Command Approval¶
def should_approve(command):
"""Check if command needs approval"""
dangerous_patterns = [
r'rm\s+-rf',
r'git\s+reset\s+--hard',
r'dd\s+if=',
r'mkfs',
]
for pattern in dangerous_patterns:
if re.search(pattern, command):
return True
return False
Secret Redaction¶
def redact_secrets(text):
"""Remove secrets from text"""
patterns = [
r'sk-[a-zA-Z0-9]{48}', # API keys
r'ghp_[a-zA-Z0-9]{36}', # GitHub tokens
r'AKIA[0-9A-Z]{16}', # AWS keys
]
for pattern in patterns:
text = re.sub(pattern, '[REDACTED]', text)
return text
Next Steps¶
- Tool System — How tools work in detail
- Skills System — Creating and managing skills
- Memory System — Persistent memory
- Contributing — Extend Hermes