Skip to content

Architecture Deep Dive

This page provides a detailed technical explanation of how Hermes Agent works internally.

System Overview

Hermes Agent is built with a modular architecture that separates concerns:

┌─────────────────────────────────────────────────────────────┐
│                    System Components                        │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │              Entry Points                           │   │
│  │  cli.py │ gateway/ │ acp_server.py │ cron/          │   │
│  └─────────────────────┬───────────────────────────────┘   │
│                        │                                   │
│                        ▼                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │              Core Agent                             │   │
│  │  run_agent.py │ agent/ │ model_tools.py             │   │
│  └─────────────────────┬───────────────────────────────┘   │
│                        │                                   │
│                        ▼                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │              Tools & Skills                         │   │
│  │  tools/ │ skills/ │ toolsets.py                      │   │
│  └─────────────────────┬───────────────────────────────┘   │
│                        │                                   │
│                        ▼                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │              Storage                                │   │
│  │  hermes_state.py │ memories/ │ sessions/             │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Core Components

1. AIAgent (run_agent.py)

The AIAgent class is the heart of Hermes:

class AIAgent:
    def __init__(self, config, tools, memory):
        self.config = config
        self.tools = tools
        self.memory = memory
        self.messages = []

    def run_conversation(self, user_input):
        """Main conversation loop"""
        # 1. Build system prompt
        system_prompt = self.build_system_prompt()

        # 2. Add user message
        self.messages.append({"role": "user", "content": user_input})

        # 3. Loop until complete
        while self.iterations < self.max_turns:
            # Call LLM
            response = self.call_llm(system_prompt, self.messages)

            # Process response
            if response.tool_calls:
                # Execute tools
                for tool_call in response.tool_calls:
                    result = self.execute_tool(tool_call)
                    self.messages.append({
                        "role": "tool",
                        "content": result
                    })
            else:
                # Return text response
                return response.content

2. Prompt Builder (agent/prompt.py)

The prompt builder constructs the system prompt:

def build_system_prompt(self):
    """Build the complete system prompt"""
    sections = []

    # 1. Base persona
    sections.append(self.load_persona())

    # 2. Memory injection
    sections.append(self.inject_memory())

    # 3. Skills context
    sections.append(self.load_skills())

    # 4. Tool schemas
    sections.append(self.get_tool_schemas())

    # 5. Context files (AGENTS.md, etc.)
    sections.append(self.load_context_files())

    return "\n\n".join(sections)

3. Tool Dispatch (model_tools.py)

Tools are dispatched through a central registry:

# tools/registry.py
class ToolRegistry:
    def __init__(self):
        self.tools = {}

    def register(self, name, schema, handler, check_fn=None):
        """Register a tool"""
        self.tools[name] = {
            "schema": schema,
            "handler": handler,
            "check_fn": check_fn
        }

    def dispatch(self, name, arguments, **kwargs):
        """Execute a tool"""
        tool = self.tools[name]

        # Check requirements
        if tool["check_fn"] and not tool["check_fn"]():
            return {"error": "Requirements not met"}

        # Execute handler
        return tool["handler"](arguments, **kwargs)

4. Context Compression (agent/compression.py)

When conversations get long, Hermes compresses context:

def compress_context(self, messages, target_ratio=0.2):
    """Compress conversation context"""

    # 1. Calculate current size
    current_size = self.calculate_token_count(messages)
    target_size = int(current_size * target_ratio)

    # 2. Protect recent messages
    protected = messages[-self.protect_last_n:]
    compressible = messages[:-self.protect_last_n]

    # 3. Summarize older messages
    summary = self.summarize_messages(compressible)

    # 4. Reconstruct messages
    compressed = [
        {"role": "system", "content": f"Previous context summary: {summary}"},
        *protected
    ]

    return compressed

Tool System Architecture

Tool Registration

Each tool is defined in tools/ directory:

# tools/terminal.py
import json
from tools.registry import registry

def terminal_handler(arguments, **kwargs):
    """Execute shell command"""
    command = arguments.get("command", "")

    # Execute command
    result = subprocess.run(
        command,
        shell=True,
        capture_output=True,
        text=True
    )

    return json.dumps({
        "output": result.stdout,
        "error": result.stderr,
        "exit_code": result.returncode
    })

registry.register(
    name="terminal",
    toolset="terminal",
    schema={
        "name": "terminal",
        "description": "Execute shell commands",
        "parameters": {
            "type": "object",
            "properties": {
                "command": {
                    "type": "string",
                    "description": "Command to execute"
                }
            },
            "required": ["command"]
        }
    },
    handler=terminal_handler,
    check_fn=lambda: True  # Always available
)

Toolset Organization

Tools are grouped into toolsets:

# toolsets.py
_HERMES_CORE_TOOLS = [
    "terminal",
    "file",
    "web",
    "browser",
    "vision",
    "memory",
    "skills",
    "delegation",
    "cronjob",
    # ...
]

def get_toolsets(platform="cli"):
    """Get available toolsets for platform"""
    return _PLATFORM_TOOLSETS.get(platform, _HERMES_CORE_TOOLS)

Memory System

Memory Storage

Memory is stored in ~/.hermes/memories/:

~/.hermes/memories/
├── user.json        # User profile
├── memory.json      # Environment facts
└── sessions/        # Session transcripts

Memory Injection

Memory is injected into every conversation:

def inject_memory(self):
    """Inject memory into system prompt"""
    sections = []

    # User profile
    user = self.load_user_profile()
    if user:
        sections.append(f"## User Profile\n{user}")

    # Memory facts
    memory = self.load_memory()
    if memory:
        sections.append(f"## Memory\n{memory}")

    return "\n\n".join(sections)

Skills System

Skill Loading

Skills are loaded from ~/.hermes/skills/:

def load_skills(self, skill_names=None):
    """Load skills into context"""
    skills = []

    for skill_dir in self.skills_dir.iterdir():
        skill_file = skill_dir / "SKILL.md"
        if skill_file.exists():
            # Parse frontmatter
            content = skill_file.read_text()
            frontmatter, body = parse_frontmatter(content)

            skills.append({
                "name": frontmatter["name"],
                "description": frontmatter["description"],
                "content": body
            })

    return skills

Skill Creation

Skills are created when Hermes learns something new:

def create_skill(self, name, description, content):
    """Create a new skill"""
    skill_dir = self.skills_dir / name
    skill_dir.mkdir(parents=True, exist_ok=True)

    skill_file = skill_dir / "SKILL.md"
    skill_file.write_text(f"""---
name: {name}
description: {description}
---

{content}
""")

Gateway Architecture

Platform Adapters

Each platform has an adapter:

# gateway/platforms/telegram.py
class TelegramAdapter:
    def __init__(self, config):
        self.bot = telegram.Bot(token=config["token"])

    async def handle_message(self, update, context):
        """Handle incoming message"""
        user_id = update.effective_user.id
        message = update.message.text

        # Process through agent
        response = await self.agent.process(message)

        # Send response
        await self.bot.send_message(
            chat_id=update.effective_chat.id,
            text=response
        )

Message Routing

Messages are routed through the gateway:

class Gateway:
    def __init__(self):
        self.adapters = {}
        self.agent = AIAgent()

    def register_adapter(self, name, adapter):
        """Register a platform adapter"""
        self.adapters[name] = adapter

    async def route_message(self, platform, message):
        """Route message to agent"""
        response = await self.agent.process(message)

        # Send back through platform
        await self.adapters[platform].send_response(response)

Credential Pool

Key Rotation

Multiple API keys rotate automatically:

class CredentialPool:
    def __init__(self, provider, keys):
        self.provider = provider
        self.keys = keys
        self.current_index = 0

    def get_next_key(self):
        """Get next available key"""
        key = self.keys[self.current_index]
        self.current_index = (self.current_index + 1) % len(self.keys)
        return key

    def mark_exhausted(self, key):
        """Mark a key as rate-limited"""
        # Move to next key
        pass

Configuration System

Config Structure

Configuration is stored in ~/.hermes/config.yaml:

model:
  default: claude-sonnet-4
  provider: anthropic
  api_key: ${ANTHROPIC_API_KEY}

agent:
  max_turns: 90
  tool_use_enforcement: auto

terminal:
  backend: local
  timeout: 180

compression:
  enabled: true
  threshold: 0.5
  target_ratio: 0.2

memory:
  memory_enabled: true
  user_profile_enabled: true

Environment Variables

Secrets are stored in ~/.hermes/.env:

ANTHROPIC_API_KEY=sk-ant-...
OPENROUTER_API_KEY=sk-or-...
TELEGRAM_TOKEN=123456:ABC...

Data Flow

Request Processing

User Input
┌─────────────────┐
│ Platform Adapter │ (Telegram, Discord, etc.)
└────────┬────────┘
┌─────────────────┐
│   Gateway       │ (Message routing)
└────────┬────────┘
┌─────────────────┐
│   AIAgent       │ (Core agent loop)
└────────┬────────┘
┌─────────────────┐
│ Prompt Builder  │ (System prompt construction)
└────────┬────────┘
┌─────────────────┐
│   LLM API       │ (Model inference)
└────────┬────────┘
┌─────────────────┐
│ Tool Dispatch   │ (Execute tools)
└────────┬────────┘
┌─────────────────┐
│ Response Gen    │ (Generate response)
└────────┬────────┘
┌─────────────────┐
│ Platform Adapter │ (Send response)
└─────────────────┘

Performance Considerations

Token Optimization

  • Context compression reduces token usage by 80%
  • Skill loading is lazy (only when needed)
  • Tool schemas are cached after first load

Memory Management

  • Session transcripts are stored in SQLite
  • Memory facts are compact JSON
  • Skills are loaded on-demand

Concurrency

  • Gateway handles multiple platforms concurrently
  • Delegation spawns subagents for parallel work
  • Cron jobs run in separate processes

Security Model

Command Approval

def should_approve(command):
    """Check if command needs approval"""
    dangerous_patterns = [
        r'rm\s+-rf',
        r'git\s+reset\s+--hard',
        r'dd\s+if=',
        r'mkfs',
    ]

    for pattern in dangerous_patterns:
        if re.search(pattern, command):
            return True

    return False

Secret Redaction

def redact_secrets(text):
    """Remove secrets from text"""
    patterns = [
        r'sk-[a-zA-Z0-9]{48}',  # API keys
        r'ghp_[a-zA-Z0-9]{36}',  # GitHub tokens
        r'AKIA[0-9A-Z]{16}',     # AWS keys
    ]

    for pattern in patterns:
        text = re.sub(pattern, '[REDACTED]', text)

    return text

Next Steps