Launch Week 5 Β· Day 4: Code evaluators β†’
IntegrationsHermes Agent
IntegrationsOtherHermes Agent
This is a Jupyter notebook

Integrate Langfuse with Hermes Agent

This notebook shows how to integrate Langfuse with Hermes Agent to trace, debug, and evaluate your agent's conversations, LLM calls, and tool usage.

What is Hermes Agent? Hermes Agent is a self-improving AI agent built by Nous Research. It features a built-in learning loop, persistent memory, autonomous skill creation, and support for any LLM provider. Hermes ships a bundled Langfuse observability plugin that traces every conversation turn, LLM request, and tool call.

What is Langfuse? Langfuse is an open-source LLM engineering platform that helps teams trace, debug, and evaluate their LLM applications.

The steps below follow Hermes' official Langfuse plugin docs β€” refer to them for the latest details.

Step 1: Install Dependencies

%pip install git+https://github.com/NousResearch/hermes-agent.git langfuse -U

Step 2: Set Up Environment Variables

Get your Langfuse keys from the project settings in Langfuse Cloud or set up self-hosting.

Hermes reads credentials from ~/.hermes/.env (the canonical location per the Hermes docs). Create the file with:

# ~/.hermes/.env
HERMES_LANGFUSE_PUBLIC_KEY=pk-lf-...
HERMES_LANGFUSE_SECRET_KEY=sk-lf-...
HERMES_LANGFUSE_BASE_URL=https://cloud.langfuse.com   # or your self-hosted URL

The plugin also accepts the standard SDK env vars (LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL); the HERMES_LANGFUSE_* variants win when both are set.

The cell below sets the same credentials inside this Python kernel so we can quickly verify them with the Langfuse SDK. Note: these os.environ values are scoped to the notebook process and will not be visible to a hermes chat command run in a separate terminal β€” use ~/.hermes/.env for that.

import os

# Get keys for your project from the project settings page: https://langfuse.com/cloud
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_BASE_URL"] = "https://cloud.langfuse.com" # πŸ‡ͺπŸ‡Ί EU region
# Other Langfuse data regions include πŸ‡ΊπŸ‡Έ US: https://us.cloud.langfuse.com, πŸ‡―πŸ‡΅ Japan: https://jp.cloud.langfuse.com and βš•οΈ HIPAA: https://hipaa.cloud.langfuse.com

# Reminder: for the Hermes CLI itself, place the same credentials in ~/.hermes/.env
# (as HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY / HERMES_LANGFUSE_BASE_URL).
# The plugin also accepts the standard LANGFUSE_* variables above.

With the environment variables set, initialize the Langfuse client to confirm your credentials work. Hermes uses its own internal client, so this step is purely a sanity check that your keys are valid.

from langfuse import get_client

langfuse = get_client()

# Verify connection
if langfuse.auth_check():
    print("Langfuse client is authenticated and ready!")
else:
    print("Authentication failed. Please check your credentials and host.")

Step 3: Enable the Langfuse Plugin

Hermes ships a bundled Langfuse observability plugin under plugins/observability/langfuse. Bundled plugins are discovered automatically but opt-in β€” they don't load until you explicitly enable them.

The plugin hooks into Hermes lifecycle events (pre_api_request / post_api_request, pre_tool_call / post_tool_call) to automatically capture:

  • One root span per conversation turn ("Hermes turn")
  • One generation observation per LLM API call
  • One tool observation per tool call

Session grouping uses the Hermes session ID (or task ID for sub-agents), so every turn within a hermes chat session lives under one Langfuse session. The plugin is also fail-open: missing SDK, missing credentials, or a transient Langfuse error all turn into a silent no-op β€” the agent loop is never impacted.

# Enable the Langfuse plugin (run this in your terminal, not in a notebook)
# hermes plugins enable observability/langfuse

# Or check the box in the interactive plugin manager:
# hermes plugins

# Or add it to ~/.hermes/config.yaml:
# plugins:
#   enabled:
#     - observability/langfuse

# Verify it is enabled:
# hermes plugins list  # observability/langfuse should show "enabled"

Step 4: Run Hermes and Generate a Trace

With the plugin enabled and credentials set, every Hermes conversation turn is automatically traced to Langfuse. Each trace captures:

  • Conversation turns as the root span ("Hermes turn")
  • LLM calls as generation observations with model, usage, cost, and latency
  • Tool calls as tool observations with input arguments and results
  • Token usage and cost broken down by input, output, cache, and reasoning tokens

You can start a conversation from the CLI:

# Send a one-off message (traces are sent automatically):
# hermes chat -q "hello"

# Or start a full interactive session:
# hermes chat

Optional: Tune Tracing Behavior

The Hermes Langfuse plugin supports several optional environment variables:

VariableDescriptionDefault
HERMES_LANGFUSE_ENVEnvironment tag (e.g. production, staging)β€”
HERMES_LANGFUSE_RELEASERelease/version tagβ€”
HERMES_LANGFUSE_SAMPLE_RATESampling rate 0.0–1.01.0
HERMES_LANGFUSE_MAX_CHARSMax characters per traced field12000
HERMES_LANGFUSE_DEBUGVerbose plugin logging (true/false)false

Set these in ~/.hermes/.env or export them in your shell before starting Hermes.

Step 5: View Traces in Langfuse

After running the example, open Langfuse Cloud to see the full trace including prompts, completions, tool calls, token usage, and latency.

Further Reading

Interoperability with the Python SDK

You can use this integration together with the Langfuse SDKs to add additional attributes to the observation.

The @observe() decorator provides a convenient way to automatically wrap your instrumented code and add additional attributes to the observation.

from langfuse import observe, propagate_attributes, get_client

langfuse = get_client()

@observe()
def my_llm_pipeline(input):
    # Add additional attributes (user_id, session_id, metadata, version, tags) to all spans created within this execution scope
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        tags=["agent", "my-observation"],
        metadata={"email": "user@langfuse.com"},
        version="1.0.0"
    ):

        # YOUR APPLICATION CODE HERE
        result = call_llm(input)

        return result

# Run the function
my_llm_pipeline("Hi")

Learn more about using the Decorator in the Langfuse SDK instrumentation docs.

The Context Manager allows you to wrap your instrumented code using context managers (with with statements), which allows you to add additional attributes to the observation.

from langfuse import get_client, propagate_attributes

langfuse = get_client()

with langfuse.start_as_current_observation(
    as_type="span",
    name="my-observation",
    trace_context={"trace_id": "abcdef1234567890abcdef1234567890"},  # Must be 32 hex chars
) as observation:

    # Add additional attributes (user_id, session_id, metadata, version, tags)
    # to all observations created within this execution scope
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        metadata={"experiment": "variant_a", "env": "prod"},
        version="1.0",
    ):
        # YOUR APPLICATION CODE HERE
        result = call_llm("some input")

# Flush events in short-lived applications
langfuse.flush()

Learn more about using the Context Manager in the Langfuse SDK instrumentation docs.

Troubleshooting

No observations appearing

First, enable debug mode in the Python SDK:

export LANGFUSE_DEBUG="True"

Then run your application and check the debug logs:

  • OTel observations appear in the logs: Your application is instrumented correctly but observations are not reaching Langfuse. To resolve this:
    1. Call langfuse.flush() at the end of your application to ensure all observations are exported.
    2. Verify that you are using the correct API keys and base URL.
  • No OTel spans in the logs: Your application is not instrumented correctly. Make sure the instrumentation runs before your application code.
Unwanted observations in Langfuse

The Langfuse SDK is based on OpenTelemetry. Other libraries in your application may emit OTel spans that are not relevant to you. These still count toward your billable units, so you should filter them out. See Unwanted spans in Langfuse for details.

Missing attributes

Some attributes may be stored in the metadata object of the observation rather than being mapped to the Langfuse data model. If a mapping or integration does not work as expected, please raise an issue on GitHub.

Next Steps

Once you have instrumented your code, you can manage, evaluate and debug your application:


Was this page helpful?

Last edited