AI

Deploying an Ambient AI Email Assistant

Deploying an Ambient AI Email Assistant - Edge vs Cloud Deployment Options Guide
Deploying an Ambient AI Email Assistant - Edge vs Cloud Deployment Options Guide

Deploying an Ambient AI Email Assistant

A Practical Guide for Teams Considering Edge and Cloud Deployment


Overview

This guide covers deployment options for ambient AI agents, specifically focusing on the LangChain/LangGraph email assistant architecture. We’ll address the common question: “Can I deploy this on Cloudflare Workers?”

Short answer: Not directly. LangGraph requires stateful infrastructure that Cloudflare Workers doesn’t provide. However, there are hybrid approaches and alternatives worth considering.


Understanding the Architecture

What Makes Ambient Agents Different

Ambient agents aren’t simple request-response APIs. They require:

  1. Persistent State – Agents maintain conversation history, memory, and workflow state
  2. Scheduled Execution – Cron jobs to check for new emails (e.g., every 10 minutes)
  3. Long-Running Processes – Complex workflows that may take minutes to complete
  4. Human-in-the-Loop Checkpoints – Ability to pause and wait for human input

The LangGraph Email Assistant Components

Based on LangChain’s Executive AI Assistant (EAIA):

Component Purpose
Email Triage Classifies emails: ignore, notify, or respond
Draft Response Generates email drafts for review
Calendar Integration Checks availability, schedules meetings
Memory Store Learns from user feedback over time
Agent Inbox UI Human-in-the-loop review interface

Human-in-the-Loop Flow


Deployment Options

Deployment Options Comparison

Option 1: LangGraph Platform (Recommended)

The simplest path to production. LangChain provides managed infrastructure specifically designed for ambient agents.

Setup Steps:

  1. Fork the EAIA repository
  2. Create a LangSmith Plus account
  3. Deploy via LangSmith dashboard
  4. Configure environment variables (API keys)
  5. Set up the cron job for email polling

Advantages:

  • Built-in persistence and checkpointing
  • Native cron job support
  • Integrated with Agent Inbox UI
  • Handles scaling automatically

Cost: Requires LangSmith Plus subscription


Option 2: Self-Hosted (Traditional Server)

For teams wanting full control or with specific compliance requirements.

Infrastructure Required:

  • Compute: Any cloud VM (AWS EC2, GCP Compute, Azure VM, DigitalOcean Droplet)
  • Database: PostgreSQL for state persistence
  • Redis: For caching and task queuing (optional)
  • Scheduler: systemd timer, cron, or Celery beat

Basic Setup:

# Clone the repository
git clone https://github.com/langchain-ai/executive-ai-assistant
cd executive-ai-assistant

# Create virtual environment
python -m venv venv
source venv/bin/activate
pip install -e .

# Configure
cp eaia/main/config.yaml.example eaia/main/config.yaml
# Edit config with your details

# Set up Google OAuth
python scripts/setup_gmail.py

# Run the development server
pip install -U "langgraph-cli[inmem]"
langgraph dev

# Set up cron for production
crontab -e
# Add: */10 * * * * /path/to/venv/bin/python /path/to/scripts/run_ingest.py --minutes-since 15

Advantages:

  • Full control over infrastructure
  • No vendor lock-in
  • Can be deployed in private networks

Challenges:

  • Must manage uptime, scaling, backups
  • More operational overhead

Option 3: Hybrid Architecture with Cloudflare Workers

Why Direct LangGraph on Workers Doesn’t Work:

Cloudflare Workers limitations:

  • 128 MB memory limit
  • Short execution windows (CPU time limits)
  • No persistent state between invocations
  • V8 isolate restrictions (JavaScript/TypeScript/WASM only)

LangGraph requirements:

  • Persistent state for conversation history
  • Long-running workflows (can take minutes)
  • Complex memory and checkpointing

Hybrid Approach:

Use Cloudflare Workers for lightweight, edge-native tasks while keeping the core agent on traditional infrastructure.

Hybrid Architecture Diagram

Cloudflare Worker Code Example (Webhook Handler):

// wrangler.toml
// name = "email-agent-gateway"
// main = "src/index.ts"

export interface Env {
  LANGGRAPH_API_URL: string;
  LANGGRAPH_API_KEY: string;
}

export default {
  async fetch(request: Request, env: Env): Promise {
    const url = new URL(request.url);

    // Validate incoming webhook
    if (request.method !== 'POST') {
      return new Response('Method not allowed', { status: 405 });
    }

    // Rate limiting (using Workers KV or Durable Objects)
    // ... rate limit logic here ...

    // Forward to LangGraph backend
    const payload = await request.json();

    const response = await fetch(`${env.LANGGRAPH_API_URL}/runs`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${env.LANGGRAPH_API_KEY}`,
      },
      body: JSON.stringify({
        assistant_id: 'main',
        input: payload,
        config: { configurable: { thread_id: payload.thread_id } },
      }),
    });

    return new Response(JSON.stringify({ status: 'queued' }), {
      headers: { 'Content-Type': 'application/json' },
    });
  },
};

When to Use This Pattern:

  • You need global edge presence for webhooks
  • DDoS protection and rate limiting at the edge
  • Low-latency request validation before hitting your backend
  • Geographic distribution for initial request handling

Option 4: Alternative Edge-Native Approaches

If you specifically need edge-native deployment, consider these alternatives to LangGraph:

1. Cloudflare Workers AI (Simple Inference)

For simple, stateless LLM inference at the edge (not full agent workflows):

// Cloudflare Worker with Workers AI
export default {
  async fetch(request: Request, env: Env): Promise {
    const { messages } = await request.json();

    const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
      messages,
    });

    return new Response(JSON.stringify(response));
  },
};

Limitations: No persistence, no complex workflows, no human-in-the-loop. Suitable for simple Q&A but not for ambient agent patterns.

2. Cloudflare Durable Objects + Workers AI

For simple stateful agents:

export class EmailAgentDO {
  state: DurableObjectState;

  constructor(state: DurableObjectState) {
    this.state = state;
  }

  async fetch(request: Request): Promise {
    // Persistent state within the Durable Object
    const history: string[] = await this.state.storage.get('history') || [];
    const { message } = await request.json() as { message: string };

    // Process with AI (simplified example)
    const aiResponse = await this.processMessage(message);

    // Save state
    await this.state.storage.put('history', [...history, message, aiResponse]);

    return new Response(JSON.stringify({ response: aiResponse }));
  }

  async processMessage(message: string): Promise {
    // Your AI logic here
    return `Processed: ${message}`;
  }
}

Limitations: Still constrained by execution time limits, no built-in HITL patterns


Configuration Guide

Required Environment Variables

# LLM API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Google OAuth (for Gmail access)
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...

# LangSmith (for observability)
LANGCHAIN_API_KEY=...
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=email-assistant

# For hybrid Cloudflare setup
LANGGRAPH_API_URL=https://your-deployment.langchain.app
LANGGRAPH_API_KEY=...

Email Assistant Configuration

Create eaia/main/config.yaml:

email: your-email@domain.com
full_name: Your Full Name
name: FirstName
background: |
  Senior Product Manager at TechCorp.
  Focus areas: AI products, enterprise solutions.
timezone: America/New_York

schedule_preferences: |
  - Default meeting length: 30 minutes
  - Prefer afternoon meetings
  - No meetings before 9am or after 6pm

response_preferences: |
  - Include calendar link for scheduling requests
  - CC assistant on important threads

triage_no: |
  - Marketing newsletters
  - Automated notifications
  - Out of office replies

triage_notify: |
  - Legal documents requiring signature
  - Messages from executives

triage_email: |
  - Direct questions requiring response
  - Meeting requests
  - Project updates needing acknowledgment

Connecting to Agent Inbox

The Agent Inbox provides the human-in-the-loop interface for reviewing agent actions.

Setup:

  1. Go to agentinbox.ai
  2. Log in with Google
  3. Click Settings → Add Inbox
  4. Configure:
    • Assistant/Graph ID: main
    • Deployment URL: Your LangGraph deployment URL
    • Name: Production Email Assistant

Interrupt Types:

Type Description Actions
Notify Flags important emails Ignore, Mark Resolved, Respond
Question Agent needs clarification Respond, Ignore
ResponseEmailDraft Draft ready for review Edit/Accept, Respond, Ignore
Schedule Meeting invite ready Edit/Accept, Respond, Ignore

Production Checklist

Before Go-Live

  • [ ] OAuth tokens refreshing correctly
  • [ ] Cron job running reliably (monitor for failures)
  • [ ] Error alerting configured (Slack, email, PagerDuty)
  • [ ] Rate limits in place for API calls
  • [ ] Memory/storage limits understood and monitored
  • [ ] Human review process documented for team
  • [ ] Rollback plan documented

Monitoring

# Example: Log agent actions for monitoring
from datetime import datetime, timedelta
from langsmith import Client

client = Client()

# Get runs for your project
runs = client.list_runs(
    project_name="email-assistant",
    execution_order=1,
    start_time=datetime.now() - timedelta(hours=24)
)

# Check for errors
errors = [r for r in runs if r.error]
if errors:
    # Replace with your alerting mechanism (Slack, email, PagerDuty, etc.)
    print(f"ALERT: Found {len(errors)} agent errors in last 24h")

Summary

Deployment Option Best For Cloudflare Involvement
LangGraph Platform Quick start, managed None (or edge gateway)
Self-Hosted Full control, compliance Optional edge gateway
Hybrid Global presence + AI backend Edge webhooks + rate limiting
Pure Edge (Workers AI) Simple, stateless inference Full (but limited AI capabilities)

Recommendation: Start with LangGraph Platform for fastest time-to-value. Consider hybrid architecture only if you need edge-specific capabilities like global webhook handling or DDoS protection.


Resources


Guide prepared by Emerge Digital