Haymaker M365 Workloads

M365 Knowledge Worker workloads for Agent Haymaker platform

View the Project on GitHub rysweet/haymaker-m365-workloads

Architecture

Technical architecture of the M365 Knowledge Worker workload.

Module Structure

haymaker_m365_workloads/
├── __init__.py                  Package entry; exports M365KnowledgeWorkerWorkload
├── workload.py                  Main workload class (deploy, stop, cleanup, status)
├── identity/
│   ├── __init__.py
│   └── user_manager.py          EntraUserManager -- creates/deletes Entra ID users
├── operations/
│   ├── __init__.py
│   └── orchestrator.py          ActivityOrchestrator -- runs the activity loop
├── content/
│   ├── __init__.py
│   ├── email_generator.py       EmailGenerator -- LLM + template fallback
│   └── prompts.py               Department-aware prompt templates
└── models/
    ├── __init__.py
    └── worker.py                WorkerIdentity, WorkerConfig, Department, ActivityPattern

workload.py

M365KnowledgeWorkerWorkload extends WorkloadBase from the agent-haymaker platform. It is the top-level entry point registered via the workload.yaml manifest:

package:
  entrypoint: haymaker_m365_workloads:M365KnowledgeWorkerWorkload

Responsibilities:

identity/user_manager.py

EntraUserManager handles all Entra ID (Azure AD) operations through the Microsoft Graph API:

Worker credentials are ephemeral: passwords are generated at creation time and never stored.

operations/orchestrator.py

ActivityOrchestrator runs the continuous activity generation loop:

content/email_generator.py

EmailGenerator produces realistic email content:

models/worker.py

Data models used across the workload:

DEPARTMENT_PATTERNS is a dictionary mapping each Department to its ActivityPattern.

Platform Integration

The workload integrates with agent-haymaker through the WorkloadBase interface:

agent-haymaker platform
  └── WorkloadBase (abstract)
        └── M365KnowledgeWorkerWorkload
              ├── deploy(config) -> deployment_id
              ├── get_status(deployment_id) -> DeploymentState
              ├── stop(deployment_id) -> bool
              ├── cleanup(deployment_id) -> CleanupReport
              ├── get_logs(deployment_id) -> AsyncIterator[str]
              ├── validate_config(config) -> list[str]
              └── list_deployments() -> list[DeploymentState]

The platform handles:

Deployment Lifecycle

A deployment moves through the following phases:

PENDING
  │
  ▼
creating_workers    Create Entra ID users via Graph API
  │
  ▼
executing           Activity loop running (email, Teams, calendar, documents)
  │
  ├──▶ stopped      Manual stop via `haymaker stop`
  │
  ▼
cleanup             Delete Entra users and resources via `haymaker cleanup`
  │
  ├──▶ completed    All resources removed
  │
  └──▶ failed       Cleanup error (resources may remain)

State transitions are persisted through the platform’s save_state mechanism, allowing deployments to be tracked across process restarts.

Activity Generation Loop

The orchestrator runs a continuous loop with one-minute cycles:

while running:
    if duration_hours exceeded:
        break

    for each worker:
        if outside work hours (08:00-17:00 UTC):
            skip

        if random() < email_per_hour / 60:
            send email (via EmailGenerator if available)

        if random() < teams_per_hour / 60:
            send Teams message

        if random() < docs_per_day / 480:
            create document

    sleep 60 seconds

Each activity:

  1. Is probabilistically triggered based on the department’s per-hour rate
  2. Generates content (AI or template for email; placeholder for Teams/documents)
  3. Logs the activity with timestamp and worker identity
  4. Fires the on_activity callback for telemetry aggregation

The per-minute probability is rate_per_hour / 60, which produces the expected hourly rate over time while maintaining natural variance between cycles.