EMR on EC2
Model Context Protocol for EMR operations
Initial diagnosis for EMR failures, built for AWS DevOps Agent.
Harrier collects bounded evidence from EMR APIs, Spark logs, CloudWatch, S3, and optional Kubernetes diagnostics, then turns the run into a readable triage report instead of another raw log dump.
Initial Triage
- Infrastructure NOT CHECKED
- Data NOT CHECKED
- Spark Runtime ISSUE
- Driver NOT CHECKED
- Shuffle ISSUE
- Executors NOT CHECKED
- Observability PASS
Runtime-aware, one report contract
Different EMR shapes. Same investigation language.
EMR Serverless
Job runs without cluster guesswork
Serverless application metadata, job run state, monitoring config, S3 logs, CloudWatch logs, and worker sizing evidence stay tied to the same report model.EMR on EKS
Spark plus pod reality
EMR Containers evidence is combined with optional read-only Kubernetes pod diagnostics so scheduling, image pull, and eviction failures do not hide behind Spark errors.Readable before it is exhaustive
Harrier separates checked evidence from open questions.
The report is explicit that these are initial checks. Items that were not evaluated are marked NOT CHECKED, while attempted checks with incomplete evidence become INCONCLUSIVE.
View production report screenshots| Area | Status | Initial read |
|---|---|---|
| Infrastructure | NOT CHECKED | IAM, S3, KMS, bootstrap, cluster capacity |
| Data | NOT CHECKED | Input path, schema, bad records, SQL, output path |
| Spark Runtime | ISSUE | Shuffle spill found in executor logs |
| Observability | PASS | Driver and executor log evidence available |
| Configuration | NOT CHECKED | Spark config and sizing need follow-up |
Task reported memory bytes spilled and disk bytes spilled.
Spark shuffle spill is slowing the job or exhausting local disk.
Inspect failed stages, spill volume, skew, and shuffle partitions.
Built as a headless MCP server
AWS DevOps Agent owns the conversation. Harrier owns the evidence model, collectors, classifier, human diagnosis report, and dry-run recommendation preview.
Explore the demo labOperational safety
Designed for investigation, not surprise mutation.
Bounded reads
Harrier reads scoped operational evidence instead of crawling entire buckets or clusters.
Redaction
Common secret patterns are redacted, and log text is treated as untrusted input.
Dry-run first
PR recommendations are advisory unless explicit repository write guardrails are enabled.