Coming Soon to AWS Marketplace
Set it and forget it — no threshold tuning

ML-Powered Anomaly Detection
for AWS Metrics

Automatically learns normal behavior per metric and alerts on deviations. Correlates anomalies with other metrics for instant context. AI-powered discovery wizard finds what to monitor. No static thresholds. No manual tuning. Deploy in minutes.

AI Monitor tells you something is wrong. Log Processor tells you why.

View Pricing Learn More

The Problem

  • Static thresholds break when traffic patterns change
  • Manual threshold tuning wastes engineering time
  • Alert fatigue from false positives kills response time
  • Data leaves your account to third-party SaaS
  • "CPU spiked" — but why? Metrics alone lack context
  • Maintenance deployments trigger noise every release
  • Per-metric SaaS pricing scales unpredictably, are you tired of surprise bills from per-metric SaaS subscriptions?
  • Multi-account visibility requires custom glue
  • Setting up monitoring requires knowing exact metric names and dimensions

The Solution

  • ML baseline learns normal — adapts to daily and day-of-week cycles
  • Zero configuration after subscription — no thresholds to set
  • Configurable sensitivity per metric reduces false positives
  • Runs entirely in your VPC — no data egress
  • Anomaly alerts include correlated metrics for incident context
  • Maintenance windows suppress alerts during deployments
  • Flat monthly fee — no per-metric charges
  • Cross-account metrics in one view (advanced/enterprise)
  • AI discovery wizard — Describe what you want to monitor, get a ready-to-use subscription, and receive natural language explanations for every anomaly (advanced/enterprise)
AI Monitor Architecture Overview
details...

Features

Subscribe to metrics, let ML learn, get alerted when something deviates.

Adaptive Baselines

Per-day-of-week Random Cut Forest models learn that "Mondays are busier than Sundays." Separate baselines per day eliminate weekly pattern false positives. Adapts continuously — no retraining needed.

Metric Stream Ingestion

CloudWatch Metric Streams → Firehose → S3 → Lambda → OpenSearch. Sub-minute latency. Editor syncs stream filters — only pay for namespaces you subscribe to.

Metric-to-Metric Correlation

When an anomaly fires, automatically checks what other metrics also deviated in the same time window. Alert includes correlated metrics — "CPU spiked and 3 other metrics were also anomalous."

Maintenance Suppression

Schedule recurring or one-time maintenance windows. Anomalies during these windows are suppressed — no alert noise during planned deployments.

Subscription Editor

Web UI to browse CloudWatch namespaces, select metrics + dimensions, set sensitivity thresholds, generate/schedule reports and manage maintenance windows. Secured with Cognito authentication — one subscription can monitor all resources independently.

Cross-Account Metrics

Remote accounts stream metrics into your central Firehose via a cross-account IAM role. S3 partitioned by account/region/date/hour — full visibility from one pane.

Dynamic Stream Filtering

Editor save updates the Metric Stream's IncludeFilters to only stream subscribed namespaces. Zero subscriptions = stream stopped. Minimizes Firehose/S3 costs.

Pairs with Log Processor

Two products, one story. AI Monitor detects metric anomalies; Log Processor surfaces the root cause from your logs. Deploys as a guest on Log Processor's VPC and OpenSearch — shared infrastructure, zero duplication, single-pane visibility across metrics and logs.

Cost Anomaly Detection

Daily billing processor computes day-over-day spend deltas per AWS service. Detects unusual cost spikes using day-of-week baselines — "RDS costs 3x more than a normal Thursday."

AI Discovery Wizard

Scans your account for all available CloudWatch metrics. Browse namespaces, see what's active, click to subscribe. Suggested prompts pre-fill subscriptions with smart defaults — or type what you want in natural language and let AI configure it (advanced+).

Comprehensive Reports

Scheduled or on-demand HTML reports with severity breakdown, day-of-week trends, top anomalous metrics, per-account breakdown, cost analysis, maintenance window stats, baseline training status, and system health checks.

Cross-Account Metric Ingestion - Advanced and Above Tiers

Centralize Metrics from multiple AWS accounts into a single pipeline.

Cross-Account Metric Architecture

How It Works

1

Deploy

Launch the CloudFormation template. Provide your Log Processor stack name and tier.

2

Subscribe

Open the editor, browse namespaces, select metrics. The Metric Stream starts flowing.

3

Baseline

The detector builds a model per metric over the training period (7-14 days). No alerts during this phase.

4

Detect

Once baseline is ready, deviations trigger alerts with anomaly score + correlated metrics.

Simple, Predictable Pricing

Software fee only. AWS infrastructure costs (Firehose, S3, Lambda) billed directly by AWS.

Basic

$49/mo
  • 15 metric subscriptions
  • 5-minute collection interval
  • 15-minute detection interval
  • 7-day baseline period
  • SNS email alerts
  • Subscription editor
  • Metric preview
  • Supports company SSO (SAML/OIDC)

Essential

$99/mo
  • Everything in Basic, plus:
  • 25 metric subscriptions
  • 1-minute collection
  • 5-minute detection

Advanced

$299/mo
  • Everything in Compact, plus:
  • 200 metric subscriptions
  • 14-day baseline period
  • Cross-account metrics
  • Anomaly reports (scheduled + on-demand)
  • Per-subscription retention override
  • Editor roles (RBAC)
  • AI-powered metric discovery & suggestions
  • AI anomaly explanation in alerts
  • DynamoDB point-in-time recovery
  • Email support (5 business days)

Enterprise

$499/mo
  • Everything in Advanced, plus:
  • 1,000 metric subscriptions
  • Full cross-account delivery
  • Email support (2 business days)
  • Onboarding call + quarterly review

Frequently Asked Questions

Does AI Monitor require Log Processor?

Yes. AI Monitor deploys as a guest on a running Log Processor stack — it uses the same VPC, subnets, and OpenSearch domain via SSM parameter lookup. The Log Processor owns the cluster; AI Monitor writes to its own index pattern (metrics-*) and reads other metric baselines for correlation.

Can I use my company's SSO (Okta, Azure AD, etc.)?

Yes. Add your SAML or OIDC identity provider to the Cognito User Pool using the sso helper script. Users will see a “Sign in with [Provider]” button on the login page. After first login, assign groups with groups for role-based access. See the helper scripts README for details.

How does anomaly detection work without thresholds?

Each metric subscription gets its own Random Cut Forest model that maintains a sliding window of recent values. After the baseline period, the detector computes a z-score based anomaly score (0-10) for each new datapoint. Scores above the configurable threshold (default 3.0) trigger alerts. The model continuously adapts — daily patterns, gradual drift, and weekly cycles are learned automatically via per-day-of-week baselines.

What is the correlation engine?

When an anomaly is detected (compact tier and above), the correlation engine checks which other subscribed metrics also showed unusual behavior in the same time window (z-score > 2.0). The alert includes these correlated metrics with their deviation scores — helping you determine if an anomaly is isolated or part of a broader incident.

How do maintenance windows work?

Create recurring schedules (e.g. "Wednesdays 02:00-04:00 UTC") or one-time windows in the editor. Anomalies detected during these periods are recorded but suppressed — no alerts fire. Apply windows globally or to specific subscriptions.

Maintenance windows example

Does my data leave my AWS account?

No. Everything runs inside your VPC. Metric Streams → Firehose → S3 → Lambda → OpenSearch — all within your account. The only external call is a periodic Marketplace entitlement check.

How does cross-account work?

On advanced/enterprise tiers, provide remote account IDs during deployment. The stack creates a cross-account IAM role that remote accounts assume to PutRecord into your Firehose. Remote accounts deploy their own CloudWatch Metric Stream targeting this role. Data lands in S3 partitioned by source account ID — all metrics queryable from a single OpenSearch instance.

What happens when I add/remove subscriptions?

The editor immediately updates the CloudWatch Metric Stream's IncludeFilters to only stream namespaces with active subscriptions. Adding the first subscription in a namespace starts streaming; removing the last stops it. Zero subscriptions = stream paused entirely — no Firehose/S3 costs when idle.

Can I run multiple AI Monitor instances?

Yes. Each AI Monitor stack is fully independent — deploy multiple stacks with different names, each paired to a specific Log Processor instance. For example, deploy AIMonProd targeting LogProcProd and AIMonDev targeting LogProcDev. Each gets its own metric stream, Firehose, S3 buckets, DynamoDB, editor, ALB, and Cognito. No resource collisions, no shared state. Useful for separate environments (dev/staging/prod), different teams, or different AWS accounts with their own Log Processor deployments.

Can I send alerts to Slack, Teams, or PagerDuty instead of email?

Yes. The alert topic is a standard SNS topic — add your own subscriptions for any protocol SNS supports: HTTPS (webhooks), Lambda, SQS, SMS, etc. For Slack/Teams, use AWS Chatbot or subscribe an HTTPS endpoint to the topic with your webhook URL. For PagerDuty/OpsGenie, subscribe their SNS integration endpoint. You can have email and webhook subscribers active simultaneously.

Every alert includes SNS message attributes: severity (high/medium/low), score (0-10), namespace, metricName, subscriptionId, description, and accountId. Use SNS subscription filter policies to route by any combination — for example, only page on-call for {"severity": ["high"]} (score ≥ 7), route a specific subscription to a dedicated Slack channel, or escalate only production account anomalies.

Can I automate responses to anomalies?

Yes. Subscribe a Lambda function, SQS queue, or HTTPS webhook to the SNS alert topic. Use SNS filter policies on message attributes (severity, score, namespace, metricName) to trigger specific automations — for example, auto-scale an ECS service only when {"severity": ["high"], "namespace": ["AWS/ECS"]}, create a Jira ticket for all anomalies, or invoke a Step Functions workflow for multi-step remediation. Any tool that accepts SNS or webhooks works: Zapier, n8n, EventBridge, custom APIs.

Can I change tiers later?

Yes. Update the CloudFormation stack with a new tier parameter. Subscription limits, detection intervals, and feature gates update immediately. Baselines and anomaly history are preserved in DynamoDB.

How does AI Monitor learn daily and weekly patterns?

The detector maintains separate baselines per day of week (Monday through Sunday). Each day builds its own sliding window of normal values over time. After approximately one week, the system knows that "Monday morning traffic" is different from "Sunday night quiet" and scores accordingly. Until per-day data is sufficient, it falls back to an overall baseline. This eliminates the most common source of false positives — weekly traffic cycles.

Will I get false positives while the baseline is still training?

Possibly. The detector begins alerting after just 15 data points (~75 minutes), even though full training requires 288 points per day-of-week (~24 hours of data per weekday). During early training:

• Alerts are marked "⚠ Low confidence" in the email subject when training is below 50%
• The alert body shows training percentage (e.g. "Fri baseline: 35% trained")
• The anomaly detail in the editor displays training status

This is by design — a genuine spike from 200ms to 5000ms should alert immediately even with a thin baseline. To suppress all alerts during initial learning, set the baselineDays field on the subscription (e.g. 7). The detector will score but not alert until that period passes. After one full week, day-of-week baselines stabilize and false positives drop significantly.

Can I monitor all AWS services with one subscription?

Yes. Use regex wildcards in dimension values. For example, setting dimensions to {"FunctionName": ".*"} monitors every entity independently — each gets its own baseline. One subscription, unlimited resources.

Wildcard subscription example

Can I suppress email alerts but still record anomalies?

Yes. Each subscription has an "Email" toggle (on/off) and an "Email threshold" (0-10). Set Email to No for fully silent recording — anomalies are stored in DynamoDB and visible in the editor but no notifications are sent. Or set Email threshold to 5.0 to only receive emails for significant anomalies while low-severity ones are recorded silently.

What do the trend indicators mean?

The subscription list shows arrows indicating anomaly frequency trends: ▲ red (worsening — more anomalies this week than last), ▼ green (improving — fewer this week), ▶ grey (stable), ★ blue (new — no prior week data). A persistently worsening trend suggests a systemic issue beyond individual alert triage.

What AWS infrastructure costs should I expect?

Typical costs for a compact tier deployment with 20 active metrics: Metric Streams ~$3/mo, Firehose ~$1/mo, S3 ~$0.50/mo, Lambda (collector, detector, billing, discovery) ~$2/mo, DynamoDB ~$0.50/mo. Total infrastructure ~$7-12/mo on top of the software fee. OpenSearch is shared with Log Processor — no additional cluster cost. On advanced+ tiers, the Bedrock VPC endpoint adds ~$7/mo and AI suggestions cost ~$0.001 per query. Cross-account adds a Metric Stream per remote account (~$3/mo each).

How does cost anomaly detection work?

A daily billing processor queries AWS/Billing EstimatedCharges data from OpenSearch, computes the day-over-day spend delta per service, and writes derived DailySpend metrics back. The detector then scores these like any other metric using day-of-week baselines. This means it learns that "Tuesdays cost more than Sundays" and only alerts on genuine cost spikes — not normal weekly patterns. Subscribe to AIMonitor/Billing / DailySpend with {"ServiceName": ".*"} to monitor all services independently.

How does the discovery wizard work?

A scheduled Lambda calls CloudWatch ListMetrics every 6 hours and caches all available namespaces, metrics, and dimension values. The editor merges this with a comprehensive AWS metrics catalog (40+ services) to show you everything you could monitor. Active metrics are highlighted; inactive ones can be shown via a toggle. Click "+ Subscribe" to pre-fill the form with smart defaults including wildcard dimensions and suggested thresholds. Click the ↻ refresh icon to update the cache on demand (~30 seconds) when you've added new resources.

What's included in an anomaly report?

Each report is a self-contained HTML page covering a configurable lookback window (default 7 days). It includes: total anomalies with severity breakdown (high/medium/low), anomaly trend direction per subscription, top anomalous metrics ranked by count, day-of-week distribution, top dimensions (resources), per-account breakdown, cost analysis (daily spend vs 7-day average with % change), maintenance window details (type/schedule/duration/suppressed count), baseline training status, and full system health checks across all Lambda functions. View a sample report

Is my anomaly data backed up?

On advanced and enterprise tiers, DynamoDB point-in-time recovery (PITR) is automatically enabled. This provides continuous backups with restore capability to any second within the last 35 days. Baselines, anomaly history, and maintenance state are all protected. On lower tiers, baselines rebuild automatically from incoming data within 1-2 weeks if lost.

How do I enable AI features (discovery suggestions, anomaly explanations, and tuning)?

AI features require Amazon Bedrock model access. In the AWS Console, go to Bedrock → Model access → Modify model access → enable Anthropic Claude models and submit the use case form. Activation takes approximately 15 minutes. Once approved, AI suggestions in the discovery wizard and anomaly explanations in alerts will work automatically. AI features are available on advanced and enterprise tiers only.

What are the AI-powered suggestions?

On advanced and enterprise tiers, the discovery wizard shows a natural language prompt. Type something like "monitor my Lambda errors" or "alert me if S3 costs spike" and Bedrock (Claude) returns a fully-configured subscription suggestion based on your actual available metrics. Lower tiers get the same suggested prompts as clickable chips that pre-fill the form using hardcoded smart mappings — no AI call needed.

AI discovery wizard example

What is the AI anomaly explanation?

On advanced and enterprise tiers, each anomaly alert includes an AI-generated analysis section. Bedrock (Claude) receives the anomaly context — metric name, current value, baseline mean, score, and any correlated metrics — and produces a 2-3 sentence explanation: likely cause, what to investigate first, and whether it's probably a real issue or noise. Labeled "AI Analysis (experimental)" in the email. Costs ~$0.0003 per alert.

AI anomaly explanation example

What is the AI assisted tuning?

On advanced and enterprise tiers, in Recent Anomalies, click on the subscription link to view its details, click the "Tune" button and Bedrock (Claude) leverages recurring anomaly history and suggests changes which you can apply, costs ~$0.0003 per tune.

AI anomaly tuning example

Does cost monitoring work with multiple accounts?

Yes. If cross-account metric streams include AWS/Billing from linked accounts, the billing processor computes per-account per-service deltas. Alerts include the account ID so you know exactly which account is spending more. Note: AWS/Billing metrics only exist in the payer (management) account unless billing access is delegated to member accounts.

Documentation

Guides and reference materials

Getting Started

Prerequisites, deployment, post-setup, and first subscription walkthrough.

User Guide (PDF)

Comprehensive reference covering creating users, groups and supporting SSO.

Editor Guide (PDF)

Subscription editor UI reference: fields, preview, discovery, maintenance, reports.

Monitoring Guide (PDF)

Comprehensive reference covering monitoring features, and configuration, and operations.

Operational Scripts

Helper scripts: users, groups, support-bundle, cross-account setup (cmd + sh).

Support

Included with all subscriptions: documentation, bug fixes, security patches, and support.

GitHub Support Portal

Report bugs, request features, and track issue resolution.

View on GitHub

Professional services: consulting@perfware.cloud


Built and supported by engineers with 10+ years of hands-on AWS experience spanning enterprise architecture, security, performance engineering, automation, infrastructure, and software development.

Contact Us