perfware.cloud - Getting Started Guide - Log Processor 26h1
Appendix B: Subscription Configuration Reference Summary
Appendix C: OpenSearch Document Structure
Appendix D: Sample Athena Queries
Appendix E: Managing Athena Workgroup Access
Appendix F: Infrastructure Updates
Appendix G: OpenSearch Snapshots
Appendix H: Cross-Account Log Ingestion and Cross-Account S3 Access Log Ingestion (Enterprise)
Appendix I: CloudWatch Log Sources
Log Processor ingests your AWS CloudWatch log groups (see Appendix I), stores them in S3, and indexes them into OpenSearch and Athena for search and analytics. It includes:
Centralized Automated log ingestion via Kinesis Firehose (auto discovers log groups)
Centralized targeting to Athena Datalake and/or OpenSearch
Chunked processing via Lambda and SQS for reliable delivery
Pattern detection and automated compliance reports
S3 access logging for compliance
Isolated/encrypted FedRAMP/HIPAA ready
Datalake and OpenSearch domain with configurable retention policies
OpenSearch Dashboards with browser access (ALB + Cognito authentication)
CloudWatch alarms for Lambda errors, DLQ depth, Firehose freshness, and OpenSearch health
Distributed locking to prevent duplicate processing
The product deploys as a CloudFormation stack:
VPC with isolated subnets and VPC endpoints (S3, DynamoDB, SQS, CloudWatch
Logs, SNS, CloudWatch Monitoring, Lambda), Kinesis Firehose delivery stream, S3
log bucket with KMS encryption and lifecycle rules, Lambda log processor with
SQS queues and DynamoDB distributed locking, OpenSearch domain with ISM
retention policies, index templates, snapshot repository, and pre-built
Dashboards saved objects (index patterns, searches, visualizations,
dashboards), S3 Datalake with Glue catalog and Athena workgroup, ALB with
Cognito authentication and Nginx reverse proxy for browser access, Subscription
Editor for managing log groups and pattern rules, SNS alarm topic with
CloudWatch alarms and a pre-built monitoring dashboard, EventBridge schedules
for subscription management and compliance reporting, and 50+ built-in pattern
detectors with per-pattern CloudWatch metrics. OpenSearch configuration is
applied automatically once the domain becomes healthy.
An AWS account and user with sufficient permissions
Basic familiarity with the AWS Console (CloudFormation, S3, CloudWatch) and the ability to run AWS CLI commands from a terminal. No programming or infrastructure-as-code experience required - the stack deploys via a single CloudFormation template with guided parameters. Use the provided (.cmd/.sh) scripts to manage the domain, users, stack deletion, cross-account, and snapshot operations.
A custom domain and an ACM certificate for HTTPS access to Dashboards (see Appendix A).
Check your current VPC utilization. As the default VPC limit is 5 per region, you may need to request a service quota increase (more ).
Step 1: Subscribe and Launch the Main Stack
1. Find Log Processor select your tier in the AWS Marketplace console
2. Click Continue to Subscribe and Accept the terms
3. Click Set up your account you will be redirected
4. Presented with Subscription Confirmed! which will provide steps
5. Upon selection CloudFormation opens with template pre-loaded → fill in parameters (below) → Submit
Step 2: Configure the Main Stack
CloudFormation will open with the template pre-loaded. Fill in the parameters, typically, you can leave other parameters at their defaults:
|
Parameter |
Required |
Description |
|
ConfirmStackName |
Yes |
Copy and paste your stack name here to confirm it meets stackname criteria. |
|
NotifyEmail |
Yes |
Comma separated email address list for CloudWatch alarm notifications and Compliance reports (up to 5). You will receive an AWS Notification - Subscription Confirmation email. |
|
DashboardsAllowedCidr* |
Yes |
Comma separated IP ranges allowed to access Dashboards (up to 5) or 0.0.0.0/0 for open access. |
|
DashboardsCertificateArn* |
Strongly recommended |
ARN of an ACM certificate for HTTPS. |
|
DashboardsDomain* |
Strongly recommended |
Custom domain for Dashboards (e.g. dashboards.example.com). Provides a stable URL that persists across stack updates. Without this, you must use the ALB-generated DNS name which changes if the stack is recreated. |
The following parameters are sentinel parameters, they control sizing. Each default to -1 which means use the tier default. Set a specific value to override no range validation if you override.
VolumeSizeValidation* OpenSearch EBS volume per node in GB (103072). Tier defaults: basic=10, essential=50, advanced=100, enterprise=500.
DataNodeCountValidation* OpenSearch data node count (180). Must be even when multi-AZ is enabled. Tier defaults: basic=1, essential=2, advanced=4, enterprise=6.
LambdaConcurrencyValidation Lambda reserved concurrency (20100). Controls max parallel log processing. Tier defaults: basic=20, essential=25, advanced=50, enterprise=100.
LambdaMemoryValidation Lambda memory in MB (25610240). Increase for large log files. Tier defaults: basic=256, essential=512, advanced=1024, enterprise=2048.
AppRetentionDaysValidation* OpenSearch ISM retention for app indexes in days (13653). Controls when app log indexes are automatically deleted. Tier defaults: basic=30, essential=30, advanced=90, enterprise=365.
AuditRetentionDaysValidation* OpenSearch ISM retention for audit indexes in days (13653). Controls when audit log indexes are automatically deleted. Tier defaults: basic=365, essential=365, advanced=1095, enterprise=2555.
DatalakeAppRetentionDays Datalake S3 retention for app prefix in days. Use -1 for tier default.
DatalakeAuditRetentionDays Datalake S3 retention for audit prefix in days. Use -1 for tier default.
Tier defaults: Basic 90/365, Essential 365/1095, Advanced 1095/2555, Enterprise 1095/2555.
* Essential tier and above.
|
Important |
We strongly recommend providing both DashboardsCertificateArn and DashboardsDomain (see Appendix A). |
||||||||||||||||||||||||||||||||
|
Tier Selection |
When upgrading tiers (e.g. essential → advanced), parameters left at -1 automatically pick up the new tiers defaults. Custom overrides are preserved across upgrades, see Capacity Planning to properly size your environment. You can select defaults, and adjust later if you wish.
After a tier change, if you experience a 502 Bad Gateway when accessing Dashboards, clear your browser cookies for the dashboards domain or use an incognito window. This occurs because the existing Cognito session becomes stale during the OpenSearch blue/green deployment. Subsequent logins work normally. |
||||||||||||||||||||||||||||||||
|
Override OpenSearch retention via the -1 sentinel parameters to increase intake capacity. You can also override the AppRetentionDaysValidation and AuditRetentionDaysValidation defaults, shortening these increases the daily capacity. Lengthening retention requires proportionally more storage. Higher ambient volume may require more resource (nodes, CPU, memory).
If specific alarms fire regarding OpenSearch capacity, override the sentinel values and redeploy. |
|||||||||||||||||||||||||||||||||
|
Tier changes must be done through CloudFormation stack updates. Do not manually modify OpenSearch domain settings, Lambda configurations, or other resources through the AWS Console or CLI to replicate capabilities of a higher tier. Manual changes will drift from the template and may be reverted or cause conflicts on the next stack update. Additionally, manually configuring resources to exceed your subscribed tier's features violates the AWS Marketplace Terms of Use and your subscription agreement. To upgrade/downgrade, subscribe to the appropriate tier (one tier change at a time) and update the stack with the new tier's template - all sizing parameters adjust automatically. |
|||||||||||||||||||||||||||||||||
|
Basic Tier |
If you start with the basic tier, you must delete it before switching to other tiers (see Teardown). |
Check I acknowledge that AWS CloudFormation might create IAM resources and click Submit.
The stack takes approximately 1530 minutes to create.
You may receive alarm emails during initial deployment.
These are expected and will resolve within 10-15 minutes after stack creation completes.
Step 3: Verify Logs Are Flowing
After 1530 minutes:
Create User(s)
To access OpenSearch and the Subscription Editor you must create the admin user first (root-level user).
Use the provided (.cmd/.sh) script to manage the Cognito User Pool from the CloudFormation stack outputs:
Using the helper script:
Usage: users.cmd <stack-name> create admin@example.com
You will receive an email with a temporary password upon create.
Available commands (refer to the readme.txt for more information):
|
Command |
Description |
|
users.cmd <stack-name> create <email> |
Create a user with a temporary password |
|
users.cmd <stack-name> list |
List all users with email, status, and enabled flag |
|
users.cmd <stack-name> delete <email> |
Delete a user |
|
users.cmd <stack-name> reset <email> |
Reset password (user sets new one on next login) |
|
users.cmd <stack-name> disable <email> |
Disable a user (blocks login) |
|
users.cmd <stack-name> enable <email> |
Re-enable a disabled user |
|
If you are using the Advanced tier or above, for other users you need to create 'admin' and 'viewer' groups to access the editor. Assign other administrator users to 'admin' group for full access or 'viewer' group for read-only permissions. |
Link: User Guide (PDF)
Access OpenSearch Dashboards
The stack provisions the following pre-built resources:
Index patterns: logs-*, logs-app-*, logs-audit-*
Saved searches: Recent Errors, By Log Group, Pattern Detections, SQL Injection Attempts, SSN Detections, Credential Detections,
Visualizations: Events Over Time, Top Log Groups, Error Rate, Pattern Detections Over Time, Pattern Detections by Type,
Dashboards: Log Overview, Logs Audit, Logs App, Pattern Detections
To access: open Dashboards →click Dashboards in the left menu →select a dashboard. To view all saved objects: Management →Saved Objects.
https://<dashboard-domain>/_dashboards
Link: Open Search Dashboards Guide (PDF)
Access Athena Dashboard
The stack provisions pre-built Athena queries in the workgroup.
To access: open the Athena console →select the workgroup (<lowercase-stackname>-workgroup) →Saved queries tab. Queries include: recent logs, errors, count by log group, count by service, Pattern detections, Pattern count by type, and SQL injection attempts per index type (app, audit). All use current_date functions so they work without editing.
Link: Athena Queries Guide (PDF)
Step 4: Configure Log Subscriptions
The stack provisions Log Subscription for its own resources. The subscriptions file is deployed to the S3 assets bucket (config/ prefix) and the Lambda checks for updates on every invocation.
Use the editor or upload (basic tier) to s3://<assets-bucket>/config/subscriptions.json.
Changes are picked up automatically on the next Lambda invocation no redeployment needed.
A built-in web UI is available for essential tier and above for managing subscriptions without editing JSON directly. The editor is protected by the same Cognito authentication as OpenSearch Dashboards.
https://<dashboards-domain>/editor
Features:
Add, remove, and reorder subscription entries
Configure streams with index type, targets, pattern mode/types, and metadata
Set CloudWatch Logs retention per subscription
Edit custom pattern rules with live regex validation
Version history with preview and one-click rollback (S3 versioning)
Diff preview before saving - shows additions and removals
Server-side validation catches errors before saving to S3
Sync Now pushes changes to the processor in Realtime
Regex patterns are supported for log group and stream names. For example, /company/app/sample.* subscribes to log groups starting with same.
Metadata lets you uniquely identify the log entries. These are queryable in OpenSearch and Athena.
Link: Subscription Editor Guide (PDF)
Log Group Retention Policy (optional): Each subscription entry supports a retentionDays field to enforce CloudWatch Logs retention on matched local log groups. This ensures log groups have consistent retention without manual configuration in the console.
Omit or set to -1 to leave the log groups current retention unchanged
Set to a positive number (e.g. 30) to enforce that retention period
Automatically snaps to the nearest valid CloudWatch value (1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365 3653)
Only applied when subscribe is "enabled"
First match wins place specific log group entries before broader regex patterns
Pattern Detection (features vary by tier): Each stream rule can include a patterns key to scan log messages for sensitive data before indexing:
"patterns": {"mode": "redact", "types": ["ssn", "credit_card", "email", "phone", "aws_key", "sql"]}
|
Mode |
Behavior |
|
off |
No scanning (default when patterns key is absent) |
|
redact |
Replace matched text with [PII:SSN], [PII:EMAIL], [SQL], * (content retained), etc. |
|
filter |
Drop the entire log event |
|
tag |
Add pattern_detected: true and pattern_types array to the event, keep message intact |
Detected types:
Identity: ssn, ssn_nondash, passport_us, drivers_license, dob
Financial: credit_card (Luhn validated), iban
Contact: email, phone_us, phone_intl, ipv4
Credentials: aws_access_key (AKIA/ASIA), aws_secret_key
PHI: MRN,
SQL: sql_select, sql_insert, sql_update, sql_delete, sql_drop, sql_alter, sql_truncate, sql_exec, sql_union, sql_injection, sql_comment
APP:
Custom patterns: Define your own additional patterns at the top level of the subscription file (JSON patternRules array). Custom patterns are loaded from S3 on every invocation no redeployment needed. Zero additional cost regex-based, no external API calls.
Querying Pattern detections:
Each log event with a detection enabled is tagged with pattern_detected: true and pattern_types (array of matched pattern names). These fields are searchable per log entry in both OpenSearch and Athena. CloudWatch Metrics provide basic pattern metrics, or enhanced metrics where specified by tier.
Landing Page
The deployment will create a landing page at https://<dashboard-domain> - parameter specified for DashboardsDomain.
The stack creates a CloudWatch monitoring dashboard.
<lowercase-stack>-monitoring
Link: Cloud Watch Dashboard Guide (PDF)
Includes an Alarm Status widget at the top showing all alarm states at a glance (OK/ALARM/INSUFFICIENT_DATA), followed by detailed widgets for Lambda, SQS, Firehose, OpenSearch, per-index metrics, Nginx, Lambda log queries, and Estimated Costs (account total and per-service USD charges for all services used by the stack).
Cost allocation tags: The following tags are automatically applied to all resources for cost tracking in AWS Cost Explorer:
cost:stack your stack name (e.g. LogProcessor2)
cost:product LogProcessor (constant across all deployments)
cost:profile tier name (basic, essential, advanced, enterprise, or custom)
To activate: go to AWS Billing → Cost Allocation Tags → find and activate cost:stack, cost:product, cost:profile. After 24 hours, Cost Explorer can filter and group costs by these tags for per-stack breakdowns.
Note: The Estimated Costs widgets require Receive Billing Alerts to be enabled in the AWS Billing console (Settings →Billing Preferences). Billing metrics update approximately every 6 hours.
CloudWatch alarms notify via email (if configured with the NotifyEmail parameter). You can monitor:
Lambda errors and throttles processing failures
DLQ depth messages that failed processing after 3 retries
Firehose data freshness delivery lag to S3
OpenSearch cluster health RED/YELLOW status, storage, JVM pressure, CPU
Custom metrics are published to CloudWatch under the <stack>/processor namespace:
S3Events S3 objects processed
SqsEnqueued chunks sent to the queue (also per index type: app, audit)
SqsEvents chunks processed (also per index type)
OsRetries OpenSearch indexing retries (also per index type)
Warnings Warning-level events (e.g. DLQ messages)
Errors Error-level processing failures
PatternDetected Aggregated events with at least one pattern match
PatternFiltered Aggregated events dropped by filter mode
ActiveSubscriptions Number of active CloudWatch log subscription filters (gauge, updated each subscription run)
FirehoseErrors Number of objects in the S3 `errors/` prefix (gauge, checked every 15 min within a configurable time budget)
Note: Tier advanced or above include enhanced CloudWatch Pattern metrics. Some queries and Dashboards may be blank at tiers lesser than advanced.
When a new version is available in Marketplace:
1. Go to AWS Marketplace →Manage subscriptions
2. Select the product and tier and click Update
3. Follow the prompts to update the CloudFormation stacks
4. Your configuration and data are preserved
1. Remove subscription filters first using the subscription editor set all entries to "subscribe": "disabled" wait 15 minutes or sync now.
2. Delete the stack select the stack in CloudFormation, click Delete
|
Important: The built-in VPC is isolated for security, self-contained, so you don't need to think about networking when deploying. However, when deleting the stack ENIs and SGs may linger, you may have to repeat the delete several times.
Redeploying after deletion: Some AWS resources (S3 bucket names, OpenSearch domains, Cognito domains) may take several minutes to fully release after stack deletion, select retain for subsequent delete attempts, use this for final cleanup:
Usage: python cleanup.py <stack-name> <region> [-force|-confirm] By default, shows what would be deleted (dry run). Note: Review carefully before specifying -force as your S3 data (logs, datalake, snapshots) is permanently deleted. To preserve data, copy it first using aws s3 sync.
Option 2 is the simplest approach and recommended for most users. |
3. Retry CloudFormation delete, retaining resources that cleanup.py can later address.
4. If Unsubscribing go to AWS Marketplace →Manage subscriptions →Cancel subscription
Your AWS infrastructure costs are billed directly by AWS at standard rates - you pay only for what you use. Your data never leaves your AWS account.
Retention periods, node counts, and disk sizes are all adjustable at deploy time - scale down to reduce costs or up to meet demand. If none of the standard tiers match your capacity requirements, this can be addressed during enterprise on-boarding or contact us for a custom configuration via professional services.
Estimated Default Pricing for Region: us-east-1 | Fixed monthly costs only (excludes per-GB data transfer and storage growth)
|
Resource |
Basic |
Essential |
Advanced |
Enterprise |
Enterprise (3 nodes)* |
|
OpenSearch Data Nodes |
|
2Χr6g.large = $244 |
4Χr6g.xlarge = $980 |
6Χr6g.2xlarge = $2,934 |
3Χr6g.2xlarge = $1,467 |
|
OpenSearch Dedicated Masters |
|
|
3Χr6g.large = $366 |
3Χr6g.large = $366 |
3Χr6g.large = $366 |
|
EBS Storage (gp3) |
|
2Χ50 GB = $8 |
4Χ100 GB = $32 |
6Χ500 GB = $240 |
3Χ500 GB = $120 |
|
Dashboards (ALB + t3.nano) |
|
$28 |
$28 |
$28 |
$28 |
|
VPC Interface Endpoints |
4Χ$7.30 = $29 |
5Χ$7.30 = $37 |
5Χ$7.30 = $37 |
5Χ$7.30 = $37 |
5Χ$7.30 = $37 |
|
KMS Keys ($1/key/month) |
$2 |
$4 |
$4 |
$4 |
$4 |
|
Other (S3, SQS, DDB, Firehose, Lambda) |
~$1 |
~$2 |
~$3 |
~$5 |
~$5 |
|
Cognito |
|
free |
free |
free |
free |
|
AWS Infrastructure Total |
~$32 |
~$323 |
~$1,450 |
~$3,614 |
~$2,027 |
|
Software Fee |
$29/mo |
$149/mo |
$299/mo |
$599/mo |
$599/mo |
|
Estimated Monthly Total |
~$61 |
~$472 |
~$1,749 |
~$4,213 |
~$2,626 |
* As an illustration dropping enterprise to data 3 nodes is a 38% reduction ($4,213 → $2,626). Still maintains multi-AZ with 3 nodes and dedicated masters handle cluster coordination.
Notes:
Prices based on us-east-1 on-demand rates (May 2026). Other regions may vary ±10-20%.
VPC Interface Endpoints: $0.01/hr per AZ (2 AZs) + $0.01/GB processed. Basic has 4 endpoints (SQS, CW Logs, CW Monitoring, SNS). Essential+ adds Lambda endpoint.
Variable costs not included: S3 storage ($0.023/GB), Firehose ingestion ($0.029/GB), Lambda invocations, CloudWatch Logs ingestion ($0.50/GB), Athena queries ($5/TB scanned).
OpenSearch instance pricing: m6g.large=$0.188/hr, m6g.xlarge=$0.375/hr, m6g.2xlarge=$0.751/hr.
All tiers can be scaled down by overriding node count, and volume size parameters during deployment.
Basic tier has no OpenSearch query logs via Athena SQL only (~$5/month variable).
No logs appearing in OpenSearch/Athena
The provisioned stack logs its own Log Processor resources so you should see logs.
Verify subscriptions.json has entries with "subscribe": "enabled".
Wait 15 minutes for the subscription manager to run
Check the generated CloudWatch dashboard:
~/dashboards/dashboard/<lowercase-stack>-monitoring for errors
Check the Lambda log group /aws/lambda/<stack>-log-processor for errors
Verify the subscribed log groups have active log streams producing logs
Cannot access Dashboards
Verify your credentials have been added to Cognito
Verify your IP is in the DashboardsAllowedCidr range
If using HTTPS, verify the ACM certificate is in the Issued state
If using MFA, ensure you have installed an authenticator application
Check that the Nginx instance is running: go to EC2 →Auto Scaling Groups → find the dashboards ASG
See Appendix A, Support
|
|
Creating an AWS Certificate
Leveraging your custom AWS domain using the provided (.cmd/.sh) script:
Step 1: Generate the cert (before deploy)
setup-domain.cmd cert-only <sub-domain> <host-zone-id> <region>
e.g. setup-domain.cmd cert-only dashboards.example.com Z013AAAAAB2QDFABCDL8S us-east-1
Step 2: Deploy stack with the domain and cert ARN from step 1
As per Subscribe and Launch the Main Stack
Step 3: After deploy, create the DNS alias to the ALB
setup-domain.cmd <stackname> <sub-domain> <host-zone-id> <region>
e.g. setup-domain.cmd LogProcessor dashboards.example.com Z013AAAAAB2QDFABCDL8S us-east-1
Creating and Importing a Self-Signed Certificate
If you do not have a domain, you can use a self-signed certificate for testing. However, browsers will show a security warning. For production use, a proper domain and AWS ACM certificate is recommended.
Step 1: Generate the certificate:
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout key.pem -out cert.pem -subj "/CN=localhost"
Step 2: Import into ACM:
aws acm import-certificate --certificate fileb://cert.pem --private-key fileb://key.pem --region <region>
Step 3: Copy the ARN from the output (e.g. arn:aws:acm:us-east-1:123456789012:certificate/abc-123...)
Step 4: Use in CloudFormation - paste the ARN into the DashboardsCertificateArn parameter when deploying the stack. Normally you would leave the DashboardsDomain blank, if you do specify a domain, you will have to add a host entry to the ALB IP address so Cognito can redirect.
Caveats:
Browsers will show a security warning ("Your connection is not private") - users must click through to proceed
Self-signed certificates do not auto-renew - you must reimport before expiry
Cognito authentication still works with self-signed certs on ALB
Find the ALB domain in the AWS Console and use your browser to open the dashboard, e.g.
https://<lowercase stack>-dashboards-alb-<accountid>.<region>.elb.amazonaws.com/_dashboards
https://<lowercase stack>-dashboards-alb-<accountid>.<region>.elb.amazonaws.com/editor
|
Field |
Description |
|
logGroupName |
Log group name or regex pattern (e.g. /aws/lambda/.*). Patterns evaluated in order; first match wins |
|
subscribe |
"enabled" to create a CloudWatch subscription filter, "disabled" to remove it, "unmanaged" to match events only (cross-account) |
|
retentionDays |
Optional. CloudWatch Logs retention in days. Omit or set to -1 to leave unchanged. Snaps to nearest valid value (1, 3, 5, 7, 14, 30, 60, 90 653). Only applied when subscribe is enabled. First match wins specific entries take precedence over broader regex patterns |
|
accountId |
Can specify a comma separated list of accountids or * (wild card) only allowed for enterprise tiers. |
|
streams[] |
Array of stream routing rules. Each rule has: |
|
.logStreamName |
Regex pattern to match stream names. First match wins |
|
.index |
OpenSearch index type: app (default, 30-day retention) or audit (365-day retention) |
|
.target[] |
"opensearch", "datalake" |
|
.patterns[] |
mode, and types |
|
.metadata |
Key/value pairs added to every OpenSearch document. Searchable in Dashboards and in the Athena datalake |
Note: If provisioned with tier: essential or above use the subscription editor, it provides validation.
Each log line is indexed as a separate document:
|
Field |
Source |
|
@timestamp |
Millisecond-precision ISO-8601 from the CloudWatch log event |
|
accountId |
Your AWS account ID |
|
logGroup |
CloudWatch log group name |
|
logStream |
CloudWatch log stream name |
|
message |
The log line text |
|
service, team, etc. |
Custom fields from metadata in your subscription configuration |
The deployment provisions ready to use saved queries for the sample logs provisioned in the main stack deployment.
The stack creates an Athena workgroup (<lowercase-stackname>-workgroup) with pre-configured query results location and saved queries. You can control who can query your log data by managing IAM permissions on the workgroup.
Grant a user or role access to the workgroup:
|
{ "Effect": "Allow", "Action": [ "athena:StartQueryExecution", "athena:GetQueryExecution", "athena:GetQueryResults", "athena:GetWorkGroup", "athena:ListNamedQueries", "athena:GetNamedQuery" ], "Resource": "arn:aws:athena:<region>:<account-id>:workgroup/<stack>-workgroup" } |
Users also need S3 and Glue permissions:
S3: s3:GetObject, s3:ListBucket on the datalake bucket, and s3:PutObject on the query-results/ prefix for writing results
Glue: glue:GetTable, glue:GetDatabase, glue:GetPartitions on the Glue catalog databases
KMS: The datalake uses CMK encryption, kms:Decrypt and kms:GenerateDataKey on the datalake CMK
Restrict access to specific databases:
To limit a user to only query app logs (not audit), scope the Glue permissions to the specific database:
"Resource": "arn:aws:glue:<region>:<account-id>:database/<lowercase-stack>_app"
Enforce workgroup usage:
The workgroup is created with enforceWorkGroupConfiguration: true, which means the query results location and encryption settings cannot be overridden by users. All queries run through this workgroup use the pre-configured datalake bucket for results.
Disable access:
To prevent a user from querying, remove the athena:StartQueryExecution permission on the workgroup resource. They will still be able to view saved queries but not execute them.
Note: For fine-grained column-level or row-level access control, consider enabling AWS Lake Formation on the Glue databases. This allows you to grant per-table or per-column permissions to specific IAM users or roles without managing S3 bucket policies directly.
A refresh schedule updates the EC2 instance automatically.
|
Tier |
Frequency |
Time |
|
essential |
Every 30 days |
~03:00 UTC (±60 min flex window) |
|
advanced |
Every 14 days |
~03:00 UTC (±60 min flex window) |
|
enterprise |
Every 7 days |
~03:00 UTC (±60 min flex window) |
The AWS Scheduler picks a random time within 03:0004:00 UTC. During the refresh the Dashboards are inaccessible. The lambdas are patched automatically by AWS; major updates are handled via marketplace updates.
Use the provided cmd/.sh files to take long-term archival beyond the AWS-managed automated snapshots (hourly, 14-day retention).
Usage: snapshot.cmd <stack-name> <command> [snapshot-name] [region]
Available commands:
|
Command |
Description |
|
snapshot.cmd <stackname> take [snapshot-name]
|
Take a snapshot. Name defaults to daily-YYYYMMDD-HHMMSS |
|
snapshot.cmd <stackname> list |
List all snapshots |
|
snapshot.cmd <stackname> status <snapshot-name> |
Check snapshot status |
|
snapshot.cmd <stackname> delete <snapshot-name> |
Delete a snapshot |
Refer to readme.txt that accompanies helper script zip.
The Enterprise tier supports centralized log/S3 access ingestion from multiple AWS accounts. Refer to notes in the Subscription Editor Guide (PDF) and Athena Queries Guide (PDF) for guidance.
The big value proposition for Log Processor is that it replaces the need for users to cobble together your own services - we're giving you OpenSearch dashboards, ISM policies, snapshot repos, and datalake export in a single stack.
Targeting CloudWatch Logs provides a central leverageable log repository for your own services/applications and of course AWS itself:
Compute:
EC2 - via AWS CloudWatch Agent
Lambda - automatic
ECS/Fargate - awslogs driver
Elastic Beanstalk - platform/app logs
App Runner - automatic
Batch - automatic via awslogs driver
Lightsail - container service logs
Networking:
VPC Flow Logs - network traffic metadata
Route 53 - DNS query logging
CloudFront - real-time logs via Kinesis (standard logs go to S3)
App Mesh / Envoy - proxy access logs
Global Accelerator - flow logs
Network Firewall - alert and flow logs
WAF - web ACL traffic logs
Verified Access - access logs
VPN - connection logs
API & Application:
API Gateway - access and execution logs
AppSync - request/response logs
Amplify - build logs
EventBridge Pipes - execution logs
Database & Storage:
RDS/Aurora - error, slow query, general, audit logs
DocumentDB - profiler and audit logs
Neptune - audit logs
ElastiCache - slow log, engine log
Redshift - audit logs (via S3 typically, but can route)
DynamoDB - via CloudTrail for API calls
S3 - via CloudTrail for data events
Security & Identity:
CloudTrail - API activity logs
Cognito - user pool advanced security logs
GuardDuty - findings (via EventBridge, but source logs feed from CloudWatch)
AWS Config - configuration change logs
IAM Identity Center - sign-in logs
Macie - discovery job logs
Integration & Messaging:
Step Functions - execution history logs
SNS - delivery status logs
SQS - via CloudTrail
MQ (ActiveMQ/RabbitMQ) - general and audit logs
MSK (Kafka) - broker logs
Kinesis Data Analytics - application logs
Developer Tools:
CodeBuild - build logs (automatic)
CodeDeploy - deployment logs
CodePipeline - via CloudTrail
CodeCatalyst - workflow logs
AI/ML:
SageMaker - training, endpoint, processing job logs
Bedrock - model invocation logs
Lex - conversation logs
Comprehend - analysis job logs
Rekognition - via CloudTrail
Transcribe - via CloudTrail
Kendra - data source sync logs
IoT:
IoT Core - message broker, device shadow, rules engine logs
IoT Greengrass - component logs
IoT SiteWise - gateway logs
Management & Governance:
Systems Manager - Run Command, Automation, Session Manager logs
CloudFormation - via CloudTrail
Service Catalog - via CloudTrail
Control Tower - guardrail logs
Media:
MediaLive - channel logs
MediaConnect - flow logs
MediaConvert - job logs
Other:
WorkSpaces - access logs
AppStream 2.0 - session logs
GameLift - server process logs
DataSync - task execution logs
Transfer Family - SFTP/FTP structured logs
Glue - ETL job logs (continuous logging)
EMR - step and application logs
More
Included support (all subscriptions):
Documentation (this guide and other PDF links contained within)
Bug fixes and security patches via AWS Marketplace product updates
GitHub/Email support for configuration questions
Helper scripts - download refer to readme.txt and script parameter help
|
File |
Summary |
|
readme.txt |
More information regarding files contained in the zip. |
|
users.(cmd/sh) |
Manage Cognito users for OpenSearch Dashboards access. |
|
groups.(cmd/sh) |
Manage Cognito groups/users for Editor Dashboard access. |
|
setup-domain.(cmd/sh) |
Configure a custom domain for the Dashboards ALB. |
|
run-subscriptions.(cmd/sh) |
Trigger immediate subscription processing on the log processor Lambda. Equivalent to clicking "Sync Now" in the editor. |
|
add-access-logging.(cmd/sh) |
Enable S3 access logging on any bucket. |
|
snapshot.(cmd/sh) |
Manage OpenSearch manual snapshots (take, list, status, delete). |
|
support-bundle.(cmd/sh) |
Collect a diagnostic bundle for support analysis. |
|
setup-replication.(cmd/sh) |
Use to configures an external account's bucket to replicate S3 access logs to a stack's access log bucket. |
|
setup-cross-account.(cmd/sh) |
Manage cross-account log subscriptions. |
|
get-destination.(cmd/sh) |
Display the CloudWatch Logs destination ARN for a stack |
|
cleanup.py |
Clean up orphaned resources after stack deletion. |
Contact: support@perfware.cloud
Issues & bugs: https://github.com/perfwaresupport/logprocessor-support/issues
Support/Response time: Refer to tier documentation
Professional services (additional):
Custom setup Custom tiers, subscription configuration, data flow verification, capacity planning
Custom integration Integration with other services, and applications
For professional services inquiries, contact consulting@perfware.cloud