Skip to main content

Building a Cost Optimization Agent with AWS Bedrock and Cost Explorer

Lucas Little
by Lucas Little
April 8, 2026

Managing AWS costs becomes increasingly complex as infrastructure grows. Organizations often struggle with cloud cost management, spending valuable engineering time manually analyzing Cost Explorer data, identifying optimization opportunities, and implementing changes. Even with dedicated cost management tools, the analysis and remediation process remains largely manual, requiring specialized expertise to interpret cost data and translate it into actionable steps.

As cloud adoption matures, cost optimization is shifting from periodic review to continuous, automated discipline. Organizations that operationalize FinOps — rather than treating it as a reporting exercise — are seeing materially better cost outcomes.

This post demonstrates how to build an automated agent that analyzes AWS costs and generates actionable recommendations to reduce cloud spend. By combining AWS Bedrock’s analytical capabilities with Cost Explorer data, the system identifies cost outliers and provides specific optimization steps that go beyond basic visualizations to deliver meaningful insights.

What we’re building

A cost optimization agent that:

  • Runs on a schedule (weekly or monthly)

  • Fetches detailed cost data from AWS Cost Explorer

  • Analyzes spending patterns and identifies optimization opportunities

  • Generates a markdown report with specific recommendations

  • Sends a summary to Slack or email

  • Tracks recommendations and their potential savings

This solution goes beyond basic cost visualization by providing specific, actionable steps to optimize AWS spending—essentially turning data into decisions.

Real-World Applications

This solution addresses cost management challenges across different contexts:

Enterprise FinOps Teams: In larger organizations, the agent provides consistent, ongoing cost analysis that augments the FinOps team’s capabilities, ensuring no optimization opportunity goes unnoticed even as the infrastructure grows in complexity.

Startups and Small Teams: For organizations without dedicated cloud financial analysts, the agent provides expert-level cost optimization recommendations that would otherwise require specialized knowledge or expensive consultants.

Managed Service Providers: MSPs can deploy the agent across client environments, standardizing cost optimization practices while customizing thresholds and priorities for each client’s specific needs.

Development Environments: The agent can enforce stricter cost controls in non-production environments, identifying development and testing resources that can be safely downsized, scheduled, or terminated to reduce costs without affecting production workloads.

Multi-Cloud Strategies: While this implementation focuses on AWS, the architecture pattern can be extended to analyze costs across multiple cloud providers, giving organizations a unified view of optimization opportunities.

Architecture overview

Here’s the high-level architecture:

The key components:

  • EventBridge: Triggers the agent on a schedule

  • Lambda: Initializes and coordinates the analysis process

  • Bedrock Agent: Orchestrates the data gathering and analysis

  • Action Groups: Custom Lambda functions for specific tasks

  • S3: Stores the generated reports

  • SNS: Sends notifications with the summary

  • DynamoDB: Tracks recommendations and their implementation status

This event-driven architecture ensures the cost optimization process runs automatically on schedule, eliminating the need for manual intervention and ensuring consistent analysis.

What you’ll need

  • AWS Account with Bedrock and Cost Explorer access
  • IAM role with appropriate permissions
  • S3 bucket for storing reports
  • SNS topic or email for notifications
  • Basic understanding of AWS services

Step 1: Designing Action Group Capabilities

Rather than thinking in terms of standalone scripts, it’s more effective to define capabilities that the agent can invoke.

Cost Data Retrieval

This capability is responsible for building a structured representation of cost and usage.

At a high level, it:

  • Determines the analysis window (week, month, quarter)
  • Retrieves time-series cost data
  • Segments spend by service and usage type
  • Identifies anomalies and unusual patterns
  • Surfaces reservation and Savings Plans opportunities

Conceptually:

def retrieve_cost_data(time_period):
# Establish time window and granularity
# Query Cost Explorer for cost and usage
# Retrieve anomaly signals
# Gather reservation and savings plan recommendations
# Normalize into structured dataset
return cost_data

In practice, the difficulty here is not in calling Cost Explorer APIs, but in:

  • Normalizing data across accounts and environments
  • Handling inconsistent or missing tagging strategies
  • Aligning cost views with how the business actually operates

This is often where “simple” cost analysis breaks down.

Resource Analysis

Once cost data is available, the system evaluates the infrastructure generating that spend.

This includes:

  • Identifying idle or underutilized compute resources
  • Detecting rightsizing opportunities
  • Evaluating storage usage patterns
  • Analyzing database and serverless utilization

Conceptually:

def analyze_resources(cost_data):
# Evaluate utilization across compute, storage, and serverless
# Identify inefficiencies and idle resources
# Detect rightsizing opportunities
# Correlate utilization with cost impact
return optimization_opportunities

While identifying inefficiencies is relatively straightforward, the real challenge is:

  • Defining thresholds that are safe for production systems
  • Avoiding recommendations that introduce risk
  • Contextualizing usage patterns (e.g., spiky workloads vs true underutilization)

Report Generation

The final capability translates analysis into something actionable.

Instead of producing raw data, the system:

  • Prioritizes opportunities by potential savings
  • Generates a structured, human-readable report
  • Provides clear, actionable recommendations
  • Distributes summaries to stakeholders
def generate_report(cost_data, resource_analysis):
# Combine cost and utilization insights
# Rank opportunities by impact
# Generate structured report (e.g., markdown)
# Send notifications and store results
return report_location

This step is critical — without it, the system becomes just another dashboard.

Step 2: Set up DynamoDB for tracking recommendations

You’ll need DynamoDB tables to track reports and recommendations. In production, you’d define these in your infrastructure-as-code using Terraform or CloudFormation. The tables need:

  1. CostOptimizationRecommendations table:

Partition key: recommendation_id (String)

PAY_PER_REQUEST billing mode for cost efficiency

  1. CostOptimizationReports table:

Partition key: report_id (String)

PAY_PER_REQUEST billing mode for cost efficiency

The first table stores individual recommendations with their implementation status, while the second table tracks metadata about generated reports.

Step 3: Agent Orchestration

The Bedrock Agent coordinates the workflow.

Instead of hardcoding execution order, the agent:

  1. Retrieves cost data
  2. Analyzes resources
  3. Generates a report

What matters most here is how the agent is instructed to:

  • Focus on high-impact opportunities
  • Produce practical, implementable recommendations
  • Communicate clearly and concisely

This is where generative AI adds value — not just in analysis, but in framing insights for action.

Step 4: Scheduled Execution

EventBridge enables the system to run automatically on a defined cadence.

This ensures:

  • Continuous monitoring of cost trends
  • Consistent identification of optimization opportunities
  • Reduced reliance on manual processes

Over time, this becomes part of the organization’s FinOps operating model.

Step 5: Set up the EventBridge rule

Create an EventBridge rule to run the analysis on a schedule. In a production environment, you’d define this in your infrastructure-as-code using Terraform or CloudFormation, setting parameters like: 

  • Rule name: “WeeklyCostOptimizationAnalysis”
  • Schedule expression: “cron(0 8 ? * MON *)” (runs every Monday at 8 AM)
  • Target: Your main Lambda function

EventBridge ensures the cost analysis runs automatically at your chosen interval without manual intervention. 

Step 6: Set up SNS for notifications

Set up an SNS topic for notifications by creating a topic and adding subscribers. In production, you’d define this in your infrastructure-as-code, configuring: 

  • Topic name: “CostOptimizationAlerts”
  • Protocol: Email, SMS, or webhook (based on your preferred notification channel)
  • Subscribers: Finance team, cloud administrators, or a Slack webhook

The notification system ensures key stakeholders are informed of optimization opportunities as they’re identified.

Analysis Capabilities

The agent can identify several types of cost optimization opportunities:

1. EC2 Instance Optimization

  • Idle Instances: Identifies running instances with CPU utilization consistently below 5%
  • Rightsizing Opportunities: Finds instances that could be downsized based on utilization patterns
  • Stopped Instances: Locates instances that have been stopped for extended periods
  • Instance Family Upgrades: Suggests moving to newer generation instance families
  • Reserved Instance Coverage: Identifies on-demand instances that should be covered by RIs

2. Storage Optimization

  • Unattached EBS Volumes: Finds volumes not attached to instances
  • Old Snapshots: Identifies EBS snapshots older than 6 months
  • Underutilized Volumes: Locates volumes with consistently low I/O patterns
  • Storage Class Transitions: Recommends moving infrequently accessed data to lower-cost storage tiers

3. Database Optimization

  • Overprovisioned RDS Instances: Identifies oversized database instances
  • Multi-AZ in Development: Flags multi-AZ deployments in non-production environments
  • Idle Databases: Finds database instances with minimal connection counts
  • Reserved Instance Opportunities: Suggests RIs for stable database workloads

4. Serverless Optimization

  • Overallocated Memory: Identifies Lambda functions with excessive memory allocation
  • Rarely Used Functions: Finds functions that are rarely invoked but consume resources
  • Long-Running Functions: Suggests optimizations for functions that consistently run long

Cost considerations

This solution is cost-efficient: 

  • Lambda costs: Most usage will fall under the free tier
  • EventBridge: No additional cost for scheduled rules
  • Bedrock API: ~$0.015 per 1,000 tokens with Claude Sonnet
  • S3: Minimal storage costs for reports
  • DynamoDB: Pay-per-request pricing keeps costs very low
  • SNS: Practically free for email notifications

For a weekly execution schedule, the infrastructure costs typically remain under $5 per month for most organizations due to the minimal compute resources required.

Extending the solution

Here are some ways to enhance your cost optimization agent:

Multi-account analysis: Extend to analyze costs across an AWS Organization. This offers a comprehensive view of spending across your entire cloud estate.

Implementation tracking: Track which recommendations were implemented and their actual savings. This helps quantify the ROI of the optimization agent.

Automated remediation: Add capability to automatically implement low-risk optimizations like removing unattached EBS volumes. The agent could implement changes automatically during off-hours.

Slack integration: Send reports directly to Slack channels, enabling team discussions around cost optimization opportunities and tagging responsible teams.

Tagging compliance: Check for resources without proper cost allocation tags, ensuring your organization maintains visibility into spend by department, team, or project.

Budget alerts integration: Combine cost optimization with proactive budget alerts, automatically triggering more aggressive analysis when a budget threshold is approaching.

Custom thresholds: Allow different teams or environments to set custom thresholds for what constitutes underutilization based on their specific workload patterns.

Conclusion

This solution leverages several AWS services to create an intelligent cost optimization system that analyzes cloud spending and provides specific recommendations for reducing costs.

Key advantages of this approach include:

  • Automation: Regular, scheduled analysis without manual intervention
  • Actionable insights: Specific recommendations rather than just data visualization
  • Comprehensive coverage: Analysis across multiple resource types (EC2, RDS, Lambda, etc.)
  • Prioritization: Recommendations sorted by potential impact
  • Integration: Works with existing AWS services and notification systems

The architecture combines the data collection capabilities of AWS Cost Explorer with the analytical power of Amazon Bedrock to generate insights similar to those from a cloud financial analyst. By implementing this solution, organizations can transform cost management from a periodic, manual exercise into an ongoing, automated process.

The system is particularly effective at identifying unused resources, rightsizing opportunities, and reservation or Savings Plans recommendations — areas that often yield significant savings when properly optimized.

Looking to make cloud cost optimization more proactive? Ippon helps organizations build practical AWS solutions that turn cost data into action, from FinOps accelerators and agentic AI workflows to broader cloud modernization initiatives. If you’re exploring how to reduce spend, improve visibility, or operationalize optimization across your AWS environment, connect with our team to start the conversation.




Comments

©Copyright 2024 Ippon USA. All Rights Reserved.   |   Terms and Conditions   |   Privacy Policy   |   Website by Skol Marketing