Back

Case Study: AI Agent Cuts Web Audit Costs 87%

Built an AI agent to perform qualitative analysis on 1,800 websites, achieving 87% cost savings versus manual review. Completed in 12 hours instead of 60, proving agentic workflows' efficiency.

Case study TLDR: Created an AI Agent to perform qualitative analysis on 1,800 websites for a client. It performed to satisfaction with a conservative 87% cost savings.

The scope: Analyze 1,800 targeted websites based on twelve qualitative assessment points. Create a sliding scale sum, a binary true/false outcome and 2-3 sentence assessment summary for each site. Log output and summarize findings in Google Sheets.

A little perspective...

1,800 target websites to analyze

12 points of qualitative consideration and assessment

conservatively, consider it 2 minutes per site for a trained person to perform the assessment. Probably more like 3-4 minutes for the assessment + another 2-3 minutes for the summary but let's stick with 2 minutes

it would take a person 60 hours to perform this task, working 8 hours per day, that's 7.5 days to execute

let's say we're paying that person $30/hr to perform the task--that comes out to $1,800 to complete the assessment.

...

The agent was able to perform this task in a little under 12 hours.

The LLM model API usage cost was ~ $180.

My time to design the agent was roughly a wash when you considered time to train somebody to perform the assessment.

This resulted in a ~ 87% cost reduction.

Findings/Considerations

This type of qualitative analysis would've been impossible programmatically (or cost prohibitive) because each site's structure and content is different. The context for each assessment point needs to be inferred, not calculated.

This was an exercise in deterministic programming vs agentic workflows.

When the agent started doing its thing, I could see the API usage skyrocket. I'm used to programmatically processing millions of requests for $.00001 per request so when I saw $.12 per site, I started sweating; but then released this was still incredibly cost effective.

At the time of designing, last week's models could not adequately complete the task. An update to the models and trial and error to find the right model for the job balancing capabilities and cost (always choose the cheapest model that can accomplish the job). That's how fast this space is moving

Love to chat more about this for anybody interested