Building 8 min read

We Built an AI Invoice Verifier. Here's Where It Hits a Wall.

We built an AI invoice verifier and watched a fake beat a real invoice. Here's why document analysis alone cannot stop invoice fraud; the five layers of detection that most businesses never reach.

Australian small businesses lodged 1,909 scam reports to Scamwatch in 2024, with fake invoices and false billing accounting for $13.1 million in confirmed losses. Payment redirection fraud cost Australian businesses $224 million in 2022 alone. Both figures understate the real number; the ACCC consistently notes that most scams go unreported.

Then in late 2025, Commonwealth Bank disclosed up to $1 billion in suspected fraudulent home loans. AI-generated payslips, fake bank statements, fabricated identity documents, all convincing enough to pass through the mortgage process undetected. All four major Australian banks flagged similar issues.

We wanted to know whether current AI could catch this kind of fraud. Not in theory. In practice. So we built a seven-stage verification engine and tested it with two invoices.

One test was enough to see the problem. Not a bug in our engine; a structural limitation in what document analysis can ever achieve.


What We Tested

The engine takes an invoice image, extracts every field into structured data, checks the arithmetic, analyses font consistency and formatting, looks for signs of pixel-level editing, and produces a confidence score.

We tested it with two documents. A genuine invoice from a real Perth sole trader operating through a family trust with a Gmail address and no website. And a fabricated invoice we created ourselves: fictional business name, made-up ABN, invented address, but with professional formatting, correct GST, and realistic pricing.

The fake scored higher than the real one.

The engine flagged the genuine invoice for its Gmail address, absence of a website, and family trust structure. It rewarded the fabricated invoice for consistent branding, a professional template, and clean formatting. Polish beat authenticity.

Two documents are not a benchmark. But they are a controlled demonstration of a limitation any document-only system will face: what we call the real-details problem. When a fraudster uses genuine, publicly available information, every check that only examines the document will pass.

We could have added more checks. ABN lookups against the Australian Business Register. Address verification. Business name matching. But we stopped, because we could already see where this was heading.


Why More Checks Do Not Solve the Problem

Every detail on an invoice can be real and the invoice can still be fraudulent.

The ABN can be copied from the public register. The business name, the address, the pricing. All publicly available, all trivial to assemble into something that looks legitimate. A fraudster who uses real details passes every check that only examines the document.

This is exactly what happened to CBA. The fraudulent documents used real business details and plausible figures. The documents were accurate. The transactions behind them were not.

Checking the document catches careless fraud: someone who invents an ABN, misspells a business name, or gets the GST calculation wrong. That is useful, but it is a filter for laziness, not for intent. Anyone with five minutes and access to the ABR website can build an invoice that passes every document-level check.

The question AI cannot answer from a document alone is: did this transaction actually happen?


The Real Problem Is Accounting, Not AI

An invoice is a claim. It says: this service was provided, this amount is owed, this is where to send the money. But a claim is not evidence. The evidence is in the bank accounts.

Did $4,200 leave the debtor’s account on the date the invoice claims? Did it arrive in the creditor’s account? Do the amounts match? Do the dates align?

This is not a document analysis problem. It is an accounting problem. And solving it requires access to data that no document verifier has: the banking records of both parties.

For years, businesses have verified invoices by examining the document: does the ABN look right, do the numbers add up, does the formatting look professional. That was reasonable when faking a convincing document required effort and skill. With AI-generated documents, it takes minutes. The document is now the easy part to fake. The transaction is still hard.

Any verification process that only examines the document is checking the side the fraudster controls completely.


What a Real Solution Requires

AI can solve this problem. But not by looking harder at documents. It needs access to banking data, and that makes it a fundamentally different kind of system.

This applies wherever documents are submitted by parties with a financial incentive to misrepresent: mortgage applications, construction progress claims, government grant acquittals, accounts payable shared services. The context changes. The real-details problem does not.

Layer 1: Document analysis. AI extracts and validates the invoice: formatting, arithmetic, signs of manipulation. This is production-ready today. It catches mistakes and lazy fraud. Every business processing invoices should automate this.

Layer 2: Registry verification. Cross-reference the ABN, business name, and address against government registers. This catches fraud where someone invented the details. It does not catch fraud where someone used real ones.

Layer 3: Transaction matching. Confirm that the payment left the debtor’s bank account and arrived in the creditor’s bank account. The amount matches. The date aligns. This is where sophisticated fraud is caught. The one thing a fraudster cannot fabricate without committing a separate, traceable crime is the actual movement of money. In practice, this layer typically lives with banks and platforms with regulated access to transaction data under frameworks like Open Banking and the Consumer Data Right; not with a standalone invoice AI.

Layer 4: Pattern detection. Flag anomalies across thousands of transactions. The same tradesperson appearing on fifty mortgage applications in one month. A sudden spike in invoices from a newly registered business. Amounts that cluster just below approval thresholds. This is where AI adds real value: not by examining individual documents, but by seeing patterns across volumes that no human reviewer could process.

Layer 5: Collusion detection. This is where it gets genuinely hard. Layers 1 through 4 assume the fraud is one-sided: a fabricator acting alone. But when parties collude, a mortgage broker and an applicant, a vendor and a purchasing officer, the transaction is real. Money does move. The invoice is legitimate. The fraud is in the relationship, not the paperwork.

Detecting collusion requires network analysis: mapping relationships between entities, identifying conflicts of interest, flagging transactions where the parties are connected in ways that should trigger review. This is a different discipline; closer to financial intelligence than document processing.


The Five Layers of Fraud Detection

Each layer catches what the layer below it misses:

LayerWhat it checksWhat it catchesWhat it misses
Document analysisFormatting, arithmetic, manipulationMistakes and lazy fakesReal details used fraudulently
Registry verificationABN, business name, addressInvented detailsReal details on a fake invoice
Transaction matchingMoney in, money outSophisticated single-party fraudCollusion where money really moves
Pattern detectionVolume anomalies, timing, clusteringOrganised fraud at scaleCoordinated low-volume collusion
Collusion detectionRelationships between partiesConnected parties, conflicts of interestNovel fraud structures

Most businesses operate at layer 1. Some reach layer 2. The CBA problem sits at layers 3 and 4. The hardest fraud, the kind that costs billions, requires all five.


Why We Stopped at Layer 1

We tested whether AI document analysis could catch invoice fraud. One test showed us it could not; not because our engine was bad, but because of the real-details problem. A well-crafted fake using genuine, publicly available information will always beat document-level analysis.

We could have built layers 2 and 3. But layer 2 only catches fraud where someone was too lazy to use real details. And layer 3 requires access to both parties’ banking data; a fundamentally different system with different privacy, infrastructure, and regulatory requirements.

The honest conclusion: AI can catch invoice fraud, but only as part of a system that has access to banking data, transaction records, and pattern analysis across thousands of submissions. Document analysis alone, no matter how sophisticated, checks the one thing the fraudster controls completely.

For most mid-market businesses processing invoices, layers 1 and 2 are the highest-return starting point and can be implemented without changing banking relationships. Automate your document extraction: AI does that brilliantly. Verify ABNs against the register: it is free and takes seconds. These two layers catch mistakes, catch careless fraud, and free your team to focus on the judgement calls that actually require a human.

If you are processing high-value or high-volume documents from parties with a financial incentive to misrepresent, layers 1 and 2 are not enough. Check the document, then check the transaction. What production AI teaches you that demos never will, and how we think about the gap between demo and production, maps these boundaries in more detail.

One without the other is no longer enough.


Perth AI Consulting builds AI systems and tells you honestly where AI works and where it does not. Start with a conversation.

Published 1 March 2026

Perth AI Consulting delivers AI opportunity analysis for small and medium businesses. Start with a conversation.

Written with Claude, Perplexity, and Grok. Directed and edited by Perth AI Consulting.

More from Thinking

Building 9 min read

How We Built On-Device De-Identification So AI Never Sees Real Names

Most AI privacy is a policy. Ours is architecture. We run a named entity recognition model inside the browser to strip identifying information before it ever leaves the device. Here is how it works, what we tested, and where it applies.

Building 8 min read

Your Practice Needs an AML/CTF Program by July 1. Here's What That Actually Looks Like.

AUSTRAC's Tranche 2 reforms hit accountants, real estate agents and settlement agents on 1 July 2026. We built a complete compliance program for a small practice in three days. Here's the process, the output and the boundaries.

Technical 7 min read

Your Agency's Clients Are About to Ask Why This Costs So Much

A solo consultant just built in two weeks what your agency quoted eight for. The client doesn't understand AI yet; but they will. The agencies that survive aren't the ones that cut costs. They're the ones that change what they sell.

Adoption 6 min read

What Do You Love Doing? What Do You Hate Doing?

Most AI rollouts fail the same way. Leadership announces efficiency. Staff hear replacement. A developer at a recent peer group meeting offered a reframe that changes everything; the psychology of why it works tells you how to deploy AI without destroying trust.

Technical 7 min read

Why I Don't Use n8n (And What I Do Instead)

If you've been pitched an AI system recently, there's a good chance you saw n8n in the demo. It demos well. But a compelling demo and a reliable production system are different things; and the distance between them is where businesses get hurt.

Technical 10 min read

Your Codebase Was Not Built for AI. That's the Actual Problem.

Amazon's mandatory meeting about AI breaking production isn't an AI tools story. It's an architecture story. The codebases AI is being pointed at were never designed to be understood by anything other than the humans who built them.

Adoption 4 min read

Your Team Has AI Licences. You Don't Have an AI System.

Fifteen people, fifteen separate AI accounts, no shared context. The problem isn't the tool; it's the architecture around it. Here's what fixing it looks like.

Building 7 min read

Your $2,000 Day Starts the Night Before: Our System Keeps You on the Tools, Not on the Phone

Your route is optimised overnight. Your customers are notified automatically. When something changes mid-day, every affected customer gets told without you picking up the phone. A tradie scheduling system that protects your daily rate.

Evaluation 4 min read

The Fastest Way for an Executive to Get Across AI

AI is moving faster than any executive can track. The alternatives: learning it yourself, sitting through vendor pitches, hiring a consultant who arrives with a hammer, all waste your scarcest resource. There is a faster way.

Building 6 min read

Your IT Department Will Take 18 Months. You Need This Working by Next Quarter.

Senior leaders often know exactly what they need built. The gap isn't technical; it's time. A prototype approach gets the tool working now and gives IT a validated blueprint to build from later.

Adoption 4 min read

What If You Had Perfect Memory Across Every Client?

Any practice managing dozens of ongoing client relationships captures more than it can recall. AI gives practitioners perfect memory across every interaction, so preparation time becomes thinking time, not retrieval time.

Building 5 min read

How to Build an AI Chatbot That Doesn't Lie to Your Customers

Woolworths deliberately scripted its AI to talk about its mother. The business fix is simple: be honest about the bot. The technical fix is harder: architecture that prevents fabrication by design, not by hope.

Technical 9 min read

Why AI Safety Features Are Load-Bearing Architecture, Not Political Decoration

The 'woke AI' label came from real failures; but they were engineering failures, not safety failures. Understanding the difference matters for every organisation deploying AI where errors have consequences.

Adoption 3 min read

Woolworths' AI Told a Customer It Had a Mother. That's a Problem.

Woolworths' AI assistant Olive was deliberately scripted to talk about its mother and uncle during customer calls. When callers realised they were talking to an AI pretending to be human, trust broke instantly.

Evaluation 4 min read

Google Is No Longer the Only Way Your Customers Find You

People are using ChatGPT, Perplexity, and Gemini to find businesses. The sites that get cited are structured differently to the sites that rank on Google. Most businesses are optimising for one and invisible to the other.

Evaluation 4 min read

Two Types of AI Assessment: And How to Know Which One You Need

Most businesses considering AI face the same question: where do we start? The answer depends on whether you need to find the opportunities or reclaim the time. Two assessments, two perspectives, one goal.

Evaluation 4 min read

The Personal Workflow Analysis: What Watching a Real Workday Reveals About Automation

When asked how they spend their day, most people describe the work they value, not the work that consumes their time. Recording a typical workday closes that gap, revealing automation opportunities no interview could surface.

Evaluation 4 min read

What a Good AI Audit Actually Delivers

A useful AI audit produces two things: a written report with specific, costed recommendations and a working prototype you can test. Not a slide deck. Not a proposal for more work.

Evaluation 4 min read

Your Website Looked Great Five Years Ago. Now It's Costing You Customers.

The signals that used to build trust online (polished design, stock imagery, aggressive calls to action) now trigger scepticism. Most businesses don't realise their digital presence is working against them.

Evaluation 4 min read

AI Audit That Starts With Your Business

Most AI consultants arrive with a toolkit and look for places to use it. An operations-first audit starts with how your business actually runs, and only recommends AI where the evidence says it will work.

Building 6 min read

What Production AI Teaches You That Demos Never Will

The gap between AI that works in a demo and AI that works in your business is where the useful lessons live. Architecture, framing, privacy, and adoption; the patterns are the same every time.

Adoption 6 min read

The Psychology of Why Your Team Won't Use AI

You buy the tool, run the demo, and three months later nobody is using it. The reason is not the technology; it is five predictable psychological barriers. Each one has a specific strategy that overcomes it.

Technical 4 min read

Stop Telling AI What NOT to Do: The Positive Framing Revolution

Most businesses get poor results from AI because they instruct it with constraints and prohibitions. Switching from negative framing to positive framing transforms output quality, and the principle comes from psychology, not computer science.

Building 5 min read

How We Turned Generic AI Into a Specialist: And What That Means for Your Business

Most businesses get mediocre AI output and blame the model. The fix is almost never a better model; it's a better architecture. Three structural changes that transform AI from 'fine' to 'actually useful.'

Evaluation 5 min read

Your Business Has 9 Customer Touchpoints. AI Can Fix the 6 You're Dropping.

You are spending money to get customers to your door. Then you are losing them because you cannot personally follow up with every lead, nurture every client, and ask for every review. AI can handle the touchpoints you are dropping: quietly, consistently, and at scale.

Technical 5 min read

What Happens to Your Data When You Press 'Send' on an AI Tool

Most businesses are sending customer data, financials, and internal documents to AI tools without understanding what happens during processing. The spectrum of AI privacy protection is wider than you think; recent research shows that even purpose-built security can have structural flaws.