We Built an AI Invoice Verifier. Here's Where It Hits a Wall.
We built an AI invoice verifier and watched a fake beat a real invoice. Here's why document analysis alone cannot stop invoice fraud; the five layers of detection that most businesses never reach.
Australian small businesses lodged 1,909 scam reports to Scamwatch in 2024, with fake invoices and false billing accounting for $13.1 million in confirmed losses. Payment redirection fraud cost Australian businesses $224 million in 2022 alone. Both figures understate the real number; the ACCC consistently notes that most scams go unreported.
Then in late 2025, Commonwealth Bank disclosed up to $1 billion in suspected fraudulent home loans. AI-generated payslips, fake bank statements, fabricated identity documents, all convincing enough to pass through the mortgage process undetected. All four major Australian banks flagged similar issues.
We wanted to know whether current AI could catch this kind of fraud. Not in theory. In practice. So we built a seven-stage verification engine and tested it with two invoices.
One test was enough to see the problem. Not a bug in our engine; a structural limitation in what document analysis can ever achieve.
What We Tested
The engine takes an invoice image, extracts every field into structured data, checks the arithmetic, analyses font consistency and formatting, looks for signs of pixel-level editing, and produces a confidence score.
We tested it with two documents. A genuine invoice from a real Perth sole trader operating through a family trust with a Gmail address and no website. And a fabricated invoice we created ourselves: fictional business name, made-up ABN, invented address, but with professional formatting, correct GST, and realistic pricing.
The fake scored higher than the real one.
The engine flagged the genuine invoice for its Gmail address, absence of a website, and family trust structure. It rewarded the fabricated invoice for consistent branding, a professional template, and clean formatting. Polish beat authenticity.
Two documents are not a benchmark. But they are a controlled demonstration of a limitation any document-only system will face: what we call the real-details problem. When a fraudster uses genuine, publicly available information, every check that only examines the document will pass.
We could have added more checks. ABN lookups against the Australian Business Register. Address verification. Business name matching. But we stopped, because we could already see where this was heading.
Why More Checks Do Not Solve the Problem
Every detail on an invoice can be real and the invoice can still be fraudulent.
The ABN can be copied from the public register. The business name, the address, the pricing. All publicly available, all trivial to assemble into something that looks legitimate. A fraudster who uses real details passes every check that only examines the document.
This is exactly what happened to CBA. The fraudulent documents used real business details and plausible figures. The documents were accurate. The transactions behind them were not.
Checking the document catches careless fraud: someone who invents an ABN, misspells a business name, or gets the GST calculation wrong. That is useful, but it is a filter for laziness, not for intent. Anyone with five minutes and access to the ABR website can build an invoice that passes every document-level check.
The question AI cannot answer from a document alone is: did this transaction actually happen?
The Real Problem Is Accounting, Not AI
An invoice is a claim. It says: this service was provided, this amount is owed, this is where to send the money. But a claim is not evidence. The evidence is in the bank accounts.
Did $4,200 leave the debtor’s account on the date the invoice claims? Did it arrive in the creditor’s account? Do the amounts match? Do the dates align?
This is not a document analysis problem. It is an accounting problem. And solving it requires access to data that no document verifier has: the banking records of both parties.
For years, businesses have verified invoices by examining the document: does the ABN look right, do the numbers add up, does the formatting look professional. That was reasonable when faking a convincing document required effort and skill. With AI-generated documents, it takes minutes. The document is now the easy part to fake. The transaction is still hard.
Any verification process that only examines the document is checking the side the fraudster controls completely.
What a Real Solution Requires
AI can solve this problem. But not by looking harder at documents. It needs access to banking data, and that makes it a fundamentally different kind of system.
This applies wherever documents are submitted by parties with a financial incentive to misrepresent: mortgage applications, construction progress claims, government grant acquittals, accounts payable shared services. The context changes. The real-details problem does not.
Layer 1: Document analysis. AI extracts and validates the invoice: formatting, arithmetic, signs of manipulation. This is production-ready today. It catches mistakes and lazy fraud. Every business processing invoices should automate this.
Layer 2: Registry verification. Cross-reference the ABN, business name, and address against government registers. This catches fraud where someone invented the details. It does not catch fraud where someone used real ones.
Layer 3: Transaction matching. Confirm that the payment left the debtor’s bank account and arrived in the creditor’s bank account. The amount matches. The date aligns. This is where sophisticated fraud is caught. The one thing a fraudster cannot fabricate without committing a separate, traceable crime is the actual movement of money. In practice, this layer typically lives with banks and platforms with regulated access to transaction data under frameworks like Open Banking and the Consumer Data Right; not with a standalone invoice AI.
Layer 4: Pattern detection. Flag anomalies across thousands of transactions. The same tradesperson appearing on fifty mortgage applications in one month. A sudden spike in invoices from a newly registered business. Amounts that cluster just below approval thresholds. This is where AI adds real value: not by examining individual documents, but by seeing patterns across volumes that no human reviewer could process.
Layer 5: Collusion detection. This is where it gets genuinely hard. Layers 1 through 4 assume the fraud is one-sided: a fabricator acting alone. But when parties collude, a mortgage broker and an applicant, a vendor and a purchasing officer, the transaction is real. Money does move. The invoice is legitimate. The fraud is in the relationship, not the paperwork.
Detecting collusion requires network analysis: mapping relationships between entities, identifying conflicts of interest, flagging transactions where the parties are connected in ways that should trigger review. This is a different discipline; closer to financial intelligence than document processing.
The Five Layers of Fraud Detection
Each layer catches what the layer below it misses:
| Layer | What it checks | What it catches | What it misses |
|---|---|---|---|
| Document analysis | Formatting, arithmetic, manipulation | Mistakes and lazy fakes | Real details used fraudulently |
| Registry verification | ABN, business name, address | Invented details | Real details on a fake invoice |
| Transaction matching | Money in, money out | Sophisticated single-party fraud | Collusion where money really moves |
| Pattern detection | Volume anomalies, timing, clustering | Organised fraud at scale | Coordinated low-volume collusion |
| Collusion detection | Relationships between parties | Connected parties, conflicts of interest | Novel fraud structures |
Most businesses operate at layer 1. Some reach layer 2. The CBA problem sits at layers 3 and 4. The hardest fraud, the kind that costs billions, requires all five.
Why We Stopped at Layer 1
We tested whether AI document analysis could catch invoice fraud. One test showed us it could not; not because our engine was bad, but because of the real-details problem. A well-crafted fake using genuine, publicly available information will always beat document-level analysis.
We could have built layers 2 and 3. But layer 2 only catches fraud where someone was too lazy to use real details. And layer 3 requires access to both parties’ banking data; a fundamentally different system with different privacy, infrastructure, and regulatory requirements.
The honest conclusion: AI can catch invoice fraud, but only as part of a system that has access to banking data, transaction records, and pattern analysis across thousands of submissions. Document analysis alone, no matter how sophisticated, checks the one thing the fraudster controls completely.
For most mid-market businesses processing invoices, layers 1 and 2 are the highest-return starting point and can be implemented without changing banking relationships. Automate your document extraction: AI does that brilliantly. Verify ABNs against the register: it is free and takes seconds. These two layers catch mistakes, catch careless fraud, and free your team to focus on the judgement calls that actually require a human.
If you are processing high-value or high-volume documents from parties with a financial incentive to misrepresent, layers 1 and 2 are not enough. Check the document, then check the transaction. What production AI teaches you that demos never will, and how we think about the gap between demo and production, maps these boundaries in more detail.
One without the other is no longer enough.
Perth AI Consulting builds AI systems and tells you honestly where AI works and where it does not. Start with a conversation.