Skip to main content

Validation Agent – Semantic Rule Engine (MCP)

Overview

The ValidationAgent in FacturaScan 360 is responsible for applying a predefined set of semantic, fiscal, and temporal validation rules to structured invoice data provided in the InvoiceContext. It outputs a standardized ValidationContext compliant with the Model Context Protocol (MCP).

The agent ensures the invoice complies with legal, accounting, and operational constraints defined per jurisdiction or industry norm.


1. Input Schema

The agent receives a single InvoiceContext object (as defined in 06-mcp-schemas.md).

Minimal Required Fields

  • invoice_number, issue_date, due_date
  • subtotal, vat, total
  • supplier_name, supplier_tax_id
  • Optional: line_items, currency, payment_method

2. Output Schema – ValidationContext

{
"invoice_id": "UUID",
"agent": "ValidationAgent_v1.0",
"validation_date": "timestamp",
"rule_results": [
{
"rule_id": "V-01",
"description": "VAT must be between 0 and 21%",
"passed": false,
"severity": "critical",
"value_observed": 0.25,
"suggestion": "Check supplier tax status or invoice type",
"references": ["invoice_context.vat"]
}
],
"overall_status": "invalid",
"notes": "Detected VAT anomaly",
"version": "1.0"
}

3. Rule Engine Design

3.1. Rule Representation

Each rule is represented by a JSON-like definition:

{
"rule_id": "V-03",
"description": "Subtotal + VAT must equal total",
"severity": "critical",
"field_refs": ["subtotal", "vat", "total"],
"condition": "(subtotal + vat) ≈ total",
"tolerance": 0.01
}

3.2. Rule Categories

CategoryDescription
FiscalTax logic, subtotal consistency
TemporalDate logic (e.g., due >= issue)
StructuralRequired fields and uniqueness
Duplicate logicMatch against invoice history
Supplier checksField format, tax ID consistency

4. Core Ruleset (v1.0)

Rule IDDescriptionTypeSeverity
V-01VAT must be ≥ 0 and ≤ 21%FiscalCritical
V-02Issue date must precede due dateTemporalMajor
V-03Subtotal + VAT ≈ Total (within €0.01)FiscalCritical
V-04Supplier name must be presentStructuralCritical
V-05Invoice number must be unique for the same supplierDuplicateCritical
V-06Invoice currency must be declaredStructuralMinor
V-07Payment method should be present (if required)StructuralInfo
V-08Missing tax ID triggers warningStructuralWarning

5. Execution Pipeline

InvoiceContext

Rule Loader (YAML or DB)

ValidationEngine (loop over rules)

Evaluation Engine

Output → ValidationContext

Each rule is evaluated dynamically using field references and mathematical expressions. Failed rules are logged with:

  • Observed values
  • Severity
  • Suggested action (if applicable)

6. Agent Deployment

ModeTechnologyDeployment
Cloud LambdaPython + FastAPIEvent-driven via Step Functions
ECS serviceDocker containerFor batch or parallel inference
CI Testingpytest with fixturesValidates rules and outputs

7. Example – Rule Violation

Input

"vat": 250.00,
"subtotal": 500.00,
"total": 710.00

Output

{
"rule_id": "V-03",
"description": "Subtotal + VAT must equal total",
"passed": false,
"value_observed": 750.00 vs 710.00,
"severity": "critical",
"suggestion": "Check for miscalculation or discount not applied"
}

8. Rule Versioning and Lifecycle

  • Rules are versioned (e.g., validation_rules_v1.0.yaml)
  • The ValidationAgent references the exact rule version used
  • If the rule definition changes, agent version must be incremented
  • Old validations remain reproducible and immutable

9. Integration with Other Modules

  • Trigger: After InvoiceContext creation
  • Storage: validation_context field in DB (invoices table)
  • Alert System: Parses output for severity critical | major
  • Chat Agent: Uses rules to explain anomalies

10. Summary

The ValidationAgent ensures invoice data quality and correctness by applying deterministic, explainable rules to structured invoice content. It serves as the first MCP-compliant AI agent in the invoice processing pipeline, enabling downstream logic (alerts, analytics, dialogue) to operate on trustworthy data.

Next: see 10-analytics-agent.md for the construction of the AnalyticsContext used in business reporting.