Isolation Forest analysis for account-vendor pairing anomalies

This analytic test leverages the Isolation Forest algorithm to identify infrequent patterns in general ledger (GL) transactions by analyzing the relationships between key attributes, such as chart-of-accounts codes and vendor names. It focuses on detecting anomalies that deviate from typical transactional patterns.

This analytic test can be used to:

  • Detect unusual combinations of accounts and vendors, or infrequent vendor transactions that may indicate anomalous activity or potential risks within the general ledger

  • Assess consistency in the structure of the income statement by identifying irregular patterns in GL transactions

  • Highlight potential issues related to reversals or contra entries specific to GL records

  • Provide auditors with enhanced context to evaluate the relevance and consistency of GL transaction patterns

Analysis fields

The following fields are required for this analysis:

  • Reference: Unique fields that are used to create a transaction ID, such as the Entry ID field for the general ledger dataset. These columns are used to identify the transactions that are part of the result. Reference fields are predefined in the test and cannot be modified.

Parameters

The following parameters must be set to run this analysis:

  • Vendor column: Select the column in the dataset that has the vendor's name.

  • Accounts column: Select the column that represents the identifier of the chart of accounts in the general ledger dataset.

  • Extra columns: Select whether to include extra columns in the results table. This does not affect the performance of the algorithm.

Test configurations

No specific test configurations are included in this test.

Technical specifications

To run the Isolation Forest analysis for account-vendor pairing anomalies analytic test:

  1. Select columns for the chart of accounts and vendor name in the general ledger.

  2. Select Run to run the test.

    The Isolation Forest model:

    • Uses a label encoder to convert text data into numbers by assigning a unique number to each value alphabetically, then adding these as temporary columns.

    • Processes the encoded data to identify outliers, where 1 represents normal data and -1 represents anomalies.

    • Calculates an outlier score, where lower scores indicate more abnormal entries.

    • Combines the outlier scores with the original dataset, transforming the scores into a 1–10 scale (with 1 being the most abnormal) and presenting the results in a table.