Quick start

This guide walks you through your first test case with Ploosh in 5 steps.

1. Install Ploosh

pip install ploosh

2. Setup connection file

Create a file connections.yml with your database connections:

my_database:
  type: mysql
  hostname: my_server.database.windows.net
  database: mydatabasename
  username: myusername
  password: $var.db_password
Using $var.db_password instead of a hardcoded password allows you to pass it securely via the command line at runtime.

3. Create test cases

Create a folder test_cases/ and add a YAML file (e.g. tests.yml) with your test definitions:

Test aggregated data:
  options:
    sort:
      - gender
      - domain
  source:
    connection: my_database
    type: mysql
    query: |
      SELECT gender,
             RIGHT(email, LENGTH(email) - POSITION("@" IN email)) AS domain,
             COUNT(*) AS count
      FROM users
      GROUP BY gender, domain
  expected:
    type: csv
    path: ./data/expected_aggregation.csv

Test no invalid emails: source: connection: my_database type: mysql query: | SELECT * FROM users WHERE email NOT LIKE '%@%.%' expected: type: empty

4. Run tests

ploosh --connections "connections.yml" --cases "testcases" --export "JSON" --pdbpassword "mysecret_password"

During execution, the status of each test is displayed in real-time:

                    _
\
)/ \ / \/ ' \
/()() \ \
_\/ \/___/
Initialization[...] Start processing tests cases[...] Test aggregated data [...] (1/2) - Started Test aggregated data [...] (1/2) - Passed Test no invalid emails [...] (2/2) - Started Test no invalid emails [...] (2/2) - Passed Summary[...] Total: 2 | Passed: 2 | Failed: 0 | Error: 0

5. Review results

A test_results.json file is generated in the output/json/ folder:

[
  {
    "name": "Test aggregated data",
    "state": "passed",
    "source": {
      "start": "2024-02-05T17:08:36Z",
      "end": "2024-02-05T17:08:36Z",
      "duration": 0.003298
    },
    "expected": {
      "start": "2024-02-05T17:08:36Z",
      "end": "2024-02-05T17:08:36Z",
      "duration": 0.000061
    },
    "compare": {
      "start": "2024-02-05T17:08:36Z",
      "end": "2024-02-05T17:08:36Z",
      "duration": 0.000465
    }
  }
]

Gap analysis Excel files

When a test fails, an Excel file (.xlsx) is automatically generated in output/json/test_results/ with a detailed gap analysis. The file contains a side-by-side comparison of the differing values:

ColumnDescription
{column}sourceValue from the source dataset
{column}expectedValue from the expected dataset
Only rows and columns with differences are included, making it easy to pinpoint exactly where the data diverges.

Next steps