Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG

Pairs Testing in RowGen v3 – Valid/Joined Pairs and…

  • by Jason Koivu

Database and solution architects depend on realistic test data to:

  • help create new databases, prototype ETL jobs or applications
  • benchmark performance in new or existing platforms
  • stress-test systems
  • protect confidential information in existing systems if database work is outsourced or used for demonstrations.

Production data runs the risk of exposing personally identifiable information (PII), proprietary information, or may not reflect the types or volume of real data that can be encountered in the future.

The data used for testing must not be real, but instead appear real; it must truly represent what would be found in production, and conform to the value and volume characteristics, along with the business rules, necessary to test any software or system accurately.

For more information on how IRI RowGen software enhances test data realism, refer to Making Test Data Realistic – Without Taking It from Production.

There are a number of functions and wizards built into RowGen that allow it to synthesize realistic sets of test data. Through the Set File from Column wizard, RowGen can create set file extracts from production column values. The set files contain actual, but randomly selected column pairs  that exist in production. This is what is referred to as a joined pair, or valid pair.

These set files would normally contain innocuous information that does not have to be protected, and make sense to be paired — like a city and state, or a state and zip code. Why is a joined pair important? If a user is testing his database or data warehouse system, he wants to be able to tell if data is being loaded and organized properly.

By loading known, valid pairs into his test data, the user can visually validate results quickly. The valid pairs provide a method of testing data loading, field entry parameters, and the subsequent presentation of the data. Consider this example from the RowGen ‘Set File from Column’ wizard:

SetColumnWithValidPairData2

Another useful test data generation feature built into RowGen is the ability to create composite keys. A composite key is a key made up of two or more attributes that uniquely identify an occurrence.

RowGen can create all permutations of two or more attributes or fields that create all possible pairs and composite keys. Depending on the testing scenario, those composite keys can represent each unique record that could be needed for testing.

This process can be used to create a primer for all-pairs testing by generating the initial data up front, and then allowing the user to pare it down to carefully selected test vectors that properly test all combinations of scenarios for a system.

The reasoning behind all-pairs testing is basic: the simplest bugs in a program are generally triggered by a single input parameter. The next simplest category of bugs consists of those dependent on interactions between pairs of parameters, which can be easily caught with all-pairs testing. Many testing methods regard all-pairs testing of a system as a reasonable cost-benefit compromise between often computationally infeasible combinatorial testing methods, and less exhaustive methods that fail to exercise all possible pairs of parameters.

RowGen v3 is IRI’s latest offering in safe, intelligent, high-volume test data generation for relational databases, sequential files, and formatted report targets. RowGen runs from the IRI Workbench GUI (built on Eclipse™), on the command line, or from batch programs, to produce the quality and quantity of test data necessary to accurately reflect the scope, layouts, and relationships within production databases and data warehouses.  For more information on RowGen, see http://www.iri.com/products/rowgen.

Making Test Data Realistic – Without Taking It from Production
An Introduction to Data Mining
all pairs all-pairs testing benchmark performance IRI Workbench joined pairs pairs testing PII prototype ETL job realistic test data RowGen safe intelligent test data stress-test systems test data test data generation test data generator test data realism test data warehouse testing with pairs valid pairs

Related articles

Masking RDB Data in the…
Find & Mask File PII…
Data Class & Rule Library…
Connecting MariaDB and MySQL to…
The IRI Platform
IRI Test Data Generation
IRI Data Governance
Pseudonym Hash Set (File) Creation…
Consistent, Self-Updating and Secure Pseudonymization
IRI Voracity and Test Design…
Creating Set Files in IRI…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact