Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG
looks good to me

Test Data Management: Goal Setting & Team Building (Step…

  • by David Friedland

This article is part of a 4-step series introduced here. Navigation between articles is below.


Step 1: Goal Setting & Team Building

Someone needs test data to do something, like:

  • stress-testing the functions and performance of applications
  • prototyping database load/query and DW ETL/ELT operations
  • benchmarking prospective new hardware or software
  • outsourcing development or proofs of concept
  • demonstrating systems with real-looking, but not real, sample data

In all these cases, the most realistic data possible is needed, but it should also be safe and de-personalized. Sometimes it’s enough to mask real data and test with that. Sometimes, however, production data cannot be used for testing (even if masked), because it is not yet in existence, nor available since source data access is restricted. It may also not be realistic or robust enough for application use cases, nor big enough to stress-test the future capacity of the solution (think healthcare.gov).

Sub-setting and masking production data can also be arduous relative to an automated method of creating data from scratch using existing metadata (like database DDL or a COBOL copybook). In that case, the goal would be to generate test data with the properties, but not the actual values, of data in production.

Once you determine the need for the test data and whether it is available (and needs masking using a tool like IRI FieldShield), or generated (‘from scratch’ with a tool like IRI RowGen), the project manager should identify who requires the test data, and detail their particular technical requirements for it (see Step 2).

Are they the same people who have access to the production data to be masked, or to the metadata information needed to generate production-quality data? Can they work together? Identify who will:

  • obtain, mask, and/or create the test data (e.g., DBA, programmer)
  • deliver the test data sets to the stakeholders, and/or populate the target tables directly
  • validate the test data’s quality and quantity (sufficiency for effect; e.g., application developer, benchmark tester)
  • assess and verify its compliance with internal and governmental data privacy regulations (e.g., CISO, data governance or stewardship lead)
  • use the test data (e.g. developer, solution architect, DBA), and give feedback to the provider(s)
  • document and version-control the metadata for the project(s) (e.g., application developer)
  • store, relocate, and/or dispose of the test data sets after use, as needed (e.g,. system administrator)

Some of these people or roles will overlap; or, you might be the only person doing it all! And the same tools used to mask or create the test data may not necessarily be the same ones used for managing it.

Having FieldShield and RowGen sharing the same metadata and Eclipse GUI allows browsing/acquisition, masking and/or generation, and test data asset management to occur in the same environment. Java application development, project management and version control, plus access to databases for browsing, population, and SQL testing are other benefits of this environment. BIRT can also be used to visualize the test data in charts and graphs that show its distribution. You may or may not need all of that functionality or control, but it’s something to think about, and it’s nice to have it all in one place.

Click here for the next article, Step 2: Test Data Needs Assessment, or here for the previous step.

Test Data Management: A Primer
Test Data Management: Test Data Needs Assessment (Step 2 of 4)
benchmark performance data masking outsource development test data production data prototyping database realistic test data RowGen stress test system test data test data goals test data management test data team building

Related articles

Masking RDB Data in the…
Find & Mask File PII…
Data Class & Rule Library…
Connecting MariaDB and MySQL to…
The IRI Platform
IRI Test Data Generation
IRI Data Governance
Pseudonym Hash Set (File) Creation…
Consistent, Self-Updating and Secure Pseudonymization
IRI Voracity and Test Design…
Creating Set Files in IRI…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact