Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG

Data Quality Rules in IRI Workbench

  • by Claudia Irvine

IRI Workbench now contains a section of data cleansing, enrichment, and validation rules called Data Quality Rules for use in IRI CoSort data transformation and IRI Voracity ETL, reporting and data wrangling jobs. The selected rules automatically serialize many of the data quality functions, long supported in their common SortCL program, into task scripts or batch workflows ready for execution, modification, sharing, or expansion.

IRI Voracity Data Quality Functionality
Data quality functions in IRI CoSort and IRI Voracity

While some of these rules may be found in other categories, there is now a centralized page for them. The data quality rules are further divided into Field Rules and Section Rules, depending on how they are applied in SortCL scripts.

Field Rules
Alignment

This rule is used to align data. It is helpful for unifying the format of data, and in generating values for lookup (set) files where their column data need to be right- or left-aligned.

When used in the virtual record layout (inrec) or target (output) section of a job script, this rule aligns the field string without its leading or trailing fill characters. The remaining length of the target field is populated with spaces or a pad character.

For example, on this input source:

McKinley William       1897-1901 
Roosevelt Theodore     1901-1909 
Taft William H.        1909-1913 
Wilson Woodrow         1913-1921 
Harding Warren G.      1921-1923

the president field can be shifted to the right using right-align.

      McKinley William 1897-1901 
    Roosevelt Theodore 1901-1909 
       Taft William H. 1909-1913 
        Wilson Woodrow 1913-1921 
     Harding Warren G. 1921-1923

Compound Data Values

Compound data values allow you to specify a standard format for data that you can use for enrichment, lookup transformations, and synthesis (test data generation). This rule does two things:  It creates a set file containing the compound data and assigns that set file in the rule.

In the example below, a US-formatted telephone number is created in the wizard using parenthesis and a dash and populated with either custom-ranged or randomly-generated numbers:

String Quality Functions

This rule opens a page with a list of functions that can be applied to strings. It contains items such as changing case, substring, and trim.

For example, sub_string can change this input:

McKinley, William
Roosevelt, Theodore
Taft, William H.
Wilson, Woodrow
Harding, Warren G.

to this output:

McKin                     
Roose                     
Taft,                     
Wilso              
Hardi

Section Rules
Contains Value Test

This rule applies a test to a field to see if the value specified is included in the field. That result is used to filter rows specified in the section or in a conditional if-then-else statement that creates a new field with data specific to whether the test passed or failed. See the Lookup Value Test below for an example with an if-then-else-statement.

For example, this condition and filter created by the details in the screenshot below,

/CONDITION=(CONTAINS_TEST, TEST=(isholding(LAST_NAME, "Smith")))
/INCLUDE WHERE CONTAINS_TEST

sends only qualifying rows (where the last name is Smith) to the target.

Data Type Test

This rule can be used to validate and include or exclude data on the basis of its actual type. Multiple types of data type or format tests are available from this function’s drop-down menu. The result is used in the section as a filter or in an if-then-else statement.

For example, this condition and filter created by the details in the screenshot,

/CONDITION=(DATA_TYPE_TEST, TEST=(isempty(LAST_NAME)))
/OMIT WHERE DATA_TYPE_TEST

omits the record if the value of the last name field is empty.

Lookup Value Test

This rule checks to see if the value in the specified field is included in the specified set file or table. The result is used in the section as a filter or in an if-then-else statement.

For example, the details in this screenshot create a field created from the set file to test against, a condition to compare the value from the set file to make sure it matches that data (and doesn’t return null), and another field that uses that condition to return the desired value.

/FIELD=(LOOKUP_LAST_NAME, TYPE=ASCII, POSITION=9, SEPARATOR="\t", 
   SET="C:/IRI/cosort100/sets/names/names_last.set" [LAST_NAME] DEFAULT="FALSE")
/CONDITION=(LOOKUP_LAST_NAME_TEST, TEST=(NOT ispattern(LOOKUP_LAST_NAME, 
   "FALSE")))
/FIELD=(IF_FIELD, TYPE=ASCII, POSITION=10, SEPARATOR="\t", 
   IF LOOKUP_LAST_NAME_TEST THEN LAST_NAME ELSE "Missing")

The result is if the LAST_NAME is contained in the set file, it will return the last name, else it will return “Missing”.

Pattern Test

This rule compares the value in the specified field to a pattern to determine if it matches.  The result is used in the section as a filter or in an if-then-else statement.

For example, this condition and filter created by the details in the screenshot,

/CONDITION=(PATTERN_TEST, TEST=(ispattern(EMAIL, 
"\\b[\\w._%+-]+@[\\w.-]+\\.[A-Za-z]{2,4}\\b")))
/INCLUDE WHERE PATTERN_TEST

only includes records that have a properly formatted email address.

Value Range Test

Finally, this rule determines if the value in the specified field is in the range specified.  The result is used in the section as a filter or in an if-then-else statement.

For example, these two conditions and two filters created by the details in the screenshot,

/CONDITION=(LOW_END, TEST=(SALARY LT 1500))
/CONDITION=(HIGH_END, TEST=(SALARY GT 5000))
/INCLUDE WHERE LOW_END
/INCLUDE WHERE HIGH_END

test that the salary value is within the range of 1500 to 5000.

For more information on each rule, all the dialogs include help and tips upon hover to guide the user. If you need help implementing Voracity data mapping, data quality, data masking, or test data generation rules in IRI Workbench, contact or your IRI representative or email voracity@iri.com.

Feeding Datadog with Voracity Part 4: Security Analytics from DarkShield Logs
Introduction to IRI DarkShield
data quality data transformation data wrangling ETL IRI Voracity IRI Workbench SortCL

Related articles

IRI Data Class Map
Schema Data Class Search
Masking RDB Data in the…
Find & Mask File PII…
Importing Data Classes into the…
Data Class & Rule Library…
Connecting MariaDB and MySQL to…
Sharing IRI Data Management Jobs…
Running IRI Software in a…
The IRI Platform
IRI Test Data Generation

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact