Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG

Real-time Database Data Replication

  • by Devon Kozenieski

Abstract:  The previous article in this series of 4 detailed how to install, set up, and configure IRI Ripcurrent, an IRI-developed command-line Java application that utilizes the Debezium embedded engine and the streaming feature of the IRI (CoSort) SortCL program to react in real-time to database change events by replicating data to downstream target(s), optionally with transformation (e.g., PII masking) rules consistently applied based on classification of the data. 

One capability of Ripcurrent is real-time data replication. Ripcurrent can monitor and react to the insertion, deletion, or update of any rows in the source database by replicating that data to target silo(s). Real-time replication of data through Ripcurrent will be demonstrated in this article through example.

Demonstration Setup

For this example, I am using a MySQL database as the source database being monitored; Debezium supports the monitoring of several other different types of databases, including PostgreSQL, Oracle, SQL Server, and DB2. The MySQL database must first be set up properly for Debezium to be able to monitor it. 

See Debezium documentation for detailed instructions on how to set up a database for monitoring of changes. The process varies depending on the database vendor. 

In the case of MySQL, this involves enabling binary logging for MySQL replication. The binary logs record transaction updates that allow replication tools to propagate changes.

I have already installed the Ripcurrent feature for IRI Workbench and set up a configuration file as specified in the previous article. I have also set up tables in the target schema with the same structure as the tables I am monitoring in the source database.

Ripcurrent is suited for many different use cases where data is being added and updated to a database frequently (e.g. orders/transactions, sensor data from IoT devices like weather conditions, location data from transportation vehicles, patient health data). 

In this case, I am demonstrating an example of data replication with financial transactions. The source table is being updated with new transactions as they occur. Ripcurrent can replicate these new transactions to file and table targets as they happen in near real time. 

Of course, there is always some latency in copying data from source to targets, but this is usually measured in sub-seconds unless a large amount of data has been modified or added to the table at once.

The image below shows a sample of data in this table.

Generating a Properties File

To begin with, a properties file should be created to define configurations required for Ripcurrent. After installing Ripcurrent in IRI Workbench, the Ripcurrent Properties Generator wizard is available in the IRI Voracity (orca icon) menu to assist in this task.

On the first page of the wizard, I selected a MySQL database as the source to monitor:

On the second page, I selected the option to replicate changes to both files in a directory and database tables in a specified schema.  I chose a SQL Server database as my target and delimited flat files as a secondary target. I entered \t as the target separator to make the flat file output tab-separated.

In the case of direct data replication without any transformations, there is no need to select anything on the third page, which pertains to selecting IRI libraries to use with Ripcurrent that defines how to classify and transform data.

On the final page, I entered a name for the connector and server, and a path for database structure change events to be logged to. For the most part, the connector name and server name can be anything as long as they are unique. They may show in Ripcurrent logs, and also are used for identification.

There are a few restrictions on the format of the server name; the Ripcurrent documentation discusses Ripcurrent configuration properties in detail. The server ID is not required; I did not specify a value in this case.

After successful completion of the wizard, a properties file will be generated and opened in the editor:

Demonstrating Real-Time Data Replication with Ripcurrent

Ripcurrent can be started by running the ripcurrent.bat script in the bin folder of the Ripcurrent distribution, since I am using a Windows operating system. This script can be triggered through a terminal or through an IRI Workbench external tools run configuration (see below).

Ripcurrent runs in the background, monitoring events from Debezium until explicitly terminated. 

Snapshot of running Ripcurrent from the External Tools Configuration in IRI Workbench.
Logging output has been configured to display on the console.

Each event received from Debezium when a database table has data added or modified includes information about the new data values, the data types, and the names for each column.

The data received is used as the input to a dynamically generated SortCL job based on the information from the Debezium event. New jobs are created when there is an event from a table not seen before, or there is an event from a table seen before but there is no existing job that has the same layout of columns from the event and the same type of event (e.g. update, insert, delete). 

With Ripcurrent running, I edited values in the source table through the Data Source Explorer in IRI Workbench to trigger change events. 

The new data in the source table is mirrored to a corresponding table in the target SQL Server schema, as well as a file in a specified directory. Ripcurrent will replicate data to additional tables or files for each event that is received from a unique table in the source database.

First, I inserted one row (shown highlighted) into the source transactions table in MySQL.

This was the first event received from the table, so a new replication job was started.

As a result of the replication job, the data added to the source table was duplicated to a tab-delimited file in the target directory. 

Additionally, the data was replicated to the target table.

Next, I modified the account number of the record previously inserted into the source table. Note that updates and deletes of data from the source database can only be reflected in database targets, not file targets.

This change is also captured by Ripcurrent and reflected in the target table.

Finally, I added several new transaction records to the source table (3051-3053):

Ripcurrent captures these additional changes and duplicates them to the target table.

Conclusion

This article demonstrated one use case for Ripcurrent – the replication of changed database data directly into target files and/or tables. The next (third) article in this series will augment this demonstration with the use of data classification and masking rules to categorize and mask data on a consistent basis before it is replicated to your targets.

Getting Started with IRI Ripcurrent
Real-Time, Incremental Data Masking
data migration IRI Ripcurrent Real-time db data replication Ripcurrent

Related articles

Schema Data Class Search
Connecting MariaDB and MySQL to…
Running IRI Software in a…
The IRI Platform
IRI Test Data Generation
IRI Data Quality and Improvement
IRI Data Migration and Modernization
IRI Voracity and Test Design…
Creating Set Files in IRI…
All About IRI Set Files:…
Using Ripcurrent for DB Structure…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact