Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG

Connecting Voracity to MapR

  • by Claudia Irvine

This article, along with counterpart articles for Cloudera and HortonWorks, describes the simple 5-step process to connect the IRI Voracity big data management platform to any MapR distribution through the VGrid Gateway.

After connecting,  data can be conveniently moved between HDFS and other systems. Furthermore, that data can be manipulated and masked in Hadoop via MR2, Spark, Spark Stream, Storm or Tez using the jobs created in Voracity’s Eclipse IDE, IRI Workbench.

Step 1 – Collect Information from the HDFS Configuration File

  1. Use terminal to search in HDFS configuration file.
  2. Make a note of the NameNode Web UI Port (grep HTTPFS_HTTP_PORT /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/httpfs-env.sh) value (in this case: 14000)

Step 2 – Collect Information from the MapR Configuration File

  1. Open the configuration file and use find to search for the appropriate keywords.

  1. Make a note of the NameNode Port (mapr.host) value (in this case: maprdemo)

  1. Make a note of the ResourceManager Web Application HTTP Port (yarn.resourcemanager.webapp.address) value (in this case: 8088)

  1. Make a note of the MapReduce JobHistory Web Application HTTP Port (mapreduce.jobhistory.webapp.address) value (in this case: 19888)

  1. Make note of the Resource Manager Address (yarn.resourcemanager.address) value (in this case: 8032)

Step 3 – Collect Information from the Oozie Configuration File

  1.  Use terminal to search in Oozie configuration file.

 

  1. Make a note of the Oozie HTTP Port (grep OOZIE_HTTP_PORT /opt/mapr/oozie/oozie-4.3.0/conf/oozie-env.sh) value (in this case: 11000)

Step 4 – Enter configuration details in VGrid Dashboard

1. Log into the VGrid Gateway.

2. Click User > Add User and enter the user information.

3. Click the X in the success banner to refresh the screen.

4. Click Detail in the Action section of the new user.

5. Make note of the generated API key shown. It will be needed in the VGrid Gateway setup in the Workbench preferences screen.

6. Click HadoopConfig and Add Hadoop Config.

  • Cluster = Cluster name
  • User = User name that will be used as the user in the Hadoop file system when working in the workbench
  • Hdfs = NameNode Web UI Port
  • Namenode = NameNode Port
  • Proxy = ResourceManager Web Application HTTP Port
  • History = MapReduce JobHistory Web Application HTTP Port
  • Jobtracker = Resource Manager Address
  • Oozie = Oozie HTTP Port

7. Click the X in the success banner to refresh the screen.

8. Click HadoopConfig and click inactive to activate that configuration.

9. Multiple configurations can be associated with each user; however, only one can be active at any given time.

Step 5 – Enter configuration details in IRI Workbench

1. Open IRI Workbench. On the IRI > VGrid Gateway preferences screen, enter the details of the connection.

2. Click Test Connection to confirm that a successful connection is made. Included engines can be narrowed down here if only certain engines are being used in the Hadoop environment. Also, a default engine can be selected for Hadoop run configurations.

Once connected, you should be able to interact with HDFS and run compatible Voracity jobs seamlessly per this article. If you have any questions or need assistance, contact voracity@iri.com.

Connecting Voracity to Cloudera
Using the Job Script Editor in IRI Workbench

Related articles

IRI Data Class Map
Schema Data Class Search
Masking RDB Data in the…
Find & Mask File PII…
Importing Data Classes into the…
Data Class & Rule Library…
Connecting MariaDB and MySQL to…
Sharing IRI Data Management Jobs…
Running IRI Software in a…
The IRI Platform
IRI Test Data Generation

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact