Big Data Archives - Page 3 of 6

Running Voracity Jobs in Hadoop

by Sharon Hewitt

Many of the same data manipulation, masking, and test data generation jobs you can run in IRI Voracity® with the default SortCL program can now also run seamlessly in Hadoop™. Read More

Creating & Executing SQL Statements in IRI Workbench

by Susan Gegner

Among the many database-centric features in IRI Workbench is the ability to create, modify, and execute SQL statements manually or graphically. These “SQL scrap-booking” features are available through the free Data Tools Platform (DTP) plug-in for Eclipse, which also supports IRI job wizards for database:

profiling, searching, classification, E-R diagramming, and integrity checking integration, including ETL, pivoting, slowly changing dimensions, and change data capture column masking, including format-preserving encryption, redaction, and pseudonymization subsetting, test data generation, and bulk loading migration, replication, and offline reorgs

To use the cross-platform(!) Read More

IoT & IRI – Aggregation on the Edge

by Abhishek Deshpande

As a follow-on to my introduction to IoT and MQTT, this article describes how device data collected from standard climate sensors and sent through an MQTT (Message Queuing Telemetry Transport) server can be rapidly and inexpensively pre-aggregated by a single SortCL program (in IRI CoSort) on a tiny edge device or gateway. Read More

An Introduction to IoT & MQTT

by Abhishek Deshpande

This article introduces the concept of the Internet of Things (IoT) and the popular, lightweight Message Queuing Telemetry Transport (MQTT) protocol for moving data from IoT devices into processing frameworks. Read More

Comparing Big Data Integration Methods

by Dale Robson

“That’s all you need in life, a little place for your stuff. That’s all your house is, a place to keep your stuff. If you didn’t have so much stuff, you wouldn’t need a house. Read More

A Fresh Look at Data Preparation

by David Friedland

To analyze data successfully, it must first be prepared successfully. Poor quality data creates poor results. Worse yet is data that takes too long to collect and clean because it is too big or too foreign. Read More

Voracity and the Logical Data Warehouse (LDW)

by Jason Koivu

The traditional or enterprise data warehouse (EDW) has been at the center of data’s transformation to business intelligence (BI) for years. An EDW involves a centralized data repository (traditionally, a relational database) from which data marts and reports are built. Read More

The Use of Data Lakes

by Jason Koivu

Has your organization considered using a data lake? This article explains what a data lake is, and how you can fish its murky depths for value in an architecture optimized for your needs. Read More

5 Downsides to Megavendor Devotion

by Donna Davis

Very large legacy IT vendors, or what we’ll call megavendors, provide valuable hardware, software, and services to companies worldwide. Often however, their technical approach, product roadmap, and price point will not be the best fit for your use case. Read More

Indexing Splunk with Voracity Add-On

by Dan Klajn

Update Q3’2019: Subsequent to the development of the IRI Voracity Add-On for Splunk described below, there is now also a Splunkbase-registered IRI Voracity App for Splunk available for Seamless Data Preparation, Indexing, and Visualization…

After our first examples of external unstructured data preparation and PII data masking for Splunk generated interest in these capabilities, IRI wanted to develop a direct integration from the Splunk user interface (UI). Read More

Linear Regression – A Predictive Tool in IRI Voracity…

by Dustin Ellsworth

Linear regression is a staple data analysis function for financial, economic, research, and many other disciplines, that helps discover new data correlations. Users of the IRI Voracity platform can now simultaneously process big data from any number of sources and present customized trend lines to help business users make predictions. Read More

Category: Big Data