{"id":17919,"date":"2024-08-19T19:02:50","date_gmt":"2024-08-19T23:02:50","guid":{"rendered":"https:\/\/www.iri.com\/blog\/?p=17919"},"modified":"2024-08-22T10:18:50","modified_gmt":"2024-08-22T14:18:50","slug":"etl-part-2","status":"publish","type":"post","link":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/","title":{"rendered":"Joining Flat-File &#038; RDB Data: Textual ETL (Part 2)"},"content":{"rendered":"<h5><b>Introduction<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">This article demonstrates the <\/span><a href=\"https:\/\/www.iri.com\/products\/voracity\"><span style=\"font-weight: 400;\">Voracity<\/span><\/a><span style=\"font-weight: 400;\"> user&#8217;s ability to join values in a flat-file file to those in an RDB (Relational Database) table to provide meaningful information. It is the continuation of <\/span><a href=\"https:\/\/www.iri.com\/blog\/migration\/data-migration\/textual-etl\/\"><span style=\"font-weight: 400;\">this article<\/span><\/a><span style=\"font-weight: 400;\"> on preparing unstructured data for integration with structured data and the standard technologies (like an Oracle database or JSON file) that support them.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As shown in that first article, the flat file was created by <\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl\"><span style=\"font-weight: 400;\">SortCL<\/span><\/a><span style=\"font-weight: 400;\"> data transposition scripts acting on a delimited <\/span><a href=\"https:\/\/www.iri.com\/products\/darkshield\"><span style=\"font-weight: 400;\">IRI DarkShield<\/span><\/a><span style=\"font-weight: 400;\"> log file. That DarkShield log (optionally) held PII values along with file-specific metadata from a search through multiple unstructured data sources. The SortCL jobs pivoted the log file into a new row-column structure organized by data class for easier use.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Neither the original log file nor the pivoted flat file with the PII provides discernibly meaningful information about any particular individual beyond the email addresses or names in them. This is because all DarkShield searches log PII found in unstructured sources chronologically. And by definition, there is no structure to lend associations to the PII elements found anyway.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Thus for those interested in learning more about, or making use of, unstructured data like the PII in this case, an association to this data must be (re)established. To do that in this case, we pick one of the PII elements from the transposed log file \u2013 in the example below, the email address \u2013 and <\/span><a href=\"https:\/\/www.iri.com\/solutions\/data-transformation\/match-join\"><span style=\"font-weight: 400;\">join it<\/span><\/a><span style=\"font-weight: 400;\"> to a structured source (e.g., a database table) that may hold the same email address <\/span><i><span style=\"font-weight: 400;\">along with <\/span><\/i><span style=\"font-weight: 400;\">actually identifying, or otherwise meaningful, information about that same person.<\/span><\/p>\n<h5><b>Why Does this Matter?<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">What is a use case for this kind of capability? Consider a non-descript email address that DarkShield found in a PDF complaint describing criminal activity, or an account number found in a chat log or call recording. How about clinical notes naming a patient and their symptoms with no link to care or trial resources? Unfortunately, such data is often just archived and untapped.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Instead, consider what can happen when those values are extracted and combined with traditional sources of data in sequential files, semi-structured EDI documents, or relational databases? In the cases above, a perpetrator could be identified, an account holder protected, and a patient treatment record made more holistic.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Heretofore, addressing this data combination challenge has been largely manual, and most such efforts would prove unfruitful. The example in this article therefore serves two purposes:<\/span><\/p>\n<p>1. <span style=\"font-weight: 400;\">To complete the textual ETL example in the first article by showing how the disassociated PII (email addresses) discovered by DarkShield in Voracity can be joined to create useful information about any emails for which there is a match in the join; and,\u00a0<\/span><\/p>\n<p>2. To demonstrate how to create and run a join operation in the IRI CoSort product or Voracity ETL platform using the SortCL job script and program common to both.<\/p>\n<h5><b>Textual Data Integration Example<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">This sample use case will use real data in a structured database to identify the owner of matching email addresses discovered in a PII search job that DarkShield performed across unstructured data sources. In the <\/span><a href=\"https:\/\/www.iri.com\/blog\/migration\/data-migration\/textual-etl\/\"><span style=\"font-weight: 400;\">first article<\/span><\/a><span style=\"font-weight: 400;\">, we transposed the DarkShield log file into a disassociated but formatted report of discovered PII:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17924 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/Darkshield-log-file-into-disassociated-300x210.png\" alt=\"\" width=\"856\" height=\"599\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Darkshield-log-file-into-disassociated-300x210.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Darkshield-log-file-into-disassociated-768x538.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Darkshield-log-file-into-disassociated.png 833w\" sizes=\"(max-width: 856px) 100vw, 856px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Looking across these restructured rows, there is no relationship between any of the elements. But in the relational database fact table below, there are (note matches of names with emails):<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17925 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/relational-database-fact-table-300x114.png\" alt=\"\" width=\"711\" height=\"270\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/relational-database-fact-table-300x114.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/relational-database-fact-table-1024x389.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/relational-database-fact-table-768x292.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/relational-database-fact-table.png 1190w\" sizes=\"(max-width: 711px) 100vw, 711px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">To identify the real owner of the DarkShield-discovered email address using the real database table above, perform a join on the Email field between the CSV file and the table. To build that job, run a fit-for-purpose <\/span><a href=\"https:\/\/www.youtube.com\/watch?v=vldvUG1tZp4\"><span style=\"font-weight: 400;\">Join job<\/span><\/a><span style=\"font-weight: 400;\"> wizard from the CoSort menu, or an <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/creating-a-voracity-flow-from-the-palette-part-2-of-2\/\"><span style=\"font-weight: 400;\">ETL Join<\/span><\/a><span style=\"font-weight: 400;\"> in Voracity, in the <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> GUI.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The join job wizard in Workbench starts with specifying the name and location of the job script:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-17926 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/job-specification-file-300x194.png\" alt=\"\" width=\"510\" height=\"330\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/job-specification-file-300x194.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/job-specification-file.png 609w\" sizes=\"(max-width: 510px) 100vw, 510px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">In the Data Source screen that comes next, specify the CSV file (transposed after a DarkShield search per the prior article), and the existing database fact table. In this case, that table, called ACCOUNT, is in an Oracle schema called SCOTT.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17928 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/new-join-job_data-sources_-300x228.png\" alt=\"\" width=\"490\" height=\"373\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/new-join-job_data-sources_-300x228.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/new-join-job_data-sources_.png 605w\" sizes=\"(max-width: 490px) 100vw, 490px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">From the CSV file, only pull in the EMAIL field to join and then map to output. From my table, specify both the EMAIL field for the join, and all other columns of interest associated with any email addresses that may match some in my file. More specifically, map out the values in the SSN, FIRST_NAME, LAST_NAME, DOB, NATIONALITY, OCCUPATION, PHONE, and ADDRESS columns.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Also required with each source is its \/FIELD metadata layout in SortCL Data Definition File (DDF) <\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl-metadata\"><span style=\"font-weight: 400;\">syntax<\/span><\/a><span style=\"font-weight: 400;\">. We can either add them from an existing DDF file, or auto-generate and incorporate the \/FIELD definitions through the \u2018<\/span><a href=\"https:\/\/www.iri.com\/blog\/data-transformation2\/using-the-metadata-discovery-wizard\/\"><span style=\"font-weight: 400;\">Discover Metadata \u2026<\/span><\/a><span style=\"font-weight: 400;\">\u2019 option.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Next is choosing and defining the join key. In this case the join key is the EMAIL field. The statement syntax will need to reflect an inner join (match) for the EMAIL field in both the file and the table:<\/span><\/p>\n<pre><b>\/JOIN<\/b> <b>NOT_SORTED<\/b><span style=\"font-weight: 400;\"> DarkShieldSearchLogTransposed <\/span><b>NOT_SORTED<\/b><span style=\"font-weight: 400;\"> ORACLE_SCOTT_ACCOUNT <\/span><b>WHERE<\/b><span style=\"font-weight: 400;\"> DARKSHIELDSEARCHLOGTRANSPOSED.EMAIL == ORACLE_SCOTT_ACCOUNT.EMAIL<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">This function can be constructed graphically on the next page of the wizard by selecting EMAIL on both sides of the join:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17929 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/new-join-job_join-sources-257x300.png\" alt=\"\" width=\"501\" height=\"585\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/new-join-job_join-sources-257x300.png 257w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/new-join-job_join-sources.png 610w\" sizes=\"(max-width: 501px) 100vw, 501px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The next and final page of the wizard is where the target file format and field layouts are defined and, if desired, further formatting and transformations. In this case, the joined rows will be kept in a Tab-Separated Values (TSV) file with no further functions performed on the output values, such as aggregation, masking, or data type conversion.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-17930\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/data-targets-300x191.png\" alt=\"\" width=\"587\" height=\"373\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/data-targets-300x191.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/data-targets.png 732w\" sizes=\"(max-width: 587px) 100vw, 587px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Once we finish the wizard, a job script like this is built automatically from what was specified:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17931 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/job-script-300x162.png\" alt=\"\" width=\"736\" height=\"397\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/job-script-300x162.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/job-script-1024x552.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/job-script-768x414.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/job-script.png 1088w\" sizes=\"(max-width: 736px) 100vw, 736px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">From the job script above, the transform mapping (ETL) diagram shown below is produced where the two sources, the email values from the file are matched to email values in the table. The table contains the other attributes of interest that will be written to output.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The output fields are then written in tab-delimited format. The CSV process type defined in the output phase of the job script will generate a header record in the output file.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When the join is run, this output report is produced that reveals the identity of the people in the table whose email addresses matched those which DarkShield <\/span><a href=\"https:\/\/www.iri.com\/blog\/migration\/data-migration\/textual-etl\/\"><span style=\"font-weight: 400;\">had found<\/span><\/a><span style=\"font-weight: 400;\"> in its search through multiple unstructured data sources:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17933 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/output-report-300x163.png\" alt=\"\" width=\"745\" height=\"405\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report-300x163.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report-1024x556.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report-768x417.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report-1536x834.png 1536w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report.png 1110w\" sizes=\"(max-width: 745px) 100vw, 745px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17934 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/08\/output-report_list-300x44.png\" alt=\"\" width=\"778\" height=\"114\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report_list-300x44.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/output-report_list.png 861w\" sizes=\"(max-width: 778px) 100vw, 778px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">If you have any questions or are interested in ETL, textual or otherwise,\u00a0<\/span><span style=\"font-weight: 400;\"><span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">please get in touch with\u00a0<a href=\"mailto:voracity@iri.com\" target=\"_blank\" rel=\"noopener\">voracity@iri.com<\/a><\/span>.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction This article demonstrates the Voracity user&#8217;s ability to join values in a flat-file file to those in an RDB (Relational Database) table to provide meaningful information. It is the continuation of this article on preparing unstructured data for integration with structured data and the standard technologies (like an Oracle database or JSON file) that<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\" title=\"Joining Flat-File &#038; RDB Data: Textual ETL (Part 2)\">Read More<\/a><\/div>\n","protected":false},"author":53,"featured_media":17939,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[],"tags":[],"class_list":["post-17919","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Joining Flat-File &amp; RDB Data: Textual ETL (Part 2) - IRI<\/title>\n<meta name=\"description\" content=\"Learn to join data in a flat-file file (from a DarkShield unstructured PII search log) with an RDB table to create insight from textual ETL.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Joining Flat-File &amp; RDB Data: Textual ETL (Part 2) - IRI\" \/>\n<meta property=\"og:description\" content=\"Learn to join data in a flat-file file (from a DarkShield unstructured PII search log) with an RDB table to create insight from textual ETL.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2024-08-19T23:02:50+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-08-22T14:18:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1110\" \/>\n\t<meta property=\"og:image:height\" content=\"532\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Chaitali Mitra\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Chaitali Mitra\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\"},\"author\":{\"name\":\"Chaitali Mitra\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf\"},\"headline\":\"Joining Flat-File &#038; RDB Data: Textual ETL (Part 2)\",\"datePublished\":\"2024-08-19T23:02:50+00:00\",\"dateModified\":\"2024-08-22T14:18:50+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\"},\"wordCount\":1059,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\",\"name\":\"Joining Flat-File & RDB Data: Textual ETL (Part 2) - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png\",\"datePublished\":\"2024-08-19T23:02:50+00:00\",\"dateModified\":\"2024-08-22T14:18:50+00:00\",\"description\":\"Learn to join data in a flat-file file (from a DarkShield unstructured PII search log) with an RDB table to create insight from textual ETL.\",\"breadcrumb\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png\",\"width\":1110,\"height\":532},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/beta.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Joining Flat-File &#038; RDB Data: Textual ETL (Part 2)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/beta.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf\",\"name\":\"Chaitali Mitra\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g\",\"caption\":\"Chaitali Mitra\"},\"sameAs\":[\"http:\/\/www.iri.com\"],\"url\":\"https:\/\/beta.iri.com\/blog\/author\/chaitalim\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Joining Flat-File & RDB Data: Textual ETL (Part 2) - IRI","description":"Learn to join data in a flat-file file (from a DarkShield unstructured PII search log) with an RDB table to create insight from textual ETL.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/","og_locale":"en_US","og_type":"article","og_title":"Joining Flat-File & RDB Data: Textual ETL (Part 2) - IRI","og_description":"Learn to join data in a flat-file file (from a DarkShield unstructured PII search log) with an RDB table to create insight from textual ETL.","og_url":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/","og_site_name":"IRI","article_published_time":"2024-08-19T23:02:50+00:00","article_modified_time":"2024-08-22T14:18:50+00:00","og_image":[{"width":1110,"height":532,"url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png","type":"image\/png"}],"author":"Chaitali Mitra","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Chaitali Mitra","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#article","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/"},"author":{"name":"Chaitali Mitra","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf"},"headline":"Joining Flat-File &#038; RDB Data: Textual ETL (Part 2)","datePublished":"2024-08-19T23:02:50+00:00","dateModified":"2024-08-22T14:18:50+00:00","mainEntityOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/"},"wordCount":1059,"commentCount":0,"publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png","inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/","url":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/","name":"Joining Flat-File & RDB Data: Textual ETL (Part 2) - IRI","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png","datePublished":"2024-08-19T23:02:50+00:00","dateModified":"2024-08-22T14:18:50+00:00","description":"Learn to join data in a flat-file file (from a DarkShield unstructured PII search log) with an RDB table to create insight from textual ETL.","breadcrumb":{"@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#primaryimage","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png","width":1110,"height":532},{"@type":"BreadcrumbList","@id":"https:\/\/beta.iri.com\/blog\/data-transformation2\/etl-part-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/beta.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Joining Flat-File &#038; RDB Data: Textual ETL (Part 2)"}]},{"@type":"WebSite","@id":"https:\/\/beta.iri.com\/blog\/#website","url":"https:\/\/beta.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/beta.iri.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/beta.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/beta.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf","name":"Chaitali Mitra","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g","caption":"Chaitali Mitra"},"sameAs":["http:\/\/www.iri.com"],"url":"https:\/\/beta.iri.com\/blog\/author\/chaitalim\/"}]}},"jetpack_featured_media_url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/08\/Featured-image-ETL-part-2.png","_links":{"self":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/17919"}],"collection":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/users\/53"}],"replies":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=17919"}],"version-history":[{"count":13,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/17919\/revisions"}],"predecessor-version":[{"id":17950,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/17919\/revisions\/17950"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media\/17939"}],"wp:attachment":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=17919"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=17919"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=17919"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}