{"id":6011,"date":"2014-06-04T17:10:56","date_gmt":"2014-06-04T21:10:56","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=6011"},"modified":"2024-01-05T15:15:34","modified_gmt":"2024-01-05T20:15:34","slug":"unstructured-data-data-restructuring-wizard","status":"publish","type":"post","link":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/","title":{"rendered":"Using the Dark Data Discovery Wizard to Unlock Unstructured Data"},"content":{"rendered":"<p><span style=\"text-decoration: underline;\">Editor&#8217;s note &#8230; this article, first posted in <strong>2014<\/strong> for the wizard it describes, has been updated as follows:<\/span><\/p>\n<ol>\n<li><em>June 2015: This wizard was renamed from the Data Restructuring Wizard to the Dark Data Discovery Wizard, and was provided free in <a href=\"https:\/\/www.iri.com\/products\/workbench\">IRI Workbench<\/a> for users of <a href=\"https:\/\/www.iri.com\/solutions\/data-and-database-migration\/file-conversion\/free-nextform-lite-version\" target=\"_blank\" rel=\"noopener\">IRI NextForm Lite<\/a>.<\/em><\/li>\n<li><em>October 2018: This wizard is now also used with both the <a href=\"https:\/\/www.iri.com\/products\/cellshield\/cellshield-ee\">IRI CellShield Enterprise Edition (EE)<\/a> and <a href=\"https:\/\/www.iri.com\/products\/darkshield\">IRI DarkShield<\/a> products for searching, extracting, and masking PII in multiple LAN-connected sources at once, and is being enhanced with value lookup, machine-learned NLP models for NER, and fuzzy search criteria. Additional blog content on DarkShield uses will follow.<\/em><\/li>\n<li><em>April &amp; July 2019: Updated UI images and instructions, updated file formats for DarkShield v2 and v3. <a href=\"https:\/\/www.iri.com\/products\/voracity\">IRI Voracity<\/a>\u00a0data management platform users can also leverage this wizard for textual ETL applications.<\/em><\/li>\n<li><em>October 2020: This wizard was subsumed in the IRI DarkShield feature menu in IRI Workbench and renamed to the &#8220;New Dark Data Search\/Masking Job &#8230;&#8221; wizard, then described in <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/searching-and-masking-with-darkshield\/\">this article<\/a>.<\/em><\/li>\n<li><em><strong>January 2024<\/strong>:<\/em> <em>The wizard has been updated again for DarkShield V5 to support the upgraded data classification infrastructure and further ergonomic improvements. See <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\">this article<\/a> now.<\/em><\/li>\n<\/ol>\n<p>The idea of dark data in unstructured sources and formats was introduced in\u00a0<a title=\"Finding Dark Data in Unstructured Sources with the IRI Data Restructuring Wizard Blog\" href=\"http:\/\/www.iri.com\/blog\/migration\/data-migration\/data-restructuring-wizard-unstructured-data\/\" target=\"_blank\" rel=\"noopener\">Finding Dark Data in Unstructured Sources<\/a>\u00a0(to introduce the IRI Data Restructuring Wizard). Recall that corporations and government agencies may have a lot of useful information trapped in these unstructured formats that can be\u00a0mashed up with other (usually structured) repositories and\u00a0mined for the benefit of operations, promotions, analytics, law enforcement, etc. However, some of these sources are difficult to parse, and the data they contain need\u00a0structure to be useful\u00a0in data integration and reporting contexts. This is where IRI&#8217;s\u00a0<em>Dark Data Discovery\u00a0<\/em><em>Wizard\u00a0<\/em>is\u00a0useful; it unlocks and organizes dark\u00a0data so it can start driving real\u00a0value to the business.<\/p>\n<p>The general idea is that, after parsing through the data in unstructured\u00a0sources,\u00a0you can output what you&#8217;re looking for into a structured text (flat) file, with its layouts automatically defined in a data definition file (.DDF). The file and its metadata repository are easily used and re-used by IRI software to integrate, transform, migrate, mask, and report on that data, and\/or feed it to other applications.<\/p>\n<p>Note also that CoSort can query and join over flat files directly, or facilitate the creation and population of tables with\u00a0DBA-defined primary-foreign keys. In this way, dark data extracts can acquire form and relationships (structure) that can make it a lot more useful.<\/p>\n<p><strong>Using the Wizard<\/strong><\/p>\n<p>The IRI\u00a0<em>Dark Data Discovery<\/em>\u00a0wizard will\u00a0search every supported unstructured document type in every directory below the root network drive you specify. The search for your dark data is based on <em>Data Classes<\/em>, which can contain any combination of regular expression patterns, lookup set files, Named Entity Recognition (NER) models, path filters for semi-structured files,\u00a0 area bounding boxes, and detected or recognized faces.<\/p>\n<p>Here is a list of unstructured sources containing strings that the wizard can search, extract, and structure:<\/p>\n<ul>\n<li>Free-form text (.txt)<\/li>\n<li>Microsoft Word documents (.doc and .docx)<\/li>\n<li>Adobe Portable Document Format (.pdf)<\/li>\n<li>Extensible Markup Language (.xml)<\/li>\n<li>E-mail messages (.eml)<\/li>\n<li>Microsoft Excel spreadsheets (.xls and .xlsx)<\/li>\n<li>Microsoft PowerPoint presentations (.ppt and .pptx)<\/li>\n<li>Microsoft Exchange and Outlook (.osd, and .pst)<\/li>\n<li>Rich Text Format (.rtf)<\/li>\n<li>Hypertext Markup Language files (.html)<\/li>\n<li>JavaScript Object Notation files (.json)<\/li>\n<li>MongoDB and Cassandra <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\">NoSQL DB collections<\/a><\/li>\n<li>Various image formats (.tiff, .jpeg, .png, .gif, .jp2, .jpx, .bmp)<\/li>\n<\/ul>\n<p>To open the wizard, select the <em>DarkShield Menu<\/em>\u00a0and select the\u00a0<em>New Dark Data Discovery Job<\/em>.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2014\/06\/setup_page.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-12734 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2014\/06\/setup_page.png\" alt=\"\" width=\"646\" height=\"462\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/setup_page.png 646w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/setup_page-300x215.png 300w\" sizes=\"(max-width: 646px) 100vw, 646px\" \/><\/a><\/p>\n<p>From the setup page, specify the folder and file names for the structured output file and the data definition file (DDF) metadata for that file. The field names in the DDF will correspond to\u00a0the keywords and patterns you searched, as well as the forensic attributes that you selected to be part of the output file.<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_sources_page.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-12988 size-full aligncenter\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_sources_page.png\" alt=\"\" width=\"546\" height=\"450\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_sources_page.png 546w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_sources_page-300x247.png 300w\" sizes=\"(max-width: 546px) 100vw, 546px\" \/><\/a><\/p>\n<p>Select any combination of sources, which currently support File System directories and SMB shares, along with the list of file types which should be searched.<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2014\/06\/metadata_selection.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-12737 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2014\/06\/metadata_selection.png\" alt=\"\" width=\"646\" height=\"462\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/metadata_selection.png 646w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/metadata_selection-300x215.png 300w\" sizes=\"(max-width: 646px) 100vw, 646px\" \/><\/a><\/p>\n<p>You can also\u00a0profile\u00a0several\u00a0different forensic aspects\u00a0of the dark data you&#8217;re discovering. The wizard can\u00a0identify and display the creation, modification, and access dates of the data source, as well as its full path,\u00a0owner,\u00a0linkage, and hidden attributes. Choose the delimiter character to offset the fields in the flat results file, such as a comma, or &#8220;|&#8221; as shown.<\/p>\n<p>There are a few ways to define the values to find:<\/p>\n<ol>\n<li>Enter a specific value.<\/li>\n<li>Use regular expressions to search for specific patterns.\u00a0If you are not familiar with <a title=\"Regular Expressions Wikipedia\" href=\"http:\/\/en.wikipedia.org\/wiki\/Regular_expression\" target=\"_blank\" rel=\"noopener\">regular expressions<\/a>, a lot of assistance is available on the internet, including\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Regular_expression\">here<\/a>\u00a0at Wikipedia. IRI also provides examples in the wizard&#8217;s easy-to-use context help.<\/li>\n<li>Providing an IRI Set file for a dictionary search. A dictionary search is similar to searching for a specific value, except that instead of using one value to search against, you use a file containing many values.<\/li>\n<li>Include a NER model which was trained to recognize named entities in the context of the sentences<\/li>\n<\/ol>\n<p>The last two ways are provided through the Data Classes, which can be created and viewed in the <em>IRI Preferences <\/em>within the Workbench.<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_class_prefs.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-12989 size-full aligncenter\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_class_prefs.png\" alt=\"\" width=\"839\" height=\"555\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_class_prefs.png 839w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_class_prefs-300x198.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/data_class_prefs-768x508.png 768w\" sizes=\"(max-width: 839px) 100vw, 839px\" \/><\/a><\/p>\n<p>You can associate multiple Data Classes and patterns with a Data Rule by creating Search Matchers. Data Rules will only be applied through the use of IRI\u00a0<em>DarkShield&#8217;s<\/em> remediation capabilities to obfuscate PII found in unstructured files.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2014\/06\/search_matchers_page.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-12990 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2014\/06\/search_matchers_page.png\" alt=\"\" width=\"546\" height=\"450\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/search_matchers_page.png 546w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/search_matchers_page-300x247.png 300w\" sizes=\"(max-width: 546px) 100vw, 546px\" \/><\/a><\/p>\n<p>Once you have entered the required information in the wizard, click <em>Finish<\/em> to generate a .search file containing the configuration parameters that you have selected, and the DDF file describing the layout of the flat file that will be generated by the search.<\/p>\n<p><!--\n\n<a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/05\/DataRestructuring_GUI4_Results1-e1408736497894.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5959\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/05\/DataRestructuring_GUI4_Results1-e1408736497894.jpg\" alt=\"DataRestructuring_GUI4_Results\" width=\"800\" height=\"368\" \/><\/a>\n\nand the data definition file (DDF) describing the layout of the resulting flat file:\n\n--><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-10614\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf-300x182.png\" alt=\"dark data data definition file\" width=\"600\" height=\"365\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf-300x182.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png 767w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<p>To execute the Search job, right click on the .search file and select <em>IRI &gt; Run Search Job.<\/em>\u00a0This will generate the flat file containing the delimited results and metadata information:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2014\/06\/results.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-12739 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2014\/06\/results.png\" alt=\"\" width=\"908\" height=\"555\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/results.png 908w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/results-300x183.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/results-768x469.png 768w\" sizes=\"(max-width: 908px) 100vw, 908px\" \/><\/a><\/p>\n<p>So, your now-structured data is stored in a file you can use (repeatedly) for any purpose. And within the same Eclipse\u00a0IDE, the <a href=\"http:\/\/www.iri.com\/products\/workbench\" target=\"_blank\" rel=\"noopener\">IRI Workbench<\/a>, you now have access to this data and its DDF for:<\/p>\n<ul>\n<li><a title=\"Data Integration Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/data-integration\" target=\"_blank\" rel=\"noopener\">Data Integration<\/a> and <a title=\"Data Transformation Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/data-transformation\" target=\"_blank\" rel=\"noopener\">Transformation<\/a><\/li>\n<li><a title=\"Data and Database Migration Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/data-and-database-migration\" target=\"_blank\" rel=\"noopener\">Data Migration<\/a> and <a title=\"Replication Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/data-and-database-migration\/replication\" target=\"_blank\" rel=\"noopener\">Replication<\/a><\/li>\n<li><a title=\"Data Masking Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/data-masking\" target=\"_blank\" rel=\"noopener\">Data Masking<\/a> (Encryption, De-ID, etc.)<\/li>\n<li><a title=\"Database Acceleration Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/database-acceleration\" target=\"_blank\" rel=\"noopener\">DB Load and Query Optimization<\/a><\/li>\n<li><a title=\"Reporting Business Intelligence Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/business-intelligence\/embedded-bi\/overview\">Reporting<\/a> or <a title=\"BI Tool Acceleration Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/business-intelligence\/bi-tool-acceleration\/overview\" target=\"_blank\" rel=\"noopener\">Hand-offs<\/a> to BI Tools<\/li>\n<li>Population of CRM, DB,\u00a0ETL, and External Apps<\/li>\n<\/ul>\n<p>See how to use the newly structured output file and its DDF in the next article,\u00a0<a title=\"Using CoSort on Restructured Data in the IRI Workbench Blog\" href=\"http:\/\/www.iri.com\/blog\/big-data-2\/output-restructuring-wizard-cosort\/\">Using CoSort on Restructured Data in the IRI Workbench<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Editor&#8217;s note &#8230; this article, first posted in 2014 for the wizard it describes, has been updated as follows: June 2015: This wizard was renamed from the Data Restructuring Wizard to the Dark Data Discovery Wizard, and was provided free in IRI Workbench for users of IRI NextForm Lite. October 2018: This wizard is now<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\" title=\"Using the Dark Data Discovery Wizard to Unlock Unstructured Data\">Read More<\/a><\/div>\n","protected":false},"author":7,"featured_media":10614,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[108,31,91,90],"tags":[688,44,610,417,689,71,690,692,686,1402,553,571,693,615,691,685,694,583,687,550],"class_list":["post-6011","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data-2","category-data-migration","category-iri-workbench","category-migration","tag-adobe","tag-cosort","tag-dark-data","tag-data-restructuring","tag-e-mail-messages","tag-eclipse","tag-excel-spreadsheets","tag-exchange","tag-free-form-text","tag-images","tag-iri-nextform","tag-microsoft","tag-outlook","tag-pdf","tag-powerpoint","tag-restructured","tag-rich-text-format","tag-unstructured","tag-word","tag-xml"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Using the Dark Data Discovery Wizard to Unlock Unstructured Data - IRI<\/title>\n<meta name=\"description\" content=\"Learn how to extract and use the valuable data hidden in semi-structured and unstructured sources, starting with producing a structured set.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Using the Dark Data Discovery Wizard to Unlock Unstructured Data - IRI\" \/>\n<meta property=\"og:description\" content=\"Learn how to extract and use the valuable data hidden in semi-structured and unstructured sources, starting with producing a structured set.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2014-06-04T21:10:56+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-01-05T20:15:34+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png\" \/>\n\t<meta property=\"og:image:width\" content=\"767\" \/>\n\t<meta property=\"og:image:height\" content=\"466\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Sharon Hewitt, Adam Lewis and Wade Donahue\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sharon Hewitt, Adam Lewis and Wade Donahue\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\"},\"author\":{\"name\":\"Sharon Hewitt, Adam Lewis and Wade Donahue\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795\"},\"headline\":\"Using the Dark Data Discovery Wizard to Unlock Unstructured Data\",\"datePublished\":\"2014-06-04T21:10:56+00:00\",\"dateModified\":\"2024-01-05T20:15:34+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\"},\"wordCount\":1080,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png\",\"keywords\":[\"Adobe\",\"CoSort\",\"dark data\",\"data restructuring\",\"E-mail messages\",\"Eclipse\",\"Excel spreadsheets\",\"Exchange\",\"free-form text\",\"images\",\"IRI NextForm\",\"Microsoft\",\"Outlook\",\"pdf\",\"PowerPoint\",\"restructured\",\"Rich Text Format\",\"unstructured\",\"Word\",\"xml\"],\"articleSection\":[\"Big Data\",\"Data Migration\",\"IRI Workbench\",\"Migration\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\",\"name\":\"Using the Dark Data Discovery Wizard to Unlock Unstructured Data - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png\",\"datePublished\":\"2014-06-04T21:10:56+00:00\",\"dateModified\":\"2024-01-05T20:15:34+00:00\",\"description\":\"Learn how to extract and use the valuable data hidden in semi-structured and unstructured sources, starting with producing a structured set.\",\"breadcrumb\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png\",\"width\":767,\"height\":466,\"caption\":\"dark data data definition file\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/beta.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Using the Dark Data Discovery Wizard to Unlock Unstructured Data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/beta.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\"}},[{\"@type\":[\"Person\"],\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795\",\"name\":\"Sharon Hewitt\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"inLanguage\":\"en_US\",\"url\":\"\",\"caption\":\"Sharon Hewitt\"}},{\"@type\":[\"Person\"],\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795\",\"name\":\"Adam Lewis\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"inLanguage\":\"en_US\",\"url\":\"\",\"caption\":\"Adam Lewis\"}},{\"@type\":[\"Person\"],\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795\",\"name\":\"Wade Donahue\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"inLanguage\":\"en_US\",\"url\":\"\",\"caption\":\"Wade Donahue\"}}]]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Using the Dark Data Discovery Wizard to Unlock Unstructured Data - IRI","description":"Learn how to extract and use the valuable data hidden in semi-structured and unstructured sources, starting with producing a structured set.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/","og_locale":"en_US","og_type":"article","og_title":"Using the Dark Data Discovery Wizard to Unlock Unstructured Data - IRI","og_description":"Learn how to extract and use the valuable data hidden in semi-structured and unstructured sources, starting with producing a structured set.","og_url":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/","og_site_name":"IRI","article_published_time":"2014-06-04T21:10:56+00:00","article_modified_time":"2024-01-05T20:15:34+00:00","og_image":[{"width":767,"height":466,"url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png","type":"image\/png"}],"author":"Sharon Hewitt, Adam Lewis and Wade Donahue","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Sharon Hewitt, Adam Lewis and Wade Donahue","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#article","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/"},"author":{"name":"Sharon Hewitt, Adam Lewis and Wade Donahue","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795"},"headline":"Using the Dark Data Discovery Wizard to Unlock Unstructured Data","datePublished":"2014-06-04T21:10:56+00:00","dateModified":"2024-01-05T20:15:34+00:00","mainEntityOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/"},"wordCount":1080,"commentCount":1,"publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png","keywords":["Adobe","CoSort","dark data","data restructuring","E-mail messages","Eclipse","Excel spreadsheets","Exchange","free-form text","images","IRI NextForm","Microsoft","Outlook","pdf","PowerPoint","restructured","Rich Text Format","unstructured","Word","xml"],"articleSection":["Big Data","Data Migration","IRI Workbench","Migration"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/","url":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/","name":"Using the Dark Data Discovery Wizard to Unlock Unstructured Data - IRI","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png","datePublished":"2014-06-04T21:10:56+00:00","dateModified":"2024-01-05T20:15:34+00:00","description":"Learn how to extract and use the valuable data hidden in semi-structured and unstructured sources, starting with producing a structured set.","breadcrumb":{"@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#primaryimage","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png","width":767,"height":466,"caption":"dark data data definition file"},{"@type":"BreadcrumbList","@id":"https:\/\/beta.iri.com\/blog\/migration\/data-migration\/unstructured-data-data-restructuring-wizard\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/beta.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Using the Dark Data Discovery Wizard to Unlock Unstructured Data"}]},{"@type":"WebSite","@id":"https:\/\/beta.iri.com\/blog\/#website","url":"https:\/\/beta.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/beta.iri.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/beta.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/beta.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/"}},[{"@type":["Person"],"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795","name":"Sharon Hewitt","image":{"@type":"ImageObject","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","inLanguage":"en_US","url":"","caption":"Sharon Hewitt"}},{"@type":["Person"],"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795","name":"Adam Lewis","image":{"@type":"ImageObject","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","inLanguage":"en_US","url":"","caption":"Adam Lewis"}},{"@type":["Person"],"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795","name":"Wade Donahue","image":{"@type":"ImageObject","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","inLanguage":"en_US","url":"","caption":"Wade Donahue"}}]]}},"jetpack_featured_media_url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2014\/06\/darkdata-ddf.png","_links":{"self":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/6011"}],"collection":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=6011"}],"version-history":[{"count":67,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/6011\/revisions"}],"predecessor-version":[{"id":17943,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/6011\/revisions\/17943"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media\/10614"}],"wp:attachment":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=6011"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=6011"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=6011"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}