{"id":16800,"date":"2024-01-03T16:07:31","date_gmt":"2024-01-03T21:07:31","guid":{"rendered":"https:\/\/www.iri.com\/blog\/?p=16800"},"modified":"2024-01-05T15:18:36","modified_gmt":"2024-01-05T20:18:36","slug":"finding-and-masking-pii-in-files-with-the-darkshield-files-wizard","status":"publish","type":"post","link":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/","title":{"rendered":"Find &#038; Mask File PII in the DarkShield GUI"},"content":{"rendered":"<p><a href=\"https:\/\/www.iri.com\/products\/darkshield\"><span style=\"font-weight: 400;\">IRI DarkShield<\/span><\/a><span style=\"font-weight: 400;\"> includes fit-for-purpose facilities in the graphical <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\/darkshield-gui\/file-masking\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The file formats containing strings that this wizard can search, extract, and mask, include:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Free-form text (.txt)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Microsoft Word documents (.doc and .docx)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Adobe Portable Document Format (.pdf)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Extensible Markup Language (.xml)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Microsoft Excel spreadsheets (.xls and .xlsx)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Microsoft PowerPoint presentations (.ppt and .pptx)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">JavaScript Object Notation files (.json)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Various image formats (.bmp, .gif, .jpg, .png, and .tif)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Parquet (.parquet)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">DICOM (.dicom)<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">These sources can exist in on-premise networks, cloud storage platforms, and within databases.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The diagram below summarizes DarkShield\u2019s architecture as part of the overarching Voracity platform, where the wizard this article explains is inside Workbench:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16827 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-V5-Architecture-244x300.png\" alt=\"\" width=\"470\" height=\"578\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-V5-Architecture-244x300.png 244w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-V5-Architecture.png 831w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-V5-Architecture-768x946.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-V5-Architecture-1247x1536.png 1247w\" sizes=\"(max-width: 470px) 100vw, 470px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Though not discussed in this article, DarkShield also includes wizards for finding and masking sensitive data in relational and NoSQL databases; i.e., the <\/span><a href=\"https:\/\/docs.google.com\/document\/d\/1aYCYm8iu73ztCrNM3ACCCBxscXKtyan4UyurNPkrdBUrkshield\/\"><i><span style=\"font-weight: 400;\">New NoSQL Search\/Masking Job \u2026<\/span><\/i><\/a><span style=\"font-weight: 400;\"> wizard for MongoDB, Cassandra and Elasticsearch, and the <\/span><a href=\"https:\/\/docs.google.com\/document\/d\/1Oaf2hMy2FEFDcT247ENZMARIn41rs7jxBsivB7GS6N0onal-databases\/\"><i><span style=\"font-weight: 400;\">New Relational DB Search\/Masking Job \u2026<\/span><\/i><\/a><span style=\"font-weight: 400;\"> wizard for JDBC-connected databases.\u00a0 <\/span><\/p>\n<h5><b>What the DarkShield Files Wizard Does<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">The \u201c<\/span><i><span style=\"font-weight: 400;\">New File Search\/Masking Job \u2026<\/span><\/i><span style=\"font-weight: 400;\">\u201d wizard builds XML DarkShield job configuration file with a .dsc extension. Each .dsc file contains the Search and Mask Contexts used in a DarkShield Job.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Search Contexts contain the access instructions for your File System source silo and searching for PII in it. Mask Contexts contains instructions for masking PII that were found during the search, as well as the access instructions for your File System target silo (where masked data will be written).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">File system silos supported through the Workbench include local or networked drives, as well as files in SharePoint Online, OneDrive, S3 buckets, Azure Blob Storage, and Google Cloud Storage.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The scanning and remediation of your dark data files is based on the search matchers and masking rules that you define in the <\/span><i><span style=\"font-weight: 400;\">IRI Data Class and Rule Library <\/span><\/i><span style=\"font-weight: 400;\">in your Workbench project folder. See <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/iri-data-classification\/\"><span style=\"font-weight: 400;\">this article on data classification<\/span><\/a><span style=\"font-weight: 400;\">\u00a0for details.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Each Data Class contains one or more search methods used to identify PII. P<\/span><span style=\"font-weight: 400;\">revious iterations of the wizard only supported scanning and extracting sensitive values that matched Java RegEx patterns and Set File lookups. Today\u2019s wizard supports more search methods, and of course simultaneous or separate masking operations.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For more information on the various search methods available read about <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/data-matchers\/\"><span style=\"font-weight: 400;\">Data Matchers<\/span><\/a><span style=\"font-weight: 400;\"> and <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/location-matchers\/\"><span style=\"font-weight: 400;\">Location Matchers<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h5><b>Prerequisites<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">Before launching the DarkShield RDB wizard ensure these preliminary steps are completed:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">First, verify that the DarkShield API distribution directory has been specified in <\/span><i><span style=\"font-weight: 400;\">IRI Workbench <\/span><\/i><i><span style=\"font-weight: 400;\"> Preferences &gt; IRI &gt; DarkShield<\/span><\/i><span style=\"font-weight: 400;\">. From here you can configure DarkShield GUI and API preferences including the host, port, and directory where the DarkShield API resides.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"> <img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16828 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/preference-for-darkshield-300x235.png\" alt=\"\" width=\"574\" height=\"450\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/preference-for-darkshield-300x235.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/preference-for-darkshield-768x601.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/preference-for-darkshield.png 805w\" sizes=\"(max-width: 574px) 100vw, 574px\" \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\">Second, all DarkShield Wizards require a project possessing an IRI Data Class and Rule Library. The IRI Library in turn should contain at least one Data Class and a Rule that can be assigned to that Data Class. To learn more about the IRI Data Class and Rule Library and creating Data Classes and Rules, read this <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/iri-data-classification\/\"><span style=\"font-weight: 400;\">article<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16829 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/iri-project-data-class-rule-libr.png\" alt=\"\" width=\"221\" height=\"149\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">IRI Project containing the Data Class &amp; Rule Library<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16830\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/IRI-Data-Class-Rules-Library-Form-Editor-300x87.png\" alt=\"\" width=\"776\" height=\"225\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/IRI-Data-Class-Rules-Library-Form-Editor-300x87.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/IRI-Data-Class-Rules-Library-Form-Editor-1024x296.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/IRI-Data-Class-Rules-Library-Form-Editor-768x222.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/IRI-Data-Class-Rules-Library-Form-Editor.png 1053w\" sizes=\"(max-width: 776px) 100vw, 776px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">IRI Data Class and Rules Library Form Editor contains some Data Classes and Rules<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">Third, verify that the Plankton server (DarkShield) is running. This can be done by opening the DarkShield API Status view in IRI Workbench. The DarkShield API Status view will display information about the DarkShield API, including whether it is currently running:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16832\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-API-Status-view-panel-300x116.png\" alt=\"\" width=\"595\" height=\"230\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-API-Status-view-panel-300x116.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-API-Status-view-panel-768x296.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/DarkShield-API-Status-view-panel.png 775w\" sizes=\"(max-width: 595px) 100vw, 595px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">DarkShield API Status view panel<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">Finally, ensure the data silos that you will be reading from and writing to can be accessed by an application (in this case DarkShield API). Depending on the file storage type (i.e. S3 Bucket, SharePoint Online, .etc), various information must be provided to allow the appropriate library to facilitate the connection and retrieval of files.<\/span><\/p>\n<h5><b>Using the Wizard<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">In this article, I will demonstrate the use of the <\/span><i><span style=\"font-weight: 400;\">New File Search\/Masking Job<\/span><\/i><span style=\"font-weight: 400;\">\u2026 wizard to create a DarkShield Files Job.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16834\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/New-File-Search-300x133.png\" alt=\"\" width=\"708\" height=\"314\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/New-File-Search-300x133.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/New-File-Search-768x339.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/New-File-Search.png 833w\" sizes=\"(max-width: 708px) 100vw, 708px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">To open the wizard, <\/span><span style=\"font-weight: 400;\">\u00a0select the DarkShield menu <\/span><span style=\"font-weight: 400;\">dropdown and select the <\/span><i><span style=\"font-weight: 400;\">New Files Search\/Masking Job<\/span><\/i><span style=\"font-weight: 400;\">\u2026 wizard. This brings up the first page where you can name the new job:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16835\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/specify-job-name-300x246.png\" alt=\"\" width=\"471\" height=\"386\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/specify-job-name-300x246.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/specify-job-name.png 540w\" sizes=\"(max-width: 471px) 100vw, 471px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Here you will also specify the folder and file names for the output of the wizard.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Click <\/span><i><span style=\"font-weight: 400;\">Next<\/span><\/i><span style=\"font-weight: 400;\"> to move into the data source specification (files to be masked) page of the wizard.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16836\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/search-report-option-300x245.png\" alt=\"\" width=\"467\" height=\"382\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/search-report-option-300x245.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/search-report-option.png 539w\" sizes=\"(max-width: 467px) 100vw, 467px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">This page lets you customize a flat-file search log by selecting metadata attributes of the files in which PII was discovered. These attributes will be displayed as columns in a flat text log file containing the values (and specified metadata) from the search operation. The default delimiter is a pipe (\u201c|\u201d) but you can change that.\u00a0<\/span><\/p>\n<p><i><span style=\"font-weight: 400;\">Note that the RESULT attribute contains the actual PII value found, so if you do not wish to persist PII in the search report, do not select RESULT.<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">You can select as well whether the wizard will generate a <\/span><i><span style=\"font-weight: 400;\">Data Definition Format (DDF) <\/span><\/i><span style=\"font-weight: 400;\">file, which is a metadata repository defining the layout of the flat file containing your search results. DDF syntax is recognized by, and used directly in, <\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl\"><span style=\"font-weight: 400;\">SortCL<\/span><\/a><span style=\"font-weight: 400;\"> data transformation and reporting jobs.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The \/FIELD names in the DDF file will correspond to the keywords and patterns you searched, as well as the forensic attributes that you selected in this dialog to be part of that output log\/report.\u00a0<\/span><\/p>\n<p><i><span style=\"font-weight: 400;\">Note DarkShield search jobs also produce another log (with no PII) in JSON called annotations.json.<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">Click <\/span><i><span style=\"font-weight: 400;\">Next<\/span><\/i><span style=\"font-weight: 400;\"> when finished to move into the specifics of the data you are trying to find \u2014 and how it should be masked.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16837\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/data-class-masking-rule-selection-291x300.png\" alt=\"\" width=\"496\" height=\"511\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/data-class-masking-rule-selection-291x300.png 291w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/data-class-masking-rule-selection.png 538w\" sizes=\"(max-width: 496px) 100vw, 496px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">In the <\/span><i><span style=\"font-weight: 400;\">Data Class and Masking Rule Selection<\/span><\/i><span style=\"font-weight: 400;\"> dialog, you will define the contents of your project\u2019s <\/span><i><span style=\"font-weight: 400;\">IRI Data Class and Rule Library<\/span><\/i><span style=\"font-weight: 400;\">. This library contains Data Classes and\/or Data Class Groups, and the data masking functions\/rules you assign to them.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">You can filter the Data Classes and Groups from the library that you intend to use by selecting or deselecting Data Classes in the <\/span><i><span style=\"font-weight: 400;\">Active<\/span><\/i><span style=\"font-weight: 400;\"> column. In this example, I am using all default Data Classes provided when creating an IRI Project.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16838\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/active-column-290x300.png\" alt=\"\" width=\"476\" height=\"493\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/active-column-290x300.png 290w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/active-column.png 535w\" sizes=\"(max-width: 476px) 100vw, 476px\" \/><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">In the Masking Rules tab, we can see that two functions are available: a Format Preserving Encryption Rule and a Blur Date Rule. These rules dictate how PII found using Data Classes will be masked. It is also possible to add or remove Masking Rules from this tab.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Click <\/span><i><span style=\"font-weight: 400;\">Next<\/span><\/i><span style=\"font-weight: 400;\"> when finished to move onto the page that will allow you to assign these Masking Rules to specific Data Classes.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16839 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/assign-masking-rules-292x300.png\" alt=\"\" width=\"467\" height=\"480\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/assign-masking-rules-292x300.png 292w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/assign-masking-rules.png 540w\" sizes=\"(max-width: 467px) 100vw, 467px\" \/><\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400;\">On the Assign Masking Rules to Data Classes wizard page, each Data Class and Data Class Group must be assigned a Masking Rule indicating how the PII will be masked or transformed. If you do not wish to modify a particular PII data type, click <\/span><i><span style=\"font-weight: 400;\">Back <\/span><\/i><span style=\"font-weight: 400;\">and deselect the Active checkbox associated with that Data Class or Data Class Group; then, return to this page and finish assigning Masking Rules to Data Classes.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once finished click <\/span><i><span style=\"font-weight: 400;\">Next<\/span><\/i><span style=\"font-weight: 400;\"> to begin specifying the location(s) of the files to be searched and masked.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16840\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/data-sources-new-file-search-mask-300x247.png\" alt=\"\" width=\"487\" height=\"401\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/data-sources-new-file-search-mask-300x247.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/data-sources-new-file-search-mask.png 540w\" sizes=\"(max-width: 487px) 100vw, 487px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">From this page we can choose to add, edit, or remove data sources that will be searched through by DarkShield.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If we click <\/span><i><span style=\"font-weight: 400;\">Add<\/span><\/i><span style=\"font-weight: 400;\">\u2026 a sub-wizard will appear.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16841\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/file-store-file-formats-300x242.png\" alt=\"\" width=\"502\" height=\"405\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-store-file-formats-300x242.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-store-file-formats.png 550w\" sizes=\"(max-width: 502px) 100vw, 502px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">From this page, you can specify the file storage type and a connection registry.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">DarkShield currently supports the following file <\/span><i><span style=\"font-weight: 400;\">storage <\/span><\/i><span style=\"font-weight: 400;\">types:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Local File System<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">OneDrive<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">SharePoint Online<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Amazon S3<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Azure Blob Storage<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Google Cloud Storage<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">A Connection Registry is a reusable connection configuration for connecting a data silo. To create a new Connection Registry first select the desired file storage type, then click <\/span><i><span style=\"font-weight: 400;\">New.<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">The example below demonstrates accessing files in the local (PC\u2019s) file system, but DarkShield supports other (cloud) file sources (listed above) in Workbench. The <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/darkshield-files-rpc-api\/\"><span style=\"font-weight: 400;\">DarkShield-Files API<\/span><\/a><span style=\"font-weight: 400;\"> can support files that reside in other storage silos, plus streaming sources, using custom code.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16842\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/Local-File-System-Storage-Type-297x300.png\" alt=\"\" width=\"489\" height=\"494\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Local-File-System-Storage-Type-297x300.png 297w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Local-File-System-Storage-Type-70x70.png 70w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Local-File-System-Storage-Type.png 509w\" sizes=\"(max-width: 489px) 100vw, 489px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">Local File System Storage Type<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">In the example above, we can see a <\/span><i><span style=\"font-weight: 400;\">File Connection Registry <\/span><\/i><span style=\"font-weight: 400;\">requires a path to a directory. The directory in question may contain files or files with more directories. The <\/span><i><span style=\"font-weight: 400;\">Include <\/span><\/i><span style=\"font-weight: 400;\">and <\/span><i><span style=\"font-weight: 400;\">Exclude <\/span><\/i><span style=\"font-weight: 400;\">fields use regex patterns to either dictate what files to include or exclude based on the file name, respectively.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As seen in the example above the Exclude field has a Regex pattern that will be used to exclude any files ending with <\/span><i><span style=\"font-weight: 400;\">error.log<\/span><\/i><span style=\"font-weight: 400;\"> from the DarkShield job<\/span><i><span style=\"font-weight: 400;\">. <\/span><\/i><span style=\"font-weight: 400;\">This can be useful when certain files should not be subjected to masking.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Recursive lookup should be checked for DarkShield to process files in directories nested inside other directories.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Lastly, you can scroll through the list of file types supported by DarkShield and choose which file types should be processed by checking on or off for each file type. Once your location and file types are selected, click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\"> to create the connection registry that was configured. Then on the previous page click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Afterward, the connection registry information will be displayed on the Data Sources page.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16843\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/files-demo-300x246.png\" alt=\"\" width=\"519\" height=\"425\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/files-demo-300x246.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/files-demo.png 541w\" sizes=\"(max-width: 519px) 100vw, 519px\" \/><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">This source item reveals my root directory from which the searches will occur. It is also possible to add additional sources for the search here.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When finished, click <\/span><i><span style=\"font-weight: 400;\">Next <\/span><\/i><span style=\"font-weight: 400;\">to open the Filter Selection dialog:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16844\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/filter-selection-300x196.png\" alt=\"\" width=\"570\" height=\"372\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/filter-selection-300x196.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/filter-selection.png 678w\" sizes=\"(max-width: 570px) 100vw, 570px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">JSON, XML, CSV, and Excel files can have their search scope reduced by specifying one or more filters here. This can decrease the time it takes to finish a job and assist in preventing false positives.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Click<\/span><i><span style=\"font-weight: 400;\"> Add\u2026 <\/span><\/i><span style=\"font-weight: 400;\">to specify a new filter. For JSON, specify a JSON path. For XML, specify an XML path. For CSV, specify a column name regex pattern. For Excel files, there are multiple options for filtering the scope of a search to certain sheets, cell ranges, or columns (by header name).\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Clicking <\/span><i><span style=\"font-weight: 400;\">Remove<\/span><\/i><span style=\"font-weight: 400;\"> deletes the selected filter from the table. When ready click <\/span><i><span style=\"font-weight: 400;\">Next<\/span><\/i><span style=\"font-weight: 400;\"> and move onto the Data Targets page.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Click <\/span><i><span style=\"font-weight: 400;\">Add<\/span><\/i><span style=\"font-weight: 400;\"> to create or select a target location. Selecting a target is optional if you are only interested in performing search-only operations.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">You must specify a target if masking will be performed. The steps to add a data target are the same as the steps to create a data source, with the exception that file type selection is not requested (since it will be in the same format as the source).<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16845\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/files-demo-out-300x247.png\" alt=\"\" width=\"503\" height=\"414\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/files-demo-out-300x247.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/files-demo-out.png 542w\" sizes=\"(max-width: 503px) 100vw, 503px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The <\/span><i><span style=\"font-weight: 400;\">Add \u2026 <\/span><\/i><span style=\"font-weight: 400;\">options allow additional target specifications.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Note that multiple sources and targets can be specified in the wizard. DarkShield will search and mask all files found in the source URIs and replicate the masked files across all of the data targets. The sources and targets can be any combination of file storage types.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At this point you can click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\"> to produce a .dsc file or click <\/span><i><span style=\"font-weight: 400;\">Next<\/span><\/i><span style=\"font-weight: 400;\"> to move to the File Search\/Mask Configurations page.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The final page of the new File Search\/Masking Job wizard is the optional <\/span><b>File Search\/Mask Configurations <\/b><span style=\"font-weight: 400;\">page to further define job attributes applicable only to certain file types, like PDF documents, or image formats like DICOM; see <\/span><b>Optional Search\/Mask Configurations <\/b><span style=\"font-weight: 400;\">which follow. These attributes can be stored for reuse in a configuration registry.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here (and only recommended for advanced users or specific requirements), you can define file configuration options for certain file types. However, DarkShield jobs will use reasonable defaults in the absence of any explicit configurations set. To specify these options, you can select from an existing Dark Data File Configuration option registry entry, or create a new one.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16846\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/file-type-specific-config-300x246.png\" alt=\"\" width=\"515\" height=\"423\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-type-specific-config-300x246.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-type-specific-config.png 540w\" sizes=\"(max-width: 515px) 100vw, 515px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">If you opt to create a <\/span><b><i>New \u2026 <\/i><\/b><span style=\"font-weight: 400;\">entry, the <\/span><b>File Configuration Option Selection<\/b><span style=\"font-weight: 400;\"> page will appear. On this page, select the types of file configuration options to specify, and enter a name for the Dark Data File Configuration registry entry.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16847\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/optional-file-search-mask-300x260.png\" alt=\"\" width=\"458\" height=\"397\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/optional-file-search-mask-300x260.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/optional-file-search-mask.png 511w\" sizes=\"(max-width: 458px) 100vw, 458px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">After clicking <\/span><b>Create<\/b><span style=\"font-weight: 400;\">, the wizard will open all pages relevant to the file specific configurations for the formats you select here. More details about what each file configuration option does can be found by scrolling to the bottom of this article or by visiting the DarkShield API docs (webpage reachable at <\/span><b><i>localhost:8959\/docs<\/i><\/b><span style=\"font-weight: 400;\"> by default while DarkShield server is running).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At this point, you are ready to click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\"> to produce the .dsc file that is used by the DarkShield API.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16848\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/dsc-file.png\" alt=\"\" width=\"206\" height=\"169\" \/><\/p>\n<h5><b>DarkShield Job Editor<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">Every .dsc file can be viewed from a DarkShield Job editor. This editor allows you to modify your DarkShield job parameters after you complete the steps of the DarkShield Files Wizard; e.g.,<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16849\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/new-job-dsc-300x120.png\" alt=\"\" width=\"676\" height=\"270\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/new-job-dsc-300x120.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/new-job-dsc-768x306.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/new-job-dsc.png 983w\" sizes=\"(max-width: 676px) 100vw, 676px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">You can add, edit, or remove sources and\/or targets from your .dsc file as desired by clicking <\/span><i><span style=\"font-weight: 400;\">Add, Edit, <\/span><\/i><span style=\"font-weight: 400;\">or<\/span><i><span style=\"font-weight: 400;\"> Remove<\/span><\/i><span style=\"font-weight: 400;\"> (see above). You can also modify your Data Class Rule Mappings (see below).<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16850\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/Data-Class-Rule-Mappings-300x124.png\" alt=\"\" width=\"711\" height=\"294\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Data-Class-Rule-Mappings-300x124.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Data-Class-Rule-Mappings-768x318.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Data-Class-Rule-Mappings.png 990w\" sizes=\"(max-width: 711px) 100vw, 711px\" \/><\/p>\n<p><span style=\"font-weight: 400;\"><br \/>\n<\/span><span style=\"font-weight: 400;\">From the editor, you can also modify your Data Class Rule Mappings by clicking the Modify button. It is also possible to choose a different IRI Library and\/or rearrange the Masking Rules assigned to your Data Classes.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The editor also has another section that allows the adding, editing, or removal of file-specific path filters (JSON, XML, Excel, CSV\/TSV):<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16851\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/file-specific-path-filters-json-300x152.png\" alt=\"\" width=\"697\" height=\"353\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-specific-path-filters-json-300x152.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-specific-path-filters-json-768x388.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/file-specific-path-filters-json.png 916w\" sizes=\"(max-width: 697px) 100vw, 697px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Finally, the editor also provides a preview option that allows you to test your Data Class search matchers and Masking Rules using text input:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16852\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/text-input-300x110.png\" alt=\"\" width=\"745\" height=\"273\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/text-input-300x110.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/text-input-768x282.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/text-input.png 1008w\" sizes=\"(max-width: 745px) 100vw, 745px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">By clicking <\/span><i><span style=\"font-weight: 400;\">Preview<\/span><\/i><span style=\"font-weight: 400;\">, you can see what PII was found and how it was transformed using the current Data Classes and Masking Rules.<\/span><\/p>\n<h5><b>Running Your Search and Masking Jobs<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">You can use your DarkShield job configuration in three different ways; i.e., in a:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Search Job to simply identify PII and log the results to file. Be aware of search results logged to file (.*annotations.json) from a search job, may contain PII found in your data source(s). DarkShield (Base) and File API will save JSON files in your workspace but DarkShield NoSQL and RDB API will store search results to directory specified in DarkShield API configuration file.;<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Masking Job that will use the search log to mask the discovered PII; or,<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\">Search <i>and <\/i><span style=\"font-weight: 400;\">Masking Job to search and mask PII in one job.\u00a0<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">In this demonstration, we will be running a DarkShield Search and Mask Job. To run a DarkShield Search and Mask Job right, click the .dsc file and select<\/span><i><span style=\"font-weight: 400;\"> IRI &gt; Run Search and Masking Job<\/span><\/i><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16853\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/ds-search-masking-job-300x291.png\" alt=\"\" width=\"618\" height=\"599\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/ds-search-masking-job-300x291.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/ds-search-masking-job.png 674w\" sizes=\"(max-width: 618px) 100vw, 618px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">After running a Search and Mask Job, the PII in data search will be masked and placed in the data silo target location previously specified in the wizard.<\/span><\/p>\n<p>If you are running a Search job on Excel files, an Excel Interchange File, or EIF (.eif extension) file is produced in the same directory as the DarkShield job. This file can be imported into the IRI CellShield Enteperise Edition (EE) product for bulk spreadsheet masking operations within Excel directly. See page 2-5 in <a href=\"https:\/\/www.iri.com\/ftp9\/pdf\/CellShield\/CellShield_EE_V2_Overview.pdf\">this booklet<\/a> for more information.<\/p>\n<p><span style=\"font-weight: 400;\">Below is an example of my source and target files, showing how the PII in them appear before and after a DarkShield search and masking operation:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16854 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/example-word-file-300x112.png\" alt=\"\" width=\"573\" height=\"214\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/example-word-file-300x112.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/example-word-file.png 613w\" sizes=\"(max-width: 573px) 100vw, 573px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">Word Document Unprotected<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16855\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/word-doc-masked-300x109.png\" alt=\"\" width=\"586\" height=\"213\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/word-doc-masked-300x109.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/word-doc-masked.png 617w\" sizes=\"(max-width: 586px) 100vw, 586px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">Word Document Masked<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16856\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/Xls-Document-Unprotected-300x166.png\" alt=\"\" width=\"661\" height=\"366\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Xls-Document-Unprotected-300x166.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Xls-Document-Unprotected.png 704w\" sizes=\"(max-width: 661px) 100vw, 661px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">Xls Document Unprotected<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16857\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/Xls-Document-Masked-300x168.png\" alt=\"\" width=\"657\" height=\"368\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Xls-Document-Masked-300x168.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Xls-Document-Masked.png 703w\" sizes=\"(max-width: 657px) 100vw, 657px\" \/><\/p>\n<p style=\"text-align: center;\"><em>Xls Document Masked<\/em><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16859\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Unprotected-300x240.png\" alt=\"\" width=\"527\" height=\"421\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Unprotected-300x240.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Unprotected-768x615.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Unprotected.png 780w\" sizes=\"(max-width: 527px) 100vw, 527px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">JPEG Document Unprotected<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16860\" src=\"\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Masked-300x240.png\" alt=\"\" width=\"510\" height=\"408\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Masked-300x240.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Masked-768x614.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/JPEG-Document-Masked.png 774w\" sizes=\"(max-width: 510px) 100vw, 510px\" \/><\/p>\n<p style=\"text-align: center;\"><i><span style=\"font-weight: 400;\">JPEG Document Masked<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">If you would like help using this wizard to scan and\/or mask data in your files, please contact your <\/span><a href=\"https:\/\/www.iri.com\/partners\/resellers\"><span style=\"font-weight: 400;\">IRI representative<\/span><\/a><span style=\"font-weight: 400;\"> or email <\/span><a href=\"mailto:darkshield@iri.com\"><span style=\"font-weight: 400;\">darkshield@iri.com<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h5><b>Format-Specific File Configuration Options<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">For several file types, various search\/mask configuration options are available to DarkShield API users. Configuration options are not essential, however, and reasonable defaults are used in the absence of any configuration option definitions.<\/span><\/p>\n<h6><strong>PDF Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">disableImageCaching<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Set to true if you wish to disable image caching across a document. This may help prevent out-of-memory issues when processing many unique embedded images within a document. However, it may also slow down processing if the document contains a lot of identical images (for example, a logo, or a background).<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean<\/span>\r\n<span style=\"font-weight: 400;\">maxMainMemoryBytes<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0The maximum amount of memory in bytes to use when loading pages in a pdf. By default, DarkShield loads everything in memory, which speeds up processing but can cause out-of-memory issues for memory-constrained environments. If set to greater than 0, DarkShield will use a combination of memory and temporary files to iterate over the pages. Setting it to 0 will mean that only temporary files will be used.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n <span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 0\r\n<\/span>\r\n<span style=\"font-weight: 400;\">onEncodingError<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Define the behavior of the PDF remediator when the replacement text cannot be encoded in either the original or default fonts. By default, the original text will be redacted with a black box. The following options can be specified:\u00a0<\/span><\/p>\n<ol>\n<li><span style=\"font-weight: 400;\"> redact\u00a0 Replace the original text with a black box.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> failedResult\u00a0 Create a failed result.<\/span><\/li>\n<\/ol>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">enum<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0- redact<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0- failedResult<\/span>\r\n\r\n<span style=\"font-weight: 400;\">onTextOverflow<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Define the behavior of the PDF remediator when the replacement text is longer than the original. By default, the original text will be replaced with the full replacement. The following options can be specified:\u00a0\u00a0<\/span><\/p>\n<ol>\n<li><span style=\"font-weight: 400;\"> redact\u00a0 Replace the original text with a black box.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> replace\u00a0 Replace the original text with the full replacement.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> truncate\u00a0 Truncate the replacement text to match the original text size.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> failedResult\u00a0 Create a failed result.<\/span><\/li>\n<\/ol>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">enum<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0- redact<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0- replace<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0- truncate<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0- failedResult<\/span>\r\n<span style=\"font-weight: 400;\">\r\nprettyTextReplacement<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Set to true to indicate that the PDF remediator should use prettyTextReplacement. This setting allows for the more seamless replacement of text in PDFs. The remediator will attempt to shift text following the replacement text by the amount of additional width produced when the replacement text is larger than the original text.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Comparatively, using the default behavior, where the replacement text width is larger than original text, there may be overlapping text at the text position. Furthermore, the default remediation (masking) operation supports fewer font types for text replacement. That said, remediation operations will be significantly slower using this configuration option compared to the default remediation behavior. Thus, use default remediation behavior if just black-box redaction is needed or if PDF has enough space between words where overlapping is unlikely to be an issue.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean<\/span>\r\n\r\n<span style=\"font-weight: 400;\">setReplacement<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Specify a set file (a file with a list of entries with each entry on its own line) URL to select data from for use in generating data for a PDF form field.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\">\r\nsetReplacementFields<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Specify the name of a form field in a PDF to insert data in from a set replacement.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n\r\n<span style=\"font-weight: 400;\">setReplacementColumns<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Specify a zero-based index of a column to select from a set file. Set files should have columns separated by a tab. For each setReplacement entry, there should be a setReplacementColumns entry; otherwise, the default is the first column (index 0).<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 0<\/span>\r\n\r\n<span style=\"font-weight: 400;\">disableImageCaching<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Set to true to disable image caching across a document. This may help prevent out-of-memory issues when processing many unique embedded images within a document. However, it may also slow down processing if the document contains a lot of identical images (for example, a logo, or a background).<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean<\/span>\r\n\r\n<span style=\"font-weight: 400;\">disableImageProcessing<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Set to true to disable image processing across a document. Embedded images will not undergo Optical Character Recognition (OCR), which will increase the speed of the processing.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean\r\n<\/span>\r\n<span style=\"font-weight: 400;\">maxMainMemoryBytes<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0The maximum amount of memory in bytes to use when loading pages in a pdf. By default, DarkShield loads everything in memory, which speeds up processing but can cause out-of-memory issues for memory-constrained environments. If set to greater than 0, DarkShield will use a combination of memory and temporary files to iterate over the pages. Setting it to 0 will mean that only temporary files will be used.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n <span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 0<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">See <\/span><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/test-data-generation\/application-form-generation\/setup.py\"><span style=\"font-weight: 400;\">https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/test-data-generation\/application-form-generation\/setup.py<\/span><\/a><span style=\"font-weight: 400;\"> for an example of synthesizing data into form fields of a PDF.<\/span><\/p>\n<h6><strong>Image Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">boundingBoxes<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0A list of strings representing the upper left (x1, y1) and lower right (x2, y2) corners of the bounding box. Can be either in a format of four-pixel positions in the image as whole numbers separated by spaces, or four ratios of the position in the image as a decimal between 0 and 1 separated by commas.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Examples:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Format 1: 23 100 73 114<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Format 2: 0.07666666666666666,0.5238095238095238,0.7,0.6084656084656085<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">pattern<\/span><span style=\"font-weight: 400;\">: ^[\\d,. ]+$\r\n<\/span>\r\n<span style=\"font-weight: 400;\">targetFont<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0A string that represents the font of target text OCR will read. For credit card:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Format 1: creditCard<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Format 2: OCR-A<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\">language<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">The language parameter for the OCR engine to use to parse the image. If no language is specified, English is assumed. Multiple languages may be specified, separated by plus (&#8216;+&#8217;) characters. The engine uses 3-character ISO 639-2 language codes.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">pattern<\/span><span style=\"font-weight: 400;\">: <\/span><span style=\"font-weight: 400;\">\"^[a-z]{3}(\\\\+[a-z]{3})*$\"<\/span>\r\n<span style=\"font-weight: 400;\">\r\ntessConfigVariables<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Additional Tesseract configuration parameters that can be passed to the engine.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: object<\/span>\r\n <span style=\"font-weight: 400;\">additionalProperties<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\">tessDataPath<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">The path to the tessdata folder containing Tesseract language models. If not specified, DarkShield will use the tessdata folder inside of the API&#8217;s install directory, or create one if it does not exist. DarkShield will attempt to download the models for the languages that were set for the File Search Context if they do not already exist. Note that the path MUST be resolvable in the server environment, not the client&#8217;s file system.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\">\r\nuseOCR<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Whether to use OCR or not. The default value is true. This can be set to false if only using<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0user-specified bounding boxes of known regions of images to significantly improve performance.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean<\/span>\r\n<span style=\"font-weight: 400;\">oem<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0OCR Engine modes<\/span><\/p>\n<ol>\n<li><span style=\"font-weight: 400;\"> \u00a0 Legacy engine only.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> \u00a0 Neural nets LSTM engine only.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> \u00a0 Legacy + LSTM engines.<\/span><\/li>\n<li><span style=\"font-weight: 400;\"> \u00a0 Default, based on what is available.<\/span><\/li>\n<\/ol>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n <span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n <span style=\"font-weight: 400;\">maximum<\/span><span style=\"font-weight: 400;\">: 4<\/span>\r\n<span style=\"font-weight: 400;\">\r\npsm<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0Page segmentation modes<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Orientation and script detection (OSD) only.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\"> Automatic page segmentation with OSD.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Automatic page segmentation, but no OSD, or OCR. (not implemented)<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Fully automatic page segmentation, but no OSD. (Default)<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Assume a single column of text of variable sizes.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Assume a single uniform block of vertically aligned text.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Assume a single uniform block of text.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Treat the image as a single text line.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Treat the image as a single word.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Treat the image as a single word in a circle.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Treat the image as a single character.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Sparse text. Find as much text as possible in no particular order.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Sparse text with OSD.<\/span><\/li>\n<li aria-level=\"1\"><span style=\"font-weight: 400;\"> Raw line. Treat the image as a single text line, <\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">b<\/span>ypassing hacks that are Tesseract-specific.<\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n <span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n <span style=\"font-weight: 400;\">maximum<\/span><span style=\"font-weight: 400;\">: 14<\/span>\r\n<span style=\"font-weight: 400;\">maskingMethod<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0The type of masking to apply to the image:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a01 (default): Black Boxes.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a02: Replacement of text.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Black boxes completely redact text with a black rectangle. Replacement of text will apply the masking rule associated with the search matcher that found the text to the text, and insert a generated image of the masked text into that section of the image. Specify replacement with: &#8220;replacement&#8221;. Black boxing is the default masking method if replacement is not specified.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\">\r\ncopyBackground<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Whether or not to try to copy the color of the background when inserting replaced text into an image with the &#8216;replacement&#8217; maskingMethod set. The average RGB values will be calculated for the subregion of the image delimited by the bounding box. This can decrease performance. The default is false, which will put the text onto a white background.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean<\/span>\r\n\r\n<span style=\"font-weight: 400;\">setReplacement<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Specify a set file (a file with a list of entries with each entry on its own line) URL to select data from for use in generating an image with the text of the entry and pasting over a bounding box region.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n\r\n<span style=\"font-weight: 400;\">setReplacementColumns<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 0<\/span>\r\n<span style=\"font-weight: 400;\">customFonts<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Specify the name of a custom font to use when replacing text in an image.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\">customFontFiles<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">Specify the path of a file to load a custom font from.<\/span><\/p>\n<pre><span style=\"font-weight: 400;\">\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">See <\/span><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/test-data-generation\/credit-card-generation\/setup.py\"><span style=\"font-weight: 400;\">https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/test-data-generation\/credit-card-generation\/setup.py<\/span><\/a><span style=\"font-weight: 400;\"> and <\/span><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/test-data-generation\/check-generation\/setup.py\"><span style=\"font-weight: 400;\">https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/test-data-generation\/check-generation\/setup.py<\/span><\/a><span style=\"font-weight: 400;\"> for examples synthesizing text into images.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">See <\/span><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/pdf-image\/image-text-replacement\/setup.py\"><span style=\"font-weight: 400;\">https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/pdf-image\/image-text-replacement\/setup.py<\/span><\/a><span style=\"font-weight: 400;\"> for an example of replacing text into images, rather than redacting with a black box.<\/span><\/p>\n<h6><strong>JSON Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">json<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span>\r\n<span style=\"font-weight: 400;\">The configuration for reading and writing JSON documents during masking.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: object<\/span>\r\n <span style=\"font-weight: 400;\">Properties<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">prettyPrint<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">Set to true if the JSON document should be written out in a human-readable \r\nformat with proper indentation.<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: boolean<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">See <\/span><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/json-xml\/setup.py\"><span style=\"font-weight: 400;\">https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/json-xml\/setup.py<\/span><\/a><span style=\"font-weight: 400;\"> for an example setting the <\/span><i><span style=\"font-weight: 400;\">prettyPrint<\/span><\/i> <span style=\"font-weight: 400;\">option to true.<\/span><\/p>\n<h6><strong>Fixed-Width Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">fixed-width<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span>\r\n<span style=\"font-weight: 400;\">    The configuration for reading and writing fixed width documents \r\n    during masking.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: object<\/span>\r\n <span style=\"font-weight: 400;\">properties<\/span><span style=\"font-weight: 400;\">:<\/span><\/pre>\n<pre><span style=\"font-weight: 400;\">columnWidths<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Sets the length of fixed width document columns. The order in which \r\n       the lengths are listed is the order in which they will be evaluated. \r\n       Configuration for columnWidths can not be left empty and can not have \r\n       values less than one.<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minItems<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 1<\/span><\/pre>\n<p><span style=\"font-weight: 400;\">See <\/span><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/fixed-width\/setup.py\"><span style=\"font-weight: 400;\">https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/blob\/master\/fixed-width\/setup.py<\/span><\/a><span style=\"font-weight: 400;\"> for an example setting columnWidths to the widths of columns in a fixed-width file.<\/span><\/p>\n<h6><strong>Plain Text Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">text<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">       The configuration for reading plaintext (text\/plain) documents. \r\n       If the bufferLimit and delimiter values are not set, the entire \r\n       document will be read into memory.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: object<\/span>\r\n <span style=\"font-weight: 400;\">properties<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n\r\n<span style=\"font-weight: 400;\">bufferLimit<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">       The maximum size of a text block which will be searched as one unit. \r\n       If no delimiter is set, then text blocks are delineated by the newline \r\n       ('\\\\n') character.<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n<span style=\"font-weight: 400;\">\r\ndelimiter<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">        The string delimiter for delineating text blocks along with \r\n        the buffer limit. If no buffer limit is set, then text blocks \r\n        of up to 4096 characters will be used.<\/span><\/pre>\n<h6><strong>CSV Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">comment<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">       Sets the comment character used to skip comment lines in \r\n       CSV documents. The default is '#'. Set to '\\0' in order to process \r\n       comment lines as single-valued records.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">maxLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n <span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n\r\n<span style=\"font-weight: 400;\">delimiter<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">        Sets the delimiter to use to parse out the values in a CSV record. \r\n        This option will override the delimiter detection if it is present \r\n        in the config.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n\r\n<span style=\"font-weight: 400;\">delimiterDetection<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">        Sets the characters that will be used during the automatic \r\n        delimiter detection. The order in which the characters are listed \r\n        is the order in which they will be evaluated. \r\n        This option is overridden if a delimiter character is specified, \r\n        in which case no detection occurs. \r\n        The default delimiters are ',', '|', ';', and '\\\\t'.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n <span style=\"font-weight: 400;\">minItems<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n <span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">maxLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n\r\n<span style=\"font-weight: 400;\">lineSeparator<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">      \u00a0\u00a0Sets the line separator to use to parse records in CSV documents. \r\n        By default, the parser will attempt to detect the standard OS line \r\n        separators or a null terminator ('\\\\0').<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n<span style=\"font-weight: 400;\">maxCharsPerColumn<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">        Sets the maximum number of characters that can be read as part of \r\n        a value in a record. A parsing error will occur if no delimiter or \r\n        line separator are detected within that character limit. \r\n        The default is 4096 characters.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n <span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n<span style=\"font-weight: 400;\">maxColumns<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n\r\n<span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">        Sets the maximum number of columns that will be parsed per record. \r\n        A parsing error will occur if a line separator is not encountered \r\n        before the max number of columns are parsed. \r\n        The default is 512 columns.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: integer<\/span>\r\n <span style=\"font-weight: 400;\">minimum<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n\r\n<span style=\"font-weight: 400;\">quote<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">        Sets the quote character to indicate the start and end of a value \r\n        inside a CSV record. The default is '\"'.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">maxLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n <span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n\r\n<span style=\"font-weight: 400;\">quoteEscape<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span>\r\n<span style=\"font-weight: 400;\">        Sets the escape character for the quote character in CSV documents. \r\n        The default is '\"'.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: string<\/span>\r\n <span style=\"font-weight: 400;\">maxLength<\/span><span style=\"font-weight: 400;\">: 1<\/span>\r\n <span style=\"font-weight: 400;\">minLength<\/span><span style=\"font-weight: 400;\">: 1<\/span><\/pre>\n<h6><strong>DICOM Configuration Options<\/strong><\/h6>\n<pre><span style=\"font-weight: 400;\">dicom<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n <span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:\u00a0<\/span>\r\n<span style=\"font-weight: 400;\">          The configuration for reading and writing DICOM documents \r\n          during masking.<\/span>\r\n <span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: object<\/span>\r\n\r\n<span style=\"font-weight: 400;\">properties<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0<\/span><span style=\"font-weight: 400;\">blackBoxes<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">description<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\">         Specify a list of black boxes to apply to the pixel data \r\n         in the DICOM file.<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">type<\/span><span style=\"font-weight: 400;\">: array<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">items<\/span><span style=\"font-weight: 400;\">:<\/span>\r\n<span style=\"font-weight: 400;\"> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span style=\"font-weight: 400;\">$ref<\/span><span style=\"font-weight: 400;\">: <\/span><span style=\"font-weight: 400;\">'#\/components\/schemas\/BlackBox'<\/span><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources. The file formats containing<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\" title=\"Find &#038; Mask File PII in the DarkShield GUI\">Read More<\/a><\/div>\n","protected":false},"author":152,"featured_media":16868,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[8,91,29],"tags":[1721,1720,1718,1719],"class_list":["post-16800","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-protection","category-iri-workbench","category-test-data","tag-darkshield-files-wizard","tag-files-wizard","tag-finding-pii","tag-masking-pii"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Find &amp; Mask File PII in the DarkShield GUI - IRI<\/title>\n<meta name=\"description\" content=\"IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Find &amp; Mask File PII in the DarkShield GUI - IRI\" \/>\n<meta property=\"og:description\" content=\"IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2024-01-03T21:07:31+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-01-05T20:18:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png\" \/>\n\t<meta property=\"og:image:width\" content=\"768\" \/>\n\t<meta property=\"og:image:height\" content=\"368\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Adam Lewis\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Adam Lewis\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"26 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\"},\"author\":{\"name\":\"Adam Lewis\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e\"},\"headline\":\"Find &#038; Mask File PII in the DarkShield GUI\",\"datePublished\":\"2024-01-03T21:07:31+00:00\",\"dateModified\":\"2024-01-05T20:18:36+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\"},\"wordCount\":3899,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png\",\"keywords\":[\"DarkShield Files Wizard\",\"files wizard\",\"finding pii\",\"masking pii\"],\"articleSection\":[\"Data Masking\/Protection\",\"IRI Workbench\",\"Test Data\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\",\"name\":\"Find & Mask File PII in the DarkShield GUI - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png\",\"datePublished\":\"2024-01-03T21:07:31+00:00\",\"dateModified\":\"2024-01-05T20:18:36+00:00\",\"description\":\"IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.\",\"breadcrumb\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png\",\"width\":768,\"height\":368},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/beta.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Find &#038; Mask File PII in the DarkShield GUI\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/beta.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e\",\"name\":\"Adam Lewis\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g\",\"caption\":\"Adam Lewis\"},\"url\":\"https:\/\/beta.iri.com\/blog\/author\/adaml\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Find & Mask File PII in the DarkShield GUI - IRI","description":"IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/","og_locale":"en_US","og_type":"article","og_title":"Find & Mask File PII in the DarkShield GUI - IRI","og_description":"IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.","og_url":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/","og_site_name":"IRI","article_published_time":"2024-01-03T21:07:31+00:00","article_modified_time":"2024-01-05T20:18:36+00:00","og_image":[{"width":768,"height":368,"url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png","type":"image\/png"}],"author":"Adam Lewis","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Adam Lewis","Est. reading time":"26 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#article","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/"},"author":{"name":"Adam Lewis","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e"},"headline":"Find &#038; Mask File PII in the DarkShield GUI","datePublished":"2024-01-03T21:07:31+00:00","dateModified":"2024-01-05T20:18:36+00:00","mainEntityOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/"},"wordCount":3899,"commentCount":0,"publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png","keywords":["DarkShield Files Wizard","files wizard","finding pii","masking pii"],"articleSection":["Data Masking\/Protection","IRI Workbench","Test Data"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/","url":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/","name":"Find & Mask File PII in the DarkShield GUI - IRI","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png","datePublished":"2024-01-03T21:07:31+00:00","dateModified":"2024-01-05T20:18:36+00:00","description":"IRI DarkShield includes fit-for-purpose facilities in the graphical IRI Workbench IDE that build jobs to search (classify) and mask (remediate) PII and other sensitive data in \u201cdark data\u201d sources. Gartner defines this as data not normally used for analytics; i.e., what is usually collected and stored in semi-structured and unstructured sources.","breadcrumb":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#primaryimage","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png","width":768,"height":368},{"@type":"BreadcrumbList","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/finding-and-masking-pii-in-files-with-the-darkshield-files-wizard\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/beta.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Find &#038; Mask File PII in the DarkShield GUI"}]},{"@type":"WebSite","@id":"https:\/\/beta.iri.com\/blog\/#website","url":"https:\/\/beta.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/beta.iri.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/beta.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/beta.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e","name":"Adam Lewis","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g","caption":"Adam Lewis"},"url":"https:\/\/beta.iri.com\/blog\/author\/adaml\/"}]}},"jetpack_featured_media_url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2024\/01\/Featured-Image-Files-Wizard.png","_links":{"self":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/16800"}],"collection":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/users\/152"}],"replies":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=16800"}],"version-history":[{"count":26,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/16800\/revisions"}],"predecessor-version":[{"id":17860,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/16800\/revisions\/17860"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media\/16868"}],"wp:attachment":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=16800"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=16800"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=16800"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}