{"id":15269,"date":"2021-11-30T16:14:20","date_gmt":"2021-11-30T21:14:20","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=15269"},"modified":"2021-11-30T16:14:20","modified_gmt":"2021-11-30T21:14:20","slug":"find-mask-pii-in-bigtable-cosmos-and-dynamo","status":"publish","type":"post","link":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/","title":{"rendered":"Find &#038; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">Abstract: <\/span><i><span style=\"font-weight: 400;\">This article covers the use of the <\/span><\/i><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/darkshield-rpc-api\/\"><i><span style=\"font-weight: 400;\">IRI DarkShield API<\/span><\/i><\/a><i><span style=\"font-weight: 400;\"> for automatically locating and de-identifying PII or other sensitive data in the three major cloud provider NoSQL databases &#8212; Google BigTable, MS CosmosDB in Azure, and Amazon DynamoDB. Prior articles in this blog cover how DarkShield wizards in IRI Workbench find and mask data in other popular NoSQL DBs, including <\/span><\/i><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\"><i><span style=\"font-weight: 400;\">Cassandra, Elasticsearch and MongoDB<\/span><\/i><\/a><span style=\"font-weight: 400;\">.<span id='easy-footnote-1-15269' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#easy-footnote-bottom-1-15269' title='\u00a0Note that the same DarkShield base API described herein can also be used on those three as well, and IRI is now also working to support Couchbase, Redis, and Solr. The &lt;a href=&quot;https:\/\/www.iri.com\/blog\/data-protection\/darkshield-files-rpc-api\/&quot;&gt;DarkShield API for files&lt;\/a&gt; finds and masks data in RDB C\/BLOB columns, unstructured text and log files, semi-structured EDI files like HL7, JSON, X12 and XML, MS and PDF documents and many image formats.'><sup>1<\/sup><\/a><\/span><\/span><i><span style=\"font-weight: 400;\">\u00a0A subsequent article covers CouchDB, Redis and Solr.<\/span><\/i><\/p>\n<h6><b>What is NoSQL?<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">NoSQL typically stands for \u201cnot only SQL\u201d although others may say it stands for \u201cnon SQL\u201d. NoSQL was introduced to provide an alternative to relational databases that at the time, were the dominant force in the industry.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Because NoSQL databases are non-tabular, data is stored differently compared to SQL databases. There are actually various types of NoSQL databases based on their data model. These data models include documents, key-value pairs, wide-column, and graphs.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15280 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/sql-nosql-database-diagram.png\" alt=\"\" width=\"651\" height=\"348\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/sql-nosql-database-diagram.png 687w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/sql-nosql-database-diagram-300x160.png 300w\" sizes=\"(max-width: 651px) 100vw, 651px\" \/><\/p>\n<h6><b>The Strength of NoSQL Databases<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">According to CloudGuru.com, relational databases have \u201c<\/span><i><span style=\"font-weight: 400;\">inflexible schemas and notoriously difficult horizontal scaling [which means] they don\u2019t always fit well in a highly scalable and geographically distributed infrastructure stack<\/span><\/i><span style=\"font-weight: 400;\">\u201d<span id='easy-footnote-2-15269' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#easy-footnote-bottom-2-15269' title='Vanbuskirk, Mike Nov, et al. \u201cNoSQL Databases Comparison: Cosmos DB VS DynamoDB VS Cloud Datastore and Bigtable.\u201d &lt;i&gt;A Cloud Guru&lt;\/i&gt;, 25 June 2021, &lt;a href=&quot;https:\/\/acloudguru.com\/blog\/engineering\/comparing-cloud-nosql-databases-dynamodb-vs-cosmos-db-vs-cloud-datastore-and-bigtable&quot;&gt;acloudguru.com\/blog\/engineering\/comparing-cloud-nosql-databases-dynamodb-vs-cosmos-db-vs-cloud-datastore-and-bigtable&lt;\/a&gt;'><sup>2<\/sup><\/a><\/span><\/span><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In comparison, the flexibility of the NoSQL document-model makes it easier to change data. NoSQL databases are also easier to scale horizontally, and usually the cloud providers handle the operational overhead of managing infrastructure.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15278 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/nosql-database-graphic.png\" alt=\"\" width=\"650\" height=\"372\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/nosql-database-graphic.png 760w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/nosql-database-graphic-300x172.png 300w\" sizes=\"(max-width: 650px) 100vw, 650px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">To know when to choose NoSQL over relational databases there are generally a few factors to consider for decision makers. According to MongoDB the drivers are: \u201c<\/span><i><span style=\"font-weight: 400;\">fast-paced Agile development, storage of structured and semi-structured data, huge volumes of data, requirements for scale-out architecture, modern application paradigms like microservices and real-time streaming<\/span><\/i><span style=\"font-weight: 400;\">\u201d<span id='easy-footnote-3-15269' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#easy-footnote-bottom-3-15269' title='&lt;\/span&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;What Is Nosql? NoSQL Databases Explained.\u201d &lt;\/span&gt;&lt;i&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;MongoDB&lt;\/span&gt;&lt;\/i&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;, &lt;\/span&gt;&lt;a href=&quot;http:\/\/www.mongodb.com\/nosql-explained&quot;&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;www.mongodb.com\/nosql-explained&lt;\/span&gt;&lt;\/a&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;.'><sup>3<\/sup><\/a><\/span><\/span><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h6><b>NoSQL DB Security Concerns<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">As with traditional relational (SQL) databases, NoSQL DBs have similar security issues, but also some unique risks. According to the International Journal of Digital Society, NoSQL vulnerabilities include: \u201c<\/span><i><span style=\"font-weight: 400;\">insufficient or ineffective input validation, errors in the application level permissions handling, weak authentication, insecure communication, illegal access to unencrypted data, etc. are some of the vulnerabilities applicable for NoSQL<\/span><\/i><span style=\"font-weight: 400;\">\u201d<\/span><span style=\"font-weight: 400;\"><span id='easy-footnote-4-15269' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#easy-footnote-bottom-4-15269' title='Shahriar, Hossain, and Hisham M Haddad. \u201cSecurity Vulnerabilities of NoSQL and SQL Databases for MOOC Applications.\u201d International Journal of Digital Society, Mar. 2017.'><sup>4<\/sup><\/a><\/span><\/span><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Like SQL injections, NoSQL injections are also possible when input validation is not handled properly. Because NoSQL databases do not have a common query language, queries are written in the programming language (PHP, JavaScript, Python, etc) of the application connected to the database. This means NoSQL injections can result in commands being executed not only in the database, but also in the application itself.\u00a0<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15279 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/nosql-security-graphic-1024x412.png\" alt=\"\" width=\"651\" height=\"262\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/nosql-security-graphic-1024x412.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/nosql-security-graphic-300x121.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/nosql-security-graphic-768x309.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/nosql-security-graphic.png 1026w\" sizes=\"(max-width: 651px) 100vw, 651px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">There is a long list of endpoint security practices for NoSQL DBs. But even with them, would-be assailants still manage to punch holes in those defenses. Companies must thus evolve to harden the security profile of these collections with another level of protection.\u00a0\u00a0<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15282 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/data-leak-hard-drive-1024x664.jpg\" alt=\"\" width=\"632\" height=\"410\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/data-leak-hard-drive-1024x664.jpg 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/data-leak-hard-drive-300x194.jpg 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/data-leak-hard-drive-768x498.jpg 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/data-leak-hard-drive.jpg 1110w\" sizes=\"(max-width: 632px) 100vw, 632px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">This is where <\/span><a href=\"https:\/\/www.iri.com\/products\/darkshield\"><span style=\"font-weight: 400;\">IRI DarkShield<\/span><\/a><span style=\"font-weight: 400;\"> comes in. As a data-centric, or \u201cstartpoint security\u201d solution, DarkShield masking provides another important layer of data protection atop the end-point measures deployed by cloud database service providers.<\/span><\/p>\n<h6><b>About IRI DarkShield<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">IRI DarkShield is a data masking tool for finding and de-identifying sensitive data in semi-structured and unstructured files and databases. DarkShield is one of three core data masking products in the <\/span><a href=\"https:\/\/www.iri.com\/products\/iri-data-protector\"><span style=\"font-weight: 400;\">IRI Data Protector Suite<\/span><\/a><span style=\"font-weight: 400;\"> which leverage graphical data classification, searching, and masking job design models in the <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> IDE, built on Eclipse.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15283 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/glasses-in-front-of-code-1024x484.png\" alt=\"\" width=\"650\" height=\"307\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/glasses-in-front-of-code-1024x484.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/glasses-in-front-of-code-300x142.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/glasses-in-front-of-code-768x363.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/glasses-in-front-of-code.png 1200w\" sizes=\"(max-width: 650px) 100vw, 650px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">As of DarkShield Version 4, however, two powerful Remote Procedure Call (RPC) Application Programming Interface (API) versions are also provided: the \u201cBase\u201d DarkShield API and the DarkShield-Files API. The DarkShield APIs extend the use of DarkShield functionality outside of Workbench and leverage a plugin on top of an IRI Web Services platform named Plankton.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-14216 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/01\/iri-web-services-architecture-cropped.png\" alt=\"\" width=\"401\" height=\"411\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/01\/iri-web-services-architecture-cropped.png 438w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/01\/iri-web-services-architecture-cropped-293x300.png 293w\" sizes=\"(max-width: 401px) 100vw, 401px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">To find and protect sensitive data in a wide range of sources, the DarkShield APIs use specified search matchers and masking rules that follow business rules. For more information on creating search matchers and masking rules, please refer to this <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/darkshield-rpc-api\/\"><span style=\"font-weight: 400;\">article<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The \u201cBase\u201d DarkShield API is used to search and mask unstructured text outside the context of files. Alternatively, the DarkShield-Files API provides the ability to search and mask PII in files.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">With the assistance of the DarkShield-Files API, semi-structured and unstructured data like plain text files, csv\/tsv, word documents, excel, pdf, json, xml, parquet, jpeg, and png images can be searched and masked.<\/span><\/p>\n<h6><b>AWS DynamoDB, Azure CosmosDB, Google BigTable and the DarkShield API<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">The companies reigning over cloud services for NoSQL databases are Amazon AWS with DynamoDB, Microsoft\u2019s Azure CosmosDB, and Google\u2019s Cloud BigTable. The focus of this article is on these three well known service providers and how the DarkShield-Files API can be leveraged to search and mask inside their NoSQL databases located in the cloud.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15004 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/09\/cloud-services-collage-e1632854430719.png\" alt=\"\" width=\"550\" height=\"307\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/09\/cloud-services-collage-e1632854430719.png 661w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/09\/cloud-services-collage-e1632854430719-300x167.png 300w\" sizes=\"(max-width: 550px) 100vw, 550px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">For those unfamiliar with connecting and querying NoSQL databases programmatically, not to worry. AWS, Azure, and Google Cloud are not only known for providing high quality service but also provide copious amounts of documentation on how to access their database content using Software Development toolkits (SDK) supported in various programming languages.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The DarkShield-File API demos currently uploaded to GitHub are written in the Python language; as such, those projects use client libraries for Python. However, other calling languages, like Java, can be used.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">These calling programs, or \u201cglue code\u201d to the API, is where these procedures can be defined. See below for the links to the DarkShield-Files API demos:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/tree\/master\/cosmosdb\"><span style=\"font-weight: 400;\">Azure CosmosDB<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/tree\/master\/dynamodb\"><span style=\"font-weight: 400;\">AWS DynamoDB<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/tree\/master\/bigtable\"><span style=\"font-weight: 400;\">Google Cloud BigTable<\/span><\/a><\/li>\n<\/ul>\n<p><b>DarkShield Search and Mask Contexts<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Within the IRI <\/span><i><span style=\"font-weight: 400;\">darkshield-files-api<\/span><\/i><span style=\"font-weight: 400;\"> demos in GitHub, there will be a setup file included. The setup file will define a search context, mask context, file search context, and file mask context that are needed by the DarkShield-Files API. Without these contexts defined, the DarkShield-Files API will not search or mask.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15284 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-darkshield-api-search-context.png\" alt=\"\" width=\"405\" height=\"442\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-darkshield-api-search-context.png 405w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-darkshield-api-search-context-275x300.png 275w\" sizes=\"(max-width: 405px) 100vw, 405px\" \/><i><span style=\"font-weight: 400;\">DarkShield API Search Context<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">A search context designates the PII that will be annotated in the files read through matchers. There are a variety of matcher types for search matchers. The DarkShield-File API supports using search matchers based on regular expressions, named entity recognition (NER) models, and matching based on predefined text that would be matched against in SET files.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The image above displays an <\/span><i><span style=\"font-weight: 400;\">EmailMatcher<\/span><\/i><span style=\"font-weight: 400;\"> that uses regular expression patterns to search for any text that may contain a \u201c@\u201d and website suffix, a <\/span><i><span style=\"font-weight: 400;\">SsnMatcher<\/span><\/i><span style=\"font-weight: 400;\"> that uses regular expression patterns to search for any text that may follow the format of SSN, and a <\/span><i><span style=\"font-weight: 400;\">NameMatcher<\/span><\/i><span style=\"font-weight: 400;\"> that uses a Named Entity Recognition (NER) model to identify names.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15285 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-file-search-context.png\" alt=\"\" width=\"257\" height=\"388\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-file-search-context.png 257w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-file-search-context-199x300.png 199w\" sizes=\"(max-width: 257px) 100vw, 257px\" \/><i><span style=\"font-weight: 400;\">File Search Context<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">For specific file formats, the DarkShield-Files API provides users with additional filtering and matching options. In this example, path matchers are provided for json and xml files.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15286 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-mask-context.png\" alt=\"\" width=\"416\" height=\"680\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-mask-context.png 416w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-mask-context-184x300.png 184w\" sizes=\"(max-width: 416px) 100vw, 416px\" \/><i><span style=\"font-weight: 400;\">Mask Context<\/span><\/i><i><span style=\"font-weight: 400;\"><br \/>\n<\/span><\/i><i><span style=\"font-weight: 400;\">Note: In older versions of the DarkShield-Files API, the configuration for rules and rulesMatcher requires the \u201ctype: cosort\u201d and \u201ctype:name\u201d in their respective configurations.<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">For the API to know what to do with PII that has been discovered during search operations, a mask context must be defined. The first part of a mask context contains a list of rules that we want to apply. Each rule has an expression that dictates what masking function will be used.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">These expressions are also documented in the IRI FieldShield manual and IRI Workbench, and because the functions are compatible, enterprise data integrity can be preserved post-masking regardless of source. The list of <\/span><a href=\"https:\/\/www.iri.com\/solutions\/data-masking\/static-data-masking\"><span style=\"font-weight: 400;\">possible masking rules<\/span><\/a><span style=\"font-weight: 400;\"> include:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Assignment Expressions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Blur Functions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Deletion Functions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Encoding Functions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Encryption Functions (AES, 3DES, FPE, GPG)\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Hashing Functions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Pseudonym Replacement<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Redaction Functions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">String Manipulation Functions<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">In the code above we have three rules called <\/span><i><span style=\"font-weight: 400;\">HashRule<\/span><\/i><span style=\"font-weight: 400;\">, <\/span><i><span style=\"font-weight: 400;\">RedactSsnRule<\/span><\/i><span style=\"font-weight: 400;\">, and <\/span><i><span style=\"font-weight: 400;\">FpeRule<\/span><\/i><span style=\"font-weight: 400;\">. Respectively, the rules were assigned a hashing function, a function to replace characters with \u2018*\u2019, and format preserving encryption. The DarkShield API uses the same <\/span><a href=\"https:\/\/www.iri.com\/solutions\/data-masking\/static-data-masking\"><span style=\"font-weight: 400;\">masking functions<\/span><\/a><span style=\"font-weight: 400;\"> as <\/span><a href=\"https:\/\/www.iri.com\/products\/fieldshield\"><span style=\"font-weight: 400;\">IRI FieldShield<\/span><\/a><span style=\"font-weight: 400;\"> (which masks structured data in SortCL-compatible job scripts).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Following masking rules are rule matchers. The rule matchers are easy to understand. Rule matchers pair search matchers with masking rules.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Lastly, is the file mask context. For specific file formats, the DarkShield-Files API provides users with additional configuration options. In this example, the configuration for json files has specified the implementation of pretty print.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15287 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-file-mask-context.png\" alt=\"\" width=\"247\" height=\"286\" \/><i><span style=\"font-weight: 400;\">File Mask Context<\/span><\/i><\/p>\n<h6><b>Authentication Credentials of NoSQL Demos<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">Accessing BigTable, CosmosDB, or DynamoDB programmatically requires the user\u2019s login credentials in some form for authentication. There are various ways to store and access these credentials securely, but for the sake of simplicity the three NoSQL demos either use credential files or environment variables.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15288 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosdb-dynamodb.png\" alt=\"\" width=\"426\" height=\"143\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosdb-dynamodb.png 426w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosdb-dynamodb-300x101.png 300w\" sizes=\"(max-width: 426px) 100vw, 426px\" \/><i><span style=\"font-weight: 400;\">CosmosDB credentials.json | DynamoDB .aws\/credentials file<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">Google BigTable allows you to generate a private key for your credential and download the newly generated key in a file.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15289 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-1024x555.png\" alt=\"\" width=\"649\" height=\"352\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-1024x555.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-300x163.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-768x416.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo.png 1600w\" sizes=\"(max-width: 649px) 100vw, 649px\" \/><i><span style=\"font-weight: 400;\">Google BigTable demo uses an environment variable GOOGLE_APPLICATION_CREDENTIALS to designate a path to the private key contained in the file downloaded from Google Cloud Platform console.<\/span><\/i><\/p>\n<h5><b>Taking a Closer Look at the DarkShield API Interface to BigTable<\/b><\/h5>\n<h6><b>The Main Program<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">To get an idea of how the main program would be implemented below is a screenshot of the Google BigTable <\/span><i><span style=\"font-weight: 400;\">main.py.<\/span><\/i><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15291\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-1.png\" alt=\"\" width=\"587\" height=\"576\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-1.png 587w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-1-300x294.png 300w\" sizes=\"(max-width: 587px) 100vw, 587px\" \/><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15292\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-2.png\" alt=\"\" width=\"795\" height=\"874\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-2.png 795w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-2-273x300.png 273w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-2-768x844.png 768w\" sizes=\"(max-width: 795px) 100vw, 795px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15290\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-3.png\" alt=\"\" width=\"566\" height=\"336\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-3.png 566w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-3-300x178.png 300w\" sizes=\"(max-width: 566px) 100vw, 566px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">All of the previously linked demos use a main program that facilitates the DarkShield-Files API call. The main program will contain the glue code that performs the following actions:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Authentication to the datasource (NoSQL DB)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Accesses and queries the database<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Makes POST requests to the DarkShield-Files API with the content of the DB<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Resulting output from the DarkShield-Files API is written back to the database.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">In the BigTable demo the resulting output has been written back into the database. Alternatively, the code could be altered to write the masked results to files or to a separate test database. The DarkShield-Files API is a flexible tool that is only limited by the glue code that manipulates it.<\/span><\/p>\n<h6><b>Executing the Program<\/b><\/h6>\n<p><span style=\"font-weight: 400;\">To execute, run python main.py \u201cproject_id\u201d \u201cinstance_id\u201d from your terminal. For those wondering, project_id is your Cloud Platform project ID and instance_id is the ID of the Cloud Bigtable instance you wish to connect to.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Below is an example of what the execution may look like:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15293 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-execution.png\" alt=\"\" width=\"979\" height=\"78\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-execution.png 979w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-execution-300x24.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/bigtable-main-py-execution-768x61.png 768w\" sizes=\"(max-width: 979px) 100vw, 979px\" \/><\/p>\n<h5><b>Results of Searching and Masking of PII via the DarkShield API<\/b><\/h5>\n<h6><b>Google BigTable<\/b><\/h6>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15303 alignleft\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/google-bigtable-logo-1024x912.png\" alt=\"\" width=\"150\" height=\"134\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/google-bigtable-logo-1024x912.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/google-bigtable-logo-300x267.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/google-bigtable-logo-768x684.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/google-bigtable-logo.png 1025w\" sizes=\"(max-width: 150px) 100vw, 150px\" \/><\/p>\n<p>Below is a demonstration of the results of search and masking operations performed on Google Cloud BigTable using the BigTable <a href=\"https:\/\/github.com\/TeamIRI\/darkshield-api-demos\/tree\/master\/bigtable\">demo<\/a> on GitHub:<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15295 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-project.png\" alt=\"\" width=\"651\" height=\"198\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-project.png 733w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-project-300x91.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-bigtable-demo-project-730x223.png 730w\" sizes=\"(max-width: 651px) 100vw, 651px\" \/><i><span style=\"font-weight: 400;\">BigTable Demo Project<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15294 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-original-and-masked-bigtable-1024x409.jpg\" alt=\"\" width=\"651\" height=\"260\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-original-and-masked-bigtable-1024x409.jpg 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-original-and-masked-bigtable-300x120.jpg 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-original-and-masked-bigtable-768x306.jpg 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-original-and-masked-bigtable.jpg 1110w\" sizes=\"(max-width: 651px) 100vw, 651px\" \/><i><span style=\"font-weight: 400;\">Original data and masked results after execution of the IRI DarkShield BigTable demo<\/span><\/i><\/p>\n<h6><b>Azure CosmosDB<\/b><\/h6>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-15305 alignleft\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/azure-cosmosdb-logo.png\" alt=\"\" width=\"150\" height=\"114\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/azure-cosmosdb-logo.png 402w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/azure-cosmosdb-logo-300x228.png 300w\" sizes=\"(max-width: 150px) 100vw, 150px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Below is a demonstration of the results of search and masking operations performed on CosmosDB:<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15296 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosdb-data-source-explorer.png\" alt=\"\" width=\"510\" height=\"348\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosdb-data-source-explorer.png 591w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosdb-data-source-explorer-300x205.png 300w\" sizes=\"(max-width: 510px) 100vw, 510px\" \/><i><span style=\"font-weight: 400;\">CosmosDB data source explorer<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15298 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmositem1.png\" alt=\"\" width=\"701\" height=\"222\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmositem1.png 837w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmositem1-300x95.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmositem1-768x243.png 768w\" sizes=\"(max-width: 701px) 100vw, 701px\" \/><i><span style=\"font-weight: 400;\">Vulnerable PII in a CosmosDB collection.<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15297 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosmask1.png\" alt=\"\" width=\"699\" height=\"206\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosmask1.png 841w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosmask1-300x88.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-cosmosmask1-768x226.png 768w\" sizes=\"(max-width: 699px) 100vw, 699px\" \/><i><span style=\"font-weight: 400;\">CosmosDB collection item after masking<\/span><\/i><\/p>\n<h6><b>Amazon DynamoDB<\/b><\/h6>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-15304 alignleft\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/amazon-dynamodb-logo.png\" alt=\"\" width=\"233\" height=\"89\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/amazon-dynamodb-logo.png 655w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/amazon-dynamodb-logo-300x115.png 300w\" sizes=\"(max-width: 233px) 100vw, 233px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Below is a demonstration of the results of search and masking operations performed on DynamoDB:<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15299 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-aws-sql-workbench-1024x590.png\" alt=\"\" width=\"649\" height=\"374\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-aws-sql-workbench-1024x590.png 1024w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-aws-sql-workbench-300x173.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-aws-sql-workbench-768x443.png 768w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-aws-sql-workbench.png 1192w\" sizes=\"(max-width: 649px) 100vw, 649px\" \/><\/p>\n<p><i><span style=\"font-weight: 400;\">AWS NoSQL Workbench provides UI to DynamoDB<\/span><\/i><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-15300 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaorig.png\" alt=\"\" width=\"602\" height=\"656\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaorig.png 696w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaorig-275x300.png 275w\" sizes=\"(max-width: 602px) 100vw, 602px\" \/><i><span style=\"font-weight: 400;\">Unmasked PII in DynamoDB Collections<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">Masked results exported to csv format part 1:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15302 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult.png\" alt=\"\" width=\"631\" height=\"83\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult.png 631w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult-300x39.png 300w\" sizes=\"(max-width: 631px) 100vw, 631px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Masked results exported to csv format part 2:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-15301 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult2.png\" alt=\"\" width=\"920\" height=\"82\" srcset=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult2.png 920w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult2-300x27.png 300w, https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-dynaresult2-768x68.png 768w\" sizes=\"(max-width: 920px) 100vw, 920px\" \/><\/p>\n<p><b>Conclusion<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Finding and masking PII through the DarkShield-Files API is an \u201copen\u201d solution not constrained by the data source or silo. As with RDBs, files, documents and images, DarkShield\u2019s API delivers flexible codable solutions to detect and protect sensitive structured, semi-structured and unstructured data in almost any NoSQL database, whether it runs on-premise or in the cloud.\u00a0<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Abstract: This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud provider NoSQL databases &#8212; Google BigTable, MS CosmosDB in Azure, and Amazon DynamoDB. Prior articles in this blog cover how DarkShield wizards in IRI Workbench find and mask data<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\" title=\"Find &#038; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs\">Read More<\/a><\/div>\n","protected":false},"author":152,"featured_media":15277,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[8],"tags":[1570,1568,1566,1567,1556,1494,1496,1569,1565,1388,369,536,149,1306,1490],"class_list":["post-15269","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-protection","tag-amazon-dynamodb","tag-azure-cosmosdb","tag-bigtable","tag-cosmos","tag-cosmosdb","tag-darkshield-api","tag-darkshield-rpc-api","tag-dynamodb","tag-google-bigtable","tag-iri-darkshield","tag-nosql","tag-nosql-database","tag-pii","tag-pii-masking","tag-search-matcher"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.3 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Find &amp; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs - IRI<\/title>\n<meta name=\"description\" content=\"This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Find &amp; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs - IRI\" \/>\n<meta property=\"og:description\" content=\"This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud\" \/>\n<meta property=\"og:url\" content=\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2021-11-30T21:14:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1110\" \/>\n\t<meta property=\"og:image:height\" content=\"624\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Adam Lewis\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Adam Lewis\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\"},\"author\":{\"name\":\"Adam Lewis\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e\"},\"headline\":\"Find &#038; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs\",\"datePublished\":\"2021-11-30T21:14:20+00:00\",\"dateModified\":\"2021-11-30T21:14:20+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\"},\"wordCount\":1962,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg\",\"keywords\":[\"Amazon DynamoDB\",\"Azure CosmosDB\",\"BigTable\",\"Cosmos\",\"CosmosDB\",\"Darkshield API\",\"DarkShield RPC API\",\"DynamoDB\",\"Google BigTable\",\"IRI DarkShield\",\"NoSQL\",\"NoSQL database\",\"PII\",\"pii masking\",\"search matcher\"],\"articleSection\":[\"Data Masking\/Protection\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\",\"name\":\"Find & Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg\",\"datePublished\":\"2021-11-30T21:14:20+00:00\",\"dateModified\":\"2021-11-30T21:14:20+00:00\",\"description\":\"This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud\",\"breadcrumb\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg\",\"width\":1110,\"height\":624},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/beta.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Find &#038; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#website\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/beta.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/beta.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e\",\"name\":\"Adam Lewis\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g\",\"caption\":\"Adam Lewis\"},\"url\":\"https:\/\/beta.iri.com\/blog\/author\/adaml\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Find & Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs - IRI","description":"This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/","og_locale":"en_US","og_type":"article","og_title":"Find & Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs - IRI","og_description":"This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud","og_url":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/","og_site_name":"IRI","article_published_time":"2021-11-30T21:14:20+00:00","og_image":[{"width":1110,"height":624,"url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg","type":"image\/jpeg"}],"author":"Adam Lewis","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Adam Lewis","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#article","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/"},"author":{"name":"Adam Lewis","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e"},"headline":"Find &#038; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs","datePublished":"2021-11-30T21:14:20+00:00","dateModified":"2021-11-30T21:14:20+00:00","mainEntityOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/"},"wordCount":1962,"commentCount":0,"publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg","keywords":["Amazon DynamoDB","Azure CosmosDB","BigTable","Cosmos","CosmosDB","Darkshield API","DarkShield RPC API","DynamoDB","Google BigTable","IRI DarkShield","NoSQL","NoSQL database","PII","pii masking","search matcher"],"articleSection":["Data Masking\/Protection"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/","url":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/","name":"Find & Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs - IRI","isPartOf":{"@id":"https:\/\/beta.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage"},"thumbnailUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg","datePublished":"2021-11-30T21:14:20+00:00","dateModified":"2021-11-30T21:14:20+00:00","description":"This article covers the use of the IRI DarkShield API for automatically locating and de-identifying PII or other sensitive data in the three major cloud","breadcrumb":{"@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#primaryimage","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg","width":1110,"height":624},{"@type":"BreadcrumbList","@id":"https:\/\/beta.iri.com\/blog\/data-protection\/find-mask-pii-in-bigtable-cosmos-and-dynamo\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/beta.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Find &#038; Mask PII in BigTable, Cosmos and Dynamo NoSQL DBs"}]},{"@type":"WebSite","@id":"https:\/\/beta.iri.com\/blog\/#website","url":"https:\/\/beta.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/beta.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/beta.iri.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/beta.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/beta.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/37c0e5beab094bd61cc521902df2876e","name":"Adam Lewis","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/beta.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/087667d0c75d33bb6fab6e734bd89333?s=96&d=blank&r=g","caption":"Adam Lewis"},"url":"https:\/\/beta.iri.com\/blog\/author\/adaml\/"}]}},"jetpack_featured_media_url":"https:\/\/beta.iri.com\/blog\/wp-content\/uploads\/2021\/11\/masking-pii-in-cloud-nosql-thumbnail.jpg","_links":{"self":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/15269"}],"collection":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/users\/152"}],"replies":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=15269"}],"version-history":[{"count":9,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/15269\/revisions"}],"predecessor-version":[{"id":15271,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/posts\/15269\/revisions\/15271"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media\/15277"}],"wp:attachment":[{"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=15269"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=15269"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/beta.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=15269"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}