Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG
A criminal bouncing off a computer

Defining Startpoint Data Security

  • by David Friedland

IRI has discussed startpoint security in further detail with the Outlook Series in a segment about data masking.

This article defines what we’d like to call “startpoint security” mostly by virtue of a comparison to endpoint security. Searches for the former didn’t yield anything on point, so at least for me, this is a case of first impression and an attempt to coin a term. My call for beholder (your) feedback on whether it’s bold, bogus or something in between, is via the comment button below.

We hear about endpoint security all the time. It refers to protection technologies for sensitive information in points along a network. Endpoint security covers mobile devices, laptops, and desktop PCs, as well as the servers and networking devices they connect through. It may also refer to storage devices like thumb and hard drives, and even more granular points within, including folders, files, and entire databases that can be encrypted, for example.

But what about securing data at its starting points; i.e., as data gets created by users and through applications that feed columns in databases (or values in files)? This is the only place where actually sensitive data is created, stored/queried, processed and moved along endpoints.

A criminal bouncing off a computer

The data masking industry is built around the concept of atomically protecting the personally identifiable information (PII) directly in the data source. Securing PII directly at these startpoints instead of (or at least in addition to) its endpoints with different techniques has several benefits, including:

  1. Efficiency – it’s much quicker (and less resource intensive) to encrypt or apply other de-identification functions to discrete values than to everything else around them
  2. Usability – by masking only what’s sensitive, the data around it is still accessible
  3. Breach nullification – any misappropriated data is already de-identified
  4. Accountability – data lineage and audit logs pointing to specific element protections are a better way to verify compliance with privacy laws applicable to specific PII (identifiers).
  5. Security – Multiple data masking techniques are harder to reverse than a single endpoint protection technique. For example, if the same encryption algorithm that was used to secure a network or hard drive is also used to mask just one field (while other fields are protected with other functions) is compromised, think about the difference in exposure.
  6. Testing – masked production data can also be used for prototyping and benchmarking
  7. Independence – data secured at its atomic source can move safely between databases, applications, and platforms (be they on-premise or the the cloud).

For those in the data governance industry, it may seem as if I’m just creating a new buzzword for data masking called startpoint security. I am not, however, equating them. Under my definition, startpoint security would also take into consideration the following:

  1. Data Discovery – the ability to find via pattern, fuzzy logic, and other searches the PII
  2. Data Classification – grouping discovered data into logical categories for global masking
  3. Data Lineage – tracking PII value and/or location changes through time for surety, etc.
  4. Data Latency – whether masking functions get applied to data at rest or in transit
  5. Metadata Lineage – recording and analyzing the changes to layouts and job definitions
  6. Authorization – managing who can mask, and/or access (restore), the data
  7. Risk Scoring – determining the statistical likelihood of re-identification (think HIPAA )
  8. Audit Logs – being able to query who masked what, and who saw what, when, and where.

order schema er diagram

Many of these additional considerations are not exclusive to startpoint security, but I think we can agree that classification, lineage, and latency are more relevant in the data-centric realm than they are to endpoint security.

What’s your opinion? Please leave me a comment below and we’ll start the discussion there.

Which Data Masking Function Should I Use?
Scoring Datasets for Re-ID Risk
audit logs data authorization data classification data discovery data latency data lineage data masking data protection data security metadata lineage personally identifiable information PII risk scoring startpoint protection

Related articles

DarkShield PII Discovery & Masking…
Masking Flat Files in the…
Directory Data Class Search Wizard
Masking PII in a Relational…
IRI Data Class Map
Schema Data Class Search
Training NER Models in IRI…
Masking NoSQL DB PII in…
Masking RDB Data in the…
IRI DarkShield-NoSQL RPC API
Find & Mask File PII…
1 COMMENT
  • Which Data Masking Function Should I Use? - IRI
    February 19, 2019 at 5:17 pm
    Reply

    […] “startpoint” data-centric security product IRI FieldShield — or the IRI CoSort product and IRI […]

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact