Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG

How to Mask Data in Web Logs

  • by Chaitali Mitra

This article is third in a 3-part series on CLF and ELF web log data. We first introduced CLF and ELF web log formats, then introduced IRI solutions for processing web log data, and here we conclude by masking private data in web log files. Note that this article specifically addresses the masking of structured web log files via IRI FieldShield, while other web, application, and device log formats that are semi- or un-structured can be scanned for PII and masked using IRI DarkShield. The IRI Voracity data management platform includes both tools, plus the ability to cleanse, manipulate, remap, and report from various log and message formats.

Web log files are created by, and stored on, website servers to track visitors’ clickstream information trail. Some of the information in these logs is sensitive or personally identifiable.

As we know from articles in the data masking sections of IRI’s blog and website, there are multiple ways to shield personally identifiable information (PII) or otherwise sensitive data in structured sources. String masking, for example, covers over (or redacts) original values using other characters. Encryption, on the other hand, produces ciphertext that de-identifies the original value, but allows its restoration (decryption).

IRI FieldShield software protects PII in databases and many other data sources — including web logs — with multiple field-level security functions. FieldShield can mask, encrypt, delete, or randomize IP addresses, along with other items subject to data protection and privacy laws. FieldShield also supports pseudonymization, hashing, redaction, and sub-string manipulation of data in structured file formats.

Consider the sample Extended Log Format (ELF) file below. It contains the visit date, time, IP address, server IP address, port, protocol, number of transferred bytes, and the URL of the opened page:

2014-05-24,12:55:15,32.09.130.15,96.48.225.22,GET,80,200,10801,"http://www.iri.com/products/fieldshield/why-is-fieldshield-better"
2014-05-24,20:55:15,96.47.227.21,96.46.220.42,GET,80,200,10801,"http://www.iri.com/solutions/data-masking/encryption/format-preserving-encryption"
2014-05-24,22:18:01,12.41.114.23,96.45.225.98,GET,80,200,10801,"http://www.iri.com/solutions/data-masking/de-identification/overview"
2014-05-24,13:15:06,96.46.230.79,96.47.126.99,GET,80,200,10801,"http://www.iri.com/products/workbench/fieldshield-gui/apply-rules"
2014-05-24 23:15:06,96.45.226.19,95.47.214.50,GET,80,200,10801,"http://www.iri.com/blog/data-protection/data-risk-fieldshield-mitigation/"
2014-05-25,23:15:22,11.11.111.11,95.47.214.50,GET,80,200,10801,"http://www.iri.com/blog/test-data/rowgen-v3-automates-database-test-data-generation/"

Use the Encryption and Decryption dialog in the IRI Workbench GUI for FieldShield to apply field-level encryption. Below is an example of encrypting each visitor’s IP address with a format-preserving AES-256 function:

Capture

Similar dialogs exist for string masking, pseudonymization, randomization, hashing, de-ID, encoding, etc.

The portable FieldShield job script created automatically in the GUI (or by hand, if you prefer), reflects both field encryption and redaction:

/INFILE=rawlog.elf
   /PROCESS=ELF
      /FIELD=(DATE, POSITION=1,TYPE=ASCII, SEPARATOR=" ")
      /FIELD=(TIME, POSITION=2,TYPE=ASCII, SEPARATOR=" ")
      /FIELD=(C_IP, POSITION=3, SEPARATOR=" ", TYPE=IP_ADDRESS)
      /FIELD=(S_IP, POSITION=4, SEPARATOR=" ", TYPE=IP_ADDRESS)
      /FIELD=(CSMETHOD, POSITION=5,TYPE=ASCII, SEPARATOR=" ") 
      /FIELD=(S_PORT, POSITION=6, SEPARATOR=" ")
      /FIELD=(STATUS, POSITION=7, SEPARATOR=" ")
      /FIELD=(BYTES, POSITION=8, SEPARATOR=" ")
      /FIELD=(CS_URI_STEM, POSITION=9, SEPARATOR=" ",TYPE=ASCII,FRAME='"')
   /OMIT WHERE C_IP EQ "11.11.111.11"

/OUTFILE=maskedlog.elf
   /PROCESS=ELF
   /HEADREC="DATE        TIME    MASKED IP    CS_URI_STEM\n\n"
      /FIELD=(DATE, POSITION=1, SIZE=12, TYPE=ASCII)
      /FIELD=(TIME, POSITION=15, SIZE=10, TYPE=ASCII)
      /FIELD=(ENC_AES256_C_IP=enc_fp_aes256_alphanum(C_IP), POSITION=30, SIZE=12, TYPE=IP_ADDRESS)
      /FIELD=(replace_chars(CS_URI_STEM , "*",7, 8, "#", 30, 8), POSITION=45, SIZE=55, TYPE=ASCII)

After running the script, we get the ELF-style output desired …  but in fixed position, and compliant with privacy regulations.

DATE        TIME    MASKED IP    CS_URI_STEM
2014-05-24 12:55:15 32.09.130.15 http:/********.com/products/f########ld/why-is-fieldshi
2014-05-24 13:15:06 05.07.569.95 http:/********.com/products/w########/fieldshield-gui/a
2014-05-24 20:55:15 98.68.117.52 http:/********.com/solutions/########king/encryption/fo
2014-05-24 22:18:01 69.67.212.32 http:/********.com/solutions/########king/de-identifica
2014-05-24 23:15:06 42.01.555.73 http:/********.com/blog/data-########on/data-risk-field

See additional formatting, filtering, transformation, and calculation functions in the previous blog on CLF and ELF Web Log Data Processing. Contact fieldshield@iri.com for assistance.

CLF and ELF Web Log Processing
(ACU)COBOL Vision File Conversion and Processing
CLF clickstream data masking ELF encryption field-level security hashing IRI FieldShield personally identifiable information PII pseudonymization randomize sub-string manipulation web log

Related articles

DarkShield PII Discovery & Masking…
Masking Flat Files in the…
Directory Data Class Search Wizard
Masking PII in a Relational…
IRI Data Class Map
Schema Data Class Search
Training NER Models in IRI…
Masking NoSQL DB PII in…
Masking RDB Data in the…
IRI DarkShield-NoSQL RPC API
Find & Mask File PII…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact