Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG
CLF ELF Data

CLF and ELF Web Log Formats

  • by Chaitali Mitra

This article is first in a 3-part series on CLF and ELF web log data, where we introduce these file formats. The next article covers IRI solutions for processing web log data, and the last demonstrates web log data masking to protect visitor identities and destinations.

The NCSA Common Log Format (CLF) is a standardized text file format used by web servers when generating server log files. The format is standardized so that analytic programs can more conveniently make use of the information contained within them, though other proprietary log formats exist.

CLF logs are in a fixed (non-customizable) ASCII format, and record basic information about user requests. For example, a CLF record might contain:

125.0.0.1 user - identifier sjones [10/Oct/2011:13:55:36 -0700] "GET /examp_alt.png HTTP/1.0" 200 10801

where:

  • A “-” in a field indicates missing data
  • 125.1.0.1 is the IP address of the client (remote host) which made the request to the server
  • user-identifier is the identity of the client
  • sjones is the userid of the person requesting the document
  • [10/Oct/2011:13:55:36 -0700] is the date, time, and time zone when the server finished processing the request
  • “GET /examp_alt.png HTTP/1.0” is the request line from the client; the method GET, /examp_alt.png is the resource requested; and HTTP/1.0 is the HTTP protocol
  • 200 is the HTTP status code returned to the client
  • 10801 is the size of the object returned to the client, measured in bytes

W3C Extended Log Format (ELF) format is a customizable ASCII format, with a variety of different fields, that is used by web servers when generating log files. ELF files provide more information and flexibility than CLF files.

With ELF, you can include fields important to you, while limiting log file sizes by omitting unwanted fields. In addition, note that fields are separated by spaces, and that time is recorded as UTC (Greenwich Mean Time). For example, an ELF record might contain:

2010-05-02 15:42:15 - 40.89.255.10  34.14.255.10 80 GET /default.htm 200 - HTTP/1.0 Mozilla/4.0  (compatible: MSIE+5.5+Windows+2000+Server)

In this case, the format is:

date, time,c-ip, cs-username(-), s-ip, sport, method, cs-uri-stem, status, csUserAgent

Each line can contain either a directive or an entry. Entries consist of a sequence of fields relating to a single HTTP transaction. Fields are separated by spaces. A “-” in a field indicates missing data.

Directives record information about the logging process itself. Lines beginning with the # character are directives.

These directives are defined as follows:

  • Version – rendition of the extended log file format used
  • Fields – space in which data is recorded in the log
  • Software – program that generated the log
  • Start Date – date/time when the log began
  • End Date – date/time when the log was finished
  • Date – date/time when the entry was added
  • Remark – specific comment information (data recorded in this field should be ignored by analysis tools)

See the next article on CLF and ELF Web Log Data Processing that introduces IRI solutions for transforming, migrating, protecting, reporting from, and prototyping huge web logs.

Using Data Templates to Find Data Format Errors
CLF and ELF Web Log Processing
clf web log common log format elf web log extended log format web log formats

Related articles

Connecting MariaDB and MySQL to…
The IRI Platform
IRI Data Migration and Modernization
Real-time Database Data Replication
Getting Started with IRI Ripcurrent
Mapping DB Data Types
Connecting to Microsoft Access in…
Creating New Tables in IRI…
Automating IRI Jobs Using File…
Migrating RDB Data via the…
Introducing the ASN.1 Format and…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact