Day 2: Separating the Records
Today we will accomplish Step 1 of the process: Identify a
rule to separate records.
I’ve noticed that
text based log files display their records in two formats. The first type is one record per line. The other is a record on multiple lines that
may or may not have the same properties.
I searched my hard drive for an example of a log file with one line for
each record. Log Example 1 (below) not
only has this, but also some header data that we need to filter out. I also added the last two lines just to
create multiple potential properties.
Log Example 1: One
record per line:
________________________________________________________________________________
Log started Thursday 2013-05-30 at 16:06:40Using st_log.ini: 1
Log level: 63
Forced flush: 0
Compressed: 0
Encrypted: 0
IniPath: C:/Program Files (x86)/Dell Backup and Recovery/ST_LOG.INI
Build Date&Time: Mar 14 2012, 14:48:33
FileVersion: 1.0.0.5
ProductVersion: 1.0.0.5
ModulePath: C:\Program Files (x86)\Dell Backup and Recovery\InstallHelper.exe
________________________________________________________________________________
[2013-05-30 16:06:40] Info: InstallHelper Start
[2013-05-30 16:06:40] Info: ST_Drive::IsInternalDrive(int nDisk=0) RETURNED 1
[2013-05-30 16:06:40] Info: IsTablePC return FALSE
[2013-05-30 16:06:40] Info: InstallHelper End
[2013-05-30 16:10:40] Warning: InstallHelper Experienced a failure.
[2013-05-30 16:10:40] Critical: Help files are not available.
This file will be saved as Data.txt.
The cmdlet that we will be creating is called Import-MyLog1. We need to create a
function the will not only determine which lines of this text file contains
data, but also to filter out the header.
We need to create some type of a rule that each record will have in
common. If you look at each line that
contains a record, there is a pattern for a date time stamp. Using a regular expression, we can filter for
just these lines in the log file. An
argument could be put forth that it would be easier to just manually remove the
header information. You need to decide
if you want to invest the time to do this in code now, or invest the time each
time before you execute this cmdlet?
Below is the start of our code. We are working with two functions. The first function if Get-Log.
Function
Get-Log{
# Reads the log file into memory. Will display an error if the
# file cannot be found.
Try
{
Get-Content -path "\\Server1\Data.txt” -ErrorAction Stop |
Write-Output
}
Catch
{
Write-Error "The data file is not present"
BREAK
}
} # End: Function Get-Log
The purpose of this function is simply to read the contents
of the log file and to return it to the calling statement in the main code. It will also post an error and exit the cmdlet
if the record file cannot be found.
Our second function, Get-Record,
is more interesting.
Function
Get-Record{
Param ($Log)
# Determines what is a record. This eliminates any header information.
# For this to work properly, you need to utilize some type of rule
# to determine what is a new record. In this case, a pattern
# is being utilized that we will send to a regular expression.
# The pattern is:
#
# Example - 2013-05-30 16:06:40
#
# See About_Regular_Expressions for more information on how to use
# Regular Expressions in the built in PowerShell help files.
# Array to hold all of the objects.
$Object = @()
# Cycle through each line of the log and look for a line that
# matches the pattern that we are using to denote a new record.
ForEach ($L in $Log)
{
If ($L -match "\d*-\d*-\d* \d*:\d*:\d*")
{
$Obj = New-Object -TypeName PSObject
$Splat = @{NotePropertyName = "Record";
NotePropertyValue = "$($L)"}
$Obj | Add-Member @Splat
$Object += $Obj
}
} # End: ForEach ($L in $Log)
# Send the records to the calling statement.
Write-Output $Object
} # End: Function Get-Record
The purpose of this code is to return only objects that
contain lines from the log that contain valid records. If your rule works correctly, the returned
objects will not contain any header information or comments or anything
else. The $Log parameter is simply the enter log file. We could have read the log file in this
function as opposed to have an extra function.
I opted not to do this for two reasons.
1.
Functions should be as single purpose as
possible.2. Cleaner code with better error handling.
We all code differently so I will not say there is a right
or wrong way. These are my reasons for
the way I do things.
The ForEach loop
will cycle through each line of the log one line at a time. Here is where we apply our rule to determine
what a record is, and what a record is not in the form of a regular expression. Our regular expression is asking if the
string has this pattern.
That is a number “-“
number “-“ number “ “ number “:” number “:” number
When this patter is
found, a new object is created with just on property called Record.
This property contains the entire line from the log. When this loop is finished, the objects are
returned to the calling statement.
Here is what we have so far:
Function Import-MyLog1
{# -----------------------------------------------------------------------
Function Get-Log
{
# Reads the log file into memory. Will display an error if the
# file cannot be found.
Try
{
Get-Content -path "\\Server1\Data.txt” `
-ErrorAction Stop |
Write-Output
}
Catch
{
Write-Error "The data file is not present"
BREAK
}
} # End: Function Get-Log
# -----------------------------------------------------------------------
Function Get-Record
{
Param ($Log)
# Determines what is a record. This eliminates any header information.
# For this to work properly, you need to utilize some type of rule
# to determine what is a new record. In this case, a pattern
# is being utilized that we will send to a regular expression.
# The pattern is:
#
# Example - 2013-05-30 16:06:40
#
# See About_Regular_Expressions for more information on how to use
# Regular Expressions in the built in PowerShell help files.
# Array to hold all of the objects.
$Object = @()
# Cycle through each line of the log and look for a line that
# matches the pattern that we are using to denote a new record.
ForEach ($L in $Log)
{
If ($L -match "\d*-\d*-\d* \d*:\d*:\d*")
{
$Obj = New-Object -TypeName PSObject
$Splat = @{NotePropertyName = "Record";
NotePropertyValue = "$($L)"}
$Obj | Add-Member @Splat
$Object += $Obj
}
} # End: ForEach ($L in $Log)
# Send the records to the calling statement.
Write-Output $Object
} # End: Function Get-Record
# -----------------------------------------------------------------------
# Load the log into memory
$Log = Get-Log
# Extract the records
$Records = Get-Record -Log $Log
$Records # Added only to see the current progress.
} #End: Function Import-MyLog1
Here is the output from calling this cmdlet: (Word wrapping may distort the output)
Record
------
[2013-05-30 16:06:40]
Info: InstallHelper Start
[2013-05-30 16:06:40]
Info: ST_Drive::IsInternalDrive(int nDisk=0) RETURNED 1
[2013-05-30 16:06:40]
Info: IsTablePC return FALSE
[2013-05-30 16:06:40]
Info: InstallHelper End
[2013-05-30 16:10:40]
Warning: InstallHelper Experienced a failure.
[2013-05-30 16:10:40]
Critical: Help files are not available.
As you can see, we have eliminated the header information,
along with any line of text that is not a record. Take the time to verify that there are not
any exceptions to your rule to define a record.
Tomorrow we will take it a step further and extract the
properties from the log file.
Comments