Advanced Windows PowerShell Scripting Video Training

Advanced Windows PowerShell Scripting Video Training
Advanced Windows PowerShell Scripting Video Training

Tuesday, March 11, 2014

Removing Identical Content from Multiple Text Logs

This week I am delivering an advanced PowerShell course in Norfolk, VA.  This is my second week with this group of inspiring PowerShell Rock Stars so I decided to push this group a little earlier than usual and I had them help me develop a script to help an individual who posted a question on PowerShell.com (http://powershell.com/cs/forums/t/16403.aspx?PageIndex=1

His task was to examine 3 different text based logs.  Should there be a line that is the same in each of the 3 logs, that line was to be removed from each log.  Sounds simple.  The problem that he was looking at is that his logs were hundreds of thousands of lines long.  This is a perfect example of how to apply PowerShell to a real world situation. We did this as a brain storming session.  For the sake of time, we did not convert this into a cmdlet or a parameterized script.  We have to save some of the fun for the guy we were helping.  Here are our results:

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

# Cast three variables of of type [System.Collections.ArrayList]

# to hold the contents of each log.  Use the Get-Content cmdlet

# to populate the ArrayLists.

 

[System.Collections.ArrayList]$Log1 = Get-Content -Path "Log1.txt"

[System.Collections.ArrayList]$Log2 = Get-Content -Path "Log2.txt"

[System.Collections.ArrayList]$Log3 = Get-Content -Path "Log3.txt"

 

# Create a new ArrayList called $List to hold all strings that

# are present in each event log.

 

$List = New-Object System.Collections.ArrayList

 

# Utilizing Log1 as our control, perform a comparison operation to

# Log2 and Log3.  If they both report $TRUE (A match was found) then

# add that string to $List.

ForEach ($L in $Log1)

{

 

    If (($L -in $Log2) -and ($L -in $Log3))

    {

       $List.Add($L)

    }

 

 

}

 

# Cycle through $List and utilize the "Remove" method

# of the ArrayList to clear that cell of the array out

ForEach ($Item in $List)

{

    $Log1.Remove($Item)

    $Log2.Remove($Item)

    $Log3.Remove($Item)

 

}

 

# Commit the three filtered logs to new files.

$Log1 | Out-File -FilePath "Log1Filtered.txt"

$Log2 | Out-File -FilePath "Log2Filtered.txt"

$Log3 | Out-File -FilePath "Log3Filtered.txt"

In lines 5 – 7, we used an ArrayList.  The advantage of this data type (http://msdn.microsoft.com/en-us/library/system.collections.arraylist(v=vs.110).aspx) is that we have the Remove() method.  When we execute the Remove() method on data in the array, it automatically re-dimensions the array after removing the specified content.

Line 12 creates the ArrayList that will hold the content of any cell from $Log1, $Log2, and $Log3 that is present in all three logs.  We will use this information to remove content from those logs later.

Line 17-26 is where we are searching for content that is present in all three logs.  We are using $Log1  as our control for the ForEach loop. 

Line 20 utilizes 2 comparison operators joined by the –and logical operator.  Each of these comparison operations is using the –in operator.  Traditionally, we would have used a ForEach loop to search each of these other two logs.  In this case, we are using the functionality of the –in comparison operator provided to us by PowerShell.  It will return $True if the other logs contain the string that we are searching for.  If both $Log2 and $Log3 contain the string, then we add that string to $List.

Lines 30-36 cycle utilize the Remove method of the ListArray object to remove each string from all three array.

Lines 39 – 41 write new filtered versions of all three logs that do not  contain strings that are identical

No comments: