Skip to main content

Removing Identical Content from Multiple Text Logs

This week I am delivering an advanced PowerShell course in Norfolk, VA.  This is my second week with this group of inspiring PowerShell Rock Stars so I decided to push this group a little earlier than usual and I had them help me develop a script to help an individual who posted a question on PowerShell.com (http://powershell.com/cs/forums/t/16403.aspx?PageIndex=1

His task was to examine 3 different text based logs.  Should there be a line that is the same in each of the 3 logs, that line was to be removed from each log.  Sounds simple.  The problem that he was looking at is that his logs were hundreds of thousands of lines long.  This is a perfect example of how to apply PowerShell to a real world situation. We did this as a brain storming session.  For the sake of time, we did not convert this into a cmdlet or a parameterized script.  We have to save some of the fun for the guy we were helping.  Here are our results:

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

# Cast three variables of of type [System.Collections.ArrayList]

# to hold the contents of each log.  Use the Get-Content cmdlet

# to populate the ArrayLists.

 

[System.Collections.ArrayList]$Log1 = Get-Content -Path "Log1.txt"

[System.Collections.ArrayList]$Log2 = Get-Content -Path "Log2.txt"

[System.Collections.ArrayList]$Log3 = Get-Content -Path "Log3.txt"

 

# Create a new ArrayList called $List to hold all strings that

# are present in each event log.

 

$List = New-Object System.Collections.ArrayList

 

# Utilizing Log1 as our control, perform a comparison operation to

# Log2 and Log3.  If they both report $TRUE (A match was found) then

# add that string to $List.

ForEach ($L in $Log1)

{

 

    If (($L -in $Log2) -and ($L -in $Log3))

    {

       $List.Add($L)

    }

 

 

}

 

# Cycle through $List and utilize the "Remove" method

# of the ArrayList to clear that cell of the array out

ForEach ($Item in $List)

{

    $Log1.Remove($Item)

    $Log2.Remove($Item)

    $Log3.Remove($Item)

 

}

 

# Commit the three filtered logs to new files.

$Log1 | Out-File -FilePath "Log1Filtered.txt"

$Log2 | Out-File -FilePath "Log2Filtered.txt"

$Log3 | Out-File -FilePath "Log3Filtered.txt"

In lines 5 – 7, we used an ArrayList.  The advantage of this data type (http://msdn.microsoft.com/en-us/library/system.collections.arraylist(v=vs.110).aspx) is that we have the Remove() method.  When we execute the Remove() method on data in the array, it automatically re-dimensions the array after removing the specified content.

Line 12 creates the ArrayList that will hold the content of any cell from $Log1, $Log2, and $Log3 that is present in all three logs.  We will use this information to remove content from those logs later.

Line 17-26 is where we are searching for content that is present in all three logs.  We are using $Log1  as our control for the ForEach loop. 

Line 20 utilizes 2 comparison operators joined by the –and logical operator.  Each of these comparison operations is using the –in operator.  Traditionally, we would have used a ForEach loop to search each of these other two logs.  In this case, we are using the functionality of the –in comparison operator provided to us by PowerShell.  It will return $True if the other logs contain the string that we are searching for.  If both $Log2 and $Log3 contain the string, then we add that string to $List.

Lines 30-36 cycle utilize the Remove method of the ListArray object to remove each string from all three array.

Lines 39 – 41 write new filtered versions of all three logs that do not  contain strings that are identical

Comments

Popular posts from this blog

Adding a Comment to a GPO with PowerShell

As I'm writing this article, I'm also writing a customization for a PowerShell course I'm teaching next week in Phoenix.  This customization deals with Group Policy and PowerShell.  For those of you who attend my classes may already know this, but I sit their and try to ask the questions to myself that others may ask as I present the material.  I finished up my customization a few hours ago and then I realized that I did not add in how to put a comment on a GPO.  This is a feature that many Group Policy Administrators may not be aware of. This past summer I attended a presentation at TechEd on Group Policy.  One organization in the crowd had over 5,000 Group Policies.  In an environment like that, the comment section can be priceless.  I always like to write in the comment section why I created the policy so I know its purpose next week after I've completed 50 other tasks and can't remember what I did 5 minutes ago. In the Group Policy module for PowerShell V3, th

Return duplicate values from a collection with PowerShell

If you have a collection of objects and you want to remove any duplicate items, it is fairly simple. # Create a collection with duplicate values $Set1 = 1 , 1 , 2 , 2 , 3 , 4 , 5 , 6 , 7 , 1 , 2   # Remove the duplicate values. $Set1 | Select-Object -Unique 1 2 3 4 5 6 7 What if you want only the duplicate values and nothing else? # Create a collection with duplicate values $Set1 = 1 , 1 , 2 , 2 , 3 , 4 , 5 , 6 , 7 , 1 , 2   #Create a second collection with duplicate values removed. $Set2 = $Set1 | Select-Object -Unique   # Return only the duplicate values. ( Compare-Object -ReferenceObject $Set2 -DifferenceObject $Set1 ) . InputObject | Select-Object – Unique 1 2 This works with objects as well as numbers.  The first command creates a collection with 2 duplicates of both 1 and 2.   The second command creates another collection with the duplicates filtered out.  The Compare-Object cmdlet will first find items that are diffe

How to list all the AD LDS instances on a server

AD LDS allows you to provide directory services to applications that are free of the confines of Active Directory.  To list all the AD LDS instances on a server, follow this procedure: Log into the server in question Open a command prompt. Type dsdbutil and press Enter Type List Instances and press Enter . You will receive a list of the instance name, both the LDAP and SSL port numbers, the location of the database, and its status.