Skip to main content

Data Deduplication Demo

Data Deduplication can save valuable amounts of hard drive space.  In today’s cost conscious environments, saving hard drive space can translate into budgets that can be utilized elsewhere.  The question that often pops up is “How much space will data dedup save me?”  Unfortunately, there is no way to make a accurate prediction.  Data dedup works best with static data.  That is because there is no reason to dedup data that changes often. 
The PowerShell code below will generate a few thousand text files that will share a lot of common bit patterns.  This will help to demonstrate some space savings with dedup.
$String
$NewLineIndex = 0

For ($X=0;$X -lt 10000;$X++)
{

    $C1 = [Char]((Get-Random(35)) + 65)
    $C2 = [Char]((Get-Random(35)) + 65)
    $C3 = [Char]((Get-Random(35)) + 65)
    $C4 = [Char]((Get-Random(35)) + 65)

    $String += "$($C1)$($C2)$($C3)$($C4) "

    If ($NewLineIndex -gt 15)
    {
        $Strgin += "`n"
        $NewLineIndex = 0
    }

    $Name = "E:\PS\Files$($X).txt"
    $NewLineIndex++
    $String | Out-File -LiteralPath $Name

}

Executing this code will generate a lot of files, but this will also help make more of a visual impact with this demonstration.  Once this code has executed, we need to install the data deduplcation
Install-WIndowsFeature FS-Data-Deduplication
Once this feature is installed, you need to enable this feature on the volume that is holding your data.  In this case, it is the E: drive.
Enable-DedupVolume –Volume E:
Data deduplication will only work when scheduled and on data that is at least 5 days old.   To make this demo work, we need to set the age requirment to 0 days.
Set-DedupVolume –Volume E: –MinimumFileAgeDays 0
Now we can start a deduplication.  To see our space savings:
Get-DedupStatus –Volume E:
image
Now we can start the deduplication.
image
You can check the status of the deduplication job:

image
And now to see what we got back:
image
This may not seem like a lot of savings for 10,000 files, but then again these were small files for the most part.  You results will vary.  In the end, this could be used as a tactic to free up space on hard drives that are critically short on space.  Also take a look at the File Server Resource Manager for more tools to help identify data that may be able to be moved to offline storage.

Comments

Popular posts from this blog

How to force a DNS zone to replicate

For many implementations of DNS in a Windows environment, DNS is configured as being Active Directory integrated.  In other words, the DNS zone information is actually stored as a partition in the active directory database.  When Active Directory replicates, the zone data transfers.  For standard DNS deployments, the data is stored in a file.  You have to configure zone transfers manually in the DNS console.   The question in class was how to initiate replication manually.  Once you have properly configured a Primary and secondary DNS server and configured the Primary server to allow zone transfers, you can manually initiate a zone transfer.   Below you can see our test environment.  The image is of to RDP sessions to two different servers.  The DNS console on the left is the primary.  You can see and entry for Test2 that is not in the secondary database.  The servers are named NYC-DC2 (Primary DNS) and NYC-DC1 (Secondary DNS).  The DNS zone is named test.contoso.com . On the se

Export Your Performance Monitor Data to Excel

Updated: 2016MAY04 To clarify when this functionality is available, you can only save the view when you are viewing a Data Collection Set.  The "live" data cannot be saved in this way. Performance Monitor in Windows Server give us the ability to see when our servers are having some issues.  Analyzing that data into something meaningful can be a problem.  You can export your data to Excel so you can better see what your performance data represents.  First collect your data. Right click the graph and select Save Data As . Change the Save as type to Text file (comma delimited)(*.csv) . Give the file a name and save it where you want to store it. Now open that file on a client with Excel installed on it.  By using excel, you will be able to present the data in a more meaningful format.

Determine which Domain Controller a client is connected to with PowerShell

When a Windows client comes online, it must find a domain controller to bind to.  Either through a static configuration or DHCP, the client will request a list of all Domain Controllers in the domain from a DNS server.  Once the list is received, the client will randomly go through the list to find a DC that will respond.  Once the client has authenticated itself with the DC, the DC will transmit the site information to the client.  The site information will contain the site name, the subnet(s) associated with that site, and any domain controllers in that site.  The client will then take a look at it’s own IP address to determine which site it is in.  From the list of DCs in the same site, it will attempt to bind to one of those DCs to receive it’s Group Policies.   You can use PowerShell and WMI to locate the domain controller that a client is connected to.   Get-WMIObject Win32_NTDomain   Look for the DomainControllerName property.