Skip to main content

Data Deduplication Demo

Data Deduplication can save valuable amounts of hard drive space.  In today’s cost conscious environments, saving hard drive space can translate into budgets that can be utilized elsewhere.  The question that often pops up is “How much space will data dedup save me?”  Unfortunately, there is no way to make a accurate prediction.  Data dedup works best with static data.  That is because there is no reason to dedup data that changes often. 
The PowerShell code below will generate a few thousand text files that will share a lot of common bit patterns.  This will help to demonstrate some space savings with dedup.
$String
$NewLineIndex = 0

For ($X=0;$X -lt 10000;$X++)
{

    $C1 = [Char]((Get-Random(35)) + 65)
    $C2 = [Char]((Get-Random(35)) + 65)
    $C3 = [Char]((Get-Random(35)) + 65)
    $C4 = [Char]((Get-Random(35)) + 65)

    $String += "$($C1)$($C2)$($C3)$($C4) "

    If ($NewLineIndex -gt 15)
    {
        $Strgin += "`n"
        $NewLineIndex = 0
    }

    $Name = "E:\PS\Files$($X).txt"
    $NewLineIndex++
    $String | Out-File -LiteralPath $Name

}

Executing this code will generate a lot of files, but this will also help make more of a visual impact with this demonstration.  Once this code has executed, we need to install the data deduplcation
Install-WIndowsFeature FS-Data-Deduplication
Once this feature is installed, you need to enable this feature on the volume that is holding your data.  In this case, it is the E: drive.
Enable-DedupVolume –Volume E:
Data deduplication will only work when scheduled and on data that is at least 5 days old.   To make this demo work, we need to set the age requirment to 0 days.
Set-DedupVolume –Volume E: –MinimumFileAgeDays 0
Now we can start a deduplication.  To see our space savings:
Get-DedupStatus –Volume E:
image
Now we can start the deduplication.
image
You can check the status of the deduplication job:

image
And now to see what we got back:
image
This may not seem like a lot of savings for 10,000 files, but then again these were small files for the most part.  You results will vary.  In the end, this could be used as a tactic to free up space on hard drives that are critically short on space.  Also take a look at the File Server Resource Manager for more tools to help identify data that may be able to be moved to offline storage.

Comments

Popular posts from this blog

How to run GPResult on a remote client with PowerShell

In the past, to run the GPResult command, you would need to either physically visit this client, have the user do it, or use and RDP connection.  In all cases, this will disrupt the user.  First, you need PowerShell remoting enabled on the target machine.  You can do this via Group Policy . Open PowerShell and type this command. Invoke-Command –ScriptBlock {GPResult /r} –ComputerName <ComputerName> Replace <ComputerName> with the name of the target.  Remember, the target needs to be online and accessible to you.

How to force a DNS zone to replicate

For many implementations of DNS in a Windows environment, DNS is configured as being Active Directory integrated.  In other words, the DNS zone information is actually stored as a partition in the active directory database.  When Active Directory replicates, the zone data transfers.  For standard DNS deployments, the data is stored in a file.  You have to configure zone transfers manually in the DNS console.   The question in class was how to initiate replication manually.  Once you have properly configured a Primary and secondary DNS server and configured the Primary server to allow zone transfers, you can manually initiate a zone transfer.   Below you can see our test environment.  The image is of to RDP sessions to two different servers.  The DNS console on the left is the primary.  You can see and entry for Test2 that is not in the secondary database.  The servers are named NYC-DC2 (Primary DNS) and NYC-DC1 (Secondary DNS).  The DNS zone is named test.contoso.com . On the se

Disable SMB signing

It never fails.  Once ever couple of months I have a delegate in my class that has to keep a Windows NT4 box running.  There is nothing wrong with that.  Many applications build on Windows NT4 are solid.  Why upgrade and incur cost when no upgrade is really required?  That is generally the reason why Windows NT4 is being used.  Another reason is the vender went out of business, but the application that is required is really good and paid for. Two things to take note of.  If these Windows NT4 clients are going to be authenticating on a Windows Sever 2008 DC, then you may have a problem.  For WinNT 4.0 SP2 and earlier, SMB signing was not supported.  For WinNT4.0 SP3 and earlier, secure channel was not supported. SMB signing helps to prevent Man-in-the-middle attacks.  To open GPMC, click Start , click Run , type gpmc.msc , and then click OK . In the console tree, right-click Default Domain Controllers Policy in Domains\ Current Domain Name \Group Policy objects\Default Domain Co