Skip to main content

Turn on Data Deduplication

Over many years as a Network Administrator, I constantly struggled with the data storage needs of my users. Not only did we need to allocate funds for greater amounts of storage, but also greater amounts of funds for backup and recovery operations that would meet the needs of the organization.

One of the big problems was duplicate information being stored by multiple users. Duplicate data adds to your cost in several ways.

· Increase in storage cost due to capacity depletion from duplicated data.

· Increased number of backup media, and the cost associated with storage, transportation, and replacement of the media.

· Increased recovery times.

· Purchasing and deployment of new disaster recovery hardware so backup and recovery operations can stay within established time frames.

Data Deduplication can help reduce the cost of the above bulleted points. Data deduplication will remove duplicated blocks of data and place references to a single copy stored on the volume. It works well for data stores that are not frequently changed and will not work on boot partitions or partitions containing the operating system. You can achieve reduced storage capacity for file shares, software deployment shares, and virtual hard disk libraries. Data deduplication is only support on the NTFS file system and not on the new Resilient File System (ReFS).

Here is how you turn it on.

First we need some duplicate files. (OK, not really, but I wanted to have some files on the drive.)

image

Here you can see that we have a couple of files that are store in different locations, but are duplicates of each other. They also reside on an NTFS formatted volume that does not contain the boot or OS partition.

Open Server Manager and click Manager –> Add Roles and Features.

Click Next three times.

Expand File and Storage Services –> File and iSCSI Services.

Check Data Deduplication and click Next.

clip_image002

Click Next and then Install.

Click Close.

You can monitor the installation in Server Manager.

clip_image003

No restart is necessary.

On the Server Manager click File and Storage Services.

clip_image004

Click Volumes

clip_image005

Right click the E: drive and select Configure Data Deduplication. (Note: it may take a few minutes before you can select Configure Data Deduplication.)

clip_image006

Check Enable data deduplication.

Click Set Deduplication Schedule

Check Enable throughput optimization. This will set the time when data deduplication will run with normal priority. This allows time for more processor capacity to be dedicated to the deduplication process.

Click OK twice.

Data Deduplication is now set up.

Comments

Popular posts from this blog

Adding a Comment to a GPO with PowerShell

As I'm writing this article, I'm also writing a customization for a PowerShell course I'm teaching next week in Phoenix.  This customization deals with Group Policy and PowerShell.  For those of you who attend my classes may already know this, but I sit their and try to ask the questions to myself that others may ask as I present the material.  I finished up my customization a few hours ago and then I realized that I did not add in how to put a comment on a GPO.  This is a feature that many Group Policy Administrators may not be aware of. This past summer I attended a presentation at TechEd on Group Policy.  One organization in the crowd had over 5,000 Group Policies.  In an environment like that, the comment section can be priceless.  I always like to write in the comment section why I created the policy so I know its purpose next week after I've completed 50 other tasks and can't remember what I did 5 minutes ago. In the Group Policy module for PowerShell V3, th

Return duplicate values from a collection with PowerShell

If you have a collection of objects and you want to remove any duplicate items, it is fairly simple. # Create a collection with duplicate values $Set1 = 1 , 1 , 2 , 2 , 3 , 4 , 5 , 6 , 7 , 1 , 2   # Remove the duplicate values. $Set1 | Select-Object -Unique 1 2 3 4 5 6 7 What if you want only the duplicate values and nothing else? # Create a collection with duplicate values $Set1 = 1 , 1 , 2 , 2 , 3 , 4 , 5 , 6 , 7 , 1 , 2   #Create a second collection with duplicate values removed. $Set2 = $Set1 | Select-Object -Unique   # Return only the duplicate values. ( Compare-Object -ReferenceObject $Set2 -DifferenceObject $Set1 ) . InputObject | Select-Object – Unique 1 2 This works with objects as well as numbers.  The first command creates a collection with 2 duplicates of both 1 and 2.   The second command creates another collection with the duplicates filtered out.  The Compare-Object cmdlet will first find items that are diffe

How to list all the AD LDS instances on a server

AD LDS allows you to provide directory services to applications that are free of the confines of Active Directory.  To list all the AD LDS instances on a server, follow this procedure: Log into the server in question Open a command prompt. Type dsdbutil and press Enter Type List Instances and press Enter . You will receive a list of the instance name, both the LDAP and SSL port numbers, the location of the database, and its status.