17 Jul

Better data management: to archive or not to archive?

Over the years I’ve had many conversations with clients about using archive technologies to offset storage costs. The concept that as data gets older is less used fueled the entire concept of hierarchical storage management.

As storage solutions progressed, proving HSM within a storage array, or what would be know as tiered storage, became the mainstay of storage management. New data gets written and read the most so put that on the top tier of storage and then let it move down as it ages. Archiving, however, is really a different concept. An archive by definition is “any extensive record or collection of data:” So in many organizations what we think of as a traditional backup routine, daily/monthly/yearly backups, can be used as an archive if you have a way to catalog that data.

In the past decade as email and file service data has exploded, however, more and more organizations have been trying to figure out ways to better manage long term retrieval of old data without deleting it from its primary location. Many companies that we have worked with have over the years implemented technologies that in some cases “stub” the original piece of data with a stub file. This points to the new location of that data in a priority compressed format or in other cases will uses proxy technologies to redirect entire shares or mounts.

So here is where it get complicated, the almighty cloud. How does this affect data management? We all hear about how great it is and that using SAS makes so much sense. CIO’s everywhere are pushing their IT staff to look at how they can save money with hosted email and file service technologies. What these CIO’s have not taken into account is the migration process back out of the existing archiver and what that means for a number of factors. End user experience will be vastly different, re-hydrating the data can be a very cumbersome process, and before you know it the TCO on the hosted email solution has sky-rocked.

We are in the midst of a number of projects where organizations are moving internal Microsoft Exchange email to Office 365. They were aware that we needed to deal with the existing archived data, but what they didn’t take into account was the sheer amount of time that would be added to the migration process. While the speed in which email data can be moved is greatly dependent on the MAPI protocol (you know we all love MAPI), with 5 concurrent VMs running while pulling data, the current estimated time is nearly 400 days to complete the migration.

When Microsoft built this amazing ROI analysis for this company they failed to build that into the TCO. The point in all of this is as we see more and more companies trying to leverage public cloud and all the benefits that it provides , it is very important they consider what existing technologies may be used and how that may affect the migration . When the SAS companies talk about their migration strategies they are always talking about best case, not what may happen in the real world. Lastly, it’s important to think about how your existing archiving strategy may impact your overall disaster recovery and data protection strategy when you do move to the cloud. Access to old data may still be an requirement and moving all of that to the cloud may not always be the most cost effective.

Next week I will write some more on this topic and discuss what new cloud based archive technologies may be a good alternative to managing data growth.

Dave Kluger – Principal Technology Architect

Share this

Leave a reply