The Emperor Has Too Many Files
By Tony Asaro on Apr 27, 2010 | In Data Management | Add a comment »
The world of file content and NAS storage is disjointed and fraught with inefficiencies and at the same time full of great potential, as it never has before. In order to understand it better we need to unravel the mess (and that takes more than just one article but it is a good place to start).
File Sprawl
The first big problem with file content is quite candidly, users. People create, copy, convert, forward, edit, scan and download files every day. It is the Wild West with few controls or restrictions. I remember one customer discovering they had 125 copes of a scanned Chinese menu on their Tier One storage system.
When you consider hundreds, thousands and tens of thousands of individuals creating all of this content you can do the math and see how easily file sprawl becomes a real and pervasive problem. There are a growing number of customers that have 100s of TBs and PBs of file storage. In many cases IT professionals have no idea how much file content they have, the value of that content, how much it is costing them, where it is being stored or how it is being protected.
Not only are we creating a ton of files but also many of these include images, video and audio content. Therefore we are creating lots of big files and this results in the consumption of expensive and hard to manage IT infrastructure.
NASty
Which brings me to the next big problem with file content and that is how we store it. Much of file content is stored on NAS storage systems and although there is great value in these solutions, they also create problems for IT professionals. For one, there are only a few vendors that provide NAS solutions and customers have a limited number options to choose from. Clearly, having more viable solutions in the market would foster more competition, cost effectiveness and innovation.
I’ve been talking to big NAS shops and one of their biggest challenges is NAS migration. Customers with hundreds of terabytes and petabytes of NAS file content feel they are essentially tethered to specific NAS devices because the complexity of moving that data is often perceived as insurmountable or far more trouble than its worth. One customer said he felt that he was being perpetually held for ransom by his NAS storage.
Unstructured = Useless
We often refer to files as unstructured data. Since by its very nature there is a lack of structure to unstructured content, it is hard for IT professionals to make any use of file data. However, we don’t dare risk deleting the majority of it because of the risk of needing it - and for most companies and organizations the cost of risk is less than the cost of capital.
Interestingly, industry studies have found that 60-80% of unstructured content is never used ninety days after its initial creation. Which really makes unstructured content synonymous with “useless” content. It cost so much to store and protect file content but then we never actually use it. Is it because the content really has no sustainable value or is it because we just don’t have the tools to easily and effectively make use of it?
The Backup and Recovery Conundrum
I believe the biggest challenge in a petabyte world is backup. Consider the new landscape with hundreds of terabytes and petabytes of file content being stored on multiple storage systems. Now ask, how do you protect all of this file content? And then think about how much that protection is going to cost you in money, time and resource. Legacy methods and the status quo are insufficient to meet the needs of today’s requirements. This means either a new method of file protection is required or you take the risk of not being able to recover data. The latter choice is a hard one to make especially when you consider the consequences could permanently damage your business and result in executives being made personally accountable. I consider this to be one of the biggest issues in the data center for the decade.
For the longest time we’ve been able to continue doing business as usual and solve the problem by throwing more IT infrastructure and people at it. We are at an inflection point where we can no longer be complacent with the status quo. Managing massive file stores is one of the “big” problems in the data center for the decade and IT professionals need to sound the alarm and make this a real priority.
No feedback yet
Leave a comment
| « VMware Makes NFS Mainstream | Private IT Clouds and Why They Matter » |

