Information, Meet Lifecycle
As the amount of data continues to grow rapidly and as compliance needs get more complex, an intelligent, lifecycle-based storage strategy makes perfect sense By Jasmine Desai.
Data has undoubtedly become one of the most valuable resources for enterprises. From security and compliance perspectives, it has become increasingly important to manage this data as businesses face compliance issues in the wake of legislation such as HIPAA and the Sarbanes-Oxley Act. Data and storage go hand in hand and the way the nature of data is changing, storage too has to keep up.
Storage hurdles
The challenge many organizations are facing today is how to tame and analyze the data they have, as it is growing at an astounding rate. Observes S. Sridhar, Director, Enterprise Solutions Business, Dell India, “With increase in data volume and stringent government regulations, storage costs are mounting. This is forcing companies to rethink their existing approach to data storage.”
High-end storage comes at a high premium, and using top-shelf storage for all data assets is no longer a cost-efficient solution. This is driving organizations to look at different approaches to data storage such as intelligence-based automated tiering, where management of data in turn manages the storage. Some organizations are also looking at cloud-based storage services providing storage on demand.
Organizations tend to forget the very important aspect of data’s lifecycle. Documents have a lifecycle and may not be required after 3, 5 or 7 years and can be archived on cheaper storage or destroyed, thereby reducing storage by as much as 40-60%. As per Venkat Iyer, Director – BIM India Lead, Capgemini, “Modern organizations face several data storage issues. The most prominent one is migration of documents and data from one system to another. It must be made easier and simpler.” Thus, documents should have predetermined life cycle, but this is difficult to manage with existing software. Also, managing very large mass stores requires sophisticated storage management software that must support migration, duplication, remote copy, capacity management, document life cycle, etc.
As per Ashok Saxena, Head, Engineering Center, Kronos India, “The primary challenges involve operational costs, performance, and reliability. Customers also want more choice in vendor hardware. Organizations also face a challenge when it comes to extending the life of existing storage assets and consolidation of multiple different storage arrays and integrating them into common solutions for effectively utilizing ILM (Information Lifecycle Management) principles.” Apart from these, providing better service delivery to internal customers and line-of-business applications as per the increase in demands is a challenge.
According to Aman Munglani, Research Director, Gartner, “Capacity growth is something that organizations continue to grapple with. In the Indian market, there has been 60-70% capacity growth on Y-O-Y basis. Also, how do you trace the issue of complexity in your infrastructure? Utilization rates for infrastructure in India are still fairly low, so how do you increase that? ILM is not a technology oriented approach.”
The right storage architecture
To get the right storage architecture to manage the complexity of data would be ideal, but there is no one right fit for all. To begin with, it is essential to evaluate the need for storage and the kind of storage that is most suited for the organization. Says Sridhar of Dell, “Firstly, one must take into account the amount of data, the value behind the data, the cost, the ROI, etc., before implementing any kind of data storage architecture. Every customer is different and their needs differ according to the business the organization is in.” Thus, storage strategy must embrace both technology considerations and business requirements of the company.
As per Iyer of Capgemini, ”We recommend Hierarchical Storage Management (HSM) for information lifecycle management. HSM is a data storage technique which automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as hard disk drive arrays, are more expensive (per byte stored) than slower devices, such as optical discs and magnetic tape drives.”
While it would be ideal to have all data available on high-speed devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprise’s data on slower devices, and then copy data to faster disk drives when needed. In effect, HSM turns the fast disk drives into caches for the slower mass storage devices. The HSM system monitors the way data is used and makes best guesses as to which data can safely be moved to slower devices and which data should stay on the fast devices.
In a typical HSM scenario, data files which are frequently used are stored on disk drives, but are eventually migrated to tape if they are not used for a certain period of time, typically a few months. If a user does reuse a file which is on tape, it is automatically moved back to disk storage. The advantage is that the total amount of stored data can be much larger than the capacity of the disk storage available, but since only rarely-used files are on tape, most users will usually not notice any slowdown. HSM is sometimes referred to as tiered storage.
Recently, the development of Serial ATA (SATA) disks has created a significant market for three-stage HSM: files are migrated from high performance fiber channel Storage Area Network (SAN) devices to somewhat slower but much cheaper SATA disk arrays of several terabytes or more, and then eventually from the SATA disks to tape. The newest development in HSM is with hard disk drives and flash memory. In practice, HSM is typically performed by dedicated software, such as IBM Tivoli Storage Manager, Oracle’s SAM-QFS, Quantum, SGI Data Migration Facility (DMF), StorNext, or EMC Legato OTG DiskXtender.
Mentions Saxena of Kronos, “This is one of the biggest challenges for any organization, especially when we are looking at globalization. Given the variables for ILM, there is focus using tiered storage solutions. This is a networked storage method where details are stored on various types of media, based on performance, availability and recovery requirements.” For example, data intended for restoration in the event of data loss or corruption could be stored locally, for fast recovery, while data for regulatory purposes could be archived to lower cost disks.
In HSM one can use Tier 1 – 3 storage with the combination of fiber channels and near-line drives to optimize the cost. Such a tiered architecture can help in complying with various security frameworks as well. To make such a system operationally manageable, tools are available to integrate different infrastructure components. In addition, it gives enterprises the ability to synchronize various locations over the network. A tiered architecture like this goes beyond costs and dramatically reduces backup and storage time.
A Forrester research paper, Top 10 Storage Predictions for I&O Professionals, mentions that for years, many firms have simply chosen the “highest common denominator” as their single tier of storage — in other words, if some data needed top-tier block storage, then in many cases, this was the only flavor to be deployed. As data volumes have grown over the years, the penalty for such a simple environment has grown, when much of the data doesn’t really need top-tier storage. Additionally, unique requirements for specific workloads vary significantly, so the single flavor is often not well suited to big portions of the data being stored.
Getting around ILM
According to Saxena of Kronos, “A successful ILM program hinges on development and adoption of a framework to manage information. To verify the various relevant components of the framework one needs: timely identification and management of risk, regulatory and compliance requirements; identification of opportunities for growth; and identification of issue/problem areas.” Thus, by integrating best practices and industry standards, one can create a program that meets all functional and strategic needs.
According to the research firm Ovum, the selection of technologies to manage content for retention is no longer a straightforward choice, since the deployment methodology also now has an important influence. Cloud computing is gaining traction, with Ovum research showing that 33% of companies surveyed are currently using software as a service (SaaS) for at least one application, with the total rising to 50% for companies intending to have SaaS in six months’ time. The number of companies with a private cloud is currently running at 27%, with 64% intending to deploy a private cloud within the next six months. Most vendors provide cloud solutions, and are pushing this form of deployment. While some companies are reluctant to trust sensitive content such as company and customer records to the cloud, with the appropriate security safeguards in place and the right service provider it can be as secure as an on-premise solution.
Munglani of Gartner says, “Organizations that are looking at ILM have the maturity to know that it is not tiered storage. There is need for education by and large on what ILM is. From 2004-05, ILM has seen a revision of its fortune, so a lot of vendors also know that ILM is the first strategy to deal with capacity.” There is certainly no mad rush of organizations adopting ILM as of now. Organizations that have massive amounts and complexity of data are the ones looking at ILM.
Niraj Kapasi, IT Auditor and Chair of ISACA’s India Task Force, says, “There is a growing raft of regulations and legislation dictating the retention of certain types of content and data. There are also new types of content that must be retained, such as elements of social media. This is adding to the complexities faced by organizations in deciding what needs to be retained and what can safely be deleted.”
Such complexity creates manageability problems, extending to backups and disaster recovery provision. By moving much of the content out of live systems into archives, organizations are simplifying content management by reducing the volumes that need to be managed on a daily basis, but are ensuring that the content can be accessed when required. The long term preservation of information is still an issue for many organizations, and this is the case in the private as well as the public sector.
The basic tenets of ILM framework like automated tiering, thin-provisioning, de-duplication, compression are already matured technologies today in the Indian market. The ILM tools market will definitely see growth in the next couple of years. Increase in regulations will be a key contributor to this growth. The only downside to growth could be the lack of awareness or the inability of organizations wanting to adopt ILM to clearly define their strategic storage framework.
What lies ahead
With the need and demands of the growing industry, globalization, and geographical distribution, the awareness and subsequent demand for ILM would grow. As per Saxena of Kronos, “Legislations regulate how organizations need to deal with particular types of data. Therefore, implementation for ILM solutions for the organizations is going to grow and become inevitable until 2014.”
Concurs Iyer of Capgemini, “There will be huge demand for ILM solutions in 2013-14 due to the expansive benefits it brings along, including reduction of paper. We see many requirements for records management from government and private sector organizations.”
If you have an interesting article / experience / case study to share, please get in touch with us at [email protected]