Thanks to hardware commoditization and the technology’s socio-economic impact, High Performance Computing is seeing growing interest in the Indian subcontinent. By Harshal Kallyanpur
Traditionally, HPC systems have been towers of servers with hundreds of processors as well as a large amount of memory and storage. Numerous such systems would be stacked together and interconnected to each other to form a supercomputer crunching huge amounts of data per second. Most of these systems featured a strong parallel processing system and applications that tied together all of the computer systems into a single supercomputer.
However, such large systems meant that they took up a lot of space and had huge power and cooling implications. Moreover, given the complex nature of the hardware and the corresponding costs, only specific industry verticals, institutes and organizations could afford to deploy such systems.
Early supercomputers or HPC systems had proprietary architectures and essentially required a significant amount of code optimization or tuning in order to best exploit the full feature set of those architectures. Software developers had to re-instrument their applications, insert compiler directives etc. (each specific to different vendors). This used to be a painstakingly tedious effort.
Evolution of HPC
The late 1990s saw Commercial Off-The-Shelf (COTS) hardware gaining popularity among organizations for creating supercomputers as these offered a cheaper alternative over the traditional, expensive, energy guzzling machines. The performance deficit could be addressed by adding more processors to the system.
“COTS hardware offered compelling price-performance benefits of at least ten times when compared to RISC systems and over a hundred times that of vector-based systems. The vector-based systems continue to be available although their applicability is limited to a niche set of customers and specialized applications today,” said Ramanan.
Realizing the benefits that this architecture had to offer, software developers too moved away from vector systems to offer applications based on parallel processing.
“The advent of MPI-based parallel processing interfaces and domain decomposition-based schemes helped accelerate this effort. Gradually, the software environment helped drive innovation in massively parallel algorithms and applications. The vector functionality of the earlier supercomputers started finding its way into COTS clusters over time,” he added.
Commoditization of hardware
The use of x86 based systems for supercomputing, accelerated the adoption of cluster computing. Today HPC is seeing increased adoption among both traditional users of this technology as well as in other industry verticals that always felt the need for these systems but could not afford to deploy the same.
Tripathy explained that, in the early 2000s, HPC clusters were measured in terms of the number of nodes and speed of processing was not the primary measure of performance. In that era, Wipro launched its Supernova offering in the HPCC space. Based on the principles of Open Computing its promise was affordable supercomputing for eight to several thousand nodes.
The first cluster designed, developed and deployed by Wipro was a 36 node cluster with a maximum throughput of 1 Tflop. Today a single node can deliver similar performance. HPCC deployments have since traversed a long journey in India. Adoption has grown and even leading technology institutes now offer courses on HPC to their students.
The early adoption of commodity hardware for supercomputing was also aided by interconnect technologies such as Infiniband, which allowed several processor based systems to be connected with each other and offered a high performance interconnect with low latency.
Today with cheaper connectivity options such as Fiber Channel over Ethernet (FCoE), HPC users have options that help them bring down the overall hardware cost. However, given the level of performance and low latency that Infiniband offers, it continues to be the fabric of choice for interconnectivity in HPC environments.
“Infiniband continues to be the best-in-class choice for HPC environments, owing to the low latency performance that it offers. We believe that it will continue to grow as a preferred mode of interconnect for the next few years,” said Vikram.
Natarajan was also of the view that the cost of deploying HPC had come down further with HPC users going beyond traditional HPC applications and looking at adopting open source HPC applications on a large scale.
Surging adoption
Traditionally, HPC adoption was limited to the government, scientific research organizations, aerospace research, weather departments and organizations involved in the fields of aerospace, material, defense, oil and gas, oceanography, seismic, physics, computational fluid dynamics etc. However, over the years, the need for HPC and the interest in HPC has grown, and this has been fueled by the fact that hardware costs have gone down, while performance has gone up, leading to new industry verticals looking at HPC.
Natural Disaster Management institutes in India today use HPC for tornado, avalanche, tsunami and hurricane predictions. HPC has found adoption in the field of bioinformatics for gene sequencing to detect diseases and their long term implications on society. HPC has also found application in drug discovery.
HPC today is also used for the design and development of industrial, aeronautical, automotive and defense technologies.
Misra described applications of HPC on the agricultural front wherein it is used to determine the quality of seeds and produce. It is also used to detect nitrogen levels in soil in order to analyze and arrive at fertilizers, which can be designed to be eco-friendly yet productive. Traditional problems such as climate changes, pollution levels etc. can be studied and predictive analysis performed to determine their effects on farming.
“HPC can help simulate climate changes for the next hundred years or study the impact of global warming for the next thirty. It can also help in flood monitoring. The technology has taken a socioeconomic perspective rather than just scientific research as it is used to study the impact of various factors on society, people and environment and helps arrive at solutions to today and tomorrow’s issues,” said Misra.
C-DAC, which is a pioneer for supercomputing in India with its PARAM supercomputer, provides HPC services to other industry verticals too. Through its CHReME portal, the institute has provided access to its HPC systems to the National Center for Medium Range Weather Forecasting for weather modeling. Similarly the North Eastern Hill University of Shillong, National Botanical Research Institute (NBRI), the National Institute of Oceanography have been using C-DAC’s supercomputing capabilities for their HPC requirements.
In the late 2000s, Wipro launched Wipro Supernova, its supercomputing practice for the Indian market. This practice helped the company provide its expertise and experience in propagating and proliferating supercomputing to organizations that wanted to use the power of HPC clusters but lacked the necessary resources to do so. Wipro’s Supernova installed base in India ranges from small four node clusters to the supercomputer at the Vikram Sarabhai Space Center (VSSC).
According to Vikram of HP, 10 out of 21 of the largest HPC sites in India employ HP hardware with Tata CRL’s supercomputer being one of them.
IBM has found good success in the life sciences field with a company working in genomics. It helped this company create a 13 cluster HPC environment that analyzes DNA samples collected by the client from various hospitals to diagnose and arrive at possible combinations of DNA in order to study diseases and find cures. It also has customers providing Web Services using HPC for managing these services.
HPC has found adoption in manufacturing, education, financial services and media and entertainment. Most HPC vendors believe that media and entertainment will be a key adopter of these solutions. In the last few years, several global motion picture and animation production houses have set up shop in India. Movies that are made in the country today also feature high definition visual effects and a lot of post production work on movies is being driven out of India. The animation industry has also seen good growth in the country. That’s another factor, which will fuel the adoption of HPC in India.
Natarajan said, “The media and entertainment industry would typically buy many low-end servers to address their rendering requirements. Today, they are adopting cluster computing for their computational requirements.”
In education
Talking about the growth of HPC in the education sector, Natarajan said, “A few years ago, HPC meant that people had to go with large expensive clusters. With the entry barrier to HPC removed, engineering colleges are building small clusters and they have made HPC a part of their curriculum.”
Mirroring this opinion, Tripathy of Wipro said, “Educational institutes starting from the IITs to engineering colleges are investing in smaller supercomputing capabilities for their students. The sole aim is to orient young minds to leverage these technologies and build capabilities in this area and create a pool of design and research brains in the country.” Based on its Supernova practice, Wipro has launched the Supernova Lab for educational institutes.
C-DAC, on the other hand, has introduced learning programs for students from government and private engineering colleges. It teaches the fundamentals including parallel programming environments and gives students hands-on experience on a host of software such as debuggers, compilers and third party tools that run on HPC hardware. While two-thirds of the course addresses the basics of HPC, system administration concepts and software development are also included as well as practicals in the HPC environment.
“The course is designed to provide the necessary skill sets to professionals who would go on to either write applications for HPC environments or run and maintain these environments. It is aimed at identifying and creating an industry standard for students to be HPC ready and to help them apply for HPC and allied jobs,” said Misra.
Misra added that C-DAC had helped create 4, 8 and 16 node cluster environments for chemical, mechanical and electronic engineering colleges. The institution is also organizing national symposiums to increase HPC awareness in each state by inviting principals and HODs of colleges to participate in sessions.
Virtualization, Cloud & Big Data
Most vendors believe that virtualization is an inhibitor for HPC. Conceptually, virtualization is used to increase utilization whereas HPC is more concerned with performance and less with utilization. A virtual machine forms the lowest unit of compute in a virtual world. In an HPC environment, a cluster node forms the lowest unit of compute. Therefore, meeting additional compute requirements in HPC terms means adding another node, while it suggests only firing up another virtual machine or allocating additional compute resources to an existing virtual machine on the physical server in a virtualized environment.
Organizations could look at an HPC Cloud for their requirements. They could have an HPC environment hosted remotely, in cases where the organization cannot afford its own environment or has space constraints etc. However, as most HPC environments deal with sensitive data, organizations will have apprehensions about securing the same. They will also be worried about latency in cases where instructions are flowing over the wire across remote locations.
The emerging trend of Big Data could well turn out to be a major growth driver for HPC. Web services and social media today generate large amounts of data, which is becoming difficult to mine and analyze. The same holds true for traditional HPC intensive activities such as oil exploration, weather simulation and molecular modeling which also generate scads of data. The parallel processing systems, and parallel database capabilities that HPC offers will prove to be essential for solving many Big Data problems.
The way forward
Easy to procure systems mean that many organizations today can look at creating HPC environments. Today’s server hardware with its multi-core configurations offers up to petaflops of performance for a compute cluster. However most HPC experts felt that the focus would be on the applications deployed on these clusters.
Vikram of HP and Natarajan of IBM were both of the view that, with the commoditization of hardware, anyone could buy and claim to create a compute cluster. It is how well the cluster nodes integrate with each other and the HPC application stack running on this cluster that would define it as an HPC environment.
Misra had a similar opinion on this but he said that the challenge was around data and not compute. He added, “Today, there is a huge amount of data generated, which needs to be mined and analyzed. The focus therefore should be on how to get this huge amount of data, process it and store it. There is a need for a very good parallel file system.”
“Today there is talk of exaflops of processing that will require Exabytes of storage. So much data can only be fetched and stored efficiently with a parallel file system. While there are solutions available in the market, they need still need to reach a certain level of maturity in handling such huge amounts of data,” concluded Misra.
Vendors also feel that with compute density increasing, the focus is shifting to providing systems that not only deliver high performance but do so in an energy-efficient manner.
harshal.kallyanpur@expressindia.com
If you have an interesting article / experience / case study to share, please get in touch with us at editors@expresscomputeronline.com