Tuesday, August 17, 2010

Closing the Server-Storage Virtualization Gap

Now that hypervisors are mature management tools, monitoring capabilities and standards are evolving. Resources in the cloud are more dynamic, multi-tenant and large-scale. However, full data center virtualization and the cloud cannot be complete without virtualizing the storage layer. Storage virtualization demands an entirely new software-based approach.

Server virtualization technologies have advanced at a rapid pace of innovation with VMware (NYSE: VMW) and Citrix (Nasdaq: CTXS) (Xen) initially leading the way. They are now being joined by significant strategic investments by Red Hat (NYSE: RHT).

Unfortunately, the storage side of the equation has lagged behind. Several trends, such as the explosion of unstructured data and the emergence of cloud computing, have shined a spotlight on the gap and woken many to the realization that it is holding the industry back from achieving a fully virtualized data center. Linux kernel is proving to be a superior hypervisor than even a microkernel-based VMware implementation, while having borrowed powerful ideas from microkernel design from early development.

This article will discuss the current state of GNU/Linux virtualization and provide best practices, focused on storage, aimed at closing the server-storage virtualization gap.

Read more from LinuxInsider: Closing the Server-Storage Virtualization Gap

Tuesday, July 13, 2010

Simple "diff-dir" script to compare two directories

Simple script to compare two volumes/directories efficiently. Copy this script as "diff-dir" and "chmod +x diff-dir". You can execute it as "./diff-dir DIR1 DIR2".

#! /bin/bash
## Author: Anand Babu Periasamy ab[at]gnu.org.in
## License: GNU GPL v3 or later
## usage: diff-dir DIR1 DIR2

DIR1=$1
DIR2=$2
DIR1_OUT=$(tempfile -prefix "$DIR1"_)
DIR2_OUT=$(tempfile -prefix "$DIR2"_)
find $DIR1 -printf "%P %s\n" | sort > $DIR1_OUT
find $DIR2 -printf "%P %s\n" | sort > $DIR2_OUT

diff $DIR2_OUT $DIR1_OUT
rm -f $DIR1_OUT $DIR2_OUT

Saturday, May 15, 2010

The Future of Cloud Storage is NAS

In the initial wave of server virtualization the focus was primarily on consolidation and efficiency. The first applications to move to virtual machines (VMs) were either in development environments or lightweight workloads, and were mostly static and not very I/O intensive such as DHCP, DNS and Active Directory. Servers were primarily using block-based Storage Area Networks (SAN) for network disk storage and that initially did not change.

As the use of server virtualization increased it began to strain the supporting SAN infrastructure. Every VM required a dedicated Logical Unit Number (LUN) to be provisioned, and as the number of VMs exploded it created management and scalability issues because of the growing number of individual LUNs. VMware developed a file system (VMFS) specifically to address the issues with SAN storage for virtual machines, enabling the creation of larger (but still limited) LUNs with the ability to store data from multiple VMs. This is an early example of storage virtualization and highlights the challenge that continues today of storage playing catch-up with virtualization advances on the server side.

As server virtualization technology has matured, we have now reached a state where enterprise applications are running in VMs in an extremely dynamic manner. Along with continued management complexity, addressing I/O bottlenecks and containing the cost of storage to accompany virtual servers has emerged as the top issues. Network Attached Storage (NAS) and iSCSI are now viable solutions for VM storage. iSCSI offers ease of use and cost advantages over traditional Fibre Channel (FC) SAN, but suffers from many common issues. NAS offers similar cost and simplicity advantages, but the inherent scalability and sharing capabilities position NAS to be the storage platform of choice for the cloud. Data center managers are now moving full speed to virtualize everything and adopt the cloud model, which makes it even more important to explore how NAS can efficiently enable cloud storage.

Limitations of SAN
Scalability Limitations: SAN is block storage; however, a scalable file system is needed to manage the data within the SAN. With VMware, VMFS allows virtual disks to be stored as files. Maximum volume size is 64TB with virtual disks up to 2TB each, although I/O will most likely be a bottleneck before these limits are reached. When it comes to application generated data the situation is worse. Without a global namespace, SAN has no easy solution to manage the hundreds of terabytes, or even petabytes, of application data created by cloud-based applications. Without a scalable high-performance storage architecture cloud computing will stall if reliant on SAN.

Read the complete article...