Novell Home

SUSE Linux Enterprise 10 is a robust and capable platform equipped to satisfy a myriad of computing roles as they exist on the desktop and in the data center. Comprised of various components that do a lot of different things, none is more important, misunderstood or overlooked than the file system. It plays an integral role in the successful completion of most, if not all, OS processes. This article outlines a select group of file systems, disk formats and storage protocols as they pertain to SUSE Linux Enterprise Server.

We'll also cover the storage capabilities of SUSE Linux Enterprise Server as they pertain to iSCSI. These are integral to Linux as a whole and specifically to SUSE Linux Enterprise Server. If you want more technical information on this topic than we can cover here, check out the vast array of Novell documentation, Wikis, etc. After reading this article you'll better understand the main industry-supported file systems and mass storage as it relates to iSCSI.

> iSCSI
Modern computing has witnessed a proliferation of data that has spawned three core storage philosophies:

  • DAS (direct-attached storage)
  • NAS (network-attached storage)
  • SAN (storage-attached network)

Each differs in how data is accessed, and its location relative to the network and server. For our purposes, I'll focus on NAS and its relationship with iSCSI.

iSCSI is an end-to-end protocol for transporting storage I/O block data over an IP network, per SNIA (Storage Networking Industry Association) specifications. Its development is closely aligned with the development of IP infrastructure as it relates to the achievement of bandwidth allocation, and multi-Gigabit speeds. With Novell, it became a reality with its introduction in NetWare 6.5.

The association of NAS to iSCSI is logical. First, they are both TCP/IP based. Second, iSCSI is a more natural addendum to SAN than FC (Fibre Channel) is to NAS. The primary reason for this is the price per port, which leads to a list of other key advantages. It:

  • builds on stable and familiar standards.)
  • provides a high degree of interoperability by reducing disparate networks and cabling, and by using regular Ethernet switches instead of special Fibre Channel switches.)
  • scales to 10 gigabits. (This is comparable to OC-192 SONET (Synchronous Optical Network) rates in Metropolitan Area Networks (MANs) and Wide Area Networks (WANs).)

Supported by industry advantages, SUSE Linux Enterprise Server 10 augments these via:

  • graphical iSCSI management tools that facilitate easy configuration of both iSCSI initiators on clients and iSCSI targets on servers, and
  • support for iSCSI as both initiator and target.

Now let's look at security. Because it's based on the TCP/IP protocol, iSCSI is CHAP-compliant (Challenge Handshake Authentication Protocol). Likewise, its location in the OSI layer above the transport layer, permits its traffic to be encrypted using IPsec. Unlike its nearest cousin, Fibre Channel, whose primary sources of security are LUN masking and zoning, iSCSI is a low-cost, flexible, secure and cost-effective storage solution.

> File System
The most integral component of the Linux, or any other enterprise operating system is the file system itself. One of the most compelling reasons for adopting Linux is its comprehensive list of supported file systems. The number of choices can be daunting as each possesses certain attributes that enhance its suitability for particular functions. There are 33 documented file systems. Here, we'll focus on only five of them: (See Figure 1.)

  • FAT
  • Ext2/Ext3
  • ReiserFS
  • XFS
  • OCFS2

I'll outline the attributes of these file systems and explain how to select the appropriate file system. Keep in mind that Novell doesn't endorse one over another. That would be like Ford endorsing Sunoco gasoline over Exxon. Simply, if you get the result you want, you've chosen an appropriate file system.

Lastly, and notably for all the Novell-philes out there: It appears the important file system is being left out, Novell Storage Services (NSS). This is intentional because it is a proprietary file system found in the NetWare and Linux kernel as it relates to Open Enterprise Server with SUSE Linux Enterprise Server 9. Support for Novell Storage Services will be available in the next release of Open Enterprise Server.

> Terminology
The first term you need to understand is metadata, or "data that describes data." For example, the file attributes specifically addressed are, ACLs (access control lists), storage location, file size, and date and time stamps. These reference the file minus the actual data itself. An easy way to think of it is like a city including maps, census reports, weather, etc., that change dynamically.

The second term is inode. These contain information about a file such as the size, number of links, time of creation, modification and access. Think of it as metadata on a more granular level.

The last term used for file system differentiation is journaling. Modern data centers are designed for 99.999 percent availability. With data being measured in terabytes and petabytes, it is essential to engineer the system with as much reliability, redundancy, scalability and ability to 'go back' as possible. Enter the journal. It is an on-disk structure containing a log where the file system stores what it is about to change in the system's metadata.

Table 1
A Comparison of File Systems
Megabytes and Mebibytes

File System Maximum filename length Allowable characters in directory entries Maximum pathname length Maximum file size Maximum volume size
ext 2 255 bytes Any byte except NUL No limit defined 16 GiB to 2TiB 2TiB to 32TiB
ext 3 255 bytes Any byte except NUL No limit defined 16 GiB to 2TiB 2TiB to 32TiB
FAT 32 255 bytes Any unicode except NUL No limit defined 4GiB 512MiB to 2TiB
ReiserFS 4032 bytes/255 characters Any byte except NUL No limit defined 8TiB 16TiB
OCFS2 255 bytes Any byte except NUL No limit defined 4PiB 4PiB

* en.wikipedia.org/wiki/Comparison_of_file_systems

Quantities of bytes        
SI prefixes     Binary prefixes (IEC 60027-2)  
Name (Symbol) Popular Usage Standard Si Name (Symbol) Value
kilobyte (kB) 210 103 kibibyte (KiB) 210
megabyte (MB) 220 106 mebibyte (MiB) 220
gigabyte (GB) 230 109 gibibyte (GiB) 230
terabyte (TB) 240 1012 tebibyte (TiB) 240
petabyte (PB) 250 1015 pebibyte (piB) 250
exabyte (EB) 260 1018 exbibyte (EiB) 260
zettabyte (ZB) 270 1021 zebibyte (ZiB) 270
yottabyte (YB) 280 1024 yobibyte (YiB) 280

A mebibyte (a contraction of mega binary byte) is a unit of information or computer storage, abbrevieated MiB.
1 MiB = 220 = 1,048,576 bytes = 1,024 kibibytes
The mebibyte is closely related to the megabyte (MB), which can either be a synonym for mebibyte, or refer to 106 bytes = 1,000,000 bytes, depending on context (see binary prefix for background). The two numbers are relatively close together, but mistaking the two has nevertheless occasionally led to problems.

Likewise, in the case of failure, or any situation that compromises data integrity, only the journal is replayed rather than checking the entire file system record by record. Logically, a major benefit is speed of recovery which is clearly dependent on factors such as file system size, extent of the failure, etc.

The first and most rudimentary file system is the FAT (File Allocation Table). It is the least feature rich of all the file systems, thus making it one of the most compatible. Unlike many robust enterprise- class file systems, FAT lacks features such as journaling and access controls. But because it's so ubiquitous, it's often found in RAM devices such as memory cards and floppy disks.

Specific to Linux, the most widely used file system is Ext3, which represents an evolutionary step from Ext2. Steve Best of Linuxmag.com notes that until recently Ext2 was the de facto file system for Linux because it's robust and predictable. That's why many refer to it as 'rock solid'. Because it's based on the Extended File System, Ext3 benefited greatly from its predecessor's enhancements; however, with the landscape of data system requirements changing, Ext2 has now been replaced at large by Ext3. At first glance, they exhibit nearly identical characteristics in a multitude of categories. (See Table 1.) The primary difference is their journaling capabilities.

Ext2 has none. That's not to say it's completely devoid of failure survivability. When data inconsistency occurs, the entire file system is analyzed opposed to the modified bits of metadata. Hence, Ext2 is not the best solution for an environment that demands high availability and fast recovery.

Ext3 is considered the default file system for Linux as a whole, as well as the SUSE Linux Enterprise 10. Its behavior is predictable and its performance is snappy. But if robust partition sizes and large file counts are destined for this file system, you should consider an alternative.

The ReiserFS file system has proven itself as a viable alternative to the Ext2/Ext3 file systems. It excels in utilization, performance and survivability. Survivability is increased by metadata-only journaling, which speeds up recovery of the file system after a failure. Unlike Ext3, in ReiserFS this is a nonconfigurable attribute. In utilization, it supports online resizing through a volume manager. Because of its architecture, the ReiserFS file system is well suited to handle small files ( < 4k). Hence, mail delivery systems and similar applications that have been migrated from other file systems have reported performance increases in as much as 10 to 15 percent.

One of the most capable file systems to date is XFS. In 1994, SGI began developing the robust 64-bit journaling file system. Their mission was to create a file system capable of complementing the storage requirements of the modern data center. Its first defining characteristic is its cross platform affinity with the UNIX operating system. This makes it a likely linchpin in any UNIX to Linux migration. In addition to providing a highly capable migration platform, its core strengths are scalability and performance.

When talking scalability, it can scale to an 8 exabyte (1 billion GB) partition. This is done by logically grouping contiguous disk space, called a virtual file system because of their independent management of inodes and free disk space. This method of handling disk space provides the pillars for XFS's scalability. Lastly, this architecture is complemented by modern multiprocessor big iron.

Alphabet Soup: FC/NAS/SAN/DAS/SCSI

Mass Storage is a term to describe the multiple storage methodologies common in many enterprises, regardless of their size. Commonly referred to as DAS, NAS and SAN, each is also commonly accompanied by the respective technologies of FC and iSCSI.

Enough acronyms. Let's talk about what they really mean.

The most common way to attach storage to a server is by directly connecting to it. This is Direct Attached Storage, or DAS. Other computers can't directly access this storage. This architecture is not highly fault tolerant because there is only a single point of failure–the computer hosting the storage.

You can hook up storage to a computer in a DAS environment in several ways. For example, you can connect a USB external hard drive to a laptop to create a DAS environment. With an enterprise-class storage array, the most common way to connect the storage is via Small Computer System Interface, or SCSI. In both environments, you create shares to allow disparate systems to access it.

NAS or Network Attached Storage is a methodology where a storage appliance is directly attached to the computer network. This removes the association of storage to a specific server or transport protocol. Whereas DAS communication is determined by the attached server, NAS almost eliminates this draw back. Protocol support is somewhat determined by the appliance vendor; however, common protocols in use are CIFS and NFS.

As for reliability, a NAS device typically has a higher level of fault tolerance. This is achieved via multiple controllers, multiple connections to the computer network, as well as a disk architecture that supports a Redundant Array of Independent (or Inexpensive) Disks, or RAID. Lastly, reliability is augmented by removing the ancillary computing responsibility such as printing and DNS/DHCP, or Domain Name System (or Service or Server)/Dynamic Host Configuration Protocol, from this appliance thus allowing it to serve just one purpose: serving files.

The final storage philosophy is the Storage Area Network (SAN). As online storage grows, it gets harder to do a complete backup of data within an acceptable time frame while limiting the backup's impact on end user productivity. You see part of this impact by the time it takes packets to cross the network on their way to a tape or disk-based backup device. To minimize this impact, data was relocated to a disparate network designed solely to handle backup related traffic. With the backup issue solved, administrators soon realized this type of architecture could also be leveraged to improve data/storage availability, increase scalability, as well as improve performance.

An associated technology to the storage area network is fibre channel or FC. The Fibre Channel Industry Association, or FICA, designed the Fibre Channel Arbitrated Loop (FC-AL) for mass storage deices and other peripheral devices that require high bandwidth using optical fiber connect devices. This interconnection allows for concurrent communication using various protocols namely SCSI and IP, or Internet Protocol.

The final file system we'll talk about is the brainchild of Oracle Corporation. The primary objective for development of the Oracle Cluster File System (OCFS) was to provide a supportive chassis for the Oracle Real Application Cluster Database. OCFS Release 2 (OCFS2) is the only clustered file system included in the v2.6.16 Linux kernel so the database platform and the file system are tightly integrated. Novell takes it one step further in SUSE Linux Enterprise Server by providing in-the-box OCFS2 enhancements.

The Oracle Cluster File System relates closely to the Ext3 file system. For example, it includes the Linux JBD layer.

You can really tell the difference in the file system when you suspend abbreviations. OCFS2 has also increased the number of supported nodes in a cluster to 255 nodes without having a negative impact on database performance. And it has improved journaling and metadata performance, which help recovery and performance.

Lastly, this file system retains the GNU public license. Thus, it's free for public consumption. This choice of licensing, coupled with the file system's incorporation into the v2.6.16 Linux kernel illustrates Oracle's commitment to the open source community.

The landscape of storage as it relates to the market is vast. Everyone has their own story to tell and enhancement to sell. That said, the foundations of file systems have and will remain constant even as connection methods are refined.

To recap, we looked at the difference between iSCSI and file systems. As an administrator, you should know the type of the data that will reside on your partitions before you create them. Likewise, you now have a good starting point as it relates to file structures. Check out the File System Primer at wiki.novell.com/index.php/File_System_Primer for a more in-depth review of file systems and their attributes. Armed with knowing the differences between file systems and the snapshot of iSCSI we covered here, your selection process should no longer be a time consuming, intimidating one. red N

Disk Technology

What's the business with SCSI, SAS, ATA and FC drives? Picking the right file system for your data was only part of it. Pairing the right drive technology (or disk) to the right task and expectation of performance is just as critical. Here's a start.

The most widely used drive technology to date is SCSI (Small Computer System Interface). Since the ratification of the standard in 1986 throughput has evolved from an unimpressive 5 MB/s to more than 320 MB/s. The most common type of drives shipped are either Ultra3 SCSI (160 MB/s) or Ultra-320 SCSI (320 MB/s) as anything below Ultra3 would be too slow. The ubiquity of the drives has made them a staple in both low-and high-end markets because of their low price, performance index and reliability.

The second generation of SCSI is Serialattached SCSI (SAS). The emerging technology builds off of the rock-solid foundation of its predecessor. As its name implies, the first notable difference is how the drive is attached to the host computer. Traditional SCSI uses a ribbon cable which sends multiple data and control signals simultaneously across different wires.

Because they had multiple paths, parallel interfaces were faster than their serial counterparts. But recent enhancements in serial technology have substantially increased the supported rates. In terms of performance, SAS drives can support transfer rates up to 3 GB/s. Future enhancements are expected to scale the technology to 10 GB/s in the future.

IDE, most commonly referred to as ATA (advanced technology attachment), is broken into two subsections with the second being SATA. The two are most easily discerned by their connection cables. The newer SATA drives are attached to the motherboard via a seven connector data cable while traditional ATA uses a 40- or 80-pin connector and is limited to an 18-inch (46cm) length. Notably, SATA drives have a substantially higher performance index. Depending on the generation of the drive, data rates are between 1.5 GB/s and 6.0 GB/s. On average, SATA drives are more expensive than their SCSI counterparts; however, they will become more common as the difference in their prices lessens.

Fibre channel (FC) drives represent the newest and arguably the most robust drive technology of the four. They typically fulfill roles in SAN environments where high transfer rates and reliability are essential. This technology is often out of reach for smaller enterprises because of the cost. Although storage is viewed as a commodity, these drives are no bargain. The supporting hardware and shear number of drives needed is the common deterrent.

In terms of reliability, no other drive technology comes close. Typically, drives measure MTBF (Mean Time Between Failure) in millions of hours of 8-hour duty cycles; FC is measured in millions of hours against a 24-hour duty cycle. This measurement criteria alone speaks volumes in the category of reliability.

Lastly, there is no silver bullet when choosing a drive technology. Just keep these points in mind when you're shopping for more storage:

  • performance
  • reliability
  • scalability
  • price

When you're ready to buy, if you've strategically considered these points, you'll be on the right track for choosing the right storage for you.



© 2014 Novell