Storage network design integrating NAS and SAN-EEWORLD

Collect

introduction

The development of IT technology has gone through three waves: the first wave was centered on processing technology and the development of processors as the core driving force, which gave rise to the computer industry and promoted the rapid popularization and application of computers; the second wave was centered on transmission technology and the development of networks as the core driving force. These two waves greatly accelerated the process of information digitization, causing more and more human information activities to be transformed into digital form, resulting in an explosive growth of digital information, which in turn triggered the third wave of IT technology development: the storage technology wave.

The core of the storage technology wave is network-based storage technology. At present, there are two main types of popular network storage systems: Network Attached Storage (NAS) and Storage Area Network (SAN). According to the definition of the Storage Network Industry Association (SNIA): NAS is a storage device that can be directly connected to the network to provide file-level services to users, while SAN is a network that can directly transmit data between servers and storage systems using interconnection protocols such as Fibre Channel. NAS is a storage device with its own simplified real-time operating system. It effectively integrates hardware and software to provide file services and has good sharing, openness and scalability. Storage devices using SAN technology are connected by a dedicated network, which is a network based on the Fibre Channel protocol. Since the storage network of Fibre Channel is separated from the LAN, the performance is very high. In SAN, capacity expansion, data migration, local data backup and remote disaster recovery data backup are all relatively convenient, and the entire SAN becomes a storage pool (storage pool) under unified management. Due to these excellent performances, SAN has become an important technology for enterprise storage.

However, in actual applications, NAS and SAN also have many defects, and are increasingly unable to meet the needs of the rapid development of IT technology and the explosive growth of digital information. For example, NAS devices have the following defects: (1) The data transmission speed is slow, because NAS can only provide file-level but not block-level data transmission; (2) The performance is low during data backup. NAS will occupy most of its network bandwidth during data backup, and other I/O performance will be affected; (3) Only a single NAS can be managed, and it is difficult to centrally manage multiple NAS located in the same LAN. SAN also has the following defects: (1) The interoperability of the equipment is poor, and it is difficult for equipment from different manufacturers to interoperate; (2) The cost of building a SAN is high. Currently, only large enterprises build their own SAN; (3) The management and maintenance costs are high. Enterprises need to spend money to train specialized management and maintenance personnel; (4) SAN can only provide storage space sharing but not file sharing in heterogeneous environments.

In view of the advantages and disadvantages of NAS and SAN, a variety of new network storage technologies have emerged, such as: NAS Gateway (NAS head), IP-based SAN technology, and object storage technology. NAS gateway can connect SAN to IP network, so that IP network users can directly access storage devices in SAN through NAS gateway. Therefore, NAS gateway has the following advantages: it can interconnect NAS and SAN in the same LAN, breaking through the limitations of FC topology and allowing FC devices to be used in IP network; it reduces the access cost of fiber optic devices and allows access to SAN storage space that is not fully utilized. IP-based SAN interconnection technologies mainly include: FCIP (IP tunneling), iFCP, iSCSI, Infiniband, mFCP, and its representative technology is iSCSI technology. The principle of iSCSI technology is to map the SCSI protocol onto TCP/IP, that is, to encapsulate the host's SCSI command into a TCP/IP data packet, transmit it on the IP network, and restore it to the SCSI command before encapsulation after reaching the destination node, thereby realizing direct and transparent transmission of SCSI commands on the IP network, making it as convenient to access remote SCSI disks as local hard disks. Storage objects have the advantages of both files and blocks: they can be directly accessed on storage devices like data blocks; through an object interface, data can be shared on different operating system platforms like files. Although NAS Gateway realizes the integration of NAS and SAN on IP, it is not a true integration, because it cannot integrate NAS devices and SAN devices to provide users with a unified storage pool, and users can only access storage devices in the form of file I/O. Although object storage has the advantages of NAS and SAN, it requires the design of a special object storage interface and the modification of the existing file system, which hinders its further popularization and promotion.

This paper proposes and implements a unified storage network (USN) that integrates iSCSI, NAS, and SAN under the IP protocol. In USN, NAS devices, iSCSI devices, and SAN devices coexist. Users can access iSCSI devices and SAN storage devices in USN in block I/O mode, or access NAS storage devices and SAN storage devices in USN in file I/O mode. The entire USN is a unified storage pool. In addition, USN can provide server channels and attached network high-speed channels to provide data to clients at the same time, reducing server bottlenecks and improving the system's I/O speed. USN has both the advantages of NAS (low cost, openness, file sharing) and the advantages of SAN (high performance, high scalability). USN has obvious advantages over NAS Gateway (NAS head) technology, IP-based SAN technology, and object storage technology.

USN Overall Structure

The hardware structure of the USN system is shown in Figure 1. The USN consists of NAS devices, iSCSI devices, and SAN devices, as well as metadata servers and application servers. Users can access NAS devices in the USN and storage devices in the SAN through NAS heads in the form of file I/O, or access iSCSI devices in the USN and storage devices in the SAN in the form of block I/O. The USN also provides users with server channels and attached network high-speed channels. Metadata and small data requests are completed through server channels, and large data requests are completed through attached network high-speed channels. This greatly improves the I/O speed of the entire system and reduces server bottlenecks. The entire USN is built with IP-based technology, which is compatible with existing storage systems and is very convenient to add and delete storage devices. Therefore, the performance and scalability of the entire system are very good. USN truly realizes the unification of NAS and SAN, that is, there are both NAS devices and SAN structures in the same storage network; it realizes the unification of file I/O and block I/O, that is, users can access devices in USN in file I/O mode (file as a unit) or block I/O mode (block as a unit); it realizes the unification of file protocol and block protocol on TCP/IP protocol, and users can access USN through NFS (Unix users) and CIFS (Windows users), and can also access USN through SCSI (iSCSI users).

Figure 2 is the software structure diagram of USN, where GMPFS is a global multi-protocol file system located on each application server in the USN system. It supports Windows users using CIFS protocol to access USN, supports UNIX users using NFS protocol to access USN, and also supports block protocol users using iSCSI protocol to access USN. GMPFS expands the metadata used by the current storage system and adopts a heuristic method to collect user application information, providing users with a unified, convenient and fast storage access interface and a reasonable data storage solution. ASA is an autonomous storage agent module that can automatically discover the types of storage devices and various available resources in the massive storage system, and autonomously manage and optimize these storage devices and resources in an effective and unified manner. According to the different applications and specific needs of the applications, ASA arranges the types of storage devices, performance, reliability and availability levels that are suitable for the applications, and selects appropriate data channels for I/O requests, so that the applications can obtain the best storage resource allocation, thereby achieving the best performance of the entire system.

System Design

USN is a complex system involving many complex technologies. This paper mainly discusses the design and implementation of its core technologies, namely the design and implementation of GMPFS, ASA and iSCSI systems. GMPFS can reside on multiple operating system platforms (UNIX, Windows, Linux), support access by users of various protocols (NFS, CIFS, iSCSI), and provide data access services to network storage systems for users or applications. ASA integrates multiple storage technologies (each of which has its own strengths and weaknesses) into a unified mass storage system, giving full play to the advantages of various storage technologies, so that the storage system achieves the best service performance for specific applications and effectively meets various application requirements. iSCSI truly realizes the unification of block I/O and file I/O on IP networks, and the unification of file protocols and block protocols on IP protocols.

Design of Global Multi-Protocol File System
GMPFS retains the advantages of flexibility and high performance of distributed file systems, while overcoming its shortcomings in supporting different I/O protocols. It can support access by users of NFS, CIFS and iSCSI protocols at the same time. GMPFS provides file access methods and file directory structures, while also providing specific storage modes for each storage volume. Each storage mode contains a certain file system metadata structure, operation interface (file type and data block type), function set (formatting, retrieval, etc.), optimization method (cache method and prefetch, etc.) and storage space allocation and recovery method and data structure. For file volumes, the storage mode contains operation functions and file directory structures that implement POSIX semantics; for partition volumes, the storage mode must be oriented to specific partition types, such as NTFS and ext3. All storage modes must be registered in the ASA system in the metadata server so that the ASA can select channels for user I/O requests.

The structure of GMPFS is shown in Figure 3. The protocol conversion interface mainly supports NFS and CIFS protocols through the combination of NFS extension program modules and samba modules, and supports iSCSI protocols through the extension of iSCSI target driver. The heuristic data management interface mainly uses heuristic methods to obtain the user's needs for storage data, such as performance, utilization rate, and security. The GMPFS data organization logic interface provides a logical view of data organization, which is precisely aimed at the weakness of the traditional file system file directory structure that is difficult to manage massive data. Under the premise of adding metadata information, various types of file views are provided according to user needs through query and retrieval, such as classification according to the user and time of file creation. The extended file or volume operation interface, data organization and allocation management, metadata organization structure and I/O director are mainly to ensure compatibility with traditional file system operation semantics and realize program-level data access. Applications can use data in the USN system without modification. Providing interfaces and communications with ASA and storage resources in the metadata server can make full use of the storage resources controlled by the ASA system, reasonably organize data, and meet the multi-faceted and personalized requirements of users or applications for data storage. For example, by providing both server channels and attached network high-speed channels, users' I/O performance services can be improved and server bottlenecks can be reduced.

iSCSI system design The
iSCSI protocol defines the mapping from SCSI to TCP/IP, that is, encapsulating the host's SCSI commands into IP packets, transmitting them on the IP network, and restoring them to the SCSI commands before encapsulation after reaching the destination node, thereby realizing direct and transparent transmission of SCSI commands on the IP network. It integrates two mainstream protocols, the existing storage protocol SCSI and the mainstream network protocol TCP/IP, to achieve seamless integration of storage and network. From the application perspective, iSCSI, on the one hand, realizes command-level interaction with remote storage devices through remote transmission of SCSI commands, making it as convenient for users to access remote SCSI devices as local SCSI devices, and at high speed; on the other hand, it can also be used to transform traditional NAS and SAN technologies to achieve the integration of NAS and SAN. The iSCSI system is one of the core parts of the USN system. The design of iSCSI realizes the data block access mechanism based on IP.

At present, the implementation of iSCSI can be considered in the following three ways: pure software, intelligent iSCSI network card, and iSCSI HBA card. Since we are designing the prototype system of USN, we only use pure software. The iSCSI HBA card is the goal we will achieve in the next productization. The overall design model of the iSCSI system is shown in Figure 4 (excluding the management module). The server (Target) uses the Linux operating system, and the client (Initiator) uses Windows 2000. The SCSI microport driver generates a virtual SCSI disk in the system, and the filter driver intercepts the SCSI command sent by the system to the SCSI disk and sends it to the server for processing through the kernel network interface.

Design of autonomous storage proxy system
One end of the autonomous storage proxy ASA faces the mass storage system. The current storage systems include DAS (direct-attached storage), NAS, SAN, iSCSI, etc. ASA can automatically discover the types of storage devices and available resources in the mass storage system, and autonomously manage and optimize these storage devices and resources in an effective and unified manner; according to the different applications and the specific needs of the application, it arranges the type of storage device, performance, reliability and availability level that are suitable for the application, so that the application can obtain the best storage resource allocation.

The other end of ASA faces the application (GMPFS). ASA expands the metadata used by the current storage system and adopts a heuristic method to collect user application information, providing users with a unified, convenient, and fast storage access interface and a reasonable data storage solution; according to the attributes of the data involved in the user I/O request, it selects the channel for the client to interact with the storage device, that is, metadata (directory, volume information, etc.) and small data I/O requests, selects the server channel, and selects the high-speed network-attached channel for large data I/O requests. Large and small data I/O requests are autonomously adjusted by ASA according to the amount of I/O information of the entire system. The ASA system structure is shown in Figure 5.

Client and USN interaction process

The USN system includes three types of users: Windows file I/O users (using CIFS protocol), Unix file I/O users (using NFS protocol), and iSCSI block I/O users (using iSCSI protocol). The user interaction process on the client and the USN system is shown in Figure 6.

The specific data read and write process of the block I/O client is as follows (as shown in Figure 6): (1) The block I/O command (SCSI command) issued by the application on Client 1 is encapsulated into an IP data packet after passing through the iSCSI device driver layer and the TCP/IP protocol stack, and is transmitted on the IP network; (2) After the encapsulated SCSI command reaches the USN server, it is decapsulated and restored to the SCSI command before encapsulation. The USN server uses these SCSI commands to issue block I/O read and write requests to the iSCSI storage device; (3) The requested data block is encapsulated into a PDU through the iSCSI layer and the TCP/IP protocol stack in the iSCSI device. The PDU transmitted by the iSCSI device to the client can be transmitted through two channels: one is forwarded through the server, and the other is directly transmitted to the client through the high-speed attached network channel; (4) After the PDU is transmitted back to Client 1 via the IP network, the PDU is decapsulated by Client 1 and assembled into a file by its file system.

When the USN system provides File I/O service, its data reading and writing process is as follows (as shown in Figure 6): (1) Client 2 (file I/O) sends a file reading and writing request to the USN server (its working method is the same as that of traditional NAS); (2) After the USN server receives the file reading and writing request from the client: on the one hand, it sends the I/O request to the corresponding NAS device or NAS head, and the NAS device or NAS head transmits the requested data to the USN server, and then transmits it to the client through the USN server; on the other hand, the USN server does not transmit the file I/O request to the NAS or NAS head, but transmits the IP address of the NAS or NAS head to the client, and the client directly interacts with the NAS or NAS head through the IP address.

The NAS head here mainly supports SAN devices of FC protocol, which can be directly connected to TCP/IP network and support NFS/CIFS user access. The NAS head can also install iSCSI target driver to support iSCSI user access. Whether it is block I/O request or file I/O request, data interaction between client and storage device can be realized through attached network high-speed channel.

Trial Evaluation

From the client side, we conduct functional and performance evaluation on each sub-storage system and the entire USN, and make further comparisons. We test the unified storage network from two aspects: functional test and performance test. Functional test includes: (1) Building 100M and 1000M Ethernet environment, connecting iSCSI storage devices to servers; after installing iSCSI software package in the server operating system, users can obtain storage space provided by iSCSI storage devices through the network and operate them like local hard disks.

This test item tests the installation, configuration, management and use of iSCSI disks on the server side; (2) iSCSI storage devices are used as storage devices of NAS heads and form a NAS storage system with NAS heads. This test item tests the installation, configuration, management and use of iSCSI disks in NAS; (3) iSCSI disks, local disks and FC-RAID disks form RAIDs of various redundancy levels. This test item tests the installation, configuration, management and use of various storage disks in RAID; (4) Multiple NAS, iSCSI devices and NAS heads are connected to FC-RAID to form a USN mass storage system through multiple GMPFS and ASA. This test item tests the installation, configuration and use of GMPFS and ASA systems in a system that integrates NAS, iSCSI and SAN.

Performance testing includes: testing the data transmission performance of NAS storage devices, iSCSI storage devices, FC-RAID, local hard disks and the massive USN system composed of them under different workloads in 100M and 1000M network environments: including the number of IOs per unit time, the average response time of an IO, the data transmission rate and the CPU utilization rate. The main idea of this test is to perform frequent IO processing on various storage devices and various transmission channels for different network application environments, and to count and calculate performance parameters such as IO rate, data transmission rate, response time, CPU utilization rate, etc. within a certain time, so as to obtain various performance evaluations.

Test environment
iSCSI storage device: P42.0GHz CPU, 256MB DRAM, IBM DPSS318350 18G hard disk, Redhat Linux 9.0 operating system; LINUX server: Pentium 42.66GHz (FC2PGA) CPU, 256MB DRAM, 80GB UltraATA/1007, 200rpm hard disk, Redhat Linux 9.0 operating system; WINDOWS server: XEON 3.06GHz CPU, 512M DRAM memory, Smart Array6i (onboard) storage controller, Qlogic QLA2300 PCI FC Adapter fiber adapter, IBM 36.4GB (32P0726) 10Krpm hard disk, Microsoft Windows2003 operating system; FC-RAID: NexStor 4000S, CPU 600MHZ, 512M SDRAM, 10 ×ST314680FC Hard disk; ordinary NAS storage device: P42.66GHz CPU, 512MB DDR, Maxtor 160G hard disk, Redhat Linux 9.0 operating system.

Network connection: iSCSI devices and ordinary NAS devices use 100M Ethernet card Realtek RTL8139; Windows server uses 1000M Ethernet card HP NC7782 Gigabit Server Adapter; Linux server uses 1000M Ethernet card HPNC7782Gigabit Server Adapter.

Functional testing
According to the test process, functional testing includes three aspects: (2) Platform unification, that is, multiple storage nodes can be accessed through a single directory tree under Windows, which is similar to the function of pvfs under Linux; (2) Protocol unification, that is, FC-RAID, iSCSI Target and ordinary NAS devices can be managed through Windows' "Computer Management" and Initiator (iSCSI client), and multiple redundancies can be achieved using the "dynamic disk mechanism"; Device unification, that is, iSCSI Target cooperates with the initiator to make the Target a storage device in the NAS system.

Performance Testing

Test content

The test was conducted using the third-party IOMETER test software. IOMETER is a test program developed by INTEL to test the I/O performance of the system. Its test parameters are relatively comprehensive and can fully reflect the I/O performance of the server. In order to illustrate the performance of the USN storage system, the following items were tested under the same conditions for comparative analysis: (1) Read and write performance test of the USN server local hard disk; (2) Read and write performance test of the FC-RAID disk in a 100M Ethernet environment; (3) Read and write performance test of the remote iSCSI disk in a 100M Ethernet environment; (4) Read and write performance test of various levels of RAID disks constructed by FC-RAID disks and remote iSCSI disks in a 100M Ethernet environment; (5) Read and write performance test of the remote iSCSI disk in a 1000M Ethernet environment; (6) Read and write performance test of the USN system in a 100M Ethernet environment.

Comparison of experimental results

The data transmission rate performance comparison of local IDE hard disk, 100M iSCSI hard disk, 1000M iSCSI hard disk, FC-RAID, RAID0 composed of FC-RAID and iSCSI, and USN system is shown in Figure 7.

The IO/s performance comparison of local IDE hard disk, 100M iSCSI hard disk, 1000M iSCSI hard disk, FC-RAID, RAID0 composed of FC-RAID and iSCSI, and USN is shown in Figure 8.

The average response time performance comparison of local IDE hard disk, 100M iSCSI hard disk, 1000M iSCSI hard disk, FC-RAID, RAID0 composed of FC-RAID and iSCSI, and USN is shown in Figure 9.

The CPU usage comparison of local IDE hard disk, 100M iSCSI hard disk, 1000M iSCSI hard disk, FC-RAID, RAID0 composed of FC-RAID and iSCSI, and USN is shown in Figure 10.

Experimental results analysis

The impact of the size of the request file or data block on the performance of the storage system can be seen from the trend of the single curve in Figure 7, Figure 8 and Figure 9. When the request file or data block is large, it takes a long time to read and write data from the destination disk or system, and the time for transmission through the network also increases accordingly. Therefore: the average response time of small packets < the average response time of large packets, and the IOps of small packets > the IOps of large packets. When the request packet is large, the additional operations performed on a request packet are less than those of a small request packet, and the time consumed for continuous reading and writing is less than the time consumed for reading and writing small packets. Therefore: MBps of small packets < MBps of large packets.

The performance trends of the server-side iSCSI disks conform to the above rules in different request packet sizes in 100M Ethernet and Gigabit Ethernet environments. Local IDE hard disks, FC-RAID and USN systems also conform to the above rules.

Performance Analysis

As can be seen from Figures 7, 8 and 9, when the I/O request is between 1k and 128kB, the I/O request response speed of the USN system is much faster than that of the local IDE hard disk, FC-RAID, 100M remote iSCSI hard disk and 1000M iSCSI hard disk. When the I/O request is larger than 128kB, the I/O request response speed of the USN system is slightly slower than that of FC-RAID, but much faster than that of other storage subsystems, with a maximum speed of 45MB/s. The reason is that in addition to GMPFS (supporting users using multiple access protocols) and ASA (providing server channels and attached network high-speed channels) loaded on the server side of USN, we also loaded the intelligent pre-fetching, hard disk cache technology (DCD), load balancing and zero copy system or software modules previously developed by our laboratory, so that both large and small I/O requests can provide excellent I/O request response performance. However, due to its own characteristics such as data verification and other delays, FC-RAID responds slowly to small I/O requests and responds faster to larger I/O requests.

For the iSCSI disk storage subsystem of USN, the experimental results show that when the requested data block is small, the performance difference between the 100M network environment and the 1000M network environment is not obvious. As the request block or file gradually increases, the gap between the IOps and MBps between the two becomes larger and larger. When the requested data block is 1024K, only the data link layer and physical layer in the network transmission are replaced, and the disk data transmission rate is greatly improved from the 100M network environment to the 1000M network environment. The latter is about 3 times that of the former.

As can be seen from Figure 10, the 100M iSCSI storage subsystem has the highest CPU occupancy rate, because in response to the user's I/O request, the server is required to continuously encapsulate and decapsulate the iSCSI protocol data unit. The local IED hard disk has the lowest CPU occupancy rate, and the USN system server-side CPU occupancy rate is second, because in the USN system, small I/O requests are directly processed by the server, while large I/O requests are processed by the storage device itself through the attached network high-speed channel.

Conclusion and Outlook

The unified storage network system we proposed, designed, and implemented uses IP interconnection equipment, which is much cheaper than Fibre Channel. We have much more resources and experience in the development and implementation of management software and the use and maintenance of the system. In addition, Gigabit Ethernet technology is developing faster than Fibre Channel technology. 10Gbps Ethernet switches have been launched and are hot in the market. Their performance prospects are also much better than Fibre Channel switches. All of these have laid a solid foundation for the commercialization of unified storage networks.

At present, we have realized the unified storage network prototype system in theory, structure and practice. Now, we are developing and improving iSCSI devices that support multiple users, multiple functions and multiple platforms, and designing and implementing new security and high-availability file systems, so that after the unified storage network system is commercialized, it can truly provide a massive storage system with better openness, performance, scalability and cost-effectiveness for the majority of enterprises, especially for small and medium-sized enterprises.

Reference address：Storage network design integrating NAS and SAN

Previous article：Design of digital video acquisition card based on TMS320
Next article：Application of HPI in MCU and DSP interface

Popular Resources
Popular amplifiers