Posted by: Eric Siegel
I'm getting a number of inquiries about using WAN optimization devices on the high-bandwidth (e.g., 155 Mbps and up) paths between large-scale storage systems. There are some interesting points about these situations:
- Transmission rate is so high that many WAN optimization devices either bypass deduplication and compression entirely, or they restrict the deduplication directory to a database that can be kept in memory instead of on the optimization device's disk. However, even if the resulting decrease in required bandwidth is only 3 or 4x reduction instead of, say, a 10x reduction, that's still a massive savings when you consider the monthly cost of high-bandwidth links. Ask the vendor what the proposed product does on high-bandwidth paths.
- On high-bandwidth, high-latency WAN paths, the flow-control and error-control characteristics of the optimization device's transport protocol are more important to overall performance than they are on lower-bandwidth paths. Even if there are no lost packets, normal TCP with too-small windows will have difficulty keeping the path full of data. Bursts of errors -- or even a few individual errors -- can make the situation much worse. If you can't alter the TCP stack in the end-user devices (for example, in the storage systems) to optimize TCP flow, then the WAN optimization devices can make a notable improvement without any compression or deduplication at all. The WAN optimization device can use forward error correction (FEC) to conceal individual packet errors, or it can use a special protocol for WAN transport. FEC is good for errors that are evenly distributed; a special transport protocol (such as the SCPS-TP variant of TCP) or advanced versions of TCP (such as Microsoft Vista's Compound TCP) handle both evenly-spaced errors and error bursts.
- Incorrectly-ordered packets can decrease throughput. The upper-level protocols, such as TCP and iSCSI, will re-order packets to ensure that even if a data flow is divided among a number of parallel paths, the final flow will be seamlessly reassembled. However, packets arriving out of order can greatly slow down this process and require special buffering inside the protocol stacks. A WAN optimization solution that does some efficient packet re-ordering of its own can save time and processing in the end-user stacks. Higher-bandwidth flows have a higher probability than do lower-bandwidth flows of including parallel sub-flows (someone may have inserted a set of parallel links somewhere to increase capacity, or the storage device may be using multiple TCP flows in parallel); it's therefore more likely that the higher-bandwidth flows will be more responsive to the advantages of packet re-ordering in optimization systems.
- Data deduplication and compression facilities in large-scale storage systems are almost never as effective at detecting and removing redundancy as are optimization devices. For example, they may detect only duplicate data blocks within a file, not among separate files, and they may work in large, fixed length (e.g., 4 Kbyte) blocks instead of being able to detect duplicate strings of any length. Both of these restrictions greatly restrict the deduplication that can be done. In the first case, multiple files (such as different versions of a virtual machine OS) that are almost completely identical will not gain any improvement from their similarity; each file will be deduplicated separately. In the second case, inserting a single word into a long file will result in the entire file's appearing to be different from that point onwards. The single inserted word will push a word off the end of the containing file segment's boundary and into the next file segment, which will then lose a word into the file segment after that, etc. With fixed-length blocks, all those "different" file segments will need to be transmitted across the network. The WAN optimization devices can be useful in squeezing additional savings out of these situations, and they can also improve the performance of the underlying protocols and can deduplicate the application-level headers inserted by the storage systems' backup protocols.
- If a file is already compressed, such as with the LZW or deflate/zip algorithms, then, depending on the file structures, it may be a good idea to decompress it before looking for duplicates. A single word change in the middle of a compressed file can alter almost all of the file. Files that are end-to-end duplicates before compression will still be duplicates after compression, but files that are different by only one word before compression may look almost completely different after compression! Some WAN optimization devices can decompress already-compressed files; others can't.
- Support for higher-level protocols such as CIFS, MAPI, and others may be less important for WAN optimization devices supporting only back-end storage systems that it is for WAN optimization devices supporting individual remote users.
Well, that's a brief set of thoughts about the special situation of WAN optimization and large-scale storage systems. If you have any additional comments or observations, write a note here!
