2012-11-16

the impact of jumbo frames in RAC interconnect

A very good article by Steven Lee:
 http://www.dbaleet.org/about_rac-interconnectjumbo-frame  ( use google translate ):

The Jumbo Frame also known as 9000-byte frames, a wide range of applications of this term in the computer network and storage networking industry. Jumbo meaning in Chinese is "oversized". The reason to call it for the large frame, because the maximum the traditional Ethernet standard frame is 1518 bytes (1500 bytes of data frames, the head size of 14 bytes, CRC parity bits is 4 bytes) In contrast, Jumbo Frame almost its six times.
Why you want to use Jumbo Frame, which means that it is compared to the standard frame what good is it? With Oracle RAC private network, what to do with it?
Jumbo Frame frame compared to the traditional standard, its main advantage is that it can reduce CPU overhead unpack unpacking, transmission, and enhance the efficiency of data transmission. Below is an example to illustrate the traditional Ethernet standard frame some of the inadequacies of the Oracle data block transfer:
Oracle database default block size is 8192 bytes, using Ethernet standard frames for transmission, at least the data block needs to be split into at least six frames for transmission. Need to split the data block in the process of sending, the sender NIC, and then sent the split data blocks this process alone is unable to complete the network card, mainly relying on the CPU to calculate, that is reality, this is a software algorithm . The way unpacking process is bound to cause some additional CPU overhead, In addition, we know that the CPU is multi-threaded, unpacking will inevitably bring frequent a larger context switches (Context Switch) overhead. This time, the target end card in the transmission process, the target end card has been waiting for all six packets ready, if a transmission fails, retransmission, in addition to waiting for nothing can be done until six data frame all the transfer is complete. It is obvious that this process will increase the transmission delay, especially for smaller bandwidth network. The final target of the end of the card through a certain algorithm, these data frames to be assembled, dismantled data frame re restore a complete data block. This process is split when it will also consume additional CPU resources.
Early 1500-byte Ethernet as standard actually has its historical reasons: that when networks are generally half-duplex , larger frame single direction of transmission may cause long dominated all bandwidth. If using a smaller frame, the unpacking / unpacketizing this process takes a certain amount of time, so that the other nodes have the opportunity to communicate. Similarly using analog computer keyboard the qwert layout rather than the the abcde layout of reasons.
The network now almost all full-duplex , which is a two-way transceiver without blocking each other. 1500 bytes this time frame is a little stretched. Network development such a long time, Ethernet has become the standard of the network, should be compatible with the frame than expected difficulty greater than 1500 bytes to be much larger, in fact almost impossible to completely compatible. Fortunately, you can use other techniques to make up for, such as unpacking unpack need to consume CPU resources to increase CPU resources, network bandwidth small increase its bandwidth, these technologies are crucial for improving the transmission efficiency. Why does it need to frames larger than 1500 bytes?
The reason is simple, before the narrative has been over, or to enhance CPU overhead and increase the transmission efficiency. If using 9000-byte jumbo frame, the only need for a single standard 8192-byte data blocks, a frame can be transmitted. No tedious process of unpacking unpacking, and naturally there is no CPU overhead. While eliminating the need to wait for a plurality of data frames transmitted simultaneously successful before assembling, the higher the efficiency of the transmission itself. But if a giant frame transmission fails, you will need to retransmission whole giant frame, so if the transmission fails, the overhead compared to the standard frame in terms of a much bigger, so the jumbo frame is particularly suited to a more stable and reliable network transmission, such as the RAC private network communication.
Oracle's best practices recommend the use of jumbo frame to enhance the transmission efficiency of the private network, and trials have shown that the use of jumbo frame can improve the transmission efficiency of about 10%, see Appendix. However, please pay attention to use jumbo frame there are the following limitations:
1. Jumbo Frame it sounds tempting, but there is not IEEE standard, so not all manufacturers support this. (Mainstream operating system and switch vendors support, Oracle VM server2.2 does not support jumbo frames, if you need to ask a separate application for oneoff, see mos document Oracle VM: Jumbo Frame on Oracle VM is Doc ID 1,166,925.1 another the earlier switch may not support jumbo frame)
The Jumbo Frame related to many aspects of broader, not just transceiver NIC MTU size needs to be adjusted to 9000, all communications related to the private network switch MTU size needs to be adjusted to 9000, may not be possible otherwise between nodes communication.
In fact, very early in the network storage industry began to recommend the use of jumbo frame as a best practice, for example, nas, iscsi san like equipment to use jumbo frame can significantly enhance the efficiency of the transmission and to reduce cpu overhead. Can google: jumbo frame best practice, will be able to find many netapp, emc, dell cisco White Paper. Oracle cautious early on as a private network, the maximum transmission unit with Jumbo frame, involves too many intermediate links, but also involves the modification of the switch MTU. Many customers because of the omission of a link or configuration error lead to the RAC not start. Oracle recommends using Jumbo frame as the maximum transmission unit of the private network, the main reason is to be found by the stress tests really can reduce global cache wait events, for the RAC framework in terms of the share-everything, short board without a doubt is that private network communication, because the private network is largely decided by whether they have a class of linear scalability.
Finally, talk about some of my personal views:
Many domestic customers (for example, the communications industry) are many sets of RAC private network using the same core switches (some even go public network and private network switch. Khan) If you intend to use jumbo frame, in addition to the host itself other than the MTU, MTU of the core switches need to be adjusted. Once adjust core switches MTU, connected to the switch all the nodes are affected, but not the only way a system using jumbo frames, this adjustment can be said to affect the situation as a whole, do be careful. If only one set of the system has a large number of gc to wait, it is recommended that at other levels such as the application of adjustments Do not try jumbo frame.
For the new on-line system, procurement of equipment, try to use jumbo frame to improve transmission efficiency, to perform as the RAC the best practice norms.
Attachment: The following are some of the results of Rene Kundersma (From Oracle Exadata Max Availability Architecture team) for standard frames and Jumbo frame comparison test, the original link for reference:
Rene through tests concluded as follows:
1. You will hardly notice the benefits of using Jumbo on a system with no stress
2. You will notice the benefits of Jumbo using Frames on a stressed system and such a system will then use less CPU and will have less network overhead.

Niciun comentariu:

Trimiteți un comentariu