Arrow left Go to previous page / next page of Tier3 site log MOVED TO...

01. 06. 2010 Implementing the ZFS incremental snapshot backup

The naive send/recv approach has terrible performance

I tried a send/receive via ssh and via pure netcat. Bot showed bad throughput, netcat even a few factors worse than ssh (SSH was in the range of few MB/s). I found this link after a short googling on the problem: "
"I observed something similar between two home systems. Investigation showed that the issue is that the zfs receive does its reads in giant bursts every 4 seconds or so. Trouble is that within a tiny fraction of the gap between reads, the network buffers fill up and flow control off the zfs send, so much of the time the data stream buffers are full and flow controlled off."

The thread recommended using a buffering mechanism on the receiver side (mbuffer). Here is another web site that shows how to use mbuffer with ZFS: http://tm.fidosoft.org/index.php/2009/05/zfs-send-and-zfs-receive-with-mbuffer-examples-on-x4500/

Establishing a baseline throughput by having receiver dump the stream to /dev/null

Testing send bandwidth over netcat by dumping to null on the receiving side.

zfs send shome@auto2010-05-31_23:35 | nc t3fs05 9000
nc -l -p 9000 > /dev/null
t3fs05-netcat-to-null.png

Testing transfers to null through the mbuffer tool (graph taking on sending side to show CPU load. BW matches receiving side in this period)

mbuffer -I t3fs05:9000 -s128k -m1G -P10 > /dev/null

zfs send shome@auto2010-05-31_23:35 | mbuffer -s128k -m1G -O t3fs05:9000
t3fs06-mbuffer-to-null.png

For both cases, the sending starts out for 10 minutes with a low performance. Then the throughput grows to basically the max bandwidth for 1 Gb link, but there always is some fine structure and sometimes periods of lesser throughput follow. The mbuffer tool seems to bring some benefit. I can try to play some more with the -P flag (sending starts at a certain fill level of send buffer) a few other settings.

mbuffer + send/recv

Testing the bandwidth with having the stream unpacked by a zfs recv on t3fs05:

mbuffer -I t3fs06:9000 -s128k -m1G -P10 | zfs recv shome2/shomebup

zfs send shome@auto2010-05-31_23:35 | mbuffer  -s128k -m1G -O t3fs05:9000

t3fs06-mbuffer-to-zfsrcv.png t3fs05-mbuffer-to-zfsrecv.png

The bandwidth looks ok.

Started a production transfer of our 3TB system at 15:40h. For this I use at jobs on both nodes for executing

mbuffer -q -l /tmp/mbuffer.log -I t3fs06:9000 -s128k -m1G -P10 | zfs recv shome2/shomebup

zfs send shome@auto2010-05-31_23:35 | mbuffer -q -l /tmp/mbuffer.log -s128k -m1G -O t3fs05:9000

-- DerekFeichtinger - 2010-06-01


Arrow left Go to previous page / next page of Tier3 site log MOVED TO...

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng t3fs05-mbuffer-to-zfsrecv.png r1 manage 41.2 K 2010-06-01 - 13:43 DerekFeichtinger  
PNGpng t3fs05-netcat-to-null.png r1 manage 19.6 K 2010-06-01 - 11:43 DerekFeichtinger  
PNGpng t3fs06-mbuffer-to-null.png r1 manage 40.0 K 2010-06-01 - 12:46 DerekFeichtinger  
PNGpng t3fs06-mbuffer-to-zfsrcv.png r1 manage 43.0 K 2010-06-01 - 13:24 DerekFeichtinger  
Edit | Attach | Watch | Print version | History: r6 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2010-06-01 - DerekFeichtinger
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback