Go to previous page / next page of Tier3 site log

01. 06. 2010 Implementing the ZFS incremental snapshot backup

01. 06. 2010 Implementing the ZFS incremental snapshot backup

The naive send/recv approach has terrible performance

I tried a send/receive via ssh and via netcat. Both showed bad throughput, netcat even a few factors worse than ssh (SSH was in the range of a few MB/s). I found this link after a short googling on the problem:

I observed something similar between two home systems. Investigation showed that the issue is that the zfs receive does its reads in giant bursts every 4 seconds or so. Trouble is that within a tiny fraction of the gap between reads, the network buffers fill up and flow control off the zfs send, so much of the time the data stream buffers are full and flow controlled off.

The thread recommended using a buffering mechanism on the receiver side (mbuffer). Here is another web site that shows how to use mbuffer with ZFS: http://tm.fidosoft.org/index.php/2009/05/zfs-send-and-zfs-receive-with-mbuffer-examples-on-x4500/

Establishing a baseline throughput by having receiver dump the stream to /dev/null

Testing send bandwidth over netcat by dumping to null on the receiving side.

zfs send shome@auto2010-05-31_23:35 | nc t3fs05 9000
nc -l -p 9000 > /dev/null

Testing transfers to null through the mbuffer tool (graph taking on sending side to show CPU load. BW matches receiving side in this period)

mbuffer -I t3fs05:9000 -s128k -m1G -P10 > /dev/null

zfs send shome@auto2010-05-31_23:35 | mbuffer -s128k -m1G -O t3fs05:9000

For both cases, the sending starts out for 10 minutes with a low performance. Then the throughput grows to basically the max bandwidth for 1 Gb link, but there always is some fine structure and sometimes periods of lesser throughput follow. The mbuffer tool seems to bring some benefit. I can try to play some more with the -P flag (sending starts at a certain fill level of send buffer) a few other settings.

Doing full snapshot transfers with mbuffer + zfs send/recv

Testing the bandwidth with having the stream unpacked by a zfs recv on t3fs05:

mbuffer -I t3fs06:9000 -s128k -m1G -P10 | zfs recv shome2/shomebup

zfs send shome@auto2010-05-31_23:35 | mbuffer  -s128k -m1G -O t3fs05:9000

The bandwidth looks ok.

Started a production transfer of our 3TB system at 15:40h. For this I use at jobs on both nodes for executing

mbuffer -q -l /tmp/mbuffer.log -I t3fs06:9000 -s128k -m1G -P10 | zfs recv shome2/shomebup

zfs send shome@auto2010-05-31_23:35 | mbuffer -q -l /tmp/mbuffer.log -s128k -m1G -O t3fs05:9000

The transfer finished ok. It took about 8 hours for the 2TB that are used on our 3TB shome system (so about 80 MB/s on avg)

After the transfer a check on the target machine (t3fs05) gives:

root@t3fs05 $ zfs list

NAME                                   USED  AVAIL  REFER  MOUNTPOINT
data1                                  202K  8.44T  48.8K  /data1
rpool                                 10.0G   447G    34K  /rpool
rpool/ROOT                            5.00G   447G    21K  legacy
rpool/ROOT/s10x_u8wos_08a             5.00G   447G  5.00G  /
rpool/dump                            1.00G   447G  1.00G  -
rpool/export                            44K   447G    23K  /export
rpool/export/home                       21K   447G    21K  /export/home
rpool/swap                               4G   451G  1.84M  -
shome2                                2.03T  4.17T  51.2K  /shome2
shome2/shomebup                       2.03T  4.17T  2.03T  /shome2/shomebup
shome2/shomebup@auto2010-05-31_23:35  1.63M      -  2.03T  -

To ensure that the filesystem is not changed, so that the next snapshot can correctly be put on it, it is necessary to set the filesystem to read only! Even if it has been just listed by ls, there will be changes in regard to the snapshot, so one may require to restore the original state using a zfs rollback snapshotname.

zfs set readonly=on shome2/shomebup

Incremental snapshot transfer

In order to have more time to work on this, I made an additional snapshot shome@marker2010-06-02_10:02 on t3fs06 that will not be deleted like the other cron-job generated ones every other night. I then tried an incremental snapshot transfer:

Started a run interactively at 10:09 that finished about 10 min later:

mbuffer -I t3fs06:9000 -s128k -m1G -P10 | zfs recv shome2/shomebup
  in @  113 MB/s, out @  225 MB/s, 38.3 GB total, buffer   1% fulll
  summary: 38.3 GByte in 10 min 12.3 sec - average of 64.1 MB/s, 240x empty

zfs send -i shome@auto2010-05-31_23:35 shome@marker2010-06-02_10:02 | mbuffer -s128k -m1G -O t3fs05:9000
  in @  0.0 kB/s, out @  113 MB/s, 38.3 GB total, buffer   0% full
  summary: 38.3 GByte in 10 min 12.2 sec - average of 64.1 MB/s

This leaves on the target t3fs05

root@t3fs05 $ zfs list
NAME                                     USED  AVAIL  REFER  MOUNTPOINT
data1                                    202K  8.44T  48.8K  /data1
rpool                                   10.0G   447G    34K  /rpool
rpool/ROOT                              5.00G   447G    21K  legacy
rpool/ROOT/s10x_u8wos_08a               5.00G   447G  5.00G  /
rpool/dump                              1.00G   447G  1.00G  -
rpool/export                              44K   447G    23K  /export
rpool/export/home                         21K   447G    21K  /export/home
rpool/swap                                 4G   451G  1.84M  -
shome2                                  2.07T  4.14T  51.2K  /shome2
shome2/shomebup                         2.07T  4.14T  2.04T  /shome2/shomebup
shome2/shomebup@auto2010-05-31_23:35    29.8G      -  2.03T  -
shome2/shomebup@marker2010-06-02_10:02      0      -  2.04T  -

-- DerekFeichtinger - 2010-06-01

Go to previous page / next page of Tier3 site log

Attachments

Topic attachments
I	Attachment	History	Action	Size	Date	Who	Comment
png	t3fs05-mbuffer-to-zfsrcv-complete.png	r1	manage	40.2 K	2010-06-02 - 07:42	DerekFeichtinger
png	t3fs05-mbuffer-to-zfsrecv.png	r1	manage	41.2 K	2010-06-01 - 13:43	DerekFeichtinger
png	t3fs05-netcat-to-null.png	r1	manage	19.6 K	2010-06-01 - 11:43	DerekFeichtinger
png	t3fs06-mbuffer-to-null.png	r1	manage	40.0 K	2010-06-01 - 12:46	DerekFeichtinger
png	t3fs06-mbuffer-to-zfsrcv.png	r1	manage	43.0 K	2010-06-01 - 13:24	DerekFeichtinger
sh	zfssnap.sh	r1	manage	10.0 K	2011-05-29 - 09:37	FabioMartinelli	last regular-snapshot deployed

Topic revision: r6 - 2011-05-29 - FabioMartinelli

CmsTier3

User Pages
Main Page
Policies

Physics Groups
Steering Board Meetings

Admin Pages
AdminArea
Cluster Specs