I’ve been able to get file transfer speeds at maximum network speeds using the below methods. On AWS the fastest I’ve seen is 400MB/s from a hs1.8xlarge to hs1.8xlarge.
Base command
This is the command I generally use when rsyncing:
rsync -avP source destination
Easy speed-ups
If data security isn’t super important, use the arcfour cipher to keep transfers from being cpu-bound. It is a bit weaker than the default cipher, but is less CPU intensive:
rsync -avP -e "ssh -c arcfour" source dest
For transferring large files that no one else is accessing, use –inplace:
rsync -avP --inplace source dest
combine the above:
rsync -avP --inplace -e "ssh -c arcfour" source dest
OpenSSH HPN
OpenSSH HPN is ssh upgraded for high performance networks. It is widely used and secure. Among other things, it allows the file to be transferred without encryption. This is useful for non sensitive data, or transferring within a local secured network like a LAN or VPN. With the “none” cipher, passwords are still transferred securely, but the actual data is not. There is also a new multi-threaded cipher than can take advantage of multiple cores to do the encryption which I have not tested, called MT-AES-CTR. Links: hpn-ssh, hpn-ssh-faq
To see if HPN is already installed, type ssh -V and look for “HPN” in the version name
To install in Ubuntu:
sudo add-apt-repository ppa:w-rouesnel/openssh-hpn sudo apt-get update -y sudo apt-get install openssh-server
Open /etc/ssh/sshd_config and add:
HPNDisabled no TcpRcvBufPoll yes HPNBufferSize 8192 NoneEnabled yes
Restart ssh:
sudo service ssh restart
This should speed up all rsync/ssh/scp connections. Additionally, you can now use the none cipher like this:
rsync -avP --inplace -e 'ssh -oNoneSwitch=yes -oNoneEnabled=yes' source dest
Using iperf
iperf can tell you how the maximum speed of transfer your network is capable of.
on node1: iperf -s
on node2: iperf -c node1.ip
------------------------------------------------------------ Client connecting to 10.159.75.17, TCP port 5001 TCP window size: 96.7 KByte (default) ------------------------------------------------------------ [ 3] local 10.141.191.34 port 49718 connected with 10.159.75.17 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.81 GBytes 1.56 Gbits/sec
Using the none cipher should get you within 95% of that range, assuming disk speed doesn’t become the bottleneck.
Parallelizing it all from the command line
This will upload all subdirectories in /dir with 10 parallel threads.
ls /dir|xargs -P 10 -I {} \ rsync -avP -e "ssh -oNoneEnabled=yes -oNoneSwitch=yes" source/{} dest/
It appears that ppa:w-rouesnel/openssh-hpn no longer exists. I really want a way to implement SSH-HPN but it looks like my only option has gone offline.
What can I do to add this patch to Open-SHH on my Ubuntu machine?
I’m not sure about the ubuntu ppa, but you should be able to download what you need from https://www.psc.edu/index.php/hpn-ssh