Hamidreza Anvari and Paul Lu

The Impact of Large-Data Transfers in Shared Wide-Area Networks: An Empirical Study


Computational science sometimes requires large data files to be transferred over high bandwidth-delay-product (BDP) wide-area networks (WANs). Experimental data (e.g., LHC, SKA), analytics logs, and filesystem backups are regularly transferred between research centres and between private-public clouds. Fortunately, a variety of tools (e.g., GridFTP, UDT, PDS) have been developed to transfer bulk data across WANs with high performance.

However, using large-data transfer tools could adversely affect other network applications on shared networks. Many of the tools explicitly ignore TCP fairness to achieve high performance. Users have experienced high-latency and low-bandwidth situations when a large-data transfer is underway. But there have been few empirical studies that quantify the impact of the tools. As an extension of our previous work using synthetic background traffic, we perform an empirical analysis of how the bulk-data transfer tools perform when competing with a non-synthetic, application-based workload (e.g., Network File System). Conversely, we characterize and show that, for example, NFS performance can drop from 29 Mb/s to less than 10 Mb/s (for a single stream) when competing with bulk-data transfers on a shared network.