Fast data transfer/sync using globus
Contents
About Globus
Globus is a secure, reliable research data management service. It works as a bridge between the remote Endpoint and the local Endpoint, rather than the cloud.
With Globus, subscribers can move, share, publish & discover data via a single
interface – whether your files live on a supercomputer, lab cluster, tape archive,
public cloud or your laptop, you can manage this data from anywhere, using your
existing identities, via just a web browser. --- from Globus
As for me, it’s a better alternative tool for rsync. I mean beeter, because it faster and easier to setup. BTW, I want to mention that it appears that Dropbox has given up Redhat or Centos guys. I mean I have tried kinds of way to walk around but failed on our cluster (OS: Centos 7.5
).
New User
1. create your globus ID on globus
If you have some organization account, link them. LBNL IT encourages employees use Globus for large data transfers, and create the Globus Web App for user login with their LBNL LDAP username/password (Do not need the OTP authorization).
2. Globus connect personal
Most new users might try to transfer data between two PCs. Before use Globus and internet to transfer your data, just set two endpoints one for the remote PC and the other for the local PC. After you login Globus, the interface will help you to manage your endpoints, e.g., Creat New personal Endpoint. Just follow the instructions step by step. Set up the endpoints on your remote and local computer, then you can start your transfers.
3. Transfer your data
From the last step, you can download the app and set up the preference on your computer. What I would suggest is that control the Access panel, which defines the accessible directories on your computer or your local endpoint. Then click the app and choose the ‘Web: Transfer fiels’ section. It will get you access to your globus account on your browser.
From the file Manager, choose the two panel mode for your two endpoints. Setup your Collection and path, then start. After globus submits your work, just wait your data at the directory you set.
Globus automated transfer
globus-sdk provides a convenient pythonic interface to Globus API. LBNL IT group creates the globus automated transfer codes based on Globus SDK package. It will be easier for unix guys to make everything done from the command line.
1. globus-sdk
globus-sdk requires python 2.7+ or 3.4+, while the globus automated transfer codes based on Globus SDK package requires python 3.4+. So I suggest to choose python 3.4+ to install globus-sdk as
pip install globus-sdk
For anaconda user on bearbin, we can install globus-sdk as
module load anaconda3/4.3.1
conda create -n globus_env python=3.6
conda install globus-sdk
2. globus automated transfer code
For globus automated transfer setup, you can also follow two instructions:
GDrive access Via Globus
just follow lbl IT support Gdrive access Via Globus.
Author F. Cheng
LastMod 2019-05-17