rclone is pretty cool
I haven’t used rclone much in the past. A few clients have used it for a few specific things and I never really got it.
If you’re not familiar it is a command line utility that provides a richer
experience for working with files in cloud buckets. Things like rsync, cp, mv,
mount, ls, ncdu, tree, rm, and cat. All with --dry-run
protection.
Apparently their tagline is: “The Swiss army knife of cloud storage”
I can confirm this knife is sharp!
Copying lots of files between two buckets
Today I was faced with the task of moving 80GBs (336k files) from a “dev” Digital Ocean Spaces Bucket (aka S3 bucket) to the “prod” bucket for an upcoming client launch.
I was not really looking forward to this task, thinking I would probably use some Python (with boto3) or maybe my mac Transmit GUI app to suck down the 80GBs to my laptop and then upload it again.
Then I happened to remember rclone
so I checked to see if maybe it could do
this for me easily and it could!
Installation and Configuration
Here is all that was required. I exec’ed into one of the Python/Django containers already running in a Digital Ocean managed Kubernetes cluster and did the following:
First, you’ll need to have rclone
installed. I also installed vim
as I knew
I was going to need to edit the rclone.conf
. Then I created /root/.config/rclone/rclone.conf
with the following:
[do]
type = s3
provider = DigitalOcean
env_auth = false
access_key_id = <your_access_key>
secret_access_key = <your_secret_key>
endpoint = sfo3.digitaloceanspaces.com
acl = public-read
Copying all the files between the buckets
And that’s it! Now I can sync the entire file tree in my “dev” bucket to my “prod” one with a single relatively easy to understand command:
Let me break this down for you. We’re wanting to sync files, so we’re asking it to make the destination match the source, including modification times.
The do:
prefix is important to rclone
’s arguments. This is the “name” we’ve
given our set up in the config file above. In my use case I wasn’t transfering
between cloud providers or even regions so I could reuse the same “config”.
However, if you DO need to work across several different cloud providers or different regions within the same provider all you need to do is define another set of configs for it and use the names you set as the prefixes to rclone’s commands.
I’ll definitely be using rclone more in the near future. Hope you found this example helpful!
Frank Wiles
Founder of REVSYS and former President of the Django Software Foundation . Expert in building, scaling and maintaining complex web applications. Want to reach out? Contact me here or use the social links below.