Hugging Face Accelarate #585

amaarora · 2021-04-25T07:44:00Z

amaarora
Apr 25, 2021

Hey @rwightman ! Here's a discussion topic - should we switch to Hugging Face Accelarate for DDP? As someone who's spent quite a bit of time in timm, I am of the belief that it's not straightforward to switch but could perhaps simplify code? This will help get rid of number of operations performed under if args.distributed: condition inside train.py. Will also help remove the need for having a Distributed Sampler inside train and eval dataloaders.. (& more?)

Or do you think this is not needed at this stage?

rwightman · 2021-04-25T18:43:10Z

rwightman
Apr 25, 2021
Maintainer

@amaarora I like the idea behind Accelerate, I 100% agree with this approach. Sylvain mentioned it to me a while back when I said I was working with TPUs. I had started working on a device wrapper of my own. I considered using Accelerate instead but there were a few things I prefered about my approach.

My device wrapper is called DeviceEnv (kudos to Sylvain and HF for having a much sexier name...). It similarly combines the device id, distributed initialization, and wrapping of some device transfer/DDP etc in a common interface. I split out the optimizer aspect into a different abstraction called Updater that deals with backward / loss scale / grad modification (clip) / step.

I have a DeviceEnvXla and DeviceEnvCuda working so far. DeviceEnvXla works with XLA in TPU, GPU, or CPU. I've been running training locally on 2X GPU w/ AMP using XLA and 8x TPU. I am about to add a DeviceEnvDeepSpeed (requires a few mods to how I initialize the model/optimizer (updater)).

It's on this branch https://github.com/rwightman/pytorch-image-models/tree/bits_and_tpu ... I was going to msg you and Tanishq this week once I push another commit to squash a few obvious bugs. I still have a number of things to improve.

1 reply

amaarora Apr 27, 2021
Author

I am amazed at the speed at which you push things out! Thanks Ross for this!

Definitely keen to be a part of this. I am pretty sure I will be able to find sometime this weekend to play & try out the branch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hugging Face Accelarate #585

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Hugging Face Accelarate #585

amaarora Apr 25, 2021

Replies: 1 comment · 1 reply

rwightman Apr 25, 2021 Maintainer

amaarora Apr 27, 2021 Author

amaarora
Apr 25, 2021

Replies: 1 comment 1 reply

rwightman
Apr 25, 2021
Maintainer

amaarora Apr 27, 2021
Author