Distributed external memory support for GPU, Lazy XGBoost GPU application to Dask RAPIDS partitions

Hi guys,

I have a couple of feature requests in the same ball park:

- Distributed external memory support for GPU
- Lazy XGBoost GPU application to Dask RAPIDS partitions

In a nutshell, I am working with a very large data set currently, and despite all my effort into minimising the amount of memory used (I.e., ensuring each feature uses no more data than it needs by specifying data types, using gradient based subsampling), it still cannot fit into memory on multiple GPUs.

I am on a cloud provider, so of course I could distribute this data to many machines - however, by my reckoning to fit all data in memory I would need _thousands_ of GPUs. At this scale, this often involves raising many requests internally in order to get approval to raise quota limits on the cloud. On top of this, I don't need to train models particularly fast - typically they will be trained at most on a weekly interval. So why GPUs? Because I think they will be fundamentally cheaper than CPU - a single V100 took only half an hour longer than 72 Xeon chips, despite being half the cost. 

Therefore, I think a nice feature would be if I could load data from libsvm format across many GPUs in a cluster, or alternatively, if there was a method to lazily apply XGBoost to Dask partitions rather than forcing Dask to persist in memory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Distributed external memory support for GPU, Lazy XGBoost GPU application to Dask RAPIDS partitions #5851

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Distributed external memory support for GPU, Lazy XGBoost GPU application to Dask RAPIDS partitions #5851

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions