Skip to content

Distributed external memory support for GPU, Lazy XGBoost GPU application to Dask RAPIDS partitions #5851

Closed
@declan-hernon

Description

@declan-hernon

Hi guys,

I have a couple of feature requests in the same ball park:

  • Distributed external memory support for GPU
  • Lazy XGBoost GPU application to Dask RAPIDS partitions

In a nutshell, I am working with a very large data set currently, and despite all my effort into minimising the amount of memory used (I.e., ensuring each feature uses no more data than it needs by specifying data types, using gradient based subsampling), it still cannot fit into memory on multiple GPUs.

I am on a cloud provider, so of course I could distribute this data to many machines - however, by my reckoning to fit all data in memory I would need thousands of GPUs. At this scale, this often involves raising many requests internally in order to get approval to raise quota limits on the cloud. On top of this, I don't need to train models particularly fast - typically they will be trained at most on a weekly interval. So why GPUs? Because I think they will be fundamentally cheaper than CPU - a single V100 took only half an hour longer than 72 Xeon chips, despite being half the cost.

Therefore, I think a nice feature would be if I could load data from libsvm format across many GPUs in a cluster, or alternatively, if there was a method to lazily apply XGBoost to Dask partitions rather than forcing Dask to persist in memory.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions