Open
Description
Operation like splitobs
, shuffleobs
and many more return ObsView
s that one has to call getobs
on in order to materialize.
I think this is unexpected for users coming from scikit-learn and mildly annoying in most scenarios.
As a default, operations on materialized objects should return materialized objects (e.g. arrays and dataframes).
Users will be able to opt-in on the "lazy" by wrapping data in a ObsView
. Operations on ObsView will produce other ObsView that can be materialized only at the end of the pipeline.