Skip to content
This repository was archived by the owner on May 23, 2022. It is now read-only.

Quick patch to export ObsDim again #42

Closed

Conversation

darsnack
Copy link
Member

@darsnack darsnack commented Aug 9, 2020

Fixes #41. Can I get some guidance on why StatsBase interfaces, like nobs, were removed? It is needed to MLDataPattern.jl.

@juliohm
Copy link
Member

juliohm commented Aug 14, 2020

We were trying to simplify the codebase and to get rid of this explicit dimension types in favour of keyword arguments like dims=1. Back in the days of the language keyword arguments were a bit slow as far as I understand, and that motivated this complex types for dimensions (which are just integers). Ideally we will replace all the dependencies downstream to use simple dims arguments with integers, but I am not sure how hard it is to do that in MLDataPattern.jl

Do you feel that this needs to be re-added to LearnBase.jl? Would it make sense to update MLDataPattern.jl instead to avoid these types?

@darsnack
Copy link
Member Author

Sure I can submit a PR to MLDataPattern.jl instead. What's the preferred replacement for StatsBase.nobs?

@juliohm
Copy link
Member

juliohm commented Aug 14, 2020

To be honest with you, I can't answer this question properly at the moment. I would need to investigate MLDataPattern.jl in more detail in order to provide a good answer. Maybe @oxinabox can add some comments? As far as I know he wrote most of the observation dimension work. I will try to devote some time over the weeks to learn what is available in MLDataPattern.jl meanwhile.

@darsnack you've been recently working on a lot ML-related packages right? Are they being developed in an organization already? I am sorry I've been quite busy with work and with the GeoStats.jl stack lately, can't find the time to catch up the ongoing initiatives you guys are developing.

@darsnack
Copy link
Member Author

They will probably be moved to FluxML when they are ready. The tracker is probably the closest thing to a single source. In a couple of weeks I'll try to get things organized more under a single banner. Currently FluxModels.jl and FastAI.jl are the current community efforts.

I'll try to file issues on stuff like nobs as they come up when I refactor MLDataPattern.jl.

@oxinabox
Copy link
Member

Maybe @oxinabox can add some comments? As far as I know he wrote most of the observation dimension work

Not really, @Evizero wrote basically all of that stuff.

@Evizero
Copy link
Member

Evizero commented Aug 15, 2020

Right, so the ObsDim types were used for dispatch. Basically we wanted to support the two schools of thought when it comes to which dimension enumarates the observations. Doing this as a type allowed this to have no runtime overhead.

I used to do a lot of trickery to both support function signatures using just positional arguments that were type stable, as well as convenient keyword arguments that were not. In a sense the methods that used positional arguments were the backend of the convenient methods with the simple keyword arguments. This allowed people who really needed to preserve type stability to use the stable ones in their code, while it also allowed for a simple interface for working interactively.

One could argue that I went a bit too far, even for the time when performances characteristics of julia were more susceptible to these factors, which is fair. Lets just say designing/writing performant julia code used to be somewhat of an obsession of mine.

StatsBase.nobs was simply imported to not introduce our own function name for that concept. thats the whole reason. I don't think we really use the StatsBase implementation anywhere except maybe for dataframe support in MLDataPatterns

@racinmat
Copy link

So currently if we want to use ObsDim in MLDataPattern and other libraries, explicit import in the library e.g. like in this PR JuliaML/MLLabelUtils.jl#40 should suffice?

Or is there plan to deprecate ObsDim and thus it should not be used?

@darsnack
Copy link
Member Author

I have a version of LearnBase.jl locally that fully removes ObsDim in favor of a dim keyword argument. I mainly haven't pushed it because

  1. The right thing would be to update LearnBase.jl and other JuliaML packages somewhat concurrently. (To avoid the same situation as ObsDim not being exported).
  2. My motivation is mainly changes to MLDataPattern.jl, and I haven't done the due diligence of making sure the refactored interface works.

But maybe I should just make the PR so we can discuss what the future interface should look like.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ObsDim is not exported and out of sync with MLLabelUtils?
5 participants