Refactoring of codebase #35
Description
Dear all,
In this issue I would like to discuss a refactoring of LearnBase.jl to accommodate more general problems under transfer learning settings. Before I can do this, I would like to get your feedback on a few minor changes. These changes should facilitate a holistic view of the interface, and should help shape the workflow that developers are expected to follow (see #28).
Below are a few suggestions of improvement that I would like to consider.
Suggestions of improvement
-
Split the main LearnBase.jl file into smaller source files with more specific concepts. For example, I'd like to review the
Cost
interface in a separate file calledcosts.jl
. Similarly, we could move the data orientation interface to a separate fileorientation.jl
and include these two files inLearnBase.jl
. -
Can we get rid of all exports in the module? I understand that this module is intended for use by developers who would
import LearnBase; const LB = LearnBase
in their code. Exporting all the names inLearnBase.jl
can lead to problems downstream like the fact that LossFunctions.jl was not exporting the abstractSupervisedLoss
type, and then users ofLossFunctions.jl
would also need to importLearnBase.jl
just to get access to the name. My suggestion here is to define the interface without exports. And then each package in JuliaML can export the relevant concepts. -
The interface for learning models is currently spread over various different Julia ecosystems. In most cases, there are two functions that developers need to implement (e.g.
fit/predict
,model/update
,fit/transform
). I would like to do a literature review on the existing approaches, and generalize this to transfer learning settings. This generalization shouldn't force users to subtype their models from someModel
type. A traits-based interface is ideal for developers who want to plug their models after the fact, and developers interested in fitting entire pipelines (e.g. AutoMLPipeline.jl).
I would like to start addressing (1) and (2) in the following weeks. In order to address (3) I need more time to investigate and brainstorm a more general interface.