Description
Having levels
automatically unwrap CategoricalValue
s is a bit inconsistent with the fallback defined in DataAPI, which is equivalent to sort!(unique(x))
. This is mostly historical, as levels
was defined in CategoricalArrays first. But it can make some implementations more complex when the goal is to return a CategoricalArray
(JuliaData/DataFrames.jl#3031 (comment)).
So it could make sense to make levels
return a CategoricalArray
in the next breaking release. This should be mostly transparent for users as long as they don't apply functions like string operations to levels. This could end up being inconvenient in some cases though. It would also force allocating a new array each time levels
is called (as we probably don't want to store a CategoricalArray
inside the CategoricalPool
for that).
Cc: @bkamins