Update to use NullableArrays and CategoricalArrays#145
Conversation
| import DataStructures: OrderedDict | ||
|
|
||
| import Base: eltype, show, convert, isascii, | ||
| import Base: eltype, show, convert, isascii, isnull, |
There was a problem hiding this comment.
You shouldn't need import for this, merely using AFAICT.
There was a problem hiding this comment.
it is because RCall also has a isnull method.
test/dataframe.jl
Outdated
| @test rcopy(Nullable, RObject(1)).value == 1 | ||
| @test rcopy(Nullable, RObject("abc")).value == "abc" | ||
| @test rcopy(RObject(Nullable(1))) == 1 | ||
| @test rcopy(Nullable, RObject(Nullable(1, true))).isnull |
There was a problem hiding this comment.
You should really avoid accessing the isnull field directly, as it's likely to go away soon (JuliaLang/julia#18510). Use the accessor instead. (This also applies in other places.)
test/dataframe.jl
Outdated
| @test rcopy(DataArray,"c(NA,'NA')").na == @data([NA,"NA"]).na | ||
| @test_throws ErrorException rcopy(DataArray,"as.factor(c('a','a','c'))") | ||
| @test rcopy(PooledDataArray,"as.factor(c('a','a','c'))").pool == ["a","c"] | ||
| @test rcopy(NullableArray,"c(NA,TRUE)").isnull == NullableArray([true,true], [true,false]).isnull |
There was a problem hiding this comment.
Same here for NullableArray. You can use isequal to test for equality of both contents and missingness, which is even better.
There was a problem hiding this comment.
is there any method to compare the following withoout values?
isequal(NullableArray([1,2,3]), collect(1:3)) # true
isequal(NullableArray([1,2,3]).values, collect(1:3)) # falseThere was a problem hiding this comment.
Yes, you need to do NullableArray(1:3).
There was a problem hiding this comment.
isequal(NullableArray(1:3), 1:3) or isequal(NullableArray(1:3), collect(1:3)) both return false..
There was a problem hiding this comment.
I meant isequal(rcopy(NullableArray,"c(NA,TRUE)"), NullableArray([true,true], [true,false])and so on (i.e. convert both sides to aNullableArray`).
src/convert-data.jl
Outdated
| function sexp{T<:Compat.String,N,R<:Integer}(v::$typ{T,N,R}) | ||
| rv = protect(sexp(v.refs)) | ||
| try | ||
| for (i,isna) = enumerate(v.refs .== 0) |
There was a problem hiding this comment.
v.refs .== 0 will allocate a vector. Better do == 0 inside the loop.
src/convert-data.jl
Outdated
| rv[i] = naeltype(eltype(rv)) | ||
| end | ||
| end | ||
| setattrib!(rv, Const.LevelsSymbol, sexp(v.pool.index)) |
There was a problem hiding this comment.
Please use CategoricalArrays.index(v), ordered(v) and levels(v). Private fields should really not be accessed directly as they could change.
There was a problem hiding this comment.
CategoricalArrays.index does not work?
julia> v = CategoricalArray(repeat(["a", "b"], inner = 5));
julia> CategoricalArrays.levels(v)
2-element Array{String,1}:
"a"
"b"
julia> CategoricalArrays.index(v)
ERROR: MethodError: no method matching index(::CategoricalArrays.CategoricalArray{String,1,UInt32})
Closest candidates are:
index(::CategoricalArrays.CategoricalPool{T,R<:Integer,V}) at /Users/Randy/.julia/v0.5/CategoricalArrays/src/pool.jl:161There was a problem hiding this comment.
Ah, indeed, at first I didn't intend it to be used outside of the package. For now the best approach is CategoricalArrays.index(v.pool). I'll try to see if I can improve the API in next releases.
TODO