Conversation
Codecov Report
@@ Coverage Diff @@
## master #194 +/- ##
==========================================
+ Coverage 89.11% 89.15% +0.04%
==========================================
Files 6 6
Lines 450 452 +2
==========================================
+ Hits 401 403 +2
Misses 49 49
Continue to review full report at Codecov.
|
julia> @btime clamp.(x, 0.1N0f8, 0.9N0f8) setup=(x=collect(rand(N0f8, 512, 512)));
460.800 μs (2 allocations: 256.14 KiB) # before (Julia v1.4.2)
11.801 μs (2 allocations: 256.14 KiB) # after (Julia v1.4.2)It seems to be faster on Julia v1.6.0-DEV for large arrays. |
|
At least on Julia 1.5, doesn't this give the same codegen? julia> @code_typed clamp(0.5N0f8, 0N0f8, 1N0f8)
CodeInfo(
1 ─ %1 = Base.getfield(hi, :i)::UInt8
│ %2 = Base.getfield(x, :i)::UInt8
│ %3 = Base.ult_int(%1, %2)::Bool
│ %4 = Base.getfield(x, :i)::UInt8
│ %5 = Base.getfield(lo, :i)::UInt8
│ %6 = Base.ult_int(%4, %5)::Bool
│ %7 = Base.Math.ifelse(%6, lo, x)::Normed{UInt8,8}
│ %8 = Base.Math.ifelse(%3, hi, %7)::Normed{UInt8,8}
└── return %8
) => Normed{UInt8,8}
julia> newclamp(x::X, lo::X, hi::X) where {X <: FixedPoint} = X(clamp(x.i, lo.i, hi.i), 0)
newclamp (generic function with 1 method)
julia> @code_typed newclamp(0.5N0f8, 0N0f8, 1N0f8)
CodeInfo(
1 ─ %1 = Base.getfield(x, :i)::UInt8
│ %2 = Base.getfield(lo, :i)::UInt8
│ %3 = Base.getfield(hi, :i)::UInt8
│ %4 = Base.ult_int(%3, %1)::Bool
│ %5 = Base.ult_int(%1, %2)::Bool
│ %6 = Base.Math.ifelse(%5, %2, %1)::UInt8
│ %7 = Base.Math.ifelse(%4, %3, %6)::UInt8
│ %8 = %new(Normed{UInt8,8}, %7)::Normed{UInt8,8}
└── return %8
) => Normed{UInt8,8} |
|
I believed that too (#179 (comment)), and I don't know the cause, but the actual results are different (in v1.0.5, v1.4.2, v1.5.0-rc1 and v1.6.0-DEV). 😕 julia> @btime clamp(x,y,z) setup=(x=rand(N0f8); y=rand(N0f8); z=rand(N0f8)); # non-vectorized version is OK
1.099 ns (0 allocations: 0 bytes) # before (Julia v1.5.0-rc1)
1.099 ns (0 allocations: 0 bytes) # after (Julia v1.5.0-rc1)
julia> @btime clamp.(x, 0.1N0f8, 0.9N0f8) setup=(x=collect(rand(N0f8, 512, 512)));
160.399 μs (2 allocations: 256.14 KiB) # before (Julia v1.5.0-rc1)
15.300 μs (2 allocations: 256.14 KiB) # after (Julia v1.5.0-rc1) |
|
That is pretty crazy. I don't understand either, but this is fine. Since it is surprising, perhaps add a code-comment linking to this PR? Then fine to merge. |
|
I think it's a common thing on Julia's auto-vectorization. 😅 Edit: |
This adds a specialized method for
clampwith all arguments in the sameFixedPointtype. This also supports the newclamp(x, T)method (cf. JuliaLang/julia#34426).This partially solves #179, but this is not helpful in cases where promotions occur.
cc: @johnnychen94