Skip to content

Commit cb1b7b5

Browse files
committed
Bugfix: we accidentally did signed comparisons instead of unsigned
1 parent 6cb4a8e commit cb1b7b5

File tree

4 files changed

+346
-36
lines changed

4 files changed

+346
-36
lines changed

README.md

Lines changed: 41 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -19,46 +19,54 @@ func goVectorEquals[T uint8 | uint16 | uint32 | uint64 | float32 | float64](dst
1919
}
2020
```
2121

22-
Similar to VectorEquals we also implement VectorGreaterThan, VectorLessThan, VectorGreaterEquals and VectorLesserEquals.
22+
Similar to VectorEquals we also implement VectorNotEquals, VectorGreaterThan, VectorLessThan, VectorGreaterEquals and VectorLesserEquals.
2323

2424
## Benchmark
2525

2626
```
2727
goos: linux
2828
goarch: amd64
2929
pkg: github.com/hexon/vectorcmp
30-
cpu: 13th Gen Intel(R) Core(TM) i9-13900
30+
cpu: 14th Gen Intel(R) Core(TM) i9-14900
3131
│ purego │ avx2 │
3232
│ sec/op │ sec/op vs base │
33-
VectorEquals8-32 191765.5n ± 2% 912.5n ± 2% -99.52% (p=0.000 n=10)
34-
VectorGreaterThan8-32 189871.0n ± 1% 899.4n ± 1% -99.53% (p=0.000 n=10)
35-
VectorLessThan8-32 189939.5n ± 2% 903.3n ± 1% -99.52% (p=0.000 n=10)
36-
VectorGreaterEquals8-32 200345.0n ± 1% 969.2n ± 3% -99.52% (p=0.000 n=10)
37-
VectorLesserEquals8-32 198189.5n ± 2% 973.8n ± 3% -99.51% (p=0.000 n=10)
38-
VectorEquals16-32 187.738µ ± 2% 1.892µ ± 7% -98.99% (p=0.000 n=10)
39-
VectorGreaterThan16-32 191.861µ ± 1% 1.890µ ± 6% -99.01% (p=0.000 n=10)
40-
VectorLessThan16-32 189.357µ ± 2% 1.894µ ± 4% -99.00% (p=0.000 n=10)
41-
VectorGreaterEquals16-32 201.101µ ± 2% 2.003µ ± 5% -99.00% (p=0.000 n=10)
42-
VectorLesserEquals16-32 197.853µ ± 1% 1.991µ ± 5% -98.99% (p=0.000 n=10)
43-
VectorEquals32-32 192.640µ ± 1% 3.685µ ± 5% -98.09% (p=0.000 n=10)
44-
VectorEqualsFloat32-32 209.408µ ± 1% 3.156µ ± 1% -98.49% (p=0.000 n=10)
45-
VectorGreaterThan32-32 192.983µ ± 1% 3.717µ ± 2% -98.07% (p=0.000 n=10)
46-
VectorGreaterThanFloat32-32 210.209µ ± 2% 3.156µ ± 1% -98.50% (p=0.000 n=10)
47-
VectorLessThan32-32 188.972µ ± 1% 3.729µ ± 4% -98.03% (p=0.000 n=10)
48-
VectorLessThanFloat32-32 211.101µ ± 3% 3.137µ ± 2% -98.51% (p=0.000 n=10)
49-
VectorGreaterEquals32-32 199.521µ ± 1% 3.859µ ± 4% -98.07% (p=0.000 n=10)
50-
VectorGreaterEqualsFloat32-32 217.290µ ± 1% 3.179µ ± 1% -98.54% (p=0.000 n=10)
51-
VectorLesserEquals32-32 200.432µ ± 1% 3.907µ ± 3% -98.05% (p=0.000 n=10)
52-
VectorLesserEqualsFloat32-32 215.699µ ± 2% 3.167µ ± 2% -98.53% (p=0.000 n=10)
53-
VectorEquals64-32 194.138µ ± 2% 6.628µ ± 4% -96.59% (p=0.000 n=10)
54-
VectorEqualsFloat64-32 228.983µ ± 1% 6.341µ ± 1% -97.23% (p=0.000 n=10)
55-
VectorGreaterThan64-32 191.010µ ± 2% 7.453µ ± 3% -96.10% (p=0.000 n=10)
56-
VectorGreaterThanFloat64-32 205.197µ ± 1% 6.357µ ± 1% -96.90% (p=0.000 n=10)
57-
VectorLessThan64-32 192.766µ ± 3% 7.443µ ± 3% -96.14% (p=0.000 n=10)
58-
VectorLessThanFloat64-32 212.995µ ± 2% 6.323µ ± 1% -97.03% (p=0.000 n=10)
59-
VectorGreaterEquals64-32 203.198µ ± 2% 7.407µ ± 3% -96.35% (p=0.000 n=10)
60-
VectorGreaterEqualsFloat64-32 220.453µ ± 2% 6.391µ ± 2% -97.10% (p=0.000 n=10)
61-
VectorLesserEquals64-32 201.495µ ± 2% 7.541µ ± 3% -96.26% (p=0.000 n=10)
62-
VectorLesserEqualsFloat64-32 215.303µ ± 3% 6.388µ ± 2% -97.03% (p=0.000 n=10)
63-
geomean 201.1µ 3.160µ -98.43%
33+
VectorEquals8-32 145594.5n ± 2% 628.0n ± 2% -99.57% (p=0.000 n=10)
34+
VectorNotEquals8-32 141155.0n ± 1% 679.7n ± 4% -99.52% (p=0.000 n=10)
35+
VectorGreaterThan8-32 146456.5n ± 2% 856.9n ± 2% -99.41% (p=0.000 n=10)
36+
VectorLessThan8-32 144820.5n ± 1% 858.8n ± 2% -99.41% (p=0.000 n=10)
37+
VectorGreaterEquals8-32 139554.0n ± 2% 928.1n ± 1% -99.33% (p=0.000 n=10)
38+
VectorLesserEquals8-32 141419.5n ± 1% 917.0n ± 3% -99.35% (p=0.000 n=10)
39+
VectorEquals16-32 151.078µ ± 1% 1.325µ ± 2% -99.12% (p=0.000 n=10)
40+
VectorNotEquals16-32 139.751µ ± 3% 1.359µ ± 4% -99.03% (p=0.000 n=10)
41+
VectorGreaterThan16-32 150.466µ ± 2% 1.682µ ± 1% -98.88% (p=0.000 n=10)
42+
VectorLessThan16-32 151.316µ ± 1% 1.700µ ± 2% -98.88% (p=0.000 n=10)
43+
VectorGreaterEquals16-32 139.931µ ± 2% 1.728µ ± 2% -98.77% (p=0.000 n=10)
44+
VectorLesserEquals16-32 138.835µ ± 1% 1.760µ ± 3% -98.73% (p=0.000 n=10)
45+
VectorEquals32-32 146.441µ ± 2% 2.626µ ± 1% -98.21% (p=0.000 n=10)
46+
VectorEqualsFloat32-32 153.134µ ± 1% 2.215µ ± 1% -98.55% (p=0.000 n=10)
47+
VectorNotEquals32-32 140.962µ ± 2% 2.663µ ± 2% -98.11% (p=0.000 n=10)
48+
VectorNotEqualsFloat32-32 160.325µ ± 2% 2.195µ ± 1% -98.63% (p=0.000 n=10)
49+
VectorGreaterThan32-32 144.302µ ± 3% 3.333µ ± 2% -97.69% (p=0.000 n=10)
50+
VectorGreaterThanFloat32-32 146.653µ ± 3% 2.209µ ± 2% -98.49% (p=0.000 n=10)
51+
VectorLessThan32-32 143.874µ ± 2% 3.348µ ± 3% -97.67% (p=0.000 n=10)
52+
VectorLessThanFloat32-32 147.214µ ± 1% 2.221µ ± 1% -98.49% (p=0.000 n=10)
53+
VectorGreaterEquals32-32 139.414µ ± 2% 3.506µ ± 0% -97.49% (p=0.000 n=10)
54+
VectorGreaterEqualsFloat32-32 151.919µ ± 1% 2.226µ ± 1% -98.54% (p=0.000 n=10)
55+
VectorLesserEquals32-32 142.315µ ± 3% 3.447µ ± 3% -97.58% (p=0.000 n=10)
56+
VectorLesserEqualsFloat32-32 152.132µ ± 2% 2.188µ ± 1% -98.56% (p=0.000 n=10)
57+
VectorEquals64-32 146.577µ ± 1% 4.468µ ± 3% -96.95% (p=0.000 n=10)
58+
VectorEqualsFloat64-32 152.734µ ± 2% 4.426µ ± 2% -97.10% (p=0.000 n=10)
59+
VectorNotEquals64-32 141.699µ ± 2% 4.506µ ± 5% -96.82% (p=0.000 n=10)
60+
VectorNotEqualsFloat64-32 164.610µ ± 2% 4.443µ ± 1% -97.30% (p=0.000 n=10)
61+
VectorGreaterThan64-32 146.472µ ± 2% 6.420µ ± 2% -95.62% (p=0.000 n=10)
62+
VectorGreaterThanFloat64-32 149.198µ ± 2% 4.423µ ± 1% -97.04% (p=0.000 n=10)
63+
VectorLessThan64-32 145.543µ ± 2% 6.190µ ± 3% -95.75% (p=0.000 n=10)
64+
VectorLessThanFloat64-32 146.221µ ± 2% 4.373µ ± 1% -97.01% (p=0.000 n=10)
65+
VectorGreaterEquals64-32 139.420µ ± 2% 6.651µ ± 2% -95.23% (p=0.000 n=10)
66+
VectorGreaterEqualsFloat64-32 151.540µ ± 2% 4.426µ ± 1% -97.08% (p=0.000 n=10)
67+
VectorLesserEquals64-32 139.964µ ± 1% 6.508µ ± 3% -95.35% (p=0.000 n=10)
68+
VectorLesserEqualsFloat64-32 151.295µ ± 1% 4.482µ ± 2% -97.04% (p=0.000 n=10)
69+
VectorIsNaNFloat32-32 23.555µ ± 1% 2.228µ ± 1% -90.54% (p=0.000 n=10)
70+
VectorIsNaNFloat64-32 23.687µ ± 1% 4.410µ ± 1% -81.38% (p=0.000 n=10)
71+
geomean 133.0µ 2.492µ -98.13%
6472
```

0 commit comments

Comments
 (0)