deeplearning/dl-op-matmul-transpose-b-benchmark.json

2025-10-13 07:43:23 UTC

dl-op-matmul-transpose-b-benchmark.json

NameTime (ms)CPU (ms)Iterations
DL_OPS_MATMUL_TRANSPOSE_B/scalar_O0/iterations:51.1e+031.09e+035
DL_OPS_MATMUL_TRANSPOSE_B/scalar_O3/iterations:52962965
DL_OPS_MATMUL_TRANSPOSE_B/scalar_O3_omp/iterations:536.324.15
DL_OPS_MATMUL_TRANSPOSE_B/vec/iterations:595.495.35
Console output
2025-09-07T12:46:24+00:00
Running ./dl-op-matmul-transpose-b-benchmark
Run on (24 X 5100 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x12)
  L1 Instruction 32 KiB (x12)
  L2 Unified 1280 KiB (x12)
  L3 Unified 30720 KiB (x1)
Load Average: 4.61, 3.68, 5.14
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------------------------------
Benchmark                                                     Time             CPU   Iterations
-----------------------------------------------------------------------------------------------
DL_OPS_MATMUL_TRANSPOSE_B/scalar_O0/iterations:5           1096 ms         1094 ms            5
DL_OPS_MATMUL_TRANSPOSE_B/scalar_O3/iterations:5            296 ms          296 ms            5
DL_OPS_MATMUL_TRANSPOSE_B/scalar_O3_omp/iterations:5       36.3 ms         24.1 ms            5
DL_OPS_MATMUL_TRANSPOSE_B/vec/iterations:5                 95.4 ms         95.3 ms            5
---------- Verification ----------
scalar_O3 PASS
scalar_O3_omp PASS
vec PASS