Skip to content

tchiwam/ptrbench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Simple bench mark to try find the limits of cache versus flops speed.

benchfloat:

For example many very fast vector processors can handle a stream of 2 ops (1 muladd) per main memory fetch
While intel processors can do up to 20 ops per main memory fetch.

benchcast:

Find the best way for you to convert int8 and int16 into floating point. On some processors and compiler combination, the casting can be very expensive

This can help you to divide some functions in stages of no more than N ops per memory fetch and keep all in the inner cache While another thread is fetching and preparing the stream to be processed.



About

Benchmark to find the number of ops per memory fetch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors