Skip to content

Combined hash and fill AES loop#166

Merged
tevador merged 3 commits intotevador:masterfrom
SChernykh:pr-apple
Dec 1, 2019
Merged

Combined hash and fill AES loop#166
tevador merged 3 commits intotevador:masterfrom
SChernykh:pr-apple

Conversation

@SChernykh
Copy link
Collaborator

Adds more parallelizm into AES loop so modern CPUs can take advantage of it. Also, scratchpad data moves between L1 and L3 caches only one time which saves time and energy per hash.

Adds more parallelizm into AES loop so modern CPUs can take advantage of it. Also, scratchpad data moves between L1 and L3 caches only one time which saves time and energy per hash.
}

void randomx_calculate_hash_first(randomx_vm* machine, uint64_t (&tempHash)[8], const void* input, size_t inputSize) {
void randomx_calculate_hash_first(randomx_vm* machine, uint64_t *tempHash, const void* input, size_t inputSize) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For typechecking purposes you could still typedef this param to uint64_t foo[8] and use the type in the parameter lists.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, uint64_t tempHash[8] could be more readable in the API and it decays into a pointer anyways.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that would have prevented that sizeof() bug too.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the "first" and "next" hash function in randomx.cpp, the miner (mine which based on this repo, not xmirg) receives "Low difficulty share" from pool.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to be careful using these new functions, check how benchmark.cpp handles them and where it updates the nonce.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My miner works fine on this workflow

	uint64_t tempHash[8];
	while (nonce < noncesCount) {
		nonce = atomicNonce.fetch_add(1);
		store32(noncePtr, nonce);
	        randomx_calculate_hash_first(vm, tempHash, blockTemplate, sizeof(blockTemplate));
		randomx_calculate_hash_next(vm, tempHash, blockTemplate, sizeof(blockTemplate), &hash);
		result.xorWith(hash);
	}

But when the randomx_calculate_hash_first func is out of while loop, the miner gets the wrong result.

	uint64_t tempHash[8];

	store32(noncePtr, nonce);
	randomx_calculate_hash_first(vm, tempHash, blockTemplate, sizeof(blockTemplate));

	while (nonce < noncesCount) {
		nonce = atomicNonce.fetch_add(1);
		store32(noncePtr, nonce);
		randomx_calculate_hash_next(vm, tempHash, blockTemplate, sizeof(blockTemplate), &hash);
		result.xorWith(hash);
	}

@tevador tevador merged commit 219c02e into tevador:master Dec 1, 2019
SmajeNz0 pushed a commit to SmajeNz0/RandomARQ that referenced this pull request May 10, 2020
Adds more parallelizm into AES loop so modern CPUs can take advantage of it. Also, scratchpad data moves between L1 and L3 caches only one time which saves time and energy per hash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants