.. | ||
internal/compress | ||
error.go | ||
hasher.go | ||
hasher224.go | ||
hasher256.go | ||
LICENSE | ||
README.md |
blake256
Overview
Package blake256
implements the BLAKE-256 and BLAKE-224 cryptographic hash
functions (SHA-3 candidate) in pure Go
along with highly optimized SSE2, SSE4.1, and AVX acceleration.
It provides an API that enables zero allocations and the ability to save and restore the intermediate state (also often called the midstate). The design philosophy has a strong on emphasis correctness, readability, and efficiency while also aiming to provide an ergonomic API.
In addition to the zero allocation API, it also implements the standard library
interfaces hash.Hash
, encoding.BinaryMarshaler
, and
encoding.BinaryUnmarshaler
for callers that are not as concerned about
avoiding allocations. No dependencies beyond the standard library are required.
A full suite of tests with 100% branch coverage and benchmarks are provided to help ensure proper functionality and analyze performance characteristics.
The core assembly code to take advantage of the amd64
SIMD vector extensions
is generated with Go via avo.
Show me the benchmarks already!
Hashing Data
The simplest way to hash data that is already serialized into bytes is via the
global Sum224
(BLAKE-224) or Sum256
(BLAKE-256) functions. This is
demonstrated for BLAKE-256 via the "Basic Usage" example linked in the
Examples section.
However, since hashing typically involves writing various pieces of information
that aren't already serialized, this package provides NewHasher224
(BLAKE-224)
and NewHasher256
(BLAKE-256) (and their respective variants NewHasher224Salt
and NewHasher256Salt
that accept salt).
These methods return rolling hasher instances that support writing an arbitrary
amount of data along with several convenience methods for writing various data
types in either big endian or little endian. For example, WriteString
adds a
string encoded as its UTF-8 byte sequence to the rolling hash and
WriteUint64BE
adds an unsigned 64-bit integer encoded as an 8-byte big-endian
byte sequence to the rolling hash.
The hash is then obtained via the Sum224
(BLAKE-224) or Sum256
(BLAKE-256)
method on the respective hasher instance.
See the "Rolling Hasher Usage" example linked in the Examples section to see rolling hashing in action.
Saving and Resuming Intermediate States
Many applications involve hashing data that always starts with the same sequence
of bytes (aka a shared prefix). Whenever that prefix is larger than the block
size (BlockSize
), or it is otherwise costly to generate and serialize, it is
typically more efficient to save the intermediate state (midstate) after writing
the shared prefix so that all future hashes can resume from that midstate and
thereby avoid redoing work.
To that end, the aforementioned rolling hasher instances support being copied to save and restore the current midstate within the same process. This is demonstrated via the "Same Process Save and Restore" example linked in the Examples section.
Alternatively, when a simple copy of the instance is not possible, such as when
the midstate is needed among multiple processes, perhaps on entirely different
hardware, it can be serialized via SaveState
and restored via
UnmarshalBinary
. Note that there is necessarily additional overhead involved
with serializing and deserializing the intermediate state, so callers should be
sure to compare that overhead with rehashing the shared data to see which
approach yields better results for their particular application.
Hashing With Salt
This implementation also provides NewHasher224Salt
(BLAKE-224) and
NewHasher256Salt
(BLAKE-256) which accept a 16-byte salt input as described by
the specification. Hashing with distinct salts effectively provides an
efficient method to hash with different functions while using the same
underlying algorithm. The salted variants behave exactly the same as the normal
unsalted variants described throughout the documentation.
Benchmarks
The following benchmarks are from a Ryzen 7 5800X3D processor on Linux and are
the result of feeding benchstat
10 iterations of each. Benchmarks for both
BLAKE-224 and BLAKE-256 are provided. They are essentialy identical (within the
margin of error) as expected since the only notable difference as it pertains to
performance is that the final output is 4 bytes shorter.
BLAKE-256 Hashing Benchmarks
The following results demonstrate the performance of hashing various amounts of
data for both small and larger inputs with the Sum256
method.
Operation | Pure Go | SSE2 | SSE4.1 | AVX |
---|---|---|---|---|
Sum256 (32b) |
168MB/s ± 1% | 188MB/s ± 1% | 232MB/s ± 0% | 234MB/s ± 1% |
Sum256 (64b) |
187MB/s ± 0% | 208MB/s ± 0% | 270MB/s ± 1% | 271MB/s ± 1% |
Sum256 (1KiB) |
378MB/s ± 1% | 421MB/s ± 1% | 536MB/s ± 1% | 539MB/s ± 1% |
Sum256 (8KiB) |
405MB/s ± 1% | 448MB/s ± 0% | 573MB/s ± 0% | 573MB/s ± 0% |
Sum256 (16KiB) |
402MB/s ± 1% | 449MB/s ± 0% | 575MB/s ± 0% | 575MB/s ± 0% |
Operation | Pure Go | SSE2 | SSE4.1 | AVX | Allocs / Op |
---|---|---|---|---|---|
Sum256 (32b) |
190ns ± 1% | 170ns ± 1% | 138ns ± 0% | 137ns ± 1% | 0 |
Sum256 (64b) |
342ns ± 0% | 308ns ± 0% | 237ns ± 1% | 236ns ± 1% | 0 |
Sum256 (1KiB) |
2.71µs ± 1% | 2.43µs ± 1% | 1.91µs ± 1% | 1.90µs ± 1% | 0 |
Sum256 (8KiB) |
20.2µs ± 1% | 18.3µs ± 0% | 14.3µs ± 0% | 14.3µs ± 0% | 0 |
Sum256 (16KiB) |
40.8µs ± 1% | 36.5µs ± 0% | 28.5µs ± 0% | 28.5µs ± 0% | 0 |
BLAKE-224 Hashing Benchmarks
The following results demonstrate the performance of hashing various amounts of
data for both small and larger inputs with the Sum224
method.
Operation | Pure Go | SSE2 | SSE4.1 | AVX |
---|---|---|---|---|
Sum224 (32b) |
171MB/s ± 1% | 188MB/s ± 1% | 232MB/s ± 1% | 234MB/s ± 1% |
Sum224 (64b) |
187MB/s ± 2% | 209MB/s ± 1% | 269MB/s ± 1% | 271MB/s ± 1% |
Sum224 (1KiB) |
378MB/s ± 1% | 423MB/s ± 1% | 539MB/s ± 1% | 536MB/s ± 1% |
Sum224 (8KiB) |
404MB/s ± 1% | 447MB/s ± 1% | 577MB/s ± 1% | 577MB/s ± 0% |
Sum224 (16KiB) |
401MB/s ± 1% | 453MB/s ± 0% | 577MB/s ± 0% | 577MB/s ± 0% |
Operation | Pure Go | SSE2 | SSE4.1 | AVX | Allocs / Op |
---|---|---|---|---|---|
Sum224 (32b) |
187ns ± 1% | 170ns ± 1% | 138ns ± 1% | 137ns ± 1% | 0 |
Sum224 (64b) |
342ns ± 2% | 306ns ± 1% | 238ns ± 1% | 236ns ± 1% | 0 |
Sum224 (1KiB) |
2.71µs ± 1% | 2.42µs ± 1% | 1.90µs ± 1% | 1.91µs ± 1% | 0 |
Sum224 (8KiB) |
20.3µs ± 1% | 18.3µs ± 1% | 14.2µs ± 1% | 14.2µs ± 0% | 0 |
Sum224 (16KiB) |
40.9µs ± 1% | 36.2µs ± 0% | 28.4µs ± 0% | 28.4µs ± 0% | 0 |
State Serialization Benchmarks
The following results demonstrate the performance of serializing the
intermediate state for both BLAKE-224 and BLAKE-256 using the zero-alloc
SaveState
method versus the standard library encoding.MarshalBinary
interface.
Metric | MarshalBinary |
SaveState |
Delta |
---|---|---|---|
Time / Op | 40.6ns ± 1% | 16.0ns ± 0% | -60.60% (p=0.000 n=10+10) |
Allocs / Op | 1 | 0 | -100.00% (p=0.000 n=10+10) |
Disabling Assembler Optimizations
The purego
build tag may be used to disable all assembly code.
Additionally, when built normally without the purego
build tag, the assembly
optimizations for each of the supported vector extensions can individually be
disabled at runtime by setting the following environment variables to 1
.
BLAKE256_DISABLE_AVX=1
: Disable Advanced Vector Extensions (AVX) optimizationsBLAKE256_DISABLE_SSE41=1
: Disable Streaming SIMD Extensions 4.1 (SSE4.1) optimizationsBLAKE256_DISABLE_SSE2=1
: Disable Streaming SIMD Extensions 2 (SSE2) optimizations
The package will automatically use the fastest available extensions that are not disabled.
Examples
- Basic Usage
Demonstrates the simplest method of hashing an existing serialized data buffer with BLAKE-256. - Rolling Hasher Usage
Demonstrates creating a rolling BLAKE-256 hasher, writing various data types to it, computing the hash, writing more data, and finally computing the cumulative hash. - Same Process Save and Restore
Demonstrates creating a rolling BLAKE-256 hasher, writing some data to it, making a copy of the intermediate state, restoring the intermediate state in multiple goroutines, writing more data to each of those restored copies, and computing the final hashes.
Installation and Updating
This package is part of the github.com/decred/dcrd/crypto/blake256
module.
Use the standard go tooling for working with modules to incorporate it.
License
Package blake256 is licensed under the copyfree ISC License.