The mnemonic is an encoding method that converts randomly generated bit-stream into word-sequence to make them easier for humans to remember.

Here’s an example of a randomly generated mnemonic sequence (abbreviated as MS) (‼️Never use this on mainnets‼️):

churn hire action decline avocado banner host kangaroo chat coffee thumb muscle brass ladder urban reform window jar poet letter artwork fiscal pony present

A private key generated by an algorithm through random entropy sampling is a bit-stream that looks like 01010101...101010. When generating a private key, users can specify the private key length, which typically comes in 128-bit, 256-bit, or even 512-bit.

Under the same conditions, the longer the private key length, the more difficult it is to crack the key through brute force, which means higher security. Currently, mainstream wallets use a default private key length of 256 bits.

It’s almost impossible for humans to memorize a 256-bit stream for long periods, even with specific encoding formats like Base58.

Therefore, mnemonics encode bit-stream into word-sequence for easier memorization.

Below, I will introduce the mnemonic generation process.

When generating private keys, the key generation algorithm requires some source of randomness from the external world as input to ensure the randomness of the private key. The parameter that describes the strength of this randomness is called entropy (abbreviated as ENT). The larger the entropy, the stronger the randomness. In this article, entropy is the user-specified private key length.

Assuming the user specifies entropy as ENT, the key generation algorithm will generate a random bit-stream of length ENT, called the private key (SecretKey, abbreviated as SK). The checksum (abbreviated as CS) of the private key is defined as the first ENT/32 bits of the SHA256 hash of the private key.

Concatenating the private key and checksum results in a longer bit-stream with a length of ENT + CS.

Prepare a list of words is prepared, containing 2048 (2^11) common words, with each word assigned an index (Index: 0000000000000000 - 1111111111111111).

The bit-stream is divided into chunks of 11 bits each, and for each chunk’s index value, the corresponding word is found from the common word list.

All words are concatenated together to form the mnemonic sequence.

At this point, we have obtained the mnemonic that is easy for humans to remember.

Finally, here’s a summary of the relevant formulas:

CS = ENT / 32
MS = (ENT + CS) / 11

|  ENT  | CS | ENT+CS |  MS  |
+-------+----+--------+------+
|  128  |  4 |   132  |  12  |
|  160  |  5 |   165  |  15  |
|  192  |  6 |   198  |  18  |
|  224  |  7 |   231  |  21  |
|  256  |  8 |   264  |  24  |
|  512  | 16 |   528  |  48  |

Note: In actual blockchain industry, the mnemonic is not strictly equal to the account/address private key but is called the Master Key. Then, through a derivation rule, a series of child private keys are derived to serve as the actual on-chain IDs.

Here is an A4 standard printed word list, a page that encompasses all wealth screts on blockchains. But you cannot crack it - this is the power of cryptography.

Some useful resources: