HD Wallets – BIPs and Terminology


The advent of HD wallets has made key management a far easier task for cryptocurrency users. These “Hierarchical Deterministic” wallets can generate an infinite amount of private keys and addresses from a single seed, eliminating the need for the periodic backups required by the old-style, nondeterministic wallets.

Despite the ease of backup, there are a few more moving parts inside an HD system than there are with traditional keypairs, so understanding how this works is a little more complex. Let’s get a high level overview of how HD wallets work and the associated terminology.

An HD Wallet Glossary

BIP, Boop, Bop – What’s With All the Proposals?

First, let’s discuss the BIPs involved with the most common HD wallet technologies. BIP stands for “Bitcoin Improvement Proposal” – this is a standard for proposing new additions to the Bitcoin protocol and software that is community driven.

BIP32 – Titled “Hierarchical Deterministic Wallets” , this BIP defines the basic specification for a protocol that generates addresses deterministically from a single seed, rather than randomly. In this BIP, Pieter Wiulle describes algorithms for taking a cryptographic seed and generating a tree of keys and addresses that can be reproduced again from that same seed.

BIP39 – Titled “Mnemonic code for generating deterministic keys”, this BIP describes a scheme for encoding the cryptographic seed as English words, making backups far easier and safer for end users. This is where those mnemonic seed phrases you may be used to seeing originated! So instead of having to backup a string of hexadecimal or base58 data, we can write down 12-24 words to backup an entire wallet!

BIP44 Titled “Multi-Account Hierarchy for Deterministic Wallets”, this BIP builds on top of the work done in BIP32 to make multi-currency, multi-account wallets possible. BIP44 makes implementing technologies like multi-currency wallets like Coinomi far easier, at it makes the handling of various cryptocurrencies standard, rather than up to the wallet developer to figure out.

Extended Keys (xprv, xpub) and Root vs. Other Keys

One of my YouTube viewers asked the question: “What’s the difference between a BIP32 root key and BIP32 extended private key when they both use xprv as the prefix?

Well, let’s remember that HD wallets generate keypairs in a deterministic, or predictable manner. It turns out that this structure takes the form of a tree of keys – meaning there is a tree root and many branches generated from that root.

The Root Key is generated directly from the wallet seed. This is the key that is generated after taking the user’s mnemonic seed phrase, returning it to a binary format, and running it through a hashing algorithm called HMAC-SHA512. The root key is the very top of the tree of addresses, and other keys are derived from it.

The Extended Private Key is one of the many branches on the tree of keypairs generated from the root. It’s relationship to the root is that of child to parent – the root is the ultimate origin of all the child keys in the tree.

Now what does the extended mean? As part of the BIP32 specification, it turns out there’s actually two pieces of data used to generate the actual Bitcoin private key and address. There’s 256 bits of information that serve as the private key, and 256 bits of information called the chain code. The chain code makes it impossible to find any “siblings” on the tree without it, so that the keypairs in the wallet appear to be random when they’re actually not. This enhances security and privacy.

Together, the key and the chain code form a 512 bit piece of data called the extended key. This format applies to both private and public keys (xprv and xpub).

BIPs, Extended Keys, and HD Goodness

HD wallets are an interesting and complex topic, and their existence makes securing and backing up wallets much easier for users. This concept involves several incremental improvement proposals (BIPs) and the introduction of some new cryptography concepts – not just private and public keys, but chain codes and the combined format of extended keys.

It becomes a bit easier to understand this concept at a high level when understanding the terminology behind it, so I hope this glossary helps clarify some of the mystery.

Comparing Major Mining Algorithms


As it stands, all of the top cryptocurrencies (Bitcoin Cash, Ethereum, Litecoin, and Bitcoin) use proof-of-work mining to secure their networks. With proof-of-work, special nodes on the network called miners use their computing power to try and solve a mathematical problem. This problem is designed so that a miner has to do a bunch of guessing to get the answer, but anyone else can verify that answer very quickly.

The general idea across proof-of-work variations is the same: miners have to guess a bunch to get an answer, essentially proving that they’ve done some amount of work. And the collective amount of computing power on the network makes pulling off fraud impossibly hard. However, there are some different variations of these proof-of-work algorithms used in different cryptocurrencies. Let’s take a high-level look at these variations and how they achieve the same goal in different ways.

Mining Algorithm Variations

Bitcoin Cash/Bitcoin Mining with SHA-256

The original mining algorithm used in a cryptocurrency is the fairly straightforward SHA-256 used by Bitcoin. This mining algorithm solves a simple problem: given some block data, add a random number called a “nonce”, and run that through the SHA-256 hashing algorithm. This one way cryptographic hash outputs a very large number (256 bits if you’re curious), and that number has to be less than a difficulty target number for the problem to be “solved” with that nonce. With a simple toy algorithm (8 bits) – a solution might look something like this:

0 0 1 0 0 0 0 0 - Difficulty target value

1 0 0 1 1 1 1 1 - Guess #1 is not valid - greater than target
0 0 0 0 1 0 1 0 - Guess N is a valid hash - less than target

This algorithm is a clean and simple one. Guess a number, hash the data, and hope the resulting block hash is less than the difficulty target.

However, a disadvantage of this algorithm is in the equipment needed to contribute to mining on the network. SHA-256 mining is a hard computing problem, but that’s all its limited by. As the Bitcoin network has adjusted the difficulty target over time, profitable mining has become limited to specialized computing devices called ASICS – Application Specific Integrated Circuits. It’s not profitable or feasible for a single user like you or I to mine Bitcoin on a single device like a PC anymore – it’s the world of specialized companies and mining pools. This can be considered a problem of centralization, as less everyday users can participate in this part of securing the network.

Litecoin Mining with Scrypt

One of the first major forks of the Bitcoin codebase resulted in the popular currency Litecoin, which made changes to the mining algorithm in an attempt to solve this problem of a high barrier of entry for mining. Litecoin uses an alternative hashing algorithm called Scrypt in place of SHA-256. Scrypt is actually considered a key-derivation function rather than a pure hashing function, although the end goal is roughly the same: a one-way function that takes some data and outputs some bits that are the same every time for a particular input.

The different with a key-derivation function or specialized hashing algorithm like this is that they’re designed to be more computationally difficult that algorithms like SHA-256. Scrypt is memory hard, meaning that the algorithm is more limited by the available memory in the system than by the computing power.

For key derivation, this is great because it’s hard to do brute force attacks on a database of keys – in other words, it’s hard to guess what the original password was. For our mining algorithm, it’s great because ASICs don’t really give miners an advantage. This makes mining easier for folks that only have access to devices like GPUs, and prevents some of the mining centralization and barrier to entry that’s seen with SHA-256 mining.

Ethereum Mining with Ethash

Ethereum mining follows a similar model to Litecoin – it was designed to prevent mining centralization. However, Ethereum goes further than simply using a memory-hard key derivation function or something of that nature. Ethereum uses its own memory-hard algorithm for mining called Ethash, custom designed by its creators.

Ethash is based on an algorithm called Dagger-Hashimoto used to make mining a memory-hard problem. Every N blocks or so, a large dataset is generated using the block data as a “seed”. The Dagger part of the algorithm was designed by Ethereum’s creator Vitalik Buterin to make mining memory hard, but make verifying the answer relatively easy for non-mining nodes on the network. The Hashimoto part was designed by Thaddeus Dryja to make a memory-hard hashing problem. Combining these concepts into Ethash makes a mining algorithm that’s less prone to requiring specialized hardware over time than SHA-256 mining.

Mining Variations – Same Idea, Different Requirements

The overall problem in proof-of-work mining is the same across currencies and algorithm variations – a mining node must expend resources to guess a bunch and find an answer to the problem. However, there are a variety of ways in which problem those miners solve can be constructed.

Some variations like SHA-256 are simple, but prone to centralization and specialized hardware requirements over time. Other like Litecoin and Ethereum take a different approach, desiring to make mining a more equitable process across the network at the expense of some complexity.

Regardless of the approach, proof-of-work mining allows lots of individuals to come together and create a peer-to-peer network of money, without the need to trust any one central “clearing house” to process transactions. Proof-of-work mining makes pulling off fraud a difficult or impossible endeavor, so that these currencies remain globally decentralized and secure.

Bitcoin is Not (Just) for Rich People


At a recent event at Duquesne University, a student asked me to explain what Bitcoin is. And as an addendum to her question, she stated that her impression is that Bitcoin is only for rich people…

As an educator in the space and a strong believer in the power of cryptocurrencies to change the fabric of finance, I can’t stand that Bitcoin and its culture has devolved into memes about hodling (holding) and buying Lamborghinis when the price goes up. Let’s discuss why I think the true value of this technology goes so far beyond price speculation.

Getting Past “Number Go Up”

Is it possibly good for the community and adoption when the price of cryptocurrencies go up, relative to the United States Dollar? Probably. After all, a medium of exchange must have some value for it to be worth accepting for goods and services. However, the value must be fundamentally derived from the currency’s utility! Bitcoin Cash, Litecoin, Ethereum, et. all are first and foremost cryptocurrencies. Bitcoin should be, but its loudest proponents in the last several years have argued it’s only major use is as a store of value, so I’ll leave that up to you to decide.

But why is this distinction so important to me, and to many others working in the space? I so strongly believe in Bitcoin as peer to peer cash because its unique properties enable everyone in society to have more sovereignty over their finances.

The cryptocurrencies I study and tech about are decentralized. No central banks or corporate institutions control these forms of money, meaning there is no central point of failure. It’s peer to peer money – just the sender and receiver, no corporation in the middle deciding which transactions are valid and allowed to proceed. This leads to the property of censorship resistance, meaning that no one can arbitrarily stop you from transacting with another party. And of course, these currencies are truly global and borderless – no silly geopolitics here. Send money to anyone, anywhere.

Peer to Peer Cash for Everyone

These properties are pretty interesting for those of us in the modern world, yes. We can all benefit from increased security and economic freedom, undoubtedly. However, our judgement on the true value of these properties is often clouded by our wealth and access to privileged banking.

Imagine for a second that you’re a migrant worker, thousands of miles from your family. Your hard work is your family’s lifeline, but Western Union will steal 30% in fees to send your money back home. If you’re allowed to send money back to where home is. But with Bitcoin Cash or Litecoin, you’ll pay less than a penny in fees and your funds will arrive in a near instant. That is powerful.

Imagine you’re a dissident or journalist whose bank accounts are frozen in an attempt to shut you up. You can’t use your debit card anyone, but you can use Bitcoin Cash, Ethereum, and other currencies anywhere, any time. It costs you less in fees than banking does, and donations to your cause can never be censored.

Imagine you live in a part of the world where the nearest bank branch is 100 or more miles away, but you have access to a cell phone with a data connection. This happens in many parts of the world. The student I spoke with is a native of Kenya, where the M-pesa is a completely digital version of cash in the common use. Cryptocurrencies can easily become a totally free medium of exchange in those places, allowed the unbanked to hold a secure bank account in their pocket.

We, the rich and privileged can benefit from crypto adoption. But we cannot allow the properties that make these currencies so powerful to be eroded, for those that truly need them.

Common Address Encoding Formats


When sending money to someone else using Bitcoin, Bitcoin Cash, Ethereum, or another cryptocurrency, you send funds to the other user’s address. This unique identifier for the other user’s wallet may look like a “random” string of letters and numbers, like this: 13GuDW2Km8TR6iCYP8E5QGhNky2ne7T17r. (Note: this is just a random address; don’t use it!). But there’s actually quite a bit more going on behind the scenes when it comes to address encoding.

Different cryptocurrencies use different schemes to turn raw address data into what you would actually copy and paste, type, or scan into a wallet for sending. In some cases, one cryptocurrency may support multiple encoding formats. Let’s take a look at some of the most common formats used by major blockchains.

Common Encoding Formats

The OG Encoding format: base58check

The first major encoding format to appear is used most commonly in Bitcoin and Litecoin, the first major derivative of Bitcoin. This encoding format is known as base58check, and addresses look like this:


Base58check encoded addresses are generally derived using the same process (at least in the case of Bitcoin & Litecoin). First, the raw address is derived using a two-step cryptographic hash – first SHA-256, and then RIPEMD160. This gives us a 160 bit (20 byte) “pay to public key hash” address.

To encode the address, a version byte is added to the front of the raw hash. For Bitcoin, this version byte (in hexadecimal format) is 0x0, and Litecoin uses 0x3. Next, a checksum is generated, in order to help with error detection. The address including the version byte is hashed using SHA-256 twice. The first 32 bits (4 bytes) of the resulting hash is added to the end of the raw address.

Finally, the raw data is converted to base58. Base58 is a number system, just like what we’re used to with base 10. But instead of digits 0-9, base58 uses this alphabet: 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz . It’s overall quite similar to the popular base64 encoding, but omits certain characters that are difficult to distinguish when writing or reading. There’s no 0, uppercase O, lowercase l, uppercase I, or non alphanumeric characters.

Base32-based: CashAddr and bech32

Over the last few years, base32 based encoding schemes have become more popular to deal with some of the issues base58check addresses are known to have. base58check addresses are shorter, but the mixing of uppercase and lowercase letters can make critical address data ambiguous and hard to read and write. Base32-based schemes solve this problem by only using non-ambiguous lowercase letters and numbers.

The first major currency to commonly use a base32-based system is Bitcoin Cash. BCH uses a special system called CashAddr for distinguishing BTC and BCH addresses. Cashaddr, like base58check, prepends a version byte to the public key hash. Version bytes for CashAddr can vary, as defined by the specification. The checksum uses a special algorithm to generate an error-detecting code called a BCH (Bose–Chaudhuri–Hocquenghem) code. The BCH code is the last 40 bits (5 bytes) of the final address. The address often includes a prefix to indicate the blockchain it’s used on as well, such as bitcoincash. A CashAddr address looks like this:


Bitcoin (BTC) has introduced a similar system for segwit addresses only, called bech32. All of these addresses, when encoded, start with bc for mainnet addresses, and also end with a BCH code for error detection.


The final encoding type we’ll discuss is the one used most commonly in Ethereum: hexadecimal. Hexadecimal encoding (often referred to as “hex”) is used very frequently in computer science outside of cryptocurrency. The hex number system is base16 and uses numbers and a few letters for its alphabet: 0123456789abcdef.

Ethereum address derivation and encoding is quite simple compared to other common cryptocurrencies. There’s no version byte, and no checksum (although a checksumming system has been introduced more recently, it’s not covered here). The public key hash is simply encoded as hex, and 0x is added to the front. This is a common indicator for hex format outside of cryptocurrency. An ethereum address looks like this:


The advantage to base16 is simplicity. There’s only 16 characters, all very easy to distinguish from one another! That makes it a more hardy format for reading and writing. However, it ends up creating rather lengthy addresses – for every 1 byte of data, you end up with two characters in hex.

Encoding – Because People Aren’t Computers

These encoding systems all exist for one simple reason – us meatbags aren’t particularly adept at reading and writing raw binary data. So instead of long chains of binary data, we send each other encoded addresses. Each of these common formats has its pros and cons when it comes to length, ease of use, and error detection. But in general, all of them are designed to make it easier for us to transact with these currencies without having to deal with raw binary data – thank goodness!

Hex Encoding, Version Prefixes, and Keccak (uBitAddr Code Companion Update)


In a previous tutorial, I shared the first published version of my uBitAddr project. This software and hardware project is a totally open source, open hardware, DIY offline address generator. It allows the user to generate a private key and the associated address for long-term storage, in something like a paper or metal wallet.

Since developing the original version, I’ve added several new features to the project and faced some interesting challenges along the way. Let’s take a look at some of the new features and complications I had to tackle.

Expanding Currency Support to my “Big 4”

Why add new currencies?

Personally, I am not a maximalist when it comes to Bitcoin or any other cryptocurrency. There are several I find interesting to study and use, and for my own personal preference those currencies are Bitcoin Cash, Ethereum, Litecoin, and Bitcoin.

The original version supported standard Bitcoin addresses (base58check encoded, no segwit support) and by extension supported Bitcoin Cash. However, BCH has long moved to a different address format, and Litecoin and Ethereum require an address derivation scheme that’s a bit different. I decided to take this project to it’s full potential (for my interests, at least) and add module and API support for all those currencies and formats.

Generating an Ethereum keypair totally offline!

Adding BCH cashaddr support

Thankfully, adding Bitcoin Cash support was fairly straightforward. The Trezor crypto libraries I use for cryptographic primitives/encoding already support cashaddr. I created a separate “address_from_pubkey” function in the module __init__.c code that derives the address from the pubkey and uses Trezor’s cashaddr encoding:

// Add the version specifier
unsigned char raw_address_nocheck[RAW_ADDRESS_NOCHECK_LENGTH];
raw_address_nocheck[0] = CASHADDR_P2PKH_BITS | CASHADDR_RIPEMD160_BITS;
memcpy(raw_address_nocheck + 1, round_2, RIPEMD160_DIGEST_LENGTH);

// Cashaddr  encode
cash_addr_encode((char*) address, "bitcoincash", raw_address_nocheck, RAW_ADDRESS_NOCHECK_LENGTH);

There’s a slightly different version specifier that has to be used with cashaddr, and that’s really it. Trezor’s cash_addr_encode takes care of the special cashaddr checksum, so there was no need to compute that manually like I do in the base58check code.

For the API, I simply added an optional flag in the constructor to allow cashaddr encoding instead of the legacy base58check:

uba = uBitAddr(output=uBitAddr.OUTPUT_DISPLAY, bch=True)

Litecoin Support

Litecoin support was again, fairly straightforward. The address derivation process is largely the same, but LTC uses different version specifiers for the WIF encoded key and the base58check address. Instead of using unsigned char BTC_ADDR_PREFIX = 0x0; for the address, LTC uses unsigned char LTC_ADDR_PREFIX = 0x30;. Likewise, the private key prefix is slightly different: unsigned char LTC_WIF_PREFIX = 0xB0; vs. Bitcoin’s unsigned char BTC_WIF_PREFIX = 0x80;. Otherwise, the derivation process is the same.

On the API side, at this point I added the ability to specify the desired currency as a optional argument when constructing the uBitAddr object:

uba = uBitAddr(output=uBitAddr.OUTPUT_DISPLAY, currency=uBitAddr.LTC)

Ethereum Support

Adding support for Ethereum was the most challenging. First, Ethereum requires a different hashing algorithm for deriving the address from the public key. It uses the Keccak version of SHA3, which outputs a 256 bit hash. However, Keccak is different than the final version of SHA3 accepted as the NIST standard. Fortunately, once again, Trezor has us covered. Their hardware wallet supports Ethereum and therefore has Keccak primitives in the crypto code. Another difference is that the address derivation scheme removes the 04 byte from the front of the pubkey (the byte that indicates the key is uncompressed) before the single round of Keccak hashing:

	// First, hash the public key without the 04 uncompressed pubkey indicator byte at the front
	unsigned char round_1[SHA3_256_DIGEST_LENGTH];

	keccak_256(pubkey + 1, ETH_PUBKEY_LENGTH, round_1);

	unsigned char raw_address[RAW_ETH_ADDRESS_LENGTH];
	memcpy(raw_address, round_1 + 12, RIPEMD160_DIGEST_LENGTH);

Next, ETH addresses use hex encoding rather than base58 or cashaddr. This would seem to be the simplest form of encoding, but once again the world of microcontroller programming threw me for a loop! Typically, one can format output data as hex in C using sprintf(%02x). However, I was not able to compile the CircuitPython module with that function call included.

So, I searched StackOverflow for solutions. Most of the sample code didn’t make immediate sense, but a user mentioned “using bit masking and shifting” and I was able to work out the solution on the whiteboard. I added a function that does hex encoding by masking off 4 bits at a time (a “nibble”), and for the leftmost nibble, using a 4 bit right shift. This gives a number that can be used as an index on the set of hex characters, hence giving us the two characters we need for encoding a single byte as hex.

// Convert a byte to hex format and write directly to the buffer
// This is a substitute for sprintf on a microcontroller platform
void byte_to_hex(unsigned char byte, unsigned char* buffer)
	char hex_chars[16] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };
	unsigned char left_mask = 0xF0;
	unsigned char right_mask = 0xF;

	// First, calculate the character for the first nibble (4 bits)
	// Mask off the last 4 bits, then right shift 4 bytes
	// This will be the index used to get the right character from hex_chars
	unsigned char left_index = byte & left_mask;
	left_index = left_index >> 4;
	*buffer = hex_chars[left_index];

	unsigned char right_index = byte & right_mask;
	*(buffer + 1) = hex_chars[right_index];
Whiteboarding up a homebrew hex encoding function

Homebrew Offline Address Generation, Achieved!

The end result of this work is, in my opinion, a pretty sweet little project. A Reddit user called this project “cypherpunk” and I have to agree – there’s nothing more nerdy and paranoid than writing your own code for address generation. This is a very fun, challenging, and constantly evolving project.

Some final thoughts are, of course on security, This code is experimental, so play with it and use it at your own risk. I am a security-conscious engineer but I am NOT a cryptography expert. As well, there are pitfalls to dealing with raw keypairs compared to mnemonic seeds. As long as they’re generated with sufficient entropy they are secure, but there are pitfalls when it comes to writing down and storing the keys as well as sweeping the funds properly for future spending. Always do your research and ask questions before dealing with serious money!

Beginning Bitcoin – Sending Funds From Your Wallet


Getting started with Bitcoin isn’t as intimidating as it would seem. In another tutorial, we discussed how to get started with a mobile wallet, using Bitcoin.com’s app or the Coinomi wallet. We learned how to install the wallet, safely back up our funds using the seed phrase, and get an address for loading up funds.

But the real use of Bitcoin isn’t simply funding a wallet and watching the price bounce up and down. Cryptocurrencies are currencies after all, so they can be spent! Sending Bitcoin is fairly straightforward, but it’s a bit of a different mechanism than using a credit card online. Let’s learn how to send funds to someone else using the Bitcoin.com mobile wallet, and learn about what makes these transactions different that traditional online payments.

Creating Cryptocurrency Transactions

Sending funds with the Bitcoin.com wallet

Let’s donate a dollar to our friends at Eat BCH using the Bitcoin.com wallet. To do this, first open up the wallet with some funds in it, and click on the Send button. You’ll see this screen with options to enter a Bitcoin Cash address manually (or copy/paste), or to scan a QR code. I find that scanning a QR code is often the easiest way to get an address, and Eat BCH has one on the website.

It’s always a good idea to double check the address that actually ends up in the transaction preview. Make sure you didn’t accidentally copy/paste an incorrect address, or scan the wrong QR code. It happens, and there’s also malware out there that has fudged copy/pasted addresses.

Just take a second to make sure you have the destination you actually want. There are no do-overs with cryptocurrency transactions – once they are sent there is no reversal!

Once everything is verified, slide the Bitcoin logo at the bottom to send. Your transaction will be created, and is on it’s way.

Bitcoin vs. Traditional Online Payments

It’s simple to create Bitcoin, Ethereum, and other cryptocurrency transactions with a nice wallet user interface. But these transactions work a bit different than traditional monetary transfers, and it’s good to understand how they work at a high level.

The major difference between a cryptocurrency transaction and a traditional credit card/debit card payment is the mechanism. Traditional credit card payments are pull transactions. You give a merchant your credit card information, which is private information you are entrusting to them. They use that number to request funds be pulled from your account, via the Visa/Mastercard/etc. network.

Bitcoin transactions are push transactions. Instead of giving away private information, the merchant has to give you their public address that you push funds to in a transaction. You’re in charge of the sending, and no private information is every exchanged.

The true beauty of this system is the increased security you get with this model! You can publish a Bitcoin transaction on a billboard and everything is safe. Try putting your credit card number in a public space and see how long it takes to have major problems!

Bitcoin Transactions – Simple and Secure!

All you need to send money is a wallet with a network connection, some cryptocurrency of your choice, and a recipient address. With a modern mobile wallet, this process is a simple as scanning a QR code and hitting send. Thanks to Bitcoin’s security model, you don’t have to trust an intermediary like a payment company to process the transaction – it’s entirely peer-to-peer. And even more novel and important, there’s no need for anyone to reveal private numbers.

If you haven’t already, try sending your first transaction! Grab some Bitcoin or Bitcoin Cash, or maybe some Litecoin. Use a few dollars to introduce a friend to cryptocurrency, buy something fun, or donate to a good cause. Adoption is important, and it’s easy!

Learn Hashing, Binary, and Proof-of-Work with MicroProver (Code Companion #2)

Note: This article focuses on the development of MicroProver. See my slides for the full BTC2019 talk


Proof-of-work is a Bitcoin and blockchain topic of vital importance, as it allows transactions to occur without trusting an intermediary. However, understanding this concept also requires some computer science background. One needs to know about hashing algorithms, binary numbers, and a bit of probability to “get” proof-of-work.

I wanted to do a better job of explaining the concept of proof-of-work to individuals without a computer science background – so I came up with the idea of visualizing hashing and binary numbers with a cool little microcontroller I received at PyCon 2019. This code companion will dive into the development of MicroProver, and how I turned this project into a session at the 2019 Blockchain Training Conference!

MicroProver (running on the Adafruit Circuit Playground Express) displays a final “block hash”

How MicroProver Works

Toy Hashing Algorithm

The first thing needed to make simulated proof-of-work operate is a hashing algorithm. Bitcoin uses the cryptographically secure SHA-256 for its mining operations, among other things. However, these cryptographic algorithms are not readily available on microcontroller platforms such as the Circuit Playground. It took extensive effort to get cryptographic primitives working for my offline address generation project, and required a more powerful line of processor than the CPX’s M0.

However, for the purposes of this project, a cryptographically secure hash algorithm is not needed! This project is designed for visualization and learning, and has no security requirements. So in order to create the 8 bit hashes I wanted, I simply used the simplest form of hash function one might use for creating a basic hash table. All of the code in this article can be found in src/core/MicroProver.py

    # Settings for "cryptography"
    self.HASH_MOD = 256


    # Return a really simple 8 bit hash
    # This is for educational purposes, so we don't need a
    # cryptographically secure hash, we just need one that works
    def hash_8bit(self, data):
        hash8 = data % self.HASH_MOD

        return hash8

For an 8 bit hash, we use the 8 bit number space that contains 256 total possible numbers (0-255). This algorithm takes the modulus of the data and 256, giving us a reasonable hash for our purposes.

Binary Representation

The other important concept needed to address proof-of-work is understanding binary numbers. In daily life, most of us are used to base 10, and working in other bases is a rather foreign concept. In order to make this easier to understanding in the context of proof-of-work, I decided to use the built in CPX LED lights, with red representing 0 and green representing 1.

The code takes the data (the hash output) and converts it from a raw number to an array of binary values, either True/False (used by the LED function) or 1/0 for string representation. The data is turned into binary using a technique called bitmasking, where one single bit of a byte is isolated using the & binary operator.

    # Convert a byte of data (8 bit hash, etc.) into
    # an array of bits represented by True (1) and False (0)
    # (with default mode bool)
    # Specify optional mode "bit" for 0/1 representation
    def byte_to_bitarr(self, byte, mode="bool"):

        # Define some bitmasks for each spot in the byte
        # We'll create the array using bitmasking
        masks = [ 0x80, 0x40, 0x20, 0x10, 0x08, 0x04, 0x02, 0x01 ]
        byte = int(byte)

        bitarr = []
        for i in range(0, 8):
            if mode == "bit":
                masked = byte & masks[i]
                bit = "1" if masked > 0 else "0"
                bit = bool(byte & masks[i])

        return bitarr

The LED display function then takes that array and lights up individual LEDs on the board to represent each bit:

    # This function displays an LED representation of a byte
    # It lights up 8 LEDs on the Playground Express board
    # Green represents a 1 bit
    # Red represents a 0 bit
    def display_byte_led(self, byte):

        bitarr = self.byte_to_bitarr(byte)
        for i in range(0, 8):
            if bitarr[i]:
                color = self.GREEN
                color = self.RED

            # Load the pixels from 1 - 9 so they
            # fill evenly on each side of the board
            cpx.pixels[i + 1] = color

Tying It Together for Proof-of-Work

Using these nifty little hash concepts, binary operators, and LEDs, it becomes bit easier to visualize the binary number comparisons needed by proof of work. Proof-of-work requires comparing a hash output to a difficulty target, and both can be thought of as binary numbers. For this toy visualization, we use small 8 bit numbers.

For example, say we have a difficulty target of 00100000. This value (32 in decimal) has two leading zeroes represented as an 8 bit, unsigned integer. Therefore, the final hash output must have at least 3 leading zeroes to be less than the target. For example, 00010100 is a valid block hash. Because the probability of finding this “block hash” decreases as the difficulty target decreases numerically, we “proved work” by doing a lot of guesses to get a solution.

In MicroProver, the user gets to program the difficulty level from 1-7, with the level being the number of leading zeroes in the target. This way, it is easy for the user to visualize the algorithm as attempts and the final solution are displayed on the board. They can look for the red “leading zeroes” as the algorithm works, and get a better feel for how difficulty affects the computing power/time needed to prove work.

Extra Visual Help – Data Visualization Script

As an addition to the board itself, I created a data visualization script in src/dataviz/graph_pow.py. This script takes an optional log generated by the CPX simulation and graphs targets/vs. attempts, breaking down how difficulty affects the probability of finding a solution:

Proof of Work difficulty target vs attempts to find a solution. It’s probability!

Proof-of-Work, Made Accessible

When I got the Circuit Playground, I knew I had to use it to create an interesting chaintuts project! After getting oriented with programming basic utilities on the board, I decided its features would be great for an educational assistant.

Fortunately, I have had the opportunity to take this project further and teach at the 2019 Blockchain Training Conference in Denver, CO! For this session, I’ll be breaking down proof-of-work and using MicroProver for interactive simulations and data visualization.

By using LEDs to visualize hashing, binary numbers, and a bit of probability, we can make understanding this critical blockchain security topic more accessible to those without a computer science background. And the more folks that understand decentralization and trustless software, the more we can drive adoption of these technologies.