Bitcoin Cryptography – Hashing Algorithms

Overview

At the core of cryptocurrencies lies the science of cryptography. These mathematically secured and provable algorithms allow currencies like Bitcoin to be built in a way that’s peer-to-peer instead of based on corporate or governmental trust.

One of the key classes of cryptographic algorithms used in cryptocurrencies is hashing algorithms – powerful one-way functions with a broad set of interesting applications. Let’s learn about some of the important properties of hash functions and how they are used in Bitcoin.

All About Hashes

Properties of Hash Functions

Hash functions have a few key properties that make them incredibly useful and secure. Hash functions are one-way, they are preimage resistant (for crypto-secure versions), and are deterministic.

The first property is that hash functions are one-way – you cannot go backwards from a hash output (called the digest or the hash) back to the original input. These functions are sometimes referred to as “trapdoor functions”, because once an input falls into the trapdoor of the hash function, there is no coming back to the original message!

For example, let’s look at the SHA-256 hash of the message “Hello there!”:

>>> from hashlib import sha256
>>> sha256(b"Hello there!").hexdigest()
'89b8b8e486421463d7e0f5caf60fb9cb35ce169b76e657ab21fc4d1d6b093603'

The message “Hello there!” is run through the function to create the digest “89b8b8e…”. There is no algorithm for going backwards – so if all you have is the hash output “89b8b8e…” and you want to know the original input, the only way to find it is to guess inputs until you find a matching digest! For example, this is how cracking passwords works. Attackers compromise a database of hashes, not plaintext passwords. They have dictionaries of common passwords that users use, and run those guesses through the hashing algorithm until a matching hash is found in the database.

This leads us to the second useful property of cryptographically secure hashing algorithms known as preimage resistance. These algorithms are designed such that there is not a predictable pattern from inputs to outputs that might make it easier to guess. If you have the hash output “89b8b8e…” from above, there’s nothing at all in that hash that points you in the direction of what the original input might be. An interesting part of good algorithms that relates to preimage resistance is the waterfall effect, where changing the input even slightly results in a drastically different output:

>>> from hashlib import sha256
>>> sha256(b"Hello there!").hexdigest()
'89b8b8e486421463d7e0f5caf60fb9cb35ce169b76e657ab21fc4d1d6b093603'
>>> sha256(b"Hello there").hexdigest()
'4e47826698bb4630fb4451010062fadbf85d61427cbdfaed7ad0f23f239bed89'

Notice that simply changing one character in the input (the ‘!’) results in a hash that’s nowhere near the first hash.

The third useful property of these functions is that they are deterministic. We know that they are one way – but it also turns out that the output is always the same for a particular input. From our example, feeding “Hello there!” into a properly-implemented SHA-256 function will always give the output value “89b8b83…”. This is critical for the use of hash functions, as it means someone with a hash and an input can verify the correctness of a message. For example, a software developer can give you an expected hash for their program along with the binary. You can then run the binary through the hash algorithm to verify the program has not been tampered with on its way to you!

Hashing in Bitcoin

There are two primary hash functions used in Bitcoin: SHA-256 and RIPEMD160. SHA2 was developed by the US National Security Agency (NSA) and first published in 2001. RIPE on the other hand was developed by a group of researchers in the EU and released in 1992 with an update in 1996. It’s interesting to note that Satoshi chose algorithms created in two very different ways, so if one does not trust the NSA or not trust open-source developed algorithms, there’s layers of security built in since both algorithms are used in the system.

Now what about how these are used? The first application is in Proof-of-Work Mining. POW is used to validate transactions and issue new coins on the peer-to-peer network without the need for trust. Every 10 minutes, miners construct a potential block of transactions. They take the block header information and a random value called the nonce through the SHA-256 algorithm. If the hash output treated as a 256 bit number is less than a difficulty target, the network accepts the Proof-of-Work solution as valid. It take millions of nonce guesses to get to a hash output that meets the difficulty, but once a solution is found anyone on the network can verify it’s correct in one step thanks to the deterministic nature of hashes.

Ex:
Difficulty target is 0123abc...
"block header info" + 0 -> SHA-256 -> d3a2bb9...
"block header info" + 1 -> SHA-256 -> 110dbb9...
"block header info" + 2 -> SHA-256 -> 04a4173...

0123abc < 04a4173, so this meets POW difficulty requirement

The second application of hashes in Bitcoin is in address derivation. Bitcoin uses Elliptic Curve Cryptography as part of the address derivation scheme (which will be covered in a later tutorial :), but hashes are also a critical part of the algorithm. Deriving a Bitcoin address looks like this:

Ex (not real outputs):
Private key: 0123456789abc....
->
ECDSA
->
Public key: af325bc23d...
->
SHA-256 
->
RIPEMD160
->
base58 encoding
->
1aDF578G...

ECDSA provides one layer of security, but the hashing algorithms provide another! When a user sends funds to another user’s address, they are not paying directly to the public key of the receiver. They are actually paying to a “Pay to Public Key Hash” address. The public key is hashed twice, once with SHA-256 and again with RIPEMD160, then encoded. What this does then is masks the receiver’s public key until the funds are later spent. So if a user avoids address reuse, other parties can never know the public key of an address containing any funds. If there’s a vulnerability in the way the wallet handles ECDSA or some future problem is found, this provides an extra layer of security over the owner’s funds.

Hashing – A Critical Part of Bitcoin

Looking at the fascinating properties of hashing algorithms, it’s clear why they’re so useful in a peer-to-peer application like Bitcoin. Thanks to the one-way, preimage resistant, deterministic nature of these algorithms, we can build systems where other users can verify things like mined transactions securely without the need for trust. As well, hashes provide layers of security over fund ownership on the blockchain, thanks to P2PKH addresses. Us Bitcoin fans trust the provability of math more than the fallibility of corporations and governments – who needs trust when you have cryptography!

Recovering BCH (Sent to BTC Address)

Overview

Many of the viewers of my tutorial on what happens when you send BCH to a BTC address have asked for more specific help on how to recover funds in this scenario. Fortunately, not all is lost if this happens – it just depends on the context. For non-custodial wallets (where the user controls the private key), it’s fairly straightforward to recover the lost BCH and send it back to a wallet the user would like to use. It is important to note, however, that this only works with these non-custodial wallets. If a user sends funds to a custodial wallet (like CashApp or an exchange), they’ll have to get in touch with that exchange’s customer service for help.

BCH Fund Recovery

What Actually Happens…

If a user sends BCH to a “BTC” address, the funds never really leave the Bitcoin Cash blockchain. The confusion happens because BCH and BTC legacy addresses are backward compatible – so when the user creates the BCH transaction with the BTC wallet address as the receiver, the transaction is totally valid and is sent.

But, the BTC and BCH addresses also share the same private key. Therefore, if the user has access to the BTC wallet’s recovery phrase (or the private key directly), they can import/sweep that key into a BCH wallet to get the funds back.

An Example Scenario, With Recovery Steps

For this tutorial, I “accidentally” sent some Bitcoin Cash to a Bitcoin address provided by a blockchain.info wallet. The funds do not show up in my BTC balance, because I actually did this on the BCH chain.

Looking at a block explorer, we can see the mistaken address for this transaction. In legacy format (BCH/BTC backward compatible), we see the address is `1Ka4YZ19kq87yXUAPXMt9KZLd2eap1pT4Y`

Now what we need to do is to get the associated private key for this address. In this case, blockchain.info provides a mnemonic seed phrase used to generate all the wallet’s private keys and associated addresses. Therefore, if we get this seed phrase we can extract the specific key for this address and import it into another wallet we control. The seed phrase for our test wallet here is:

water pulse panel anchor impulse brown effort cake open drastic bright aerobic 

It’s important that you don’t reuse this seed phrase for any of your own wallets – anyone who’s seen this tutorial could steal your money!

Now what we can do is import this seed phrase into a mnemonic tool. I’m a big fan of Ian Coleman’s Bip39 Mnemonic Code tool. An important note – if you’re doing this with real funds, download the webpage and run the tool when your PC is offline. Ian’s tool is trusted in the community, but it’s best practice to never but private information like seed phrases into online tools – your funds could be stolen by a nefarious website.

When the seed is put in the tool, the derived addresses and keys will be shown in a table like this:

Search for the address you mistakenly sent the funds to. If you’ve used your wallet a lot, you may have to generate more child addresses. Also, if you don’t see the address here, don’t panic – play around with some derivation settings. Different wallets use different paths and that sort of thing, but in this case our wallet used the defaults.

Now you need to copy the Wallet Import Format private key shown next to the address – this will allow you to unlock the funds. What we’ll do here is import or sweep the address into a BCH wallet we control to recover the funds. Note the distinction – an import keeps the funds in the same address. If you add this to a wallet with an existing backup phrase, the imported address will not be protected by that backup phrase. Sweeping is better – this will create a new transaction that sends the recovered funds to a new address controlled by the wallet’s phrase.

Your funds are back! For this sample recover, I imported the funds into an Electron Cash wallet on desktop.

Fund Recovery – It’s All About Keys

The most critical step in recovering lost funds is understanding that whoever controls the private keys owns the lost funds. If you’ve sent your funds off to an exchange or app like CashApp, you’ll need help from them because you don’t control the keys. But if you accidentally sent funds to an address provided by your own BTC wallet, you can follow these steps to get your BCH back into your normal BCH wallet. It’s a matter of getting the key for the mistaken address into a BCH wallet that recognizes the funds on the blockchain.

I hope this helps! It can be scary to lose funds in the cryptocurrency space, where things are a lot more final than in the world of banking. Ultimately though, cryptocurrency gives us more control over our own money and that’s a great thing.

Oh and as a little easter egg for my viewers – I left the BCH for this tutorial in the sample address. If you want it, it’s yours 🙂 First recovered, first served.

Why You Can’t Just Brute-Force a Bitcoin Private Key

Overview

Unfortunately, sometimes Bitcoin private keys are lost. Users destroy wallets, throw away hard drives, or simply never backup keys in the first place. A seemingly obvious solution to the problem of a lost key might be to try and “guess” all the possible keys until you find the one that unlocks your addresses. One can just spin up their gaming laptop, or maybe some Azure VM’s to get this done, right?

Well, no. In fact, there’s no practical chance at all that you could ever brute-force a Bitcoin private key. The scale of the problem is far larger than we as humans can even appreciate. Let’s dive in to the numbers and show why cracking your lost paper wallet simply isn’t going to happen.

Bits and Brute-Force: Understanding Key Cracking

A “Bit” on Binary Numbers

In order to understand the scale of the numbers involved in Bitcoin private keys, one must first understand a little bit about the binary number system used by computers. We’re used to a base 10 system, where each “place” in a number can have digits 0-9. The number 100, for example, has a 0 in the one’s place, a 0 in the 10’s place, and a 1 in the 100’s place. The number 100 when represented as a power of 10 is 10^2, or ten squared.

When it comes to binary numbers, similar principles apply. But force each “place”, we can only have 0 or 1. On or off. The number 18. in binary, for example, is 10010. There’s a 0 in the 1’s place, 2 in the 2’s place, 0 in the 4’s place, 9 in the 8’s place, and finally, 1 in the 16’s place. Each place is a power of 2! 10000 in binary is 16, or 2^4 – two to the fourth power.

The Scale of the Problem

Now this concept helps us understand a bit more about the scale of a particular keyspace. Let’s say, for example, we have an 8-bit private key for a cryptosystem. 8 bits, and each bit can be either 1 or 0. This means there’s a total of 2^8 possible combinations of 1’s and 0’s, giving us 2^8 = 256 possible private keys. If we have a 16-bit private key, we have 2^16 = 65536 possible private keys. You can apply this formula to any size keyspace you’d like.

You may have noticed something from our small sample size here. We only added 8 bits to the keysize going from 8 to 16 bits. However, we went from 256 possible keys to over 65 thousand! That’s a lot more possible keys, and therefore a lot more possible guesses we’d have to make to brute force the keyspace. It turns out that adding bits to a keysize makes it exponentially harder to brute-force that particular keyspace, and that’s why even a seemingly small keysize can make it practically impossible to brute-force those keys.

So, what about Bitcoin? Bitcoin uses 256-bit private keys. So given our little bit of math here, we can calculate the number of possible combinations:

2^256 = 115792089237316195423570985008687907853269984665640564039457584007913129639936 or 1.157921 x 10^77

Now that is an unfathomably large number of possible keys. For scale, the number of keys available in the space is on par with the number of atoms thought to be in the observable universe. That’s a lot of keys to guess.

A More Hands-On Look at the Scale of the Problem

Now one can stare at that large number of possible keys in awe, and still not quite understand how impossible it is to brute force Bitcoin keys. Computers are pretty good at dealing with large numbers, right? Yes, but still not good enough to deal with this many guesses.

I’ve created a program called PkTime written in C. This program will show us how long it takes to brute-force various keysizes on the machine it’s run on, or it can be given the number of ops (guesses) per second a machine is capable of. And of course, it’s available under a free and open source license!

Pktime will calculate the actual time to brute-force Bitcoin keys when run on the machine without a given ops/second value. It uses Trezor’s libraries to generate an actual Bitcoin Keypair, so it more accurately reflects the time to check one keypair in a brute-force attack.

This is the output of PkTime on my personal machine, an Asus laptop with an intel i5 processor. The program only uses a single core (no multithreading), and you’ll see why it makes no difference anyway:

josh@Josh-Asus:~/pktime/bin$ ./pktime
Calculating some real and estimated brute force times...please wait
Average optime for one keypair check: 0.00000959
Real time to bruteforce 8 bit keysize: 0.004566 seconds with 256 iterations
Real time to bruteforce 12 bit keysize: 0.035437 seconds with 4096 iterations
Real time to bruteforce 16 bit keysize: 0.457779 seconds with 65536 iterations
Real time to bruteforce 18 bit keysize: 1.902443 seconds with 262144 iterations
Real time to bruteforce 20 bit keysize: 7.585260 seconds with 1048576 iterations
Est. time to bruteforce 24 bit keysize: 2.68233088 minutes with 16777216 iterations
Est. time to bruteforce 28 bit keysize: 42.91729408 minutes with 268435456 iterations
Est. time to bruteforce 32 bit keysize: 11.44461175 hours with 4294967296 iterations
Est. time to bruteforce 36 bit keysize: 7.62974117 days with 68719476736 iterations
Est. time to bruteforce 40 bit keysize: 122.07585872 days with 1099511627776 iterations
Est. time to bruteforce 44 bit keysize: 5.34760777 years with 1.759219E+13 iterations
Est. time to bruteforce 48 bit keysize: 85.56172438 years with 2.814750E+14 iterations
Est. time to bruteforce 52 bit keysize: 1368.98759015 years with 4.503600E+15 iterations
Est. time to bruteforce 64 bit keysize: 5.607373E+06 years with 1.844674E+19 iterations
Est. time to bruteforce 128 bit keysize: 1.034378E+26 years with 3.402824E+38 iterations
Est. time to bruteforce 192 bit keysize: 1.908090E+45 years with 6.277102E+57 iterations
Est. time to bruteforce 256 bit keysize: 3.519805E+64 years with 1.157921E+77 iterations

It’s amazing to see just how much longer it takes to brute-force a keyspace by adding just a few bits. The jump from 28 to 36 bits takes the problem from minutes to days, and the jump from 40 to just 44 bits makes the difference between days and years. This is exponentially scaling.

A look at where we are for a 256-bit Bitcoin key…over 3 x 10^64 years. Trillions upon trillions upon trillions of years. Even a 50-some bit key is impossible to exhaust in any meaningful amount of time on a modern laptop.

Now you’re probably thinking – that’s a laptop though…what about the world’s best supercomputers, governments, etc. No need to fear for your keys; exponentially scaling still protects them!

Let’s re-run our estimation program and give an ops/second for the world’s fastest supercomputer. top500.org list’s IBM’s Summit as the current leader, with a theoretical maximum of 200795000000000000 flops, or floating-point operations per second. Let’s generously assume we have a supercomputer with this speed, and that it only takes 1 op to check a key (it takes quite a few more).

josh@Josh-Asus:~/pktime/bin$ ./pktime 200795000000000000
Average optime for one keypair check: 0.00000000
Est. time to bruteforce 24 bit keysize: 0.00000000 seconds with 16777216 iterations
Est. time to bruteforce 28 bit keysize: 0.00000000 seconds with 268435456 iterations
Est. time to bruteforce 32 bit keysize: 0.00000002 seconds with 4294967296 iterations
Est. time to bruteforce 36 bit keysize: 0.00000034 seconds with 68719476736 iterations
Est. time to bruteforce 40 bit keysize: 0.00000548 seconds with 1099511627776 iterations
Est. time to bruteforce 44 bit keysize: 0.00008761 seconds with 17592186044416 iterations
Est. time to bruteforce 48 bit keysize: 0.00140180 seconds with 281474976710656 iterations
Est. time to bruteforce 52 bit keysize: 0.02242884 seconds with 4503599627370496 iterations
Est. time to bruteforce 64 bit keysize: 1.53114238 minutes with 18446744073709551616 iterations
Est. time to bruteforce 128 bit keysize: 5.370103E+13 years with 3.402824E+38 iterations
Est. time to bruteforce 192 bit keysize: 9.906091E+32 years with 6.277102E+57 iterations
Est. time to bruteforce 256 bit keysize: 1.827351E+52 years with 1.157921E+77 iterations

Our supercomputer here is orders of magnitude faster than a laptop – in fact, it renders 64-bit keys completely insecure. But look how, once again, exponential scaling keeps 256-bit keys very, very safe. It would still take over 1 x 10^52 years to exhaust all the possible Bitcoin keys. Given that the universe is roughly 13 x 10^9 billion years old, I think it is safe to assume that brute-forcing the keys are an impossible task.

One last note – even if we could build a theoretical computer that could guess much faster, there are energy limits to computation. The amount of energy it would take to do all these operations far exceeds what humanity could harvest.

Brute-Forcing Bitcoin – Not Gonna Happen!

Given the enormity of the 256-bit keyspace, the speed of the world’s most amazing computers, and the limits of computation itself – you can clearly assume your Bitcoin keys are safe. As with all things security, there are other attack vectors that can get your coins stolen. But, there’s no way someone is spinning up a cloud computer to try and brute-force keys. The probabilities are far too low of finding a key, and the cost is simply too high. If you wanna steal those Bitcoin keys…just crack open a cold one and wait until the end of the universe. Even then, you might run out of time!

Fetching Live Balances on a Microntroller with WatchAddr (Code Companion #3)

Overview

One of the biggest perks of offline wallets (like paper wallets and hardware wallets) is the ability to store private keys away from prying eyes. In previous code companions, I talked about the challenges of building uBitAddr, a custom offline keypair generator that works with BCH, BTC, LTC, and ETH.

However, you probably want to keep tabs on your sweet, sweet offline savings. Many wallets have a “watch address” feature that lets you track the balances of an address without having the private key on hand for spending. In this project, I built a custom miniature “watch address” utility that runs on a wifi-enabled microcontroller!

Watching with Wifi

Fetching Data on an ESP8266 with MicroPython

The first part of my work involved connecting a wifi-enabled microprocessor to an API that would allow me to fetch live balance data. I wanted to be able to fetch data on addresses for my “big 4”, BCH, BTC, LTC, and ETH. So I chose to connect to the bitcoin.com API for BCH, and BlockCypher for the other 3. Both provide an easy to use API that does not require an API key – an added layer of complexity that isn’t necessary for fetching totally public blockchain data.

Connecting to a network on this platform is surprisingly straightforward, and that makes building internet-connected projects a treat:

    # Connect to the wifi access point configured in auth.py
    def connect_wifi(self):

        conn = network.WLAN(network.STA_IF)
        conn.active(True)

        conn.connect(auth.SSID, auth.PASS)

        return conn.active()

This code initializes a WLAN connection using the network module built into MicroPython and connects using the pre-configured network name (SSID) and password. This data is stored in a separate file called auth.py – anyone cloning this project will simply copy the sample and add their own configuration.

Building a Better API

Connecting directly to the Bitcoin.com/BlockCypher APIs directly worked pretty well for just balance data, but I ran into two fairly serious issues fast.

First, connecting via HTTP isn’t secure or private – a man-in-the-middle attack could steal the user’s addresses. This isn’t a security issue per-se, as blockchain addresses are public data and can’t be used to steal funds. However, it does compromise the privacy of the user who might not want others on the network to discover their addresses and therefore wallet balances!

Second, it turns out that the buffer for TLS (HTTPS) connections on the ESP8266 are very small, around 5KB. Fetching balance and price data in JSON format (with a bunch of stuff I didn’t need) overflowed the buffer every time! The Cryptonator price API doesn’t support HTTP at all, so I needed to kill two birds with one stone and find a fix for this.

My solution was to essentially create my own API on top of the balance and price APIs I wanted to use. This “proxy” fetches the balance and price data and digests it down to only the very minimum data I need, the address balance and current USD value. I return this data in a comma-delimited string – no fuss and a very small data size.

		# Fetch the data in multiple threads to reduce IO latency
		data = {}
		t_bal = threading.Thread(target=self.fetch_bal, args=(address, currency.upper(), data))
		t_price = threading.Thread(target=self.fetch_price, args=(address, currency.upper(), data))

		t_bal.start()
		t_price.start()
		t_bal.join()
		t_price.join()

		bal = data["bal"]
		price = data["price"]
		usd = bal * price

		# Return the data in the form of a comma-separated list
		response = "{0:.8f},{1:.2f}".format(bal,usd)
		return response

There’s two functions in api/watchaddr.py not shown here that fetch the desired data from the API endpoints, parse, and return. This main code calls those functions and formats the data in the very basic string format before returning the response.

There’s also a fun optimization I added here: I’m fetching the data in two separate threads. Python doesn’t have “true” multithreading due to the Global Interpreter Lock, but it still works great for IO bound operations like waiting for a network response!

Screen Time!

Fetching the data and building a proxy is fun and all, but the ultimate goal is to run this off of a battery and not just off a USB cable connected to my laptop. So, I needed some way to display the information that wasn’t writing to my laptop’s console. In comes the delightfully small and powerful Adafruit OLED display.

This screen is more robust than the character screen I’ve used in past projects – it can display pixels! Fortunately it does come with a simple API for writing text which is all we need here. This screen is wired up to communicate over the i2c serial protocol, which I find nice and simple to use. I also have an SPI version of the screen I would like to add support for as well, and it would be easy to wire up a character LCD.

    # Initialize the OLED screen for display
    def init_oled(self):

        i2c = machine.I2C(-1, machine.Pin(5), machine.Pin(4))
        self.oled = ssd1306.SSD1306_I2C(128, 32, i2c)
        self.oled_line = 0

    # Define a flexible display function
    # This can simply print to serial or output to a peripheral
    def output_data(self, data):
        if self.output == self.OUTPUT_DISPLAY:
            self.oled.text(data, 0, self.oled_line)
            self.oled.show()
            if self.oled_line == 30:
                self.oled_line = 0
            else:
                self.oled_line = self.oled_line + 10
        else:
            print(data)

These functions initialize the OLED screen and allow output of up to 3 rows of text (all that will fit on the screen).

Showing current crypto balances
Showing current USD value

Offline Addresses, Online Watching

This was another very fun project to build, and I’m proud of it! I love tinkering, and working with microcontrollers has been an exciting new avenue for learning and building interesting projects.

This utility is a nice companion for uBitAddr, so someone making offline wallets can keep tabs on their secure savings in a small, lightweight package. And as always, I think looking at this code can help folks understand more about how cryptocurrencies and software work.

How to Read the Bitcoin Whitepaper

Overview

On October 31st, 2008, Satoshi Nakamoto graced the world with his vision for a peer-to-peer electronic cash system called Bitcoin. The cryptocurrencies we use today started with this abstract and the ideas contained within. I highly recommend anyone interested in Bitcoin or other open blockchain projects read the original whitepaper, but it can be a bit technical and terse for the completely uninitiated.

Let’s walk through the whitepaper together, and get an idea of what each section discusses and why it’s important for creating the Bitcoin system. The whitepaper can be read here: https://www.bitcoincash.org/bitcoin.pdf and from many other sources.

Understanding The Whitepaper

Abstract and Section 1

The abstract and first section of the whitepaper lay the groundwork for why Bitcoin is important and how it solves several problems with the existing system used for online transactions. Whitepapers such as this are often oriented around solving a problem, and the Bitcoin whitepaper discusses the particular problem of creating a trustless system of internet money, in contrast to traditional systems that require intermediary processors and mints.

Section 2 – Digital Signatures

Section 2 describes what we now know as the system of private/public key pairs (addresses) and how they are used for transactions. In Bitcoin, the user holds their own private keys that prove they own their coins. When the user wants to spend these coins, they create a new transaction that sends coins to another address. That transaction contains a mathematical/cryptographic proof that they are the rightful owner of those funds, called a digital signature. The user can prove to the rest of the users on the network that they control a private key without revealing that key at all! Instead, they sign a message and provide a public key, allowing other users to verify they own some money without revealing the secret.

This is a powerful and important feature in Bitcoin – transactions are created and chained together without the need for secret information to be revealed. However, this doesn’t prevent the problem of double spending, where a user could simply do a digital signature twice to re-spend coins.

Sections 3-6 & 11 – Proof of Work Consensus

In order to prevent double spending and ensure consensus across the peer-to-peer network of users, Bitcoin uses an interesting system called proof-of-work. This system prevents fraud by requiring that users expend real-world resources in the form of electricity and computing power to solve a mathematical problem. Sections 3 & 4 describe the proof-of-work timestamp system used to create a consensus on what transactions are valid and when they occurred.

This mathematical problem requires taking a batch of transactions (batched every ten minutes, roughly) and solving a very difficult guessing game, essentially. The answer to this problem, based on data in the block, can only be solved by guessing a bunch. But once the answer is found, all the other nodes on the network can verify the answer is correct in an instant! This is based on the properties of the hashing algorithm used in the problem.

Now what does that mean for Bitcoin? It means that as a bunch of honest nodes on the network contribute to security through proof-of-work, attackers don’t have the resources to outcompete the honest chain. For an attacker to put a fraudulent transaction in a block, they would have to compute faster than the rest of the entire world acting honestly, and with a large network effect this becomes impossible to do at scale!

As well, part of the problem data for a new block is based on older block data (hence, blockchain as the blocks are cryptographically linked). So, if an attacker wanted to create a fraudulent transaction that is three blocks back in the history, they would have to do outcompute the rest of the world, doing thirty minutes of work in under ten. Section 11 shows that the probably of a successful attack becomes negligible after only a few blocks of history.

Sections 5 & 6 – Networking and Incentives

Now why act honestly in the first place? Sections 5 & 6 describe how the network interacts and rewards legitimate users.

Section 5 describes how transactions are flooded out across the peer to peer network for batch processing. Any node that wants to mine to help secure the chain can do so using their CPU power. When a solution is found, it is shared with other nodes that can verify it is correct, and the race to solve the next problem begins. Nodes running the Bitcoin software will always follow the longest chain of solved problems or the longest chain of “proof-of-work.

Section 6 describes one of the most important aspects of proof-of-work in Bitcoin, and that’s the system of economic incentives that rewards miners for acting honestly. The miner that finds solves the problem for a block gets two rewards – first, they get newly minted Bitcoin. They also receive all of the transaction fees in that block. The system is therefore designed so that it’s easier and more profitable to mine legitimately than to try and attack the network.

Sections 7, 8, & 9- Dealing with Block Data

Storing all of the data in blocks takes up a lot of space as time goes on. The Bitcoin blockchain is several hundreds of gigabytes in size as of the time of this writing. However, Satoshi found a way to keep validation secure without storing every bit of data and discusses in Section 7. Transaction data is cryptographically “summarized” in a Merkle Tree, so that the amount of data needed for secure validation is reduced through a process called pruning.

As well, thanks to Merkle trees, it is not necessary for every user on the network to store the whole blockchain and validate data. Section 8 describes a process known as Simplified Payment Verification that allow Bitcoin clients to create and receive payments securely without needing the entire history stored in the blockchain. This is especially critical for applications such as mobile wallets.

Section 9 describes the model Bitcoin uses to store data about who owns what. This is the UTXO blockchain model, where each wallet owns these Unspent Transaction Outputs that behave like dollar bills. A user can take multiple UTXOs and combine them to create a transaction, and send the “change” back to their own wallet.

Section 10 – A Privacy Primer

Everything in Bitcoin happens on a completely public blockchain. It is pseudonymous, but not anonymous! In order to prevent blockchain analysis from linking a bunch of transactions back to an individual user, Satoshi recommends in Section 10 that users create new addresses for each payment and not reuse old addresses. This makes linking transactions together much harder for observers.

Understanding Bitcoin, Right from the Source!

The Bitcoin whitepaper is a great place to get started in understanding Bitcoin. In only 8 pages of information, this work describes how to create a system of payments that doesn’t rely on any central, trusted authorities like our traditional monetary system does.

The paper itself is brief and to the point, which may make it tricky to understand at first. But with this companion, some re-reading, and exploring other works out there, it becomes easier to understand the magic of the ideas described. Happy Birthday to Bitcoin, and happy reading all!

HD Wallets – BIPs and Terminology

Overview

The advent of HD wallets has made key management a far easier task for cryptocurrency users. These “Hierarchical Deterministic” wallets can generate an infinite amount of private keys and addresses from a single seed, eliminating the need for the periodic backups required by the old-style, nondeterministic wallets.

Despite the ease of backup, there are a few more moving parts inside an HD system than there are with traditional keypairs, so understanding how this works is a little more complex. Let’s get a high level overview of how HD wallets work and the associated terminology.

An HD Wallet Glossary

BIP, Boop, Bop – What’s With All the Proposals?

First, let’s discuss the BIPs involved with the most common HD wallet technologies. BIP stands for “Bitcoin Improvement Proposal” – this is a standard for proposing new additions to the Bitcoin protocol and software that is community driven.

BIP32 – Titled “Hierarchical Deterministic Wallets” , this BIP defines the basic specification for a protocol that generates addresses deterministically from a single seed, rather than randomly. In this BIP, Pieter Wiulle describes algorithms for taking a cryptographic seed and generating a tree of keys and addresses that can be reproduced again from that same seed.

BIP39 – Titled “Mnemonic code for generating deterministic keys”, this BIP describes a scheme for encoding the cryptographic seed as English words, making backups far easier and safer for end users. This is where those mnemonic seed phrases you may be used to seeing originated! So instead of having to backup a string of hexadecimal or base58 data, we can write down 12-24 words to backup an entire wallet!

BIP44 Titled “Multi-Account Hierarchy for Deterministic Wallets”, this BIP builds on top of the work done in BIP32 to make multi-currency, multi-account wallets possible. BIP44 makes implementing technologies like multi-currency wallets like Coinomi far easier, at it makes the handling of various cryptocurrencies standard, rather than up to the wallet developer to figure out.

Extended Keys (xprv, xpub) and Root vs. Other Keys

One of my YouTube viewers asked the question: “What’s the difference between a BIP32 root key and BIP32 extended private key when they both use xprv as the prefix?

Well, let’s remember that HD wallets generate keypairs in a deterministic, or predictable manner. It turns out that this structure takes the form of a tree of keys – meaning there is a tree root and many branches generated from that root.

The Root Key is generated directly from the wallet seed. This is the key that is generated after taking the user’s mnemonic seed phrase, returning it to a binary format, and running it through a hashing algorithm called HMAC-SHA512. The root key is the very top of the tree of addresses, and other keys are derived from it.

The Extended Private Key is one of the many branches on the tree of keypairs generated from the root. It’s relationship to the root is that of child to parent – the root is the ultimate origin of all the child keys in the tree.

Now what does the extended mean? As part of the BIP32 specification, it turns out there’s actually two pieces of data used to generate the actual Bitcoin private key and address. There’s 256 bits of information that serve as the private key, and 256 bits of information called the chain code. The chain code makes it impossible to find any “siblings” on the tree without it, so that the keypairs in the wallet appear to be random when they’re actually not. This enhances security and privacy.

Together, the key and the chain code form a 512 bit piece of data called the extended key. This format applies to both private and public keys (xprv and xpub).

BIPs, Extended Keys, and HD Goodness

HD wallets are an interesting and complex topic, and their existence makes securing and backing up wallets much easier for users. This concept involves several incremental improvement proposals (BIPs) and the introduction of some new cryptography concepts – not just private and public keys, but chain codes and the combined format of extended keys.

It becomes a bit easier to understand this concept at a high level when understanding the terminology behind it, so I hope this glossary helps clarify some of the mystery.

Comparing Major Mining Algorithms

Overview

As it stands, all of the top cryptocurrencies (Bitcoin Cash, Ethereum, Litecoin, and Bitcoin) use proof-of-work mining to secure their networks. With proof-of-work, special nodes on the network called miners use their computing power to try and solve a mathematical problem. This problem is designed so that a miner has to do a bunch of guessing to get the answer, but anyone else can verify that answer very quickly.

The general idea across proof-of-work variations is the same: miners have to guess a bunch to get an answer, essentially proving that they’ve done some amount of work. And the collective amount of computing power on the network makes pulling off fraud impossibly hard. However, there are some different variations of these proof-of-work algorithms used in different cryptocurrencies. Let’s take a high-level look at these variations and how they achieve the same goal in different ways.

Mining Algorithm Variations

Bitcoin Cash/Bitcoin Mining with SHA-256

The original mining algorithm used in a cryptocurrency is the fairly straightforward SHA-256 used by Bitcoin. This mining algorithm solves a simple problem: given some block data, add a random number called a “nonce”, and run that through the SHA-256 hashing algorithm. This one way cryptographic hash outputs a very large number (256 bits if you’re curious), and that number has to be less than a difficulty target number for the problem to be “solved” with that nonce. With a simple toy algorithm (8 bits) – a solution might look something like this:

0 0 1 0 0 0 0 0 - Difficulty target value

1 0 0 1 1 1 1 1 - Guess #1 is not valid - greater than target
...
...
...
...
0 0 0 0 1 0 1 0 - Guess N is a valid hash - less than target

This algorithm is a clean and simple one. Guess a number, hash the data, and hope the resulting block hash is less than the difficulty target.

However, a disadvantage of this algorithm is in the equipment needed to contribute to mining on the network. SHA-256 mining is a hard computing problem, but that’s all its limited by. As the Bitcoin network has adjusted the difficulty target over time, profitable mining has become limited to specialized computing devices called ASICS – Application Specific Integrated Circuits. It’s not profitable or feasible for a single user like you or I to mine Bitcoin on a single device like a PC anymore – it’s the world of specialized companies and mining pools. This can be considered a problem of centralization, as less everyday users can participate in this part of securing the network.

Litecoin Mining with Scrypt

One of the first major forks of the Bitcoin codebase resulted in the popular currency Litecoin, which made changes to the mining algorithm in an attempt to solve this problem of a high barrier of entry for mining. Litecoin uses an alternative hashing algorithm called Scrypt in place of SHA-256. Scrypt is actually considered a key-derivation function rather than a pure hashing function, although the end goal is roughly the same: a one-way function that takes some data and outputs some bits that are the same every time for a particular input.

The different with a key-derivation function or specialized hashing algorithm like this is that they’re designed to be more computationally difficult that algorithms like SHA-256. Scrypt is memory hard, meaning that the algorithm is more limited by the available memory in the system than by the computing power.

For key derivation, this is great because it’s hard to do brute force attacks on a database of keys – in other words, it’s hard to guess what the original password was. For our mining algorithm, it’s great because ASICs don’t really give miners an advantage. This makes mining easier for folks that only have access to devices like GPUs, and prevents some of the mining centralization and barrier to entry that’s seen with SHA-256 mining.

Ethereum Mining with Ethash

Ethereum mining follows a similar model to Litecoin – it was designed to prevent mining centralization. However, Ethereum goes further than simply using a memory-hard key derivation function or something of that nature. Ethereum uses its own memory-hard algorithm for mining called Ethash, custom designed by its creators.

Ethash is based on an algorithm called Dagger-Hashimoto used to make mining a memory-hard problem. Every N blocks or so, a large dataset is generated using the block data as a “seed”. The Dagger part of the algorithm was designed by Ethereum’s creator Vitalik Buterin to make mining memory hard, but make verifying the answer relatively easy for non-mining nodes on the network. The Hashimoto part was designed by Thaddeus Dryja to make a memory-hard hashing problem. Combining these concepts into Ethash makes a mining algorithm that’s less prone to requiring specialized hardware over time than SHA-256 mining.

Mining Variations – Same Idea, Different Requirements

The overall problem in proof-of-work mining is the same across currencies and algorithm variations – a mining node must expend resources to guess a bunch and find an answer to the problem. However, there are a variety of ways in which problem those miners solve can be constructed.

Some variations like SHA-256 are simple, but prone to centralization and specialized hardware requirements over time. Other like Litecoin and Ethereum take a different approach, desiring to make mining a more equitable process across the network at the expense of some complexity.

Regardless of the approach, proof-of-work mining allows lots of individuals to come together and create a peer-to-peer network of money, without the need to trust any one central “clearing house” to process transactions. Proof-of-work mining makes pulling off fraud a difficult or impossible endeavor, so that these currencies remain globally decentralized and secure.

Bitcoin is Not (Just) for Rich People

Overview

At a recent event at Duquesne University, a student asked me to explain what Bitcoin is. And as an addendum to her question, she stated that her impression is that Bitcoin is only for rich people…

As an educator in the space and a strong believer in the power of cryptocurrencies to change the fabric of finance, I can’t stand that Bitcoin and its culture has devolved into memes about hodling (holding) and buying Lamborghinis when the price goes up. Let’s discuss why I think the true value of this technology goes so far beyond price speculation.

Getting Past “Number Go Up”

Is it possibly good for the community and adoption when the price of cryptocurrencies go up, relative to the United States Dollar? Probably. After all, a medium of exchange must have some value for it to be worth accepting for goods and services. However, the value must be fundamentally derived from the currency’s utility! Bitcoin Cash, Litecoin, Ethereum, et. all are first and foremost cryptocurrencies. Bitcoin should be, but its loudest proponents in the last several years have argued it’s only major use is as a store of value, so I’ll leave that up to you to decide.

But why is this distinction so important to me, and to many others working in the space? I so strongly believe in Bitcoin as peer to peer cash because its unique properties enable everyone in society to have more sovereignty over their finances.

The cryptocurrencies I study and tech about are decentralized. No central banks or corporate institutions control these forms of money, meaning there is no central point of failure. It’s peer to peer money – just the sender and receiver, no corporation in the middle deciding which transactions are valid and allowed to proceed. This leads to the property of censorship resistance, meaning that no one can arbitrarily stop you from transacting with another party. And of course, these currencies are truly global and borderless – no silly geopolitics here. Send money to anyone, anywhere.

Peer to Peer Cash for Everyone

These properties are pretty interesting for those of us in the modern world, yes. We can all benefit from increased security and economic freedom, undoubtedly. However, our judgement on the true value of these properties is often clouded by our wealth and access to privileged banking.

Imagine for a second that you’re a migrant worker, thousands of miles from your family. Your hard work is your family’s lifeline, but Western Union will steal 30% in fees to send your money back home. If you’re allowed to send money back to where home is. But with Bitcoin Cash or Litecoin, you’ll pay less than a penny in fees and your funds will arrive in a near instant. That is powerful.

Imagine you’re a dissident or journalist whose bank accounts are frozen in an attempt to shut you up. You can’t use your debit card anyone, but you can use Bitcoin Cash, Ethereum, and other currencies anywhere, any time. It costs you less in fees than banking does, and donations to your cause can never be censored.

Imagine you live in a part of the world where the nearest bank branch is 100 or more miles away, but you have access to a cell phone with a data connection. This happens in many parts of the world. The student I spoke with is a native of Kenya, where the M-pesa is a completely digital version of cash in the common use. Cryptocurrencies can easily become a totally free medium of exchange in those places, allowed the unbanked to hold a secure bank account in their pocket.

We, the rich and privileged can benefit from crypto adoption. But we cannot allow the properties that make these currencies so powerful to be eroded, for those that truly need them.