What’s in Your Wallet? Understanding Private Key Control

Overview

Just like your cash, cards, and ID, your cryptocurrency assets live in something called a “wallet”. Most all forms of digital money implement this concept in some form, and understanding wallets is critical to safely storing and using your favorite digital currency.

Much like your physical wallet, your Bitcoin, Monero, or Ethereum wallet gives you direct access to the funds inside. A crypto-wallet isn’t like a credit card – if a stranger gets a hold of it, you can’t cancel it. Much like cash, the money stolen would be theirs!

But how does this work? How can a digital asset act like cash when all other forms of digital monetary transactions (credit cards, bank transfers, PayPal) can be “cancelled” if stolen? We must first understand a bit about what a “private key” is, and why who controls it is so important to the security of your cryptocurrency funds.

A word on private keys

Without getting to far into the technical details, let’s discuss a bit about what a “private key” is and why it is so important. Remember how I said that your crypto-wallet is like digital cash, and your wallet “stores” your Bitcoin or other currency? Well, that’s not quite how that works…

In reality, the amount of Bitcoin that you own is stored on a worldwide, completely public ledger/database called the “blockchain”. This ledger stores a public record of all of the Bitcoin transfers ever conducted, so anyone can see exactly who owns what. Sound scary and insecure? How can you control your digital cash if everyone has access to this open blockchain??

This is where private keys come in. Bitcoin (and other crypto currencies) use a form of cryptography called “elliptic curve cryptography” to generate the Bitcoin “addresses” people can use to send you money. The address is completely public; you can give it to anyone and they can send you funds. However, behind this address is a special “private key” used to access those funds on the blockchain. Your address is generated from this randomly generated private key by using this form of cryptography.

The cryptography used in address generation makes it so that you can’t figure out the private key by going backwards from the public key, or Bitcoin address. However, the private key is used to prove that you own the address without ever revealing it, thanks to the magic of elliptic curve cryptography. It is critical that the private key is always kept secret, because anyone with the private key can access the Bitcoin at the associated address.

Levels of Private Key control in Wallets

Now that we understand the basics of private keys and their importance, we can talk a bit about how different wallets keep these keys safe from the prying eyes of crypto-thieves. All wallets must work in some way that keeps the private keys, well, private so that control of the funds lies with their rightful owner. There are three general approaches to private key control in wallets: a full control model, a hybrid model, and a custodial model.

Full control wallets

Full control wallets offer the obvious – complete and total control of the private keys. With a full control model, the private keys are generated and stored on the user’s device, be it a desktop computer, mobile phone, or even a hardware wallet like the Trezor. With this model, the private keys never leave the user’s device in any shape or form.

The advantage to this model should be fairly obvious – it is by far the most secure model. There is no trust involved with a third party; the funds are completely controlled by you. Users should still exercise care around other security parameters (ensuring a virus-free machine, for example), but generally these wallets offer the most hardened approach to keeping private keys safe.

The disadvantage here is the lack of convenience and ease of use. This wallets require the most technical savvy of these three models, although most “power users” will have no problem understanding and securing these wallets. It is extremely important that the users of these wallets understand how to back up their private keys. If the device is fried or lost and there is no accessible backup, all funds will be lost! Fortunately again, BIP39 mnemonic backups make this tasks easier than it was with the first few Bitcoin wallets.

Hybrid wallets

Some major players in the crypto space have created an interesting hybrid model for private key storage. Web wallets like those at blockchain.info or btc.com implement this model. With hybrid wallets, private keys are generated and then encrypted on the user’s machine (usually in the web browser) before being stored on the company’s server. With these wallets, private keys are only known and accessible to the user, while the company keeps an encrypted backup safe on their servers.

With this model, security is still pretty strong. Because strong encryption is done on the user’s machine, no one with access to the company’s servers have access to the actual private keys without the decryption passphrase, which lies safely with the user. This model requires that the user trust that the company’s code (which is preferably open source) is soundly implemented and doesn’t contain secret backdoors. However, if the encryption is done right no one but the user can actually access the keys. This model is slightly less secure than a full control model,

Although there is a small amount of security tradeoff here, this model comes with increased convenience to the user. Most web wallets have a more traditional username and password login interface, so the user only needs to create and remember a secure passphrase to access their funds, with the site taking care of backups for them. This may be easier for a beginner crypto-enthusiast, and any good site will still offer mnemonic backups and private key exports for the savvy user.

Custodial wallets

The final model we’ll discuss here is the custodial wallet. Many exchanges like Coinbase offer custodial wallets. Like a hybrid wallet, all you need to do is create a username and password and log into a website to access funds. However, the critical difference is that with a custodial wallet, the user doesn’t know their private keys at all!. With custodial wallets, the website takes care of generating and storing all the private keys without revealing them to the user. No backups to manage, and no need to understand how to do much more than log in to a website to use this kind of wallet.

The security pitfalls of this model are pretty serious, in my opinion. With these wallets, the user has no way to back up their private keys. What’s more, the user must completely trust the company or individual implementing this kind of wallet. These companies must have significant security measures in place to avoid attacks on their servers, and they must be trusted to mitigate the access rogue employees could have to user’s money.

In fact, these types of wallets completely break a fundamental security principle of Bitcoin – the user controls their keys, therefore the user controls their money. Custodial wallets far more closely resemble the centralized model of traditional banks.

Don’t worry though – these wallets aren’t all scary! The cost of security comes with a large benefit – ease of use! These wallets have a much smaller learning curve for complete beginners. Just sign up for a wallet account just like you would a forum, email address, or social media account. For someone with little understanding of the world of cryptocurrencies, this type of wallet offers a gentle introduction

My only advice would be that given the security pitfalls of this model, only use custodial wallets to store small amounts or for buying and selling. Most custodial wallets live in currency exchanges, so use them to buy your crypto of choice and send the funds to a more secure wallet.

Know Your Keys

The most important takeaway from this discussion of private key control models is that it is important for a wallet user to know where their private keys are. Again, in Bitcoin and other cryptos, control over your private keys is control of your money. Anyone with access to the keys has access to your money and can spend it freely, so either keep them to yourself or make sure the holder is a wallet maker you trust.

By understanding these different models, users have more control over how they choose to secure their wallets and keep their funds safe. Understanding the pros and cons of full control, hybrid, and custodial wallets allows a cryptocurrency user to choose the best wallet for their needs. Ultimately, an understanding of these models allows for better security and comfort with digital money, because a person knows who truly owns their keys and is responsible for keeping them safe.

Playing With Blocks: The Basics of Blockchain Databases (Part 2 – Blockchain for Techies)

Overview

In our non-technical overview of blockchain, we discussed what a blockchain database is – a distributed, cryptographically secured, and immutable data structure used in applications like digital currencies. We discussed how blocks are cryptographically linked together to ensure that old records can not be changed, and why these types of databases are useful for applications where the immutability of data is key.

But how does this work technically? Let’s take a deeper look at how blockchains are secured by the application of hashing and proof-of-work.

It’s all in your the block’s head

The block header

The key to understanding blockchain lies in a data structure included in every block of every blockchain. This data structure is called the block header, and it contains several critical bits of information needed to secure the chain as it grows.

Let’s look at the Bitcoin block header as an example. Each Bitcoin block contains an 80 byte header with the following information:

  • Version – the software version of the Bitcoin protocol
  • Timestamp – expressed in seconds since the Unix Epoch
  • Merkle Root – for our purposes here, we’ll say this is a “fingerprint” of all the transactions in this block
  • Difficulty target – A 256 bit integer used in calculating proof of work
  • Nonce – The value added to the block header to demonstrate proof of work
  • Previous block hash – The SHA-256 hash of the previous block header

All of this data is important and serves a purpose in the Bitcoin protocol. However, I’ve highlighted the previous block hash because it is particularly important when discussing how the blockchain is secured!

A quick review of hashes

Before we discuss the critical role that the previous block hash section of the block header plays in securing the blockchain, let’s step back and recall what a hash function does. When “hashing” data, a special algorithm called a “hash function” takes the data and outputs a unique “fingerprint” of the input data. These functions (at least if they are implemented properly) have two very important properties.

First, a particular chunk of data always produces the same hash (or “fingerprint”) every time it is run through the function. If you run Hello through the SHA-256 hashing algorithm, the result will always be 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969.

Second, good hash functions avoid collisions, where different data results in the same hash output. Using a proven algorithm such as SHA-256 means that for every different input, a different hash is produced. Even if a single bit of input changes, the hash is radically different. “Hello” will produce a very different output than “hello”, even though only a few bits of the input are changed.

Hashes inside hashes inside hashes

Understanding these properties of hashes, we can better understand the interesting approach that blockchains take to securing the integrity of data in previous blocks when combined with proof of work.

Each time a block is generated, proof of work is generated in the form of the nonce included in the block header. This is computationally intensive and essentially proves that a miner spent a good bit of CPU power to find the answer to this cryptographic puzzle.

Now, when the nonce is included in the block header with the other data including the previous block hash, we can hash the entire block header to generate the unique fingerprint. This “block hash” is unique, and changing any data whatsoever in the block header would create a radically different block hash.

Okay, so each block has an associated hash. What’s the big deal? How does that help secure the blockchain? The magic of blockchain lies in that critical piece of data known as the previous block hash. Recall that changing any bit of data in the input radically changes the output of a hash function. So what would happen if we tried to change a transaction 2 blocks back in our blockchain?

If a node tries to broadcast a fake blockchain to the network with a fake transaction 2 blocks back, the block hashes for each subsequent block would be radically different.

Let’s look at an example. Let’s say we have a really simple blockchain with some transaction data like so (Note – these hashes are made up for demonstration purposes):

Block 2:
Time - 3000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 5345245
Prev block hash - 33a0b89fcce723e9f41f5d756ab1c20584afbe6dfa9ea18838ff3caf0915b5f5
Transaction: Bob pays Alice 6 units

Block 1:
Time - 2000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 2356343
Prev block hash - f4ebb8b56f590188f5824276af552cd51a48ba774e3ad1350c2800b116d8f6f5
Transaction: Alice pays Bob 5 units

Block 0:
Time - 1000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 1232341234
Prev block hash - 0000000000000000000000000000000000000000000000000000000000000000
Transaction: Alice pays Bob 1 unit

Now let’s say Bob gets greedy and tries to say that Alice paid him 10 units in the first transaction:

Block 2:
Time - 3000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 9987983
Prev block hash - 9ae343e333cbb96427eb333bb8c443359e3cf926c9de9845ceb583577b945afb
Transaction: Bob pays Alice 6 units

Block 1:
Time - 2000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 390970
Prev block hash - 3f82f4cfe059b5a69a0fd5b4d34774af5ecdc672d988320d5fd186998969a645
Transaction: Alice pays Bob 5 units

Block 0:
Time - 1000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 235235
Prev block hash - 0000000000000000000000000000000000000000000000000000000000000000
Transaction: Alice pays Bob 10 units

Notice how different the hashes are for blocks 1 and 2 in Bob’s fake blockchain. If these hashes are to be considered valid by a node in this currency’s network, then each block must also demonstrate proof of work. Since the data has changed in a block, a new nonce must be found to show that work was done.

Here is the most critical part – since the hash for block 0 has changed and is included in block 1, proof of work has to be re-done for block 1. And since block 1’s hash is included in block 2, proof of work has to be re-done for block 2. In other words, to try and fake a transaction 2 blocks back, Bob has to re-do proof of work for 3 whole blocks!! In the meantime, legitimate nodes only have to try and find a solution for the current block. It is clearly impractical, if not impossible, to “fake” a blockchain more than one or two block old, because proof of work has to be redone for many blocks in the time the legitimate network only has to prove one.

Faking a blockchain take too much work!

Due to the interesting combination of hashing and proof-of-work algorithms, it is incredibly difficult if not impossible to fake history in a blockchain database. Because each block contains the hash of the previous block, changing history blocks back means that every bit of the chain on forward must be forged. While legitimate nodes only have to prove work for one block in that span of time, an attacker would have to calculate for many. Unless a malicious party has some amount of computing power the rest of us don’t know about, it’s nearly impossible to do so.

For the extra curious, Satoshi covers the math behind this concept extensively in section 11 of the Bitcoin whitepaper. While I don’t claim to understand this math very well myself, the paper does a good job of explaining its conclusions that forging blockchain history is a fool’s errand.

Playing With Blocks: The Basics of Blockchain Databases (Part 1 – Blockchain for Everyone)

Overview

Blockchain is the latest and greatest buzzword in the information technology world. From open source, decentralized cryptocurrencies like Bitcoin to traditional financial institutions, it seems as though everyone is dying to create and release their own blockchain based applications. But what is blockchain? Why is it such a popular concept, and what is it actually good for? Let’s discuss.

What and why: Blockchain simplified

What is blockchain?

So you’ve heard that blockchain is going to revolutionize everything, but what is it exactly? Let’s cut through the hype and discuss the technical foundations of Blockchain.

A blockchain is a distributed, cryptographically secured database that focuses on making historical data immutable.

In a traditional database, information is often stored on one or a few machines, controlled by a central authority. Access is controlled by this authority (think IT administrator) and the data is kept secure by granting credentials to modify that data to a select few trusted parties. By contrast, a blockchain database is governed by what is called distributed consensus, using mechanisms such as proof-of-work. For more information on proof-of-work, you can read my series of articles on it.. The important thing to note is that (in general), no one central person or authority decides what data is “verified” in a blockchain, a community of network nodes and software does.

If anyone can modify the data in a blockchain rather than a trusted party, then how is this consensus on what is correct achieved? Again, the secret lies in the science of cryptography. Through a mechanism like proof-of-work, a cryptographic puzzle is solved by software with some incentive to do so. In Bitcoin, the node that solves this puzzle is granted new currency. The real magic, however, is the fact that any other node in the network can verify that this answer is correct in a split second, so anyone can independently verify that a block meets the cryptographic standards set by that blockchain’s protocol.

You may be wondering how the cryptography in each block keeps the overall blockchain secure. This is the question of immutability, or how easy it is to modify the history stored in the blockchain. Blockchains solve this by cryptographically “linking” each block to the previous block, thereby making each individual block a critical part of the history stored by that “chain”. Each block has a header full of useful metadata about that block – a timestamp, a “summary” of the included data or transactions, a difficulty target and nonce for mining (part of proof-of-work), and the hash of the previous block’s header. Each block header is run through a one-way, cryptographically secure function called a “hash function” that creates a unique digital fingerprint for the data.

Immutability is achieved when combining the proof-of-work consensus mechanism with this system of chaining each block together. In order to create each block, the cryptographic puzzle solved by the proof-of-work algorithm allows a unique block header hash to be generated. It is computationally difficult to get this value, but very easy to verify it is correct. Now, it’s not that hard to re-solve that hard problem in a matter of minutes…it would be easy to create a fake block at the top of the chain. But what about 10 blocks back? Well, since each block contains a hash of the previous block header that is generated by solving this hard problem, you would have to now fake history for ten whole blocks! It is exponentially more difficult to do so the further back in the chain that you go. Unless you can truly do the work required to fake history in a blockchain, any independent network node could easily see that the rest of your history on forward is invalid. The immense difficulty of “faking” history in a blockchain gives it the most important property it has, its immutability.

Cool, so why is it useful then?

By far the most important aspect of blockchain, in my opinion, is its ability to decentralize applications. With a traditional database, a central authority has to be trusted, which can be a disadvantage in applications that are controversial or have high incentives for fraud. For example, previous attempts at digital money like DigiCash had central services for issuing currency and validating transactions. These were promptly shut down by governments that didn’t like independent currencies very much.

With blockchain, it is possible to have things like completely peer-to-peer money as with Bitcoin, Litecoin, and countless others because no central government or individual has to be trusted! The network is secured by math (cryptography) rather than trust thanks to the blockchain. You don’t have to trust anyone to not defraud you of your money, because the math cannot lie about who owns what.

The other critical function of blockchains beyond decentralization are the preservation of history. Because blockchains are immutable, they can be useful for keeping things like medical records, property transactions, court histories, and more secure from malicious tampering like a traditional database. This does rely on some degree of decentralization, but even within a single company a blockchain is far harder to tamper with than a traditional database.

Cool, now I want a blockchain!

Blockchains are a fascinating and novel way to handle problems with traditional databases in certain applications. Thanks to the decentralized and cryptographically secure nature of these databases, it’s possible to create peer-to-peer applications that don’t require trusting a third party – a key problem to solve for concepts like digital money. As well, their immutability makes them useful even beyond the first few money-centric applications that existed – they may be coming do a real-estate authority, doctor’s office, or justice system near you!

BIP39 Mnemonics Made Easy (Part 2 – The Tech of Bits to Backups)

Overview

In the last article, we discussed a high level overview of BIP39 mnemonics and their value as a simplified backup tool. Mnemonics make it much easier to take a single seed, back it up, and ensure access to an entire wallet of private keys, addresses, and transactions. But how do we go from a random set of bits to a list of words? Let’s discuss the technical side of BIP39.

Bits to Backups – The Steps for Generating a Mnemonic

First, Chaos

In order to generate a good seed, a fair amount of entropy or “randomness” is desirable. Good random number generators are hard to get, but modern OS’s like Linux do a pretty good job of sourcing entropy from the user and hard drives, and something like /dev/urandom on a daily driver machine should be sufficiently secure for generating the entropy we need.

Now how many random bits do we need? The BIP39 standard specifies 128-256 bits of entropy to be used for generating the seed. This will correspond to 12-24 words later on when we “map” the entropy to the words.

First, a warning: DO NOT USE any of the examples in this article to generate a wallet – your funds will be stolen!

With that out of the way, let’s look at an example. First, let’s generate 128 bits of entropy using os.urandom() in Python. Represented as binary, our entropy looks like this:

10111110011001010101110111001111010100011111011010110001110101111011110111000101101001100011110100010100011101000011011011100000

Next, a checksum

In order to better secure the seed, we’ll add a checksum to the end of the entropy. This makes it easier for wallet software to validate a backup seed.

To get the checksum, we’ll first take the SHA-256 hash of our entropy. Then, we take the first N/32 bits of the hash and append it to the entropy.

In our case, 128/32 bits gives us a 4 bit checksum size. In our example, the 4 bit checksum will be 0101. We’ll append that to the entropy to give us a 132 bit value:

101111100110010101011101110011110101000111110110101100011101011110111101110001011010011000111101000101000111010000110110111000000101

Dividing and our Dictionary

The final step of the process involves dividing our checksummed bits into “chunks” and mapping those chunks to the mnemonic words from the dictionary. The BIP39 standard specifies that the chunks will always be 11 bits long. So, we divide our 132 bit checksummed entropy into 12 chunks of 11 bits each:

  1. 10111110011
  2. 00101010111
  3. 01110011110

Now, each of these 11 bit chunks can be interpreted as an unsigned 11 bit integer value ranging from 0-2047. This “maps” to a word from the dictionary of 2048 words directly! These are standardized and listed in alphabetic order. So, we can take the 11 bit chunk as an index in the dictionary to extract the words we need:

  1. 10111110011 = 1523 -> salmon
  2. 00101010111 = 343 -> cliff
  3. 01110011110 = 926 -> inherit

The overall mnemonic we generate is this example turns out to be:

  1. salmon
  2. cliff
  3. inherit
  4. physical
  5. help
  6. type
  7. warfare
  8. regular
  9. dial
  10. photo
  11. asset
  12. scheme

Mnemonics – from Entropy to Dictionary Entries

The process of generating a mnemonic seed is both ingenious and straightforward. One can easily create a secure wallet seed of 12-24 words by generating some entropy, checksumming the data, and mapping to a standard dictionary.

I’ve written a project called MnemonicGen that generates mnemonics using these steps. Take a look at this project to see these steps implemented in Python. This code should be considered academic/experimental – use it to create wallets at your own risk. Other proven implementations such as Ian Coleman’s BIP39 are also available to study.

Happy generating!

BIP39 Mnemonics Made Easy (Part 1 – Backups, Simplified!)

Overview

A critical component of cryptocurrency security is the ability for users to easily and efficiently backup the private keys that control access to their funds on the blockchain. Without one’s private keys, any funds in a user’s addresses are irrecoverably lost.

However, the nature of early wallets made backing up one’s private keys a regularly-scheduled necessity, unwieldy and annoying for most users. BIP39 and associated Bitcoin Improvement Proposals have thankfully simplified private key backup by introducing HD wallets and mnemonics.

Newer wallets explained

HD? So like, High Definition? No, Hierarchical Deterministic!

In the early days of Bitcoin and other cryptocurrencies, address generation was done non-deterministically. For each new address needed, a private key would be randomly generated and stored in the wallet’s backup file. Most wallets would pre-generate some addresses in the initial wallet file, but every new private key/address introduced into the wallet meant a new backup would be needed.

For privacy reasons, it is recommended to use a new address for each transaction. And for every new address generated since the last backup, a user would need to create a new backup to avoid losing recent funds in the event of a wallet resortation. Even for “power users”, backups became an unwieldy and annoying task.

Enter BIPs (Bitcoin Improvement Proposals) 32 and 44. In summary, these proposals introduce HD (“Hierarchical Deterministics Wallets”). These wallets only require one seed to be randomly generated. And from that seed, all the private keys and addresses a wallet needs can be derived in a tree structure; all associated with the initial seed. Since the private keys and addresses can be regenerated from the seed, one only has to back up the seed to recover all of their private keys, addresses, and transactions for a wallet. Much better!

Introducing Mnemonics – Simplifying Seed Backups

The ability to generate an entire wallet from one seed drastically simplified wallet backups, and therefore has improved the ease by which users can keep their funds safe. However, a seed is still just a random binary value. Represented in hex or Base64 encoding, it is still fairly easy to misread/miswrite a character and accidentally create a useless wallet backup.

To truly simplify the task of backing up a wallet seed, some developers in the Bitcoin space proposed a system that allows the translation of the binary seed value into English words that can be more easily transcribed or even memorized to secure access to one’s funds. This proposal, given the designation BIP39, was written by Marek Palatinus, Pavol Rusnak, Aarone Voisine, and Sean Bowe.

What does a mnemonic look like?

Mnemonics don’t use just any set of words. These words are carefully chosen to avoid ambiguity and make transcription easy, so that a user doesn’t accidentally create an incorrect backup.

There are a total of 2048 words in the dictionary, and a wallet mnemonic contains 12-24 words. The last word contains a checksum validating the other words in the list, making it easy for wallets to validate a backup.

Here is a sample Bitcoin or Bitcoin Cash BIP39 mnemonic:

  1. army
  2. van
  3. defense
  4. carry
  5. jealous
  6. true
  7. garbage
  8. claim
  9. echo
  10. media
  11. make
  12. crunch

WARNING, DO NOT use this seed for a wallet. A seed must remain private, and your funds will be stolen! This mnemonic is excerpted from Andreas’ Antonopoulos’ Mastering Bitcoin to further discourage its use – millions of people have access to this wallet.

Now how do you get one of these fancy mnemonics for a wallet? Most modern wallet software will generate this for you when you create a wallet. Then, all you need to do is write down the phrase and store it in a secure location to backup access to your funds if your wallet device is lost or stolen.

Alternatively, a mnemonic can be generated by a separate tool and imported as a backup into the wallet software. I’ve written a generator called MnemonicGen that produces standard phrases that can be imported into any modern HD wallet that supports BIP39. Keep in mind that this particular project is meant to be academic/experimental and may not be sufficiently secure for your needs. But other mnemonic generators like Ian Coleman’s are widely used and well-vetted.

Mnemonics – Backups Made Better

With BIP39 mnemonics, Bitcoin newbies and power users alike can easily create and backup secure wallets without the need to keep a schedule or deal with unwieldy binary data encoding. This modern standard is implemented in widely-used and accessible wallets like the Bitcoin.com wallet, Electron Cash, Blockchain.info, and more. I would personally advise upgrading your cryptocurrency experience by using an HD wallet, simplifying your security practices and keeping your funds safe!

In the next article, we’ll discuss the technical workings of BIP39, showing how we can go from a random seed to a set of words in a few fairly straightforward steps.

Proof of Work, Explained (Part 2 – A Hash Bash for Techies)

Overview

In the last article, we looked at the overall idea of proof of work and its applications. That article covered the origins of this concept, how it works at a high level, and some of its applications. Now, let’s take a look at the technical inner workings of these algorithms.

In a nutshell, proof of work involves the use of hash functions. These one way functions form the basis for a difficult to solve, but easily verifiable computational puzzle as a way to prove that one did some amount of desired computing work.

Proof of Work, the Technical Perspective

Hash Functions

First, we need to understand a bit about hash functions and why they form the basis for proof of work. A hash function is a one-way function that takes some input of any size and outputs a consistently sized set of bits. The two most important characteristics of hash functions are that they:

  • Are one-way – you cannot take an output and find the input without brute force guessing
  • Have unique outputs for every possible input (if the hash function is a good one!)

These two properties are critical for proof of work. First, the one-way nature of these makes it so that brute-force is required to find some desired output. Second, the desired one-to-one input/output property makes it so we can easily verify the solution once we have one.

An Overview of the Algorithm

Hashing and Binary and Difficulty Targets, Oh My!

Proof of work builds on top of the properties of hash functions by realizing that as a stream of bits, hash outputs actually represent binary numbers. For example, an 8 bit hash 00001000 represents the decimal number “8”. Now remember that hash outputs can only be matched to a particular input by using brute force to guess.

Using these properties, proof of work takes a pretty ingenious approach to making a user do some amount of predetermined work – it makes them look for a hash that, interpreted as a number, is less than some target value!

This is where the idea of difficulty comes in. Let’s say you want the user to find some input where the hash value, when representing an 8 bit number, has two zeros in the front (00101010, for example). Now imagine you want the user to find an input that gives a hash with four zeros in front (00001011). Which one takes more guesses to compute? It turns out that the smaller the “difficulty target” value, the more guesses (and more computing time) it takes to find an input that gives the desired hash output. This is the fundamental basis for proof of work. It can be statistically predicted that a certain difficulty target will take roughly some amount of guesses (and therefore computing time) to find. So the smaller the difficulty target, the harder the puzzle.

The nonce value

Now since hash outputs map one-to-one with some input, how can we prevent the worker from just using a dictionary to find an input that meets the difficulty? Here is what makes proof of work truly proof – we always create a hash input unique to the problem we’re trying to solve, using an applicable message and a random guess we call a nonce.

See, for our proof of work to truly require the desired amount of work, we always start with a unique message for the problem. In the case of Bitcoin, our message is what is called the “block header” – a chunk of data containing information about the transactions included in the current block. In an anti-spam application, this message would be something like a forum post or the contents of an email message. Since this message is unique, a dictionary cannot be used to guess a hash that meets the difficulty target.

The worker then has to take the message plus a random number guess called the nonce, and run that combined string through the hashing function. If the hash output isn’t less than the target number, then the worker increments that random number guess concatenated to the message and tries again – and again and again until the right output is found.

Verifying the nonce

Once a nonce is found, another node or server can very easily verify that the solution (the nonce) is correct. All the verifying party has to do is take the original message plus the nonce value found by the worker and run that through the hash function. Since a hash input will always give the same output, it only takes one step to verify that the worker’s nonce is in fact a correct solution to the proof of work problem.

A Practical Example

Let’s take a look at an example anti-spam proof of work problem. Let’s say a user wants to contact a site owner with the message “Hello”, and the site owner wants the user to do some proof of work before sending that email. The site owner specifies a difficulty target of 2^240. This difficulty target can be any 8 bit number in this case, but a power of two is easy to work when building an application. This system uses the compute-intensive SHA-256 hashing algorithm for its proof of work. Here’s what the steps would look like:

Worker (Client)

  1. Uses “Hello” as the message
  2. Starts guessing a nonce with 0 – the hash input is the string “Hello0”
  3. The SHA-256 output of “Hello0” (in hexadecimal format) is 80878c5b013ba72c0d2b7e8f65868649cbdb1e7e7a8c8a07537d6b3619e4e32f
  4. Clearly, this output is greater than the difficulty target of 2^240, which would have three prepending 0’s in hexadecimal: 0001000000000000000000000000000000000000000000000000000000000000
  5. Increment nonce to 1, and try again. This continues until an appropriate nonce is found
  6. The client finally finds a nonce that works with the value 9172. The SHA-256 hash of “Hello9172” is 00001f2e9f8f74117b4178eb04b368c807f906ae2a07bece562266cbc9adff3c, which is less than the difficulty target of 0001000000000000000000000000000000000000000000000000000000000000 (2^240)
  7. Since the client has a nonce guess that meets the difficulty target for this unique message, it now has proof that it did all that computing work!

Verifying Party (Server)

  1. Take the message for this problem, “Hello”, plus the client’s found nonce, “9172” and pass “Hello9172” through the SHA-256 hash function
  2. Since hash functions produce the same output for any input, we get the same output the client found: 00001f2e9f8f74117b4178eb04b368c807f906ae2a07bece562266cbc9adff3c.
  3. Since the above output is indeed less than the difficulty target 2^240, the server has now verified that the client did the desired amount of computing work to find the nonce. The message can now be sent.

Proof of Work – Hashing for a Cause

These algorithms put the properties of hashing algorithms to new and innovative uses, particularly in the incredible space of cryptocurrencies. Proof of work takes the one-to-one input/output and irreversible properties of hash functions and uses them to create difficult to solve, easy to verify computing problems. This simple but interesting bit of math and computer science powers new approaches to interesting challenges. Proof of work can be used to help prevent spam in a new and unique way – by making large-volume spam uneconomical for its propagators. Arguably at its most revolutionary, proof of work powers the transaction verification and currency issuance components of cryptocurrencies like Bitcoin and Litecoin, allowing for an entirely new form of money free from centralized institutions.

Proof of Work, Explained (Part 1 – POW for Non-Techies)

Overview

Personally, I’m fascinated by both the technical and financial implications of cryptocurrencies like Bitcoin, Bitcoin Cash, and Litecoin (to name a few). The way these currencies work is a complex topic, with lots of moving parts to discuss. One of the core components of cryptocurrencies like Bitcoin is the mechanism by which an entirely decentralized system of money can securely verify transactions as well as issue new currency, all while preventing fraud and issuing at a predictable rate.

Most of these currencies solve this problem using a concept called “proof of work” by which nodes solve a computationally difficult but easily verifiable mathematical problem. This concept goes beyond cryptocurrencies as well, and actually originated as an anti-spam measure.

Proof of Work – The 10,000 foot view

What is Proof of Work?

Proof of work, fundamentally, is the solving of a computationally intensive mathematical problem. This problem has two very important properties – the solution to the problem is both:

  • Difficult (computationally intensive) to find
  • Easy to verify once found

The idea is this: for an application like cryptocurrency or anti-spam, a “node” or computer is challenged to find a solution to this puzzle. The solution can only be found by brute-force guessing. However, once the solution is found, all the other nodes on a network or a server can verify the solution in one step. Since the answer can only be found by brute-force computation but can easily be verified as correct, the solution to the problem serves as proof that a certain amount of computing work was done – hence the term “proof of work”.

Why is it Useful?

First, let’s look at the original application of proof of work: anti-spam. The original idea was implemented in a system called HashCash, invented by Adam Back. Back’s system works like so: Before performing an action like posting to a forum or sending an email, the user of a site is made to do a small proof of work problem. This problem only takes half a second or so of computing to solve, and of course is almost instantaneous for the system to verify. For a legitimate user of a forum or email system, the half second of computing is no obstacle to completing his or her task. However, for a spammer trying to send hundreds of thousands of spam messages, the task suddenly becomes very uneconomical since it would tie up their computer for minutes or even hours at a time!

Now how does this system apply to cryptocurrencies like Bitcoin? In this system, transaction verification and currency issuance is totally decentralized – no third party is trusted to create new value tokens or verify that transactions are legitimate. This of course presents a massive fraud-prevention challenge – how can the network ensure that malicious parties don’t create “counterfeit” currency or send through transactions that aren’t valid?

Proof of work helps to solve this problem. On the Bitcoin network, new transactions are broadcast to computers running what is called “mining” software and accumulated into “blocks” of transactions that will be validated at one time. Every time a new block is waiting to be verified, all the nodes on the network running this software essentially “race” to solve a proof of work problem first. The Bitcoin network adjusts the difficulty of this problem so that about once every ten minutes, one miner wins the race and finds a solution to this problem. Once one node finds the answer, it tells all the other nodes on the network that it’s found an answer, and the other nodes can instantly verify that the answer is correct.

The node that finds proof of work for this block is rewarded with brand new Bitcoin (issued at a predictable rate) as well as all the transaction fees in that block. This computationally expensive proof of work problem creates an excellent system of economic incentives- the reward of new Bitcoin drives miners to to verify transactions are correct, and also make fraud more expensive than legitimate mining. If a miner were to try and cheat, all the other nodes running the legitimate software would instantly reject the new block since it doesn’t meet the rules of the network, and all of the time and computing power of the malicious node would thus be wasted.

Proof of Work – Powering Cryptocurrency and Thwarting Spammers

The idea of proof of work has incredible value for multiple applications. This system allows computing to be used as a precious resource in a purely digital economy; a way to both secure monetary transactions and prevent the waste of resources like time and storage space. In an anti-spam setting, proof of work allows the operators of a curated space to reduce the impact of spam on their systems, reducing wasted time, clutter, and storage space. In a cryptocurrency application, proof of work allows the secure verification of transactions and the issuance of new currency without the need for a trusted third party, the often fatal flaw in fiat systems.

The applications of this technology are incredibly interesting. In the case of cryptocurrencies, I would say its application is part of a system that is revolutionary. Now, as a software engineer, I find the actual technical workings of proof of work to be even more interesting than the surface description. In the next article, I’ll walk through how these algorithms work from a more technical perspective.

Bitcoin as “Digital Gold” is Bad for Crypto Adoption

“Digital Gold” vs. “Digital Cash”

The core Bitcoin network has a scaling problem, and has had this problem for a while now. As more and more transactions try to fit in Bitcoin’s 1MB blocks (once every 10 minutes), network fees have skyrocketed to $5-10 dollars, and that’s even a little low if you want your transaction confirmed within an hour or so.

One response to this problem, especially from supporters of the Bitcoin Core roadmap, is to ignore the problem to some extent. As Bitcoin has seen more attention over the last few years, the price has risen dramatically. As a result, many are now claiming that Bitcoin is not meant to be a “means of exchange” or a form of “digital cash” for day to day transactions. Rather, their viewpoint is that Bitcoin should be seen as a “store of value” or “digital gold”.

To be clear about my biases in this space – I think cryptocurrencies are at their most interesting and valuable as a means of exchange; a way to do truly global, peer-to-peer, decentralized cash. I do not care at all for the risky speculative investing that goes on in the crypto space; I believe these currencies should be used or held in small amounts, allowing one to learn more about this fascinating technology and spread adoption.

With that said, I don’t have a problem with a digital currency being used as a long term store of value like a “digital gold”. There is plenty of room in the crypto space for currencies that solve different problems in different ways. I do, however, think there is a big problem with Bitcoin being the currency of choice for that use case.

The Bitcoin Brand, and the problem with Bitcoin as “Digital Gold”

Let’s be honest, fellow crypto nerds. How many people in your daily life have actually heard of Bitcoin? And how many of them actually understand it, at least at a high level? How many people actually own some and use it? If you’ve got the same variety of people in your life as I do, the percentage isn’t that high.

Now how many of that small subset of Bitcoin-aware people in your life know about Bitcoin Cash? Ethereum? Litecoin, Vertcoin, Monero, Dash? It’s an even smaller percentage, surely. Even if they’ve heard of them, do they understand how these alternatives to Bitcoin solve different problems? We’re down to a sliver of people that understand and adopt these different currencies beyond Bitcoin.

Herein lies the crux of the problem:

Bitcoin is the defacto cryptocurrency. It is the face of digital money, the storefront, the brand, however you want to refer to it.

Whether anyone likes it or not, Bitcoin is what most people hear about first when they hear about cryptocurrencies. And with a now large chunk of the Bitcoin community marketing this as “digital gold”, we have the potential to miss out on opportunities for widespread adoption in the coming years.

More and more businesses are going to become interested in adopting cryptocurrencies as a form of payment, they’re going to want to start with Bitcoin. It is the biggest after all, and the original value proposition we still see on bitcoin.org is “fast peer-to-peer transactions” and “low processing fees”. The reality is far from that, however. When the coffee shop owner realizes his or her customer will have to pay $10 to buy a $3 coffee, they’re not going to find Bitcoin usable for their business. How many of the already small percentages of crypto-curious entrepeneurs are going to take the time to understand Bitcoin Cash, Litecoin, or Dash? Many will probably say: screw this and return to business as usual with fiat.

Satoshi’s “Peer-to-peer electronic cash” – How Does Bitcoin Continue That Vision?

The problem is not a cryptocurrency as “digital gold”, the problem is Bitcoin as “digital gold”. The original promise of Bitcoin was indeed a global, peer-to-peer, low fee alternative to centralized payment processors like Visa, Mastercard, or PayPal. But as the Bitcoin community shifts away from that vision, they take the adoption of that vision with them.

With a functional implementation of the lighting network at least two years away, and the lack of press for already-scaled alternatives like Bitcoin Cash, Litecoin, etc., it concerns me that adoption of cryptocurrencies will stall. I don’t at all fear that they will go away or stop the ceaseless flow of innovation that we’ve seen since the advent of Satoshi’s brilliant whitepaper. But I do worry that Bitcoin’s current scaling problems and the community’s attitude toward it will lead to several years of stalled adoption in the mainstream. I do hope, however, the problem gets fixed soon and that I am wrong.