Don’t Just Hodl, Spedn! – Cool Ways to Use Your Cryptocurrency

Overview

The “hodl” meme in the cryptocurrency world has gotten out of hand. With dreams of lambos abound, it seems everyone is just sitting around with fat crypto wallets waiting for the next big jump in price.

Now, there is absolutely nothing wrong with savings. I wouldn’t tell you to spend all your fiat either, and it’s always great to have money set aside for the future. But when it comes to cryptocurrencies, we’re doing the community a disservice by focusing too much on the price.

Bitcoin and all of its descendents are meant to be digital cash! We’re in the era of the most fascinating way to exchange value ever created, so let’s not sit on our crypto-assets like they’re a couple of boring old gold bars.

Why Spend Cryptocurrencies?

So why should you spend some of your digital assets? I always go back to the unique properties of cryptocurrencies that make them so interesting in the first place. Currencies like Bitcoin Cash, Litecoin, Ethereum, and (to some extent) the original Bitcoin Core chain are:

  • Secure
    • Crypto transactions are push transactions, so you never have to reveal personal information to a merchant like you do with a credit card.
  • Global and decentralized
    • These networks run worldwide without borders. Purchase goods, donate, and share with anyone anywhere without asking anyone’s permission
  • Low barrier
    • There is no KYC requirements, no paperwork, no approvals – download the wallet software and you now have a bank in your hands
  • Low fee
    • With the exception of the Bitcoin Core chain, you can send anyone any amount of money for a penny or less. And your fee isn’t going to a middleman, it’s going to support the network!

The Fun Part – How to Spend Cryptocurrencies

Purchasing Goods and Services

There are tons of merchants that will accept the most popular cryptocurrencies, especially online. I’ve bought several interesting items with various digital currencies – A JavaScript reference book for my shelf, T-shirts that share my love of Bitcoin, and even special apparel for Brazilian Jiu Jitsu.

Check out websites like Accept Bitcoin Cash or SpendBitcoins for ideas on where you can trade digital money for real-world goods.

Donate to your Favorite Organizations

Cryptocurrencies are great for donations as they make it so easy to send money quickly. Just snap a picture of a QR code address and your charitable contribution is on its way.

I’m a big fan of free and open source software, so I’ve sent tips to other developers and software projects I find useful.

Share it with Friends

Once again, the barrier to entry for digital currencies is low. Do you have crypto-curious friends? Have them download your favorite mobile wallets and send them a dollar or too. It’s simple and you may make a new crypto enthusiast for life!

After my recent lecture at Saint Vincent College, I was able to send a dollar a piece to several students by just instructing them to download a BCH wallet and snapping a picture of their QR code addresses. Crypto sharing is caring.

Spedning is Fun!

Sure, it’s easy to acquire cryptocurrencies and forget about them, stashing funds for a rainy day or big price spike. But the beauty of Bitcoin and its peers is the ability to exchange money in a way we’ve never done before. Adoption will be key for the future of digital money, and it’s quite easy and fun to participate in the economy.

Hodl some, spedn some. The crypto community thanks you.

Understanding Address Balances for UTXO Blockchains

Overview

When you open your Bitcoin, Bitcoin Cash, or Litecoin wallet, you’ll see a balance just like you do when you open your bank app. At the end of the day, you just want to know how much currency you own, right?

You may be curious, however, how your total balance is calculated in the world of cryptocurrencies. With your local bank, a centralized authority (the bank itself) keeps track of the state of your account as one unit. The bank tracks deposits and withdrawals, and keeps a running tally of your available balance for you.

The Bitcoin blockchain, however, does things a little differently. This blockchain (and the BCH and LTC blockchains, to name a few others) use a concept called the UTXO to deal with available balances. If that sounds completely foreign, don’t fret. It turns out UTXO-based chains function quite like the physical cash in your wallet!

UTXOs explained

What is a UTXO?

An unspent transaction output, commonly referred to as a UTXO is a chunk of cryptocurrency that is owned by a user’s wallet and available for the user to spend. More specifically, a UTXO is owned by a particular address in the user’s wallet, and therefore the associated private key.

A raw UTXO looks something like this when pulled from a block explorer API:


{
"txid": "2e2a921b819c261822dfa0931523a54b0c8900182c20d4be25ff333982a8f76a",
"amount": 0.10401187,
"confirmations": 306
}

This UTXO is pulled from the bitcoin.com REST API, with some bits of data removed for simplification. If you want to try querying this yourself, you can opening this API call in your browser.

Deciphering UTXO data

Now let’s look a little closer at this UTXO. The first data field that we see is the txid, which is a long string of data that looks meaningless. This data is the hash of the transaction that created this UTXO. In other words, this particular transaction sent money to this address.

The second item is fairly self explanatory: this is the Bitcoin amount sent to the address in this UTXO.

Finally, the number of confirmations indicates how many times a new block has been added on top of the block containing this transaction. The more confirmations, the more “sure” we can be that this transaction is a permanent part of blockchain history and owned by the address.

How do UTXO’s function?

UTXOs function in a way that is remarkably similar to physical cash. Think of a UTXO like a five dollar bill in your wallet.

A UTXO is a bill available for you to spend in a future Bitcoin transaction. Let’s say your grandma sent you $5 in a card for Christmas. You now have $5 in your wallet ready to use when you go to the store.

Much like dollar bills, UTXOs must be spent entirely in a new transaction. If you go to the store to buy a bag of chips and a drink for $2.50, you cannot tear the $5 bill in half and give it to the cashier, can you? You give the person the entire bill, and get $2.50 back in change.

Bitcoin UTXOs function in the exact same way in a transaction. If you have a UTXO your address owns for 0.1 Bitcoin and you want to send your friend 0.05 Bitcoin, your wallet will create a transaction that sends their address 0.05 BTC in a new UTXO, and sends 0.05 back to your address in change!

UTXO’s and Your Wallet Balance

Now that we understand how UTXOs work, understanding how your wallet tracks your balance is pretty straightforward! Your wallet contains a bunch of private keys and a bunch of addresses derived from those keys. Each address can have a bunch of UTXOs associated with that address, and your wallet balance is the sum total of all those UTXOs. It’s that simple. Just like you may have some 1’s, 5’s, and 20’s in your physical wallet, your Bitcoin wallet can have a bunch of UTXOs in any denomination of Bitcoin.

When you go to send Bitcoin to another user, your wallet bundles up as many UTXOs as it needs to create a transaction in that amount and uses them as “inputs” for that transaction. Unlike physical cash, however, your wallet can turn your $5 and $10 UTXOs in to a fresh $20 bill.

What’s in your (Bitcoin) wallet? UTXOs of course!

Again, UTXOs are the dollar bills of the Bitcoin world. Blockchains based on this model include popular digital currencies such as Bitcoin Core, Bitcoin Cash, and Litecoin. Other popular currencies such as Ethereum use an account based model that functions more like a traditional bank account, tracking inputs, outputs, and balances as state changes over time. The good news is, understanding the slightly more complex UTXO model is fairly trivial with a good analogy, and this model functions like the cash we use every day.

If you have a Bitcoin Cash address, you can try viewing your UTXO “dollar bills” yourself using a project I created for this purpose. This project features an API that digests raw blockchain data and outputs an easy to understand format so you can learn these concepts. On top of the API, there’s a nice and simple React frontend that formats the data in a table. The code is available on Github, and if you visit https://jmcintyre.net/sites/myaddrbal_client/ you can try it for yourself! Here’s an example with one of my BCH addresses used above:

Happy crypto learning!

(Bitcoin) Script Kiddies – Understanding Basic Transaction Scripts

Overview

In the Bitcoin world, money is not just digital – money is programmable! When transactions between users on the network are created and broadcast, miners and nodes independently verify that these transactions are valid. But this verification is not just checking some basic data points – it involves the execution of special scripts specified in the transaction parameters.

Script Basics

The Bitcoin Scripting Language

Before we can understand how basic Bitcoin scripts operate, we need to know a little bit about the scripting language itself. Unlike common scripting languages such as Python and Bash, the Bitcoin scripting language is quite limited and fairly simple in its execution. “Script” is stack based, meaning data is stored on an execution stack and script operators “push” and “pop” data from this stack. As well, Script is not Turing complete. There are no functions for looping or jumping around in the order of script execution. Operations are completely linear from the beginning of execution to the end. This keeps scripts secure, as it is not possible to tie up machines executing the scripts with an infinite loop.

Some operators are general, but most are specific to the cryptography of Bitcoin. Operators such as OP_ADD, OP_SUB, OP_DUP are pretty self explanatory – they make it possible to add, subtract, or duplicate data on the stack. Operators such as OP_HASH160 are more specific to the way Bitcoin operates – this operator takes the top item on the stack, hashes it using SHA-256 and then RIPEMD160, and finally pushes the result back on to the stack.

Common Bitcoin Script Mechanics

Pay to Public Key Hash (P2PKH) Script Basics

Finally, we need to understand a bit about how scripts are formed by discussing the basics of transactions. When a user “receives” Bitcoin in a transaction, they don’t just have a “bank account” balance on the blockchain. Rather, the blockchain stores what are called unspent transaction outputs associated with a user’s address. These outputs specify a locking condition that must be satisfied in a script when the user tries to spend that output in a future transaction. When the user creates a new transaction with that UTXO, they specify an unlocking script that satisfies that locking script

The most common form of script on the Bitcoin network is called Pay to Public Key Hash. With this type of script, the locking script requires that the user provide their public key and a digital signature formed with transaction data and their private key. This public key and digital signature will “satisfy” the locking script. When this unlocking script is combined with the UTXO locking script and executed, the final result on the Script stack should be true, meaning that the user can spend the Bitcoin.

P2PKH Formation

For a P2PKH script, the locking script specified by an unspent transaction output looks like this:

OP_DUP OP_HASH160 <Public Key Hash> OP_EQUALVERIFY OP_CHECKSIG

The unlocking script provides the user’s signature and public key in order

<Signature> <Public Key>

In order to verify that the user owns the Bitcoin they wish to spend, a node verifying this transaction will append the locking script to the unlocking script and then execute it:

<Signature> <Public Key> >OP_DUP OP_HASH160 <Public Key Hash> OP_EQUALVERIFY OP_CHECKSIG

Script Execution

Now let’s walk through how this P2PKH script executes.

<Signature> <Public Key> OP_DUP OP_HASH160 <Public Key Hash> OP_EQUALVERIFY OP_CHECKSIG

First, the signature and public key specified by the unlocking script are pushed on to the stack:

STACK: <Signature> <Public Key>

Next, OP_DUP pushes a copy of the top item on to the stack:

STACK: <Signature> <Public Key> <Public Key>

OP_HASH160 will pop the top stack item and hash it using SHA-256, and then RIPEMD160. Once the hashing operations are complete, the result is pushed on to the stack:

STACK: <Signature> <Public Key> <Public Key Hash>

The user’s public key hash (a data item) specified by the locking script is pushed on to the stack:

STACK: <Signature> <Public Key> <Public Key Hash> <Public Key Hash>

OP_EQUALVERIFY now pops the top two stack items and checks that they are equal. If they are equal, execution continues. If the comparison fails, script execution exits with a failure.

STACK: <Signature> <Public Key>

OP_CHECKSIG now verifies that the signature is valid against the public key specified. An elliptic curve digital signature is created using a private key and a specific message, and any user with that message and public key can verify that the signature is valid without knowing the private key! Note that the message is not a part of the script, but is garnered from the overall transaction data. If the signature is valid, OP_CHECKSIG pushes true on to the stack.

STACK: true

Any Bitcoin script that ends with just true on the stack indicates a valid transaction. The user that created this transaction to spend some currency is in fact the rightful owner of the unspent output they want to use.

Bitcoin Scripts – Simple But Powerful

For someone with programming experience and some computer science background, Bitcoin scripts are generally straightforward to understand since the language is limited and Turing incomplete. Understanding P2PKH scripts requires just a working knowledge of stack data structures and commonly used cryptographic algorithms, but no higher level programming constructs. Now you know what goes on when you send your friend some money from your Bitcoin wallet!

However, the beauty of programmable money is the power to create transactions beyond the normal flow of “Alice sends Bob some cash”. Script opens up the possibility of things like multi-signature transactions, time locked spending, and more!

EZ-Pay – Full Node vs. SPV Wallets

Overview

When discussing digital currencies, the question is often asked “where is the ‘money’ actually stored?” In the world of fiat currency (US dollars, Euros, etc.), cash stored in your physical wallet is the money. You give a $20 bill to a cashier, and they now have $20. With cryptocurrencies like Bitcoin, the actual currency is stored on a completely public, open ledger called the blockchain. The blockchain stores a complete record of every transfer between individuals in the history of Bitcoin’s existence, so a Bitcoin wallet can easily verify that you own some amount of currency and can send it to another person.

However, there’s a bit of a problem with this. The Bitcoin blockchain contains a record of every single transaction ever recorded, now over ten years of history. The blockchain is HUGE in terms of storage space – nearly 200 gigabytes these days. What if we don’t have that kind of space on our computer? What if we want to have a digital currency wallet on our phone or another capacity-limited device? Fortunately, there’s a type of wallet called an SPV wallet that fixes this problem. Let’s discuss the difference between full node and SPV technology.

Full Node vs. SPV

The Full (Node) Experience

The most full wallet experience is using what is called a full node. The strategy use by a full node wallet is very simple: the entire blockchain containing all transactions is downloaded to the machine running the wallet software.

Because the full blockchain is available to the wallet, verifying ownership of the user’s funds is simple. The wallet software looks at the blockchain ledger and traces the ownership of the currency back to the very beginning of Bitcoin. The blockchain is well secured by cryptography and proof of work, making it near impossible to forge any of these transfers. So, by storing the full blockchain, the wallet software can verify that all the previous transfers of Bitcoin leading up to the transfer to the current owner are valid and considered indisputable history. If your wallet can independently verify the blockchains transactions, it knows for sure that your Bitcoin is truly yours.

Wallets on a diet – Simplified Payment Verification or SPV

As we discussed, however, it can be problematic to download and store the entire blockchain for a wallet in many cases. What if a user with a small laptop, a mobile phone, or other limited-capacity device wants to participate on the Bitcoin network? A user may have limited storage capacity, or may also have limited bandwith for downloading the very large blockchain. But if we can’t download the blockchain, how can we independently verify that the Bitcoin in our wallet is actually ours?

Satoshi Nakamoto, the inventor of Bitcoin, brilliantly solved this problem by developing a technology called Simplified Payment Verification, or SPV. These wallets use some neat cryptographic tricks to avoid downloading the whole blockchain, at the expense of a minimal amount of trust required to verify currency ownership.

So how do SPV wallets work? When an SPV node needs to verify ownership of a user’s funds (in order to create a new transaction where they send money to someone else), the node makes special requests to full nodes it can find on the network. Instead of asking for the whole blockchain, it only asks for specific bits of information it needs to cryptographically verify that the wallet user owns their money.

SPV wallet only downloads what are called the block headers. These headers store important metadata about the transactions included in that block, including a sort of cryptographic summary of transactions called a Merkle tree. Next, the SPV wallet will ask other nodes on the network for transaction data that is relevant specifically to the user’s wallet, like previous transactions that send money to the user’s Bitcoin addresses.

By getting the basic transaction data from other nodes and the block headers, the SPV wallet can use cryptography that verifies that the transaction does indeed belong in a particular block (by verifying it belongs in the Merkle tree in the block header). The SPV node can then verify that the blockchain is valid by checking that all the block headers are valid and have sufficient “proof of work”. It turns out that if the transaction the wallet needs to verify is several blocks “deep” (that is, behind the latest block proved and added to the chain), the wallet can generally trust that the funds do indeed belong to the user without having to verify the whole blockchain!

It is important to note that in order to prevent being scammed by one rogue node on the network, SPV wallets connect to many full nodes to request transaction data. It is far less likely that all the peers an SPV node connects to a trying to scam that node with falsified transaction data, so it is generally considered secure to use SPV nodes for everyday transfers. If a user wants the most secure wallet experience, a full node is a bit better since it verifies the whole blockchain and doesn’t have to trust other parties on the network.

SPV – Wallets, simplified!

Thanks to the interesting cryptography of Merkle trees, proof of work, and block chaining, SPV wallets do not need to download the entire blockchain to securely check if a user owns their Bitcoin. By asking for specific transaction data, an SPV wallet can check that transactions sent to a user’s address belong to a block using a Merkle tree. And by verifying block chaining and proof of work, the node can trust that said transaction has been accepted as part of the Bitcoin history and is therefore owned by the user. Since SPV nodes communicate with multiple full nodes, it is generally true that SPV wallets are secure despite the fact that they do not download and validate the entire blockchain. So never fear – if you’re using a mobile phone wallet or a wallet on your netbook, you can participate in Bitcoin in a way that is secure!

What’s in Your Wallet? Understanding Private Key Control

Overview

Just like your cash, cards, and ID, your cryptocurrency assets live in something called a “wallet”. Most all forms of digital money implement this concept in some form, and understanding wallets is critical to safely storing and using your favorite digital currency.

Much like your physical wallet, your Bitcoin, Monero, or Ethereum wallet gives you direct access to the funds inside. A crypto-wallet isn’t like a credit card – if a stranger gets a hold of it, you can’t cancel it. Much like cash, the money stolen would be theirs!

But how does this work? How can a digital asset act like cash when all other forms of digital monetary transactions (credit cards, bank transfers, PayPal) can be “cancelled” if stolen? We must first understand a bit about what a “private key” is, and why who controls it is so important to the security of your cryptocurrency funds.

A word on private keys

Without getting to far into the technical details, let’s discuss a bit about what a “private key” is and why it is so important. Remember how I said that your crypto-wallet is like digital cash, and your wallet “stores” your Bitcoin or other currency? Well, that’s not quite how that works…

In reality, the amount of Bitcoin that you own is stored on a worldwide, completely public ledger/database called the “blockchain”. This ledger stores a public record of all of the Bitcoin transfers ever conducted, so anyone can see exactly who owns what. Sound scary and insecure? How can you control your digital cash if everyone has access to this open blockchain??

This is where private keys come in. Bitcoin (and other crypto currencies) use a form of cryptography called “elliptic curve cryptography” to generate the Bitcoin “addresses” people can use to send you money. The address is completely public; you can give it to anyone and they can send you funds. However, behind this address is a special “private key” used to access those funds on the blockchain. Your address is generated from this randomly generated private key by using this form of cryptography.

The cryptography used in address generation makes it so that you can’t figure out the private key by going backwards from the public key, or Bitcoin address. However, the private key is used to prove that you own the address without ever revealing it, thanks to the magic of elliptic curve cryptography. It is critical that the private key is always kept secret, because anyone with the private key can access the Bitcoin at the associated address.

Levels of Private Key control in Wallets

Now that we understand the basics of private keys and their importance, we can talk a bit about how different wallets keep these keys safe from the prying eyes of crypto-thieves. All wallets must work in some way that keeps the private keys, well, private so that control of the funds lies with their rightful owner. There are three general approaches to private key control in wallets: a full control model, a hybrid model, and a custodial model.

Full control wallets

Full control wallets offer the obvious – complete and total control of the private keys. With a full control model, the private keys are generated and stored on the user’s device, be it a desktop computer, mobile phone, or even a hardware wallet like the Trezor. With this model, the private keys never leave the user’s device in any shape or form.

The advantage to this model should be fairly obvious – it is by far the most secure model. There is no trust involved with a third party; the funds are completely controlled by you. Users should still exercise care around other security parameters (ensuring a virus-free machine, for example), but generally these wallets offer the most hardened approach to keeping private keys safe.

The disadvantage here is the lack of convenience and ease of use. This wallets require the most technical savvy of these three models, although most “power users” will have no problem understanding and securing these wallets. It is extremely important that the users of these wallets understand how to back up their private keys. If the device is fried or lost and there is no accessible backup, all funds will be lost! Fortunately again, BIP39 mnemonic backups make this tasks easier than it was with the first few Bitcoin wallets.

Hybrid wallets

Some major players in the crypto space have created an interesting hybrid model for private key storage. Web wallets like those at blockchain.info or btc.com implement this model. With hybrid wallets, private keys are generated and then encrypted on the user’s machine (usually in the web browser) before being stored on the company’s server. With these wallets, private keys are only known and accessible to the user, while the company keeps an encrypted backup safe on their servers.

With this model, security is still pretty strong. Because strong encryption is done on the user’s machine, no one with access to the company’s servers have access to the actual private keys without the decryption passphrase, which lies safely with the user. This model requires that the user trust that the company’s code (which is preferably open source) is soundly implemented and doesn’t contain secret backdoors. However, if the encryption is done right no one but the user can actually access the keys. This model is slightly less secure than a full control model,

Although there is a small amount of security tradeoff here, this model comes with increased convenience to the user. Most web wallets have a more traditional username and password login interface, so the user only needs to create and remember a secure passphrase to access their funds, with the site taking care of backups for them. This may be easier for a beginner crypto-enthusiast, and any good site will still offer mnemonic backups and private key exports for the savvy user.

Custodial wallets

The final model we’ll discuss here is the custodial wallet. Many exchanges like Coinbase offer custodial wallets. Like a hybrid wallet, all you need to do is create a username and password and log into a website to access funds. However, the critical difference is that with a custodial wallet, the user doesn’t know their private keys at all!. With custodial wallets, the website takes care of generating and storing all the private keys without revealing them to the user. No backups to manage, and no need to understand how to do much more than log in to a website to use this kind of wallet.

The security pitfalls of this model are pretty serious, in my opinion. With these wallets, the user has no way to back up their private keys. What’s more, the user must completely trust the company or individual implementing this kind of wallet. These companies must have significant security measures in place to avoid attacks on their servers, and they must be trusted to mitigate the access rogue employees could have to user’s money.

In fact, these types of wallets completely break a fundamental security principle of Bitcoin – the user controls their keys, therefore the user controls their money. Custodial wallets far more closely resemble the centralized model of traditional banks.

Don’t worry though – these wallets aren’t all scary! The cost of security comes with a large benefit – ease of use! These wallets have a much smaller learning curve for complete beginners. Just sign up for a wallet account just like you would a forum, email address, or social media account. For someone with little understanding of the world of cryptocurrencies, this type of wallet offers a gentle introduction

My only advice would be that given the security pitfalls of this model, only use custodial wallets to store small amounts or for buying and selling. Most custodial wallets live in currency exchanges, so use them to buy your crypto of choice and send the funds to a more secure wallet.

Know Your Keys

The most important takeaway from this discussion of private key control models is that it is important for a wallet user to know where their private keys are. Again, in Bitcoin and other cryptos, control over your private keys is control of your money. Anyone with access to the keys has access to your money and can spend it freely, so either keep them to yourself or make sure the holder is a wallet maker you trust.

By understanding these different models, users have more control over how they choose to secure their wallets and keep their funds safe. Understanding the pros and cons of full control, hybrid, and custodial wallets allows a cryptocurrency user to choose the best wallet for their needs. Ultimately, an understanding of these models allows for better security and comfort with digital money, because a person knows who truly owns their keys and is responsible for keeping them safe.

Playing With Blocks: The Basics of Blockchain Databases (Part 2 – Blockchain for Techies)

Overview

In our non-technical overview of blockchain, we discussed what a blockchain database is – a distributed, cryptographically secured, and immutable data structure used in applications like digital currencies. We discussed how blocks are cryptographically linked together to ensure that old records can not be changed, and why these types of databases are useful for applications where the immutability of data is key.

But how does this work technically? Let’s take a deeper look at how blockchains are secured by the application of hashing and proof-of-work.

It’s all in your the block’s head

The block header

The key to understanding blockchain lies in a data structure included in every block of every blockchain. This data structure is called the block header, and it contains several critical bits of information needed to secure the chain as it grows.

Let’s look at the Bitcoin block header as an example. Each Bitcoin block contains an 80 byte header with the following information:

  • Version – the software version of the Bitcoin protocol
  • Timestamp – expressed in seconds since the Unix Epoch
  • Merkle Root – for our purposes here, we’ll say this is a “fingerprint” of all the transactions in this block
  • Difficulty target – A 256 bit integer used in calculating proof of work
  • Nonce – The value added to the block header to demonstrate proof of work
  • Previous block hash – The SHA-256 hash of the previous block header

All of this data is important and serves a purpose in the Bitcoin protocol. However, I’ve highlighted the previous block hash because it is particularly important when discussing how the blockchain is secured!

A quick review of hashes

Before we discuss the critical role that the previous block hash section of the block header plays in securing the blockchain, let’s step back and recall what a hash function does. When “hashing” data, a special algorithm called a “hash function” takes the data and outputs a unique “fingerprint” of the input data. These functions (at least if they are implemented properly) have two very important properties.

First, a particular chunk of data always produces the same hash (or “fingerprint”) every time it is run through the function. If you run Hello through the SHA-256 hashing algorithm, the result will always be 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969.

Second, good hash functions avoid collisions, where different data results in the same hash output. Using a proven algorithm such as SHA-256 means that for every different input, a different hash is produced. Even if a single bit of input changes, the hash is radically different. “Hello” will produce a very different output than “hello”, even though only a few bits of the input are changed.

Hashes inside hashes inside hashes

Understanding these properties of hashes, we can better understand the interesting approach that blockchains take to securing the integrity of data in previous blocks when combined with proof of work.

Each time a block is generated, proof of work is generated in the form of the nonce included in the block header. This is computationally intensive and essentially proves that a miner spent a good bit of CPU power to find the answer to this cryptographic puzzle.

Now, when the nonce is included in the block header with the other data including the previous block hash, we can hash the entire block header to generate the unique fingerprint. This “block hash” is unique, and changing any data whatsoever in the block header would create a radically different block hash.

Okay, so each block has an associated hash. What’s the big deal? How does that help secure the blockchain? The magic of blockchain lies in that critical piece of data known as the previous block hash. Recall that changing any bit of data in the input radically changes the output of a hash function. So what would happen if we tried to change a transaction 2 blocks back in our blockchain?

If a node tries to broadcast a fake blockchain to the network with a fake transaction 2 blocks back, the block hashes for each subsequent block would be radically different.

Let’s look at an example. Let’s say we have a really simple blockchain with some transaction data like so (Note – these hashes are made up for demonstration purposes):

Block 2:
Time - 3000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 5345245
Prev block hash - 33a0b89fcce723e9f41f5d756ab1c20584afbe6dfa9ea18838ff3caf0915b5f5
Transaction: Bob pays Alice 6 units

Block 1:
Time - 2000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 2356343
Prev block hash - f4ebb8b56f590188f5824276af552cd51a48ba774e3ad1350c2800b116d8f6f5
Transaction: Alice pays Bob 5 units

Block 0:
Time - 1000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 1232341234
Prev block hash - 0000000000000000000000000000000000000000000000000000000000000000
Transaction: Alice pays Bob 1 unit

Now let’s say Bob gets greedy and tries to say that Alice paid him 10 units in the first transaction:

Block 2:
Time - 3000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 9987983
Prev block hash - 9ae343e333cbb96427eb333bb8c443359e3cf926c9de9845ceb583577b945afb
Transaction: Bob pays Alice 6 units

Block 1:
Time - 2000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 390970
Prev block hash - 3f82f4cfe059b5a69a0fd5b4d34774af5ecdc672d988320d5fd186998969a645
Transaction: Alice pays Bob 5 units

Block 0:
Time - 1000
Difficulty - 1000000000000000000000000000000000000000000000000000000000
Nonce - 235235
Prev block hash - 0000000000000000000000000000000000000000000000000000000000000000
Transaction: Alice pays Bob 10 units

Notice how different the hashes are for blocks 1 and 2 in Bob’s fake blockchain. If these hashes are to be considered valid by a node in this currency’s network, then each block must also demonstrate proof of work. Since the data has changed in a block, a new nonce must be found to show that work was done.

Here is the most critical part – since the hash for block 0 has changed and is included in block 1, proof of work has to be re-done for block 1. And since block 1’s hash is included in block 2, proof of work has to be re-done for block 2. In other words, to try and fake a transaction 2 blocks back, Bob has to re-do proof of work for 3 whole blocks!! In the meantime, legitimate nodes only have to try and find a solution for the current block. It is clearly impractical, if not impossible, to “fake” a blockchain more than one or two block old, because proof of work has to be redone for many blocks in the time the legitimate network only has to prove one.

Faking a blockchain take too much work!

Due to the interesting combination of hashing and proof-of-work algorithms, it is incredibly difficult if not impossible to fake history in a blockchain database. Because each block contains the hash of the previous block, changing history blocks back means that every bit of the chain on forward must be forged. While legitimate nodes only have to prove work for one block in that span of time, an attacker would have to calculate for many. Unless a malicious party has some amount of computing power the rest of us don’t know about, it’s nearly impossible to do so.

For the extra curious, Satoshi covers the math behind this concept extensively in section 11 of the Bitcoin whitepaper. While I don’t claim to understand this math very well myself, the paper does a good job of explaining its conclusions that forging blockchain history is a fool’s errand.

Playing With Blocks: The Basics of Blockchain Databases (Part 1 – Blockchain for Everyone)

Overview

Blockchain is the latest and greatest buzzword in the information technology world. From open source, decentralized cryptocurrencies like Bitcoin to traditional financial institutions, it seems as though everyone is dying to create and release their own blockchain based applications. But what is blockchain? Why is it such a popular concept, and what is it actually good for? Let’s discuss.

What and why: Blockchain simplified

What is blockchain?

So you’ve heard that blockchain is going to revolutionize everything, but what is it exactly? Let’s cut through the hype and discuss the technical foundations of Blockchain.

A blockchain is a distributed, cryptographically secured database that focuses on making historical data immutable.

In a traditional database, information is often stored on one or a few machines, controlled by a central authority. Access is controlled by this authority (think IT administrator) and the data is kept secure by granting credentials to modify that data to a select few trusted parties. By contrast, a blockchain database is governed by what is called distributed consensus, using mechanisms such as proof-of-work. For more information on proof-of-work, you can read my series of articles on it.. The important thing to note is that (in general), no one central person or authority decides what data is “verified” in a blockchain, a community of network nodes and software does.

If anyone can modify the data in a blockchain rather than a trusted party, then how is this consensus on what is correct achieved? Again, the secret lies in the science of cryptography. Through a mechanism like proof-of-work, a cryptographic puzzle is solved by software with some incentive to do so. In Bitcoin, the node that solves this puzzle is granted new currency. The real magic, however, is the fact that any other node in the network can verify that this answer is correct in a split second, so anyone can independently verify that a block meets the cryptographic standards set by that blockchain’s protocol.

You may be wondering how the cryptography in each block keeps the overall blockchain secure. This is the question of immutability, or how easy it is to modify the history stored in the blockchain. Blockchains solve this by cryptographically “linking” each block to the previous block, thereby making each individual block a critical part of the history stored by that “chain”. Each block has a header full of useful metadata about that block – a timestamp, a “summary” of the included data or transactions, a difficulty target and nonce for mining (part of proof-of-work), and the hash of the previous block’s header. Each block header is run through a one-way, cryptographically secure function called a “hash function” that creates a unique digital fingerprint for the data.

Immutability is achieved when combining the proof-of-work consensus mechanism with this system of chaining each block together. In order to create each block, the cryptographic puzzle solved by the proof-of-work algorithm allows a unique block header hash to be generated. It is computationally difficult to get this value, but very easy to verify it is correct. Now, it’s not that hard to re-solve that hard problem in a matter of minutes…it would be easy to create a fake block at the top of the chain. But what about 10 blocks back? Well, since each block contains a hash of the previous block header that is generated by solving this hard problem, you would have to now fake history for ten whole blocks! It is exponentially more difficult to do so the further back in the chain that you go. Unless you can truly do the work required to fake history in a blockchain, any independent network node could easily see that the rest of your history on forward is invalid. The immense difficulty of “faking” history in a blockchain gives it the most important property it has, its immutability.

Cool, so why is it useful then?

By far the most important aspect of blockchain, in my opinion, is its ability to decentralize applications. With a traditional database, a central authority has to be trusted, which can be a disadvantage in applications that are controversial or have high incentives for fraud. For example, previous attempts at digital money like DigiCash had central services for issuing currency and validating transactions. These were promptly shut down by governments that didn’t like independent currencies very much.

With blockchain, it is possible to have things like completely peer-to-peer money as with Bitcoin, Litecoin, and countless others because no central government or individual has to be trusted! The network is secured by math (cryptography) rather than trust thanks to the blockchain. You don’t have to trust anyone to not defraud you of your money, because the math cannot lie about who owns what.

The other critical function of blockchains beyond decentralization are the preservation of history. Because blockchains are immutable, they can be useful for keeping things like medical records, property transactions, court histories, and more secure from malicious tampering like a traditional database. This does rely on some degree of decentralization, but even within a single company a blockchain is far harder to tamper with than a traditional database.

Cool, now I want a blockchain!

Blockchains are a fascinating and novel way to handle problems with traditional databases in certain applications. Thanks to the decentralized and cryptographically secure nature of these databases, it’s possible to create peer-to-peer applications that don’t require trusting a third party – a key problem to solve for concepts like digital money. As well, their immutability makes them useful even beyond the first few money-centric applications that existed – they may be coming do a real-estate authority, doctor’s office, or justice system near you!

Round and Round – Using Generators in Python

Overview

Most all modern programming languages support constructs for storing lists of data – think C++ arrays, Java ArrayLists, and Python lists. Any time we have a list of data, it’s often necessary to use loops to perform some operation on each item in that list. Again, most modern languages support ways of looping over these lists of data, most commonly “for” loops.

In Python, it’s trivial to iterate over a list using a for loop. However, Python lists are stored in memory. What happens if the data set you need to operate on is large and therefore memory inefficient? Python offers constructs called iterators that allow one to loop over data sets that are not stored in memory. Some iterators are built in for things like file reading, but you can easily create your own custom iterators using a concept called generators!

Iterators? Generators? Combobulators?

Traditional List Iteration

It’s easy to create a list of numbers in Python an loop over that list. Let’s say for example, we want to create a list of multiples of 2. We’ll store 5 numbers in this list. For each number in this list, we’ll just print it out to the screen for now:

def print_multiples():

    multiples = get_multiples()
    for m in multiples:
        print m

def get_multiples():

    return [ 2, 4, 6, 8, 10 ]

if __name__ == "__main__":

    print_multiples()

Let’s step through this code a bit and explain how it works. When we enter the print_multiples function from main, the first call multiples = get_multiples() assigns the return value of get_multiples() to the multiples variable.

The multiples variable now stores an in-memory list with all of our multiple of 2 values – 2, 4, 6, 8, and 10. We next go to the for loop for m in multiples:. These Python for loops are slick and fairly straightforward – for each go around the loop, the next value in the list is stored directly in the variable m. The loops continues until each value in the list is exhausted.

The output looks like this:

python test.py
2
4
6
8
10

So what’s a generator look like?

Now let’s try this code again, using Python’s generator construct. Here’s what that looks like:

def print_multiples():

    multiples = get_multiples()
    for m in multiples:
        print m

def get_multiples():

    for i in range(1, 6):
        yield i * 2

if __name__ == "__main__":
    print_multiples()

Notice our output is the same:

python test.py
2
4
6
8
10

This code requires some closer examination to understand how it works. When we assign multiples = get_multiples(), we don’t a assign a list, we assign what’s known in Python as an iterator. An iterator object exposes a next method that allows for loops to retrieve each sequential item in a list or other iterable object.

When we enter our for loop this time, we don’t iterate over the list – instead our iterator uses the get_multiples() generator function to retrieve each item one by one. The first time we go around the for loop in print_multiples, we enter the get_multiples function and enter its for loop.

Now, you’ll notice a different keyword being using in get_multiples – instead of returning an entire list, the function only yields one item, the result of i * 2. The value is passed through the iterator’s next function and printed to the screen. The next time around the for loop in print_multiples, the code goes to the same spot the value was yielded from in get_multiples. The return keyword returns code control to the caller, whereas the yield keyword only temporarily yields control back to the caller until the next time the caller needs a value from the iterator. The get_multiples function’s for loop continues, and yields the next multiple of 2. The function will continue yielding values of 2 until it’s for loop ends, yielding 10.

Cool, so why would I want to use generators instead of a list?

Our trivial example makes the use of generators pretty clear, but it doesn’t explain why they’re actually useful. Why all the extra complexity to avoid storing a whole 5 numbers in memory?. The use case of generators goes far beyond small lists of numbers.

First, what if we wanted to display the first one billion multiples of 2? In that case, it becomes much more expensive to store one billion integers in memory – it would eat up over a gigagbyte!.

In real-world software engineering applications, we often use generators to deal with large datasets even beyond simple calculations such as this. Generators can be used to operate on large database queries or data read from files on disk, where it would be inefficient or even impossible to read the information into memory.

Generate your own generators

In this article, we’ve explained how to go beyond memory-stored lists of information and create our own generators. Instead of hogging memory for large data operations, we can make our own memory-efficient iterators. Now when you have large calculations, database queries, or file reads to worry about, you can keep your memory usage low and your code easy to understand thanks to Python!

BIP39 Mnemonics Made Easy (Part 2 – The Tech of Bits to Backups)

Overview

In the last article, we discussed a high level overview of BIP39 mnemonics and their value as a simplified backup tool. Mnemonics make it much easier to take a single seed, back it up, and ensure access to an entire wallet of private keys, addresses, and transactions. But how do we go from a random set of bits to a list of words? Let’s discuss the technical side of BIP39.

Bits to Backups – The Steps for Generating a Mnemonic

First, Chaos

In order to generate a good seed, a fair amount of entropy or “randomness” is desirable. Good random number generators are hard to get, but modern OS’s like Linux do a pretty good job of sourcing entropy from the user and hard drives, and something like /dev/urandom on a daily driver machine should be sufficiently secure for generating the entropy we need.

Now how many random bits do we need? The BIP39 standard specifies 128-256 bits of entropy to be used for generating the seed. This will correspond to 12-24 words later on when we “map” the entropy to the words.

First, a warning: DO NOT USE any of the examples in this article to generate a wallet – your funds will be stolen!

With that out of the way, let’s look at an example. First, let’s generate 128 bits of entropy using os.urandom() in Python. Represented as binary, our entropy looks like this:

10111110011001010101110111001111010100011111011010110001110101111011110111000101101001100011110100010100011101000011011011100000

Next, a checksum

In order to better secure the seed, we’ll add a checksum to the end of the entropy. This makes it easier for wallet software to validate a backup seed.

To get the checksum, we’ll first take the SHA-256 hash of our entropy. Then, we take the first N/32 bits of the hash and append it to the entropy.

In our case, 128/32 bits gives us a 4 bit checksum size. In our example, the 4 bit checksum will be 0101. We’ll append that to the entropy to give us a 132 bit value:

101111100110010101011101110011110101000111110110101100011101011110111101110001011010011000111101000101000111010000110110111000000101

Dividing and our Dictionary

The final step of the process involves dividing our checksummed bits into “chunks” and mapping those chunks to the mnemonic words from the dictionary. The BIP39 standard specifies that the chunks will always be 11 bits long. So, we divide our 132 bit checksummed entropy into 12 chunks of 11 bits each:

  1. 10111110011
  2. 00101010111
  3. 01110011110

Now, each of these 11 bit chunks can be interpreted as an unsigned 11 bit integer value ranging from 0-2047. This “maps” to a word from the dictionary of 2048 words directly! These are standardized and listed in alphabetic order. So, we can take the 11 bit chunk as an index in the dictionary to extract the words we need:

  1. 10111110011 = 1523 -> salmon
  2. 00101010111 = 343 -> cliff
  3. 01110011110 = 926 -> inherit

The overall mnemonic we generate is this example turns out to be:

  1. salmon
  2. cliff
  3. inherit
  4. physical
  5. help
  6. type
  7. warfare
  8. regular
  9. dial
  10. photo
  11. asset
  12. scheme

Mnemonics – from Entropy to Dictionary Entries

The process of generating a mnemonic seed is both ingenious and straightforward. One can easily create a secure wallet seed of 12-24 words by generating some entropy, checksumming the data, and mapping to a standard dictionary.

I’ve written a project called MnemonicGen that generates mnemonics using these steps. Take a look at this project to see these steps implemented in Python. This code should be considered academic/experimental – use it to create wallets at your own risk. Other proven implementations such as Ian Coleman’s BIP39 are also available to study.

Happy generating!