This article intends to give a thorough introduction to the fundamentals of blockchains and why they are truly different from previous technology. Everything is explained without prerequisite knowledge, so depending on your expertise there may be entire sections you can skip. There are certainly a lot of details that will be glossed over or entire subjects that are not mentioned, but hopefully this post is enough of a primer such that you can easily explore the rest on your own.
This technology is very powerful and lately the markets which trade blockchain based assets have surprised everyone. It’s hard to predict the future use cases of this technology - it is similar to predicting Google, Facebook, Amazon and Twitter in 1990 when people were just starting to get online for the first time. As we were back then, we are on the cusp of an entirely new way for humans to interact with each other. If the internet was about revolutionizing information flow, blockchains are about revolutionizing finance, business and governance.
- Traditional Databases
- A high level view on the internals of a blockchain
- How computers see data
- Cryptographic Hash Functions
- Digital Signatures
- Accounting within the blockchain
- Consensus Algorithm
- Block Rewards
- The Longest Chain Wins
- Protocol Governance and Evolution
As crazy as it sounds, right now, there are software engineers all over the world building businesses and entire economic systems that operate entirely through software - no people involved. Traditional companies are incorporated and function on top of a traditional banking system because that was the only economic system available. Recently though, breakthroughs in computer science allow us to create economic systems directly from the keyboard. This is a new era of human organization.
To name just a few applications, developers are building currency exchanges, voting systems, prediction markets, lending platforms, insurance companies and services much like Uber, AWS and Facebook.
Blockchains allow a developer to assume a level of trust and authority that typically only a bank, government or corporate law could give to a user. These services can operate without registering as a business entity and they do not need bank accounts to receive payments or pay employees. The simple reason for all of this is that a blockchain can, for the first time, create scarce digital assets. Previously all digital state, like the contents of a file, could be copied infinitely.
Blockchain technology has received a particularly large amount of attention in last 6 months. It’s been almost one decade since the blockchain was invented and now the community working on this technology is blooming. There are many digital currencies today, all with different features and tradeoffs. At the time of writing this, the total cryptocurrency market cap is a little over $750 billion USD. “Blockchain” has now joined the buzz-word list along with machine learning, AI and VR.
A revolution in cryptography over the last decade has allowed for blockchains to exist. Cryptography is a branch of computer science interested in manipulating data in such a way that certain properties can be safely assumed. Encryption for example allows you to secretly transport data which is unreadable without a key. Digital signatures allow you to verify the authenticity of digital statements.
In this article we will explore the elements that comprise a blockchain and why these elements are necessary. I will use bitcoin to help illustrate what a vanilla blockchain does, but it should be noted that today there are many variants on this technology.
Databases are a critical component of pretty much every institution, government agency and technology product you can think of. A database is a special piece of software whose primary function is to store and retrieve data. There are many databases that exist today and some are optimized for speed, scalability or durability. Databases are usually modelled as a table with rows and columns not unlike an excel spreadsheet. You could think of a database like a spreadsheet but it is accessed by a program (and really, excel under the hood is using a database).
Governments use databases to keep records of things like social security numbers, passports, driver’s licenses and who owns what land. A bank will use it to store your account balance. Facebook will use databases to store your posts while a telecoms company will store information about how much data you use so they can bill you. An insurance company will probably use a database for storing information about its customers, their status and any claims they have made. Database models often look like this:
All classical databases are simple computers which means you can tell them to do whatever you want. If you are the administrator of the database, you can change any value at any time. In the old days, databases were big books or ledgers and likewise, an employee of a bank could update or amend the ledger as needed. Similarly, today banks employ software engineers to write code that will modify field in the database including account balances.
Due to this historic property of information (that ye who controls the persistence layer can manipulate any data they want) some common points of trust must emerge. If you are keeping records for yourself, then you have no issues provided you can trust yourself. When you need to keep records for more than yourself things get difficult. For instance, two cases worth discussing are around storing data within an organization and storing data for use between parties.
For the first case, let’s think back to the example where a business (say a bank) is storing account balances (or it could just be anything of value). You have to trust that this business will not randomly change your balance to zero, or increase the balances of others (which would inflate the value of your money away). There may be policies in place to ensure that deployed software that can update the database is not malicious. Perhaps policies that ensure that privileged database access is only for necessary work (perhaps upgrading software or performing maintenance). Even with internal policies, you will have to trust that those exist and are enforced. Permission to run a business, which stores funds like this, is hard to get because the government wants to protect users. This leads to heavy regulation and enormous barriers to entry which has truly stifled innovation in the banking industry.
Note that in this scenario, you not only have to trust the bank, but also the government’s regulation as well as the state’s ability to facilitate fair legal proceeding, in the case that the bank wrongs you.
The second case is manifested in the subject of moving money. We prefer to live in a world where transferring money does not involve one physically giving gold coins to another person. To spend money, the merchant and I need to have access to some data that says, “I have the required amount of funds, and after the transaction completes, the merchant has this money”. Naturally the way you will store data is with a database and so the question quickly becomes, who will be the database administrator?
Let’s imagine Sam is at a pub and wants to buy a pint. To buy the pint, the pub will need to believe that the money they are receiving wasn’t printed out of thin air - this would make Sam’s money worthless. They also need to believe that they received the correct amount of money, and they can use that money for other expenses later on. The way we solve this problem is by trusting some third party, typically a bank, to maintain the database correctly. We trust that they (likely to be an entire network of financial institutions) will subtract the cost of the drink from Sam’s balance and add it to the balance belonging to the pub.
Anytime you want to send money around the world in electronic form, you will need a trusted third party. This is the entire business of Western Union and a lot of what makes a bank a bank. Through the whole digital age this was the case until 2008 when the bitcoin white paper was released anonymously. It gave the blueprint for a peer-to-peer cash system which required no third party. The great innovation was a novel way to use a lot of cryptography in concert to create a new type of database which had no single administrator. Instead, that supreme privilege required to change the values in the database would be delegated to a collective majority of the users of the network.
Today, this magical database is usually what people are talking about when they say the word “blockchain”. The author of the paper, under the pseudonym Satoshi Nakamoto, went on to write most of the initial code and kicked off the network with the open source community on January 9, 2009.
The quick answer to the 3rd party trust issues we outlined above is that now this trustless database can be used to do the record keeping that we needed. A bank or the Visa network is no longer required to move money. Updates to balances are allowed only if they make sense according to some hard coded rule set (for example, they don’t print money out of thin air). When I pay for a good, the merchant normally knows that I am not forging money because they trust the bank. If I pay in bitcoin, they believe that I am not forging money because it would be computationally infeasible for me to do so. Effectively we have replaced trust in the institution or government with trust in mathematics, physics, and time.
The term hard coded rule set in the above paragraph is important, and bitcoin is largely defined by this rule set. Keep this idea in the back of your mind for now and we will get back to it when we talk about ethereum (a more generalized blockchain) in the following post.
The blockchain was a step function difference in database technology and it laid the foundation for all the digital assets you see today. This does not mean that blockchains are better than all other databases. Blockchains are very useful in specific circumstances and wildly inefficient for pretty much all other circumstances. The next section will explain how a blockchain can be used as a database.
A high level view on the internals of a blockchain
A blockchain uses many pieces of cryptography. First, I will explain the individual parts, and then I will illustrate how the whole is greater than the sum.
Single functions that do cryptographic operations, known as crypto-primitives, are used to implement the layers in a blockchain (which are identified in the diagram above). Crypto-primitives are to blockchain as arithmetic operations are to mathematics. Addition, multiplication, and exponentiation are single operations and using them together allows for more powerful objects like polynomials.
Each crypto-primitive is simply a function which takes an input, does some computation, and produces an output. The interesting part is the properties they guarantee between the inputs and outputs.
The primitives that we are interested in are hash functions and digital signatures. Each will be explained below in their respective section.
How computers see data
Since these functions will operate on data, it is important to remember how computers see data. Computers only understand binary data which is in the form of zeros and ones. Even this text on the screen that you are reading is in zeros and ones (known as bits) somewhere in the memory of your computer. Text, modeled as a sequence of characters, is called a string in software engineering. Each character in a string is represented by a number (A = 65, B = 66). Those translations are shown in this table, known as the ASCII table.
As you might already know, bits can be grouped together, 8 at a time, to create bytes. Each byte can itself be represented as binary (zeros and ones), decimal numbers, or hexadecimal numbers (explained below). These concepts will be used throughout the next sections.
Cryptographic Hash Functions
Hash functions are interesting creatures. They allow you take a snapshot of data, or fingerprint it, so that it can be uniquely identified. These functions take 1 input and give you 1 output. You could model them like this:
or this in binary, which is what the computer really sees:
So what just happened? Basically the hash function takes the input data, mashes it up, and gives you something that is unrecognizable. The function is static, which means that if you give it the same input you will always get the same output. “SHA256” is one type of hash function. We could each compute the SHA256 of “golden ticket,” and we will both get the same value.
sha256("golden ticket") = a5cf75865b718bf518d2e66c9a3ccc07e31a4aa5f1b417c4bbe448a607ec083e
You can try that now by typing into the box below and it will compute the SHA256 on the fly.
The output of the hash function is shown to you in hexadecimal form. Hexadecimal is counting in base 16 and is commonly used to represent data in computer science. In decimal (base 10), the way we usually count, the digits go from 0 to 9. This gives us 10 different symbols to use before we run out and have to start using another column. With few exceptions, all human civilizations chose to count in base 10, which is likely because we have 10 fingers.
In hexadecimal we have 16 symbols to play with. We take 0-9 for the first 10 symbols and then rather boringly, we delegate the duty of the remaining 6 symbols to the letters A-F. So 6 in decimal is just 6 in hex but 10 in decimal is not 10 in hex since we still have some characters left to blow before we resort to adding another column.
We use A for 10, B for 11, C for 12, and so on … until we run out of course. If we want to go beyond F (15 in decimal), we must record that we have maxed out the ones column once by putting a 1 in the next column and put the counter for the ones column back to 0. So the number “16” in decimal is “10” in hex. 1 is in the 16’s column so we take one 16 and we have a 0 in the ones column so we take none of those. The number
3C means take 3 16’s and C amount of 1’s. So . Okay, back to hashes.
If you change the input to a hash function ever so slightly, by just a single bit, the output will change dramatically and in an unpredictable way. For example, here are the 2 hashes for “charlie” and “charlid” which each only differ by a single bit as you can see from observing the binary of the input.
Their outputs are not related at all and also seem to have no relation to the inputs. They are completely random.
The other thing to note is that every output from the SHA256 hash function is always 256 bits long. The input on the other hand can be anything. It could be the letter “a” or it could be the entire text of Gulliver’s Travels. Both will give unrelated 256 bit long outputs.
This leads to an important property of hashes known as preimage resistance. This is a fancy way of saying that if I give you an output, there is no way for you to efficiently find the input that gave me this output. Efficiently here means that if you had all the computers in the world, bruteforcing the problem by just trying every input until you find the right one, you would not succeed for millions of years. For example, I challenge you to find the input that gave me this hash:
You can try with the tool above - but please don’t; it’s a waste of time.
This is a crazy property, but it makes sense. Consider the output which is 256 bits long. This gives you possible outputs. To find the correct output with a 50% probability you will need to try at half that many inputs on average. is such a massive number that it’s incomprehensible. For perspective, it’s a bit smaller than the number of particles in the observable universe, ~.
There are also other properties of hash functions, but we won’t need them. For now, we just need to keep the concept of a hash function in the back of our minds. Later on, we’ll see how these get used.
Digital signature are also pretty magical. Historically, they used some clever properties of mathematics, but we will focus on the properties and not dive head first into number theory. A digital signature scheme is a way for someone to verify that you intended to do something - quite like a signature you may use on a legal document. The difference is that signatures with pens are incredibly forgeable. I just need to see your signature once or could probably guess what it is based on your name. Furthermore, there is usually no way to verify a signature, and if there is a reference, that reference reveals how to forge it. It is shocking that banks still regularly use signatures for authorization.
A digital signature on the other hand is impossible to forge, and there is a convenient asymmetry such that giving someone a signature does not reveal how to generate more signatures.
Glossing over some detail here, we start by generating 2 large pieces of data that are random but related. We call one of them the private key and the other one the public key. These are large enough that it will be impossible for anyone to guess the values you generate. You can press the button below to generate some key pairs.
They certainly look random but not obviously related. Under the hood, the private key is used to generate the public key, but you cannot do the reverse operation of using the public key to generate the private key. Sometimes people use analogies with paint to describe asymmetric cryptography. If the private key were a color, let’s say blue, and I had some base white color, I could create light blue by mixing the white and the blue - this would be our public key. But with light blue, I cannot get back to the original blue; I cannot unmix the colors.
Once we have a key pair, we can announce to the world that we own this particular public key and anyone who cares will record that fact. One day, we decide that we want to announce a scheme to the world where we will release 5 golden tickets - how will the public verify that we actually did decide this?. We take the message (string) “I am going to release 5 golden tickets” and sign this data. That means we do some mathematical calculation involving our private key and this message.
If the message is too long, we actually cannot produce a signature due to the nature of the underlying number theory. To get around this, we can use the hash function introduced above to map all messages to a standard 256 bit size.
The signature produced is computationally unforgeable. Someone who does not know our private key will never be able to generate a valid signature. What is valid? A signature is valid if it can satisfy a verification algorithm which can be run against the public key the world already knows. This is how someone can be sure that you did intend to release that message.
Verification of signatures require another algorithm, abstracted away in the box above, and it will tell you if the signature is correct or not.
Now you may ask, why don’t we just tweet “I am going to release 5 golden tickets” and then the world will learn the same. Well yes you can do this, but how do people who go to twitter.com know that they are getting their feed from the real Twitter servers? Actually, through the same system of digital signatures! When Twitter sends you your feed, they sign that data and your browser checks the signature against a list of public keys that it already has. If the signatures pass verification, you get the green lock in the top left of the URL bar (exactly like you see on this site now). This is the basis of the Transport Layer Security (TLS) system that secures the web today.
All digital currencies are what you might formally call a distributed system. This is a system which is realized when a bunch of computers all follow a certain protocol. Rather than the system be a single program run on a single machine, the concert of computers work together. From this you get an emergent system - similar to many employees running a company but no single employee providing the service of the company on their own.
Anyone who would like to partake in this system can volunteer their hardware and run free software to participate in the network. The software that speaks the protocol is usually called a node. Computers speaking the same protocol can communicate to each other in meaningful ways just like humans speaking the same language.
This idea is not unfamiliar to you. For example, there are many servers and web browsers (Chrome, FireFox, Safari, etc) which all speak the protocol HTTP. The internet is also an open network, so anyone can set up a server speaking the HTTP protocol, which will allow any HTTP browser to talk to it.
All nodes for a particular cryptocurrency can talk to each other by implementing the protocol that is defined by the currency. To help the network stay in sync with the latest information, nodes will relay information they learn from other nodes. I want you to imagine that whenever someone wants to make an update to the database/blockchain, they input this change into their node which then tells its immediately connected nodes which then tells more nodes and so on.
For example, it might be a transaction moving money from Sam to Jules and so Sam will broadcast a statement like “I am giving Jules 10 of my coins.” That message is then relayed from Sam’s node to its peers and then on to their peers and so on until the whole network has heard about it. We call this the gossip protocol.
Information propagation like this is the most important part of blockchain networking and is necessary to understand how information moves in a decentralized system. In centralized systems, for example Facebook servers, information (like a status post) will flow from your laptop directly to the servers and when a friend of yours loads their feed, that post goes from the Facebook servers to their laptop.
Accounting within the blockchain
A blockchain will only accept updates if the proposed update satisfies some specific rules. The word accept here is a bit strange because in traditional databases, any update would be accepted. If you have the correct credentials, the computer will do as you tell it. Blockchains are different because each node in the network is programed to first check if the update satisfies particular rules. You could choose to have your node not follow these rules, but then you are probably going to allow people to defraud you, and your version of reality (who has what balances etc.) will be distorted from everyone else’s. Instead, by everyone accepting and rejecting the same updates to the database (i.e. implementing the same protocol), everyone shares a synchronized state.
To keep it simple, let’s use the bitcoin blockchain as our example blockchain. The rules it enforces on updates are concerned with keeping account balances sound - something you would expect from a currency centric blockchain. Remember that sending or spending money in this system, just like normal record keeping in banks, means subtracting from one balance and adding to another. The two most important rules are:
- Thou shalt not create or destroy money. - you can only spend what you have.
- Thou shalt spend only thy money (not anyone else’s) - only subtract from your balance.
So in the networking section we learnt that when a transaction is made, all the nodes will hear about it. They then run some checks to make sure the transaction is legitimate. In our example of Sam sending coins to Jules, they would ask:
- Did Sam have at least the amount coins being spent?
- Did Sam mean to send the coins?
We now translate these two English statements into cryptographic statements which a computer can verify.
The answer to the first question is simple - due to the design of bitcoin every node has knowledge of every transaction ever made. A node can look at the history of the ledger to see if Sam has the coins they are trying to spend. Sam will have enough coins, provided that they received at least 5 coins in the past. Similarly, the money in your bank account is the sum of receiving it from somewhere else.
There isn’t an account called “Sam” in the blockchain; instead there is, in this case, a bitcoin address that you can think of as an account number. It is an identifier for an account or a place where the bitcoin is located. A bitcoin address looks like this:
It is a random string of letters and numbers and you create it by taking the hash1 of a public key.
Let’s walk through what happens when Jules wants to receive bitcoin from Sam. Jules will first generates a bitcoin address and then send that string to Sam. Oh yeah, by the way, Sam and Jules are these 2 fairies …
Sam must already have some coins at an address that they control, and we can assume that they got those from previous transactions. Control means that Sam has the private key for the address.
The transaction that Sam will generate is just a statement saying that, in this case, 5 coins will move from their address to Jules’s address.
The raw transaction shown here can be produced by anyone and does not prove any authenticity. For the transaction to be honoured by the network, Sam would reveal the public key behind the address and generate a signature for the transaction with the private key corresponding to that public key.
Next the transaction would get broadcasted and the network would learn about it through the gossip protocol. Each node in the network will then ask the questions necessary for this blockchain to be updated.
The first check each node will perform is to ask, does
13j8VaRTAnbsKHqu3hHAj8KdPXRRvr8r9e have at least 5 coins to send? Provided the node is synced with the network, it can look back through the history of valid transactions to calculate the current balance on this address. If there are currently at least 5 coins at this address, we go to the second check.
Second check - did address
13j8VaRTAnbsKHqu3hHAj8KdPXRRvr8r9e mean to send these coins? To answer this question we use the digital signature that came with the transaction. Remember that digital signatures are checked against a known public key. The public key is provided with the transaction but we also need to check that this public key corresponds to the address sending coins because accepting a signature from a random key would prove nothing about authenticity.
Is the public key provided actually the input for the hash that made this address?
Does the signature verify correctly with this public key?
Yes it does. You could therefore say that the owner of the address has signed the transaction to send funds from this address, which means it must not be a fraudulent transaction.
Cool, at this point we have a system which is sound for sending and receiving coins. This is quite like normal money transfers; however, there is still a vulnerability. Sam, now wearing devil horns, could make two transactions for the same coins. The first transaction can send 10 coins to Jules as expected. Another transaction can be made sending the same 10 coins to another address that Sam owns. For simplicity in the diagram, I will replace the addresses with names.
Since Sam has the private key for the Sam_1 address, they can make valid signatures for spending all 10 coins twice. The network will run the checks on both transactions to ensure validity and they will both pass.
If each node was to calculate the latest balances based on all valid transactions, they would see that Sam has now minted 10 coins out of nothing. This has inflated away the value of everyone else’s coins slightly and by sending to themselves many times they could print infinite money!
Luckily, a node has a simple fix for this attack - just arbitrarily pick one transaction and reject the other one. Now Sam cannot print money.
This works, except what if different nodes select and reject different transactions? It doesn’t matter which transaction is rejected, but if different nodes reject transaction A and other nodes reject transaction B, the network will have split versions of reality. In this diagram, by randomly picking a transaction, the nodes end up with different balances for each address.
This would make bitcoin pretty useless if no one could agree on any balances. Bitcoin has no hierarchy and so there is no authority for the nodes to turn to - this would be a centralized system otherwise. We solve this infamous double spend problem in the next section with something called a consensus algorithm.
A consensus algorithm is a methodology of definitively picking which version of reality the group will accept. We have a current state that everyone agrees with and then the network has to make a choice between following branch A or branch B for the next update to the database. Which branch is picked does not matter, but what does matter is everyone selecting the same branch simultaneously.
Now, it turns out that making these decisions takes a fair bit of time and this is necessary for security reasons (we will see why a bit later). In the Bitcoin network this takes 10 minutes and in Ethereum it’s about 14 seconds. To facilitate more throughput than just 1 transaction per 10 minutes, we will actually bundle transactions into what we call blocks. This is what the consensus map looks like when we have blocks of transactions.
It’s probably worth noting that the same transaction might appear in multiple candidate blocks which is fine. If a transaction is in every candidate block, then it is assured to be confirmed in the next round. A block will not have conflicting transactions (like the double spend from Sam) within it because that entire block would get rejected from the network since it print money and violates the bitcoin rule set.
So how does the network synchronously make the decision about which block will become the next chapter in bitcoin’s history? Some set of voluntary nodes in the network will select a candidate block and then engage in an activity known as mining. We will see why it is given this name a bit later. Normally in centralized systems, collective decision making is not a problem, because it’s not a collective decision. One authority or server just makes a decision and the rest follow.
Mining is the act of spending time and energy computing the solution to a problem, based on the chosen block, and then presenting this solution to the network. This problem is kind of like winning at bingo. You play and play and play and then eventually, someone randomly wins. When someone wins, they broadcast their solution (in the bingo example, they show everyone at the table that their card has a sequence of matches) and then whichever block they had curated is the one the network chooses. Who wins is just random (as explained below), and that’s fine. We just need someone to definitely be given the conch for a split second so that they can tell the network which way to go. After that, everyone goes back to the next round of mining again - the conch is passed along.
It’s almost as if blockchains manage to remove the central authority by delegating the power of a central authority randomly for a brief moment, and then it is assigned randomly again.
What is this bingo like problem that miners are solving? The problem proposed is to find a value, that we call a nonce, which will allow for to have some number of leading zero bits (“000…..000001”) in the equation2 below.
On the left, we have . simply means concatenate - so for example “abc” “xyz” is “abcxyz”. A nonce is just some number (or more precisely some bytes) which we put on the end of the block. Hash the whole thing to get and then check if it has the properties we are looking for. Namely, the property we are looking for is that , when put in binary form, has leading 0s. could be , , - just some reasonable positive integer. The nonce cannot be found algorithmically beyond just trying values until you get the right one, “Is it 0? Nope, try 1, nope, try 2, nope, try 3, nope …”
Imagine you’re a miner and you have chosen the block in the diagram below - try to find a valid nonce. I have hardcoded the block below and you can pick a nonce in the box. See what it takes to get at least 3 leading 0s for the block.
2 leading zero(s)
2 leading zero(s)
Each bit has a 50-50 chance of being a zero. So for 3 zeros, on any one attempt, you are looking at a of % chance of getting something useful. Try some nonces until you get something valid.
Did you get it? Turns out, that in this case, you get at least 3 zeros for value . So is a valid nonce, but not the only one - there is another valid nonce at for example. There are actually an infinite number of solutions, but we only need one.
Since computers can try much faster than a human clicking a button, we ask computers to find a nonce for us, and to get an idea of what the difficulty is like in real bitcoin mining, we are looking for a hash with ~70 leading zeros.
Notice that while finding a valid nonce might take minutes, it’s instant to verify the solution. If someone tells me is a valid nonce, I can put it into the algorithm, and in one operation I will see if it gives an output of leading zeros.
So this is what mining is - brute-forcing a solution to this game of randomness. When many people are doing it, someone will win before everyone else. The chosen one then broadcasts the valid nonce along with the block they curated. If the block is valid (all signatures are valid, no conflicting transactions) and the nonce is valid, we accept it as the next block and new history is written.
This is how we keep everyone’s understanding of the current state of the world in sync. Someone randomly gets momentary privileges to tell the network what’s next, and provided that what they proposed is valid by the rules of the given blockchain, everyone follows the decision and the process repeats. A valid nonce is also often called a proof of work since producing it required some intensive computation to be done. This necessarily slow proof of work also ensures some natural rate limiting for people claiming the conch and the rate at which new coins are minted.
So now we have a way to ensure:
- That transactions require authorization.
- No one can mint money fraudulently.
- Attempts to double spend will not result in network splits into different mutually exclusive versions of reality.
Miners spend money on hardware and electricity to serve a necessary function of determining which block will be the next block. To incentivise people to mine, every time a miner finds a valid nonce, the community actually awards them some cash and this payment is called the block reward. When a miner submits a nonce, they also announce an address where these new coins will appear out of thin air. The network agrees (by partaking in the system) that these coins now exist. The first balance of this address will be the size of the block reward. Let’s assume the block reward is 50 coins.
If you go back far enough in history you can trace all coins back to this special genesis transaction. Since mining requires time and energy, there is a natural rate at which block rewards are given out, and it’s not something you can just generate en mass and therefore print lots of money. Really you’re converting time & electricity, 2 things bounded by universal laws of physics, into money.
Think back to when I said that for Sam to send coins to Jules, they had to have previously received coins. Where do the coins initially come from then? Mining is the answer.
The term mining was chosen because time and energy are required to generate this valuable asset, quite like mining gold out of the ground requires time and energy. Profit from this industry comes from selling the coins that are created through this work and only in some circumstances is it profitable. Cryptocurrency mining tends to be concentrated in China and Iceland because electricity is especially cheap in those countries.
It’s worth noting, that a built-in feature of bitcoin and many cryptocurrencies is that the block rewards exponentially decays. So in bitcoin, we halve the block reward (called the halvening) every 4 years. In the early days, the bitcoin block reward was 50 coins but back then, bitcoin was also worth much less and it was very unclear if bitcoin would survive the next month. It’s a cool system because it incentivizes early adopters to take a leap of faith, but as the system matures, this reward gets less and less. Hopefully the value of the coins is also going up as the system matures and so the USD value of the block reward kind of maintains itself.
The Longest Chain Wins
There is still another class of attacks that Sam can pull off! We need a few more things to make our blockchain secure.
When you join the network and receive updates from others nodes, how do you know that the data you’re given is legitimate? The node hierarchy is flat so anyone’s word is as good as anyone else’s - there is no central authority. If you have been on the network for a while and you’re are told information that conflicts with your recorded history, that’s confusing and maybe you can call this out as malicious - but how do you know the original history you received was correct in the first place. If you’re new to the network, well that’s even worse. You have no idea what’s correct and you have to trust what your peers tell you about the current state of the network.
Let’s imagine Sam goes rouge and tries to disseminate false history. This could be beneficial if for instance, they have already spent coins in the past to purchase something, but then change the history to show that those coins were sent elsewhere (probably to an address they control).
Perhaps nodes could take a poll? The history they accept could be based on a survey of the network and whichever history is most frequent is the chosen one.
This is not a very good plan since deploying computers on a network is trivial and cheap. This makes it easy to tip the vote in your favor. If the difference between A and B is a billion dollars, you will go out and buy all the machines necessary. Better yet, with Amazon Web Services et al, you can deploy thousands of computers from your keyboard. This attack is called a Sybil attack.
What we need is a system which allows you to determine which of two proposed histories is more accurate. The indicator should be something very expensive to forge. We know proof of work is costly in time and energy - is this enough?
Sam could forge a proof of work for each of the new blocks they claim are part of real history since it only takes about 10 minutes. Not a huge penalty at all. In the diagram above, Sam mines a new nonce for only the modified block and then inserts it into the history.
So to solve this issue, the protocol says that each block is to be linked to its previous block so that you cannot just swap blocks in and out.
This is where the chain part of blockchain comes in. The way we will link blocks is through our old friend, the cryptographic hash, and this will force a sort of avalanche effect where by changing a block in any way, will cause the next to change and so on.
First, we need a way to fingerprint or summarize the data in the previous block. Hash function sounds perfect, so first we will take a hash of the whole block (it’s a bit more elegant than this in the real thing, but this is good enough for explanation).
We then include this hash in the next block so that the previous block is immutably part of the next.
The goal of this linking is to ensure that if we are at block , and you want to change a block , you will have to recalculate the nonces for block , and . Changing anything about block will result in the nonce needing to be recalculated and also a new block hash. This hash must be part of the next block, which means data in block has now changed, and so you need to calculate another nonce. This cascading effect will happen up to the tip of the chain.
See this avalanche effect happen when you change the contents of the block in the diagram below.3 We will assume that the mining difficulty is 5 zeros.
5 leading zero(s)
If all the nonces become invalid, the network will reject the proposed blocks, and now to fix this, Sam will need to spend time and energy finding each nonce again. That is not so bad. Each one takes 10 minutes in bitcoin, so this will take 30 minutes to compute three blocks worth. Sam probably does not control all of the network’s hash power so it will really take them much longer, but it can be done.
The challenge for Sam will be generating 3 nonces quick enough because blockchain clients will also follow an important rule - “trust the longest chain.” 4
In the time that Sam was recomputing nonces for blocks , and , the network also progressed another 3 blocks. To a new node joining the network, they are presented with Sam’s chain which is blocks long, and another chain blocks in length. If the node follows the rule that the longest chain is the more valid one, then they will not be duped by Sam’s chain.
If every node follows the longest chain, then someone’s ability to change history is drastically reduced. They would somehow need to produce valid blocks faster than the rest of the network. This is possible, but that would require them to have 51% of the computer power on the network.
Since mining is an economically competitive field, many people join the mining arms race so that they can earn more and more. This makes getting 51% of the mining power extremely difficult and likely to cost much more money than you will get out of changing some ledger history. Recall that the block reward is a fixed amount of bitcoin. This means that the amount of mining power invested into the network is directly proportional to the value of bitcoin; because the more valuable bitcoin is, the more people will invest in mining rigs.
The incentive to attack the network is also proportional to the value of bitcoin since this determines how much wealth you can gain from the attack. If it’s not profitable to pull off a 51% attack today, it might be later on if the bitcoin price doubles for instance. Quite elegantly, since a higher bitcoin price will also mean more money invested into mining, the security of the network moves proportionally to the incentive to attack the network. This is brilliant because it means that a 51% attack, where you buy enough compute power to mine faster than everyone else combined, is likely to never be worthwhile. This should be so expensive that you would have been better off keeping the cash rather than spending it on computers and electricity.
Quick side note, since people have often asked me who sets the price of a given coin - the exchange rate of any digital currency is just supply and demand. There are lots of cryptocurrency exchanges and they have order books which reflect how the market values each currency.
This longest chain rule is the final nail in the coffin for credible attacks against blockchains. Blockchains maintain their security out of an amazing balance of game theory and cryptography. Amazingly, it works and there is one more subtle kind of defense in the system too. Since the value of the coins are derived entirely out of belief and trust in the system, if you did attack the network and someone noticed (and that is very likely since it’s a public network), you would drastically reduce the value of the coins and so your heist might become pointless. This is not a core defense, but it is a nice bonus. If anything, it forces people to improve the system and innovate so that the coin rises in value even more.
Protocol Governance and Evolution
There is probably one last topic worth mentioning about blockchains which is that they are fully open source and community driven. This is a boon to innovation, experimentation and progress. In theory, anyone can propose edits to the protocol and even write the necessary changes to the codebase. If enough of the community likes it, they will download the latest software, update their nodes, and the blockchain has evolved slightly.
It’s very empowering when anyone in the community can improve the system used by all. Not many products are like this and especially in gridlocked political systems, progress can be very slow. In reality though, there are usually large debates and many conflicting interests for significant changes to a cryptocurrency. The pragmatism and ideals of each the community varies from currency to currency; but at worst case, if you believe a group of people are better served with a change that has not been accepted by the community, you can do what’s called a fork.
A fork is when you propose a change to the network that will be incompatible with the previous network after the change is made. At the time of the fork, the history between the two currencies is perfectly shared but from that point on they will diverge.
You could copy the original code base, make the necessary changes and then your followers will download and run your version of the software. This allows for everyone to choose the network that best fits their needs and ideals.
For example, there are 2 bitcoin currencies now - bitcoin and bitcoin-cash. Bitcoin is quite conservative and pretty resistant to most change, which may be good for stability and long term stores of value. Bitcoin-cash wanted to increase the maximum block size - this would allow for more network throughput and lower transaction fees.5 There is an exchange rate between the 2 currencies which we could arguably see as the market perceived value of each coin.
The main point to note is that that people now have a choice of which system to use and each community might be stronger on their own now that they can cater to their own needs rather than trying to serve lots of different interests simultaneously. This may explain how after the fork the sum of the prices of the two coins was larger than the original bitcoin price pre-fork!
Think of all the other financial systems out there and note that they are all run by governments. Typically citizens of a country cannot just chose an alternative financial system nor can they propose direct changes to their system. At best, if you’re lucky enough to live in a democracy, you can nudge politicians towards certain policies. The ability to innovate freely with economic rules and choose the financial system that works best for you is part of the beauty of cryptocurrency networks.
Sometimes people say bitcoin isn’t backed by anything, and therefore it cannot have any value. Well, it’s an interesting comment since neither are US dollars backed by anything (in 1971 the gold standard was ditched). Effectively US dollars, a truly fiat currency, are backed by a belief that someone else will accept it as valuable later on when you go to spend them. On a macro scale, this is reflecting trust that this resource is scarce (hopefully not anyone can forge US cash or update their bank accounts) and that the US government will not collapse.
I think, in a similar way you could decide that bitcoin is backed by mathematics, time and physics. The currency cannot be forged unless you have some magical access to lots of energy or you have a time machine. To be disjointed from the political system and backed by something untouchable by humans is a pretty cool quality for a currency.
That was a lot! Unfortunately there are lots of moving parts but blockchains, amazingly, do work!
When put all together in the right way, we get amazing emergent properties:
- We can definitively store data without relying on any single entity being the dictator of truth.
- We can update the data only in a way that everyone agrees to in accordance to some rule set useful for the given application (did they have the coins?, did they want to move the coins? …).
- We can believe that the data in the blockchain is correct because of the properties of cryptographic functions and the economic incentives of a majority of nodes to be doing the right thing.
For most of the explanation thus far we were using bitcoin as an example use case of a blockchain. Blockchains are updated according to some rule set; bitcoin has a rule set useful for a currency. In the next post we will talk about Ethereum, a platform designed to make it easy for developers to leverage the powers of blockchains. The hard coded rule set that made bitcoin bitcoin, will become programmable. This drastically reduces the amount of work required to use a blockchain in an application.
Thank you so much to Sam, Margaret, Brittany and Ana for reviewing my post.
It’s actually multiple hash functions (including our friend SHA256) put together. ↩
The real equation it a bit more complex. There are fields which have been left out for simplicity and the full block is not used but instead a fingerprint of the block (done via more hashing). ↩
Fun fact: I actually had to mine these blocks with a script to make the initial state in the diagram work. Of course, how else would you get a valid proof of work? ↩
Technically, it’s not the longest chain but the chain with the most work (proof of work work) invested in it. Just going after the longest chain is not secure because an attacker could tell you that the history of the chain uses a much weaker proof of work (less leading zeros) and this will make it easier for them to generate blocks faster than the rest of the network. ↩
I didn’t talk about transaction fees anywhere, but the short story is that with each transaction you burn a little bit of money in a fee which is given to the miner for their work. Miners therefore earn coins via both the block reward and fees. Fees are necessary to stop spam and when the block reward gets really really small, it will probably account for most of the miners’ income. Fees vary widely between networks - bitcoin has a small blocks and the network is overcrowded, fee for the average transaction might be around $15. Bitcoin-cash has fees under 1 cent. ↩