We Need To Talk About Blockchain(i)…..
and why it’s all hot air…(except, just sometimes, when it maybe isn’t.)
The blockchain juggernaut may be slowing slightly as “Data Science” takes over as the latest shiny tech bauble on the street, but is still only just past the Peak of Inflated Expectation and is certainly still some way off the Plateau of Productivity (cf. Wikipedia “hype cycle”). The number of invitations I keep getting to learn all about it, and how it’s “the most in-demand computing skill on the market” right now (I kid you not) it’s tempting to climb aboard the hype train. As someone who once (briefly) made a living building Expert Systemsii I am well aware of the risk of overselling new tech, so it’s high time we critically appraised the technology and its purported application areas.
I’m not saying that blockchain is intrinsically bad, I’m saying that it’s not a universal panacea and there are many charlatans and Snake Oil Salesmen out there all trying to make a quick buck out of our ignorance, susceptibility to advertising and fear of being left for dead. Secondly, and perhaps most importantly, much of the press conflates blockchain and cryptocurrencies. One is a technology that has undeniable merits. The other is a currency/commodity (you choose) that may or may not leverage blockchain. This conflation is an error. But because of the prevailing noise, here I’m going to focus on picking holes in “blockchain” in the context of cryptocurrencies.
For the sake of clarity I should also say I’m not professing any deep intuitive understanding of how to solve all the blockchain conundrums so don’t expect a divine revelation in the last quarter (although I’ll save my favourite issue until last). What I’m trying to do is to enumerate the issues then look at what is being done to address them, not preparing my own batch of Snake Oil. Now not all of the issues are applicable to every blockchain implementation, but some of the issues are applicable to many cases. All of these issues have been talked about elsewhere but not necessarily in a thumbnail “Here’s the list of dumbass issues we have to fix” way. Also there’s no guarantee that they can all be resolved in such a way as to enable blockchain to be deployed for all the use cases it is posited for. There are a plethora of blockchain implementations around but I’ll stick largely with Bitcoin and Ethereum for my examples since they’re the most widely known. So let’s get on and fling some dirt at those two. Note that these are both cryptocurrency examples so the commentary is skewed accordingly.
The Ugly List
Proof of Work(PoW)
PoW is a fundamentally bad idea, not scalable and an all round bad thing for the planet. Thousands upon thousands of servers burning up the earth’s resources so people could verify that a single transaction was correct to secure a network and produce another bitcoin. It’s not scalable, sensible or long term viable. The ‘work’ has to keep getting harder at a rate comparable to Moore’s Law and the number of participants, otherwise it’s no longer ‘hard’. That’s just dumb. Calculating a nonce (the cryptographic one not the pejorative British variant) is designed entirely to equalise the likelihood of a bunch of computationally similar servers hitting the jackpot. That’s it – totally artificial. Andrew Tayo gives as good an exposition as any as to why PoW is a dumb idea here. (2022 Addendum: so asof this year Ethereum stopped using PoW and switched to Proof of Stake. Kudos to them although there’s no perfect solution to this conundrum.)
CAP Theorem
Blockchain isn’t magic. It’s still bound by some of the rules of physics, or distributed computing at least. Let’s consider the CAP Theorem, aka Brewer’s Theorem. As a refresher CAP stands for Consistency, Availability, Partition tolerance. Pick any 2 from 3 (or 1 from 2 as we’ll see in a minute). You can’t have all of them.
- Consistency: Every read receives the most recent write or an error.
- Availability: Every request receives a (non-error) response, unless the node has failed.
- Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes, ie Consistency and Availability persist.
But the 2 from 3 thing is only a big deal when there’s a network error, ie a partition in the network. Then you end up choosing between consistency or availability. Imagine you have 2 blockchain nodes NodeA and NodeB where they lose the connection to each other, so the network is partitioned. Then a new block X gets written to NodeA. Any read from NodeB should return X. However the network is partitioned so if NodeB returns the latest block it has (say the previous write, X-1), then it is not consistent. Alternatively it can wait to confirm the latest block written, but then it’s not available. The only other alternative is that the update block X gets communicated to NodeB, however that requires NodeB to be able to see NodeA which means that the network is not Partition tolerant. So really it’s a choice between Consistency and Availability.
If you have to be consistent, and that’s a pretty fundamental tenet of blockchain, and cross-nodal consistency cannot be checked, then an error must result. But ‘Availability’ asserts that you cannot receive an error response. That’s a problem.
Let’s reiterate. You can only assert that a blockchain is consistent following the addition of a new block when you know that every copy of the blockchain is updated with the same new block. But if there’s a network break the consistency cannot be checked, so it’s unknown, and so either cannot be consistent (with universal confirmation) or cannot be available. That’s a problem.
As soon as you get a network partition blockchain stops because the nodes can’t collectively validate an addition. To avoid inconsistency blockchain has to stop everywhere. That’s a big problem.
It’s not a total disaster as Yaron Gotland showsiii a blockchain can be made AP or CP, but blockchain does not have a special sauce to circumvent the CAP theorem.
Growth
If every machine has to maintain the whole blockchain there’s a limit to it’s size. Typically distributed storage is used for two reasons, either to site the data close to its destined usage location or because there’s too much to store in one place/server. The latter option is off the table if everything has to be stored everywhere.
There are other issues. Although the Bitcoin blockchain can only scale by 52.5Gb per year, ensuring universal consistency is still expensive since unless there’s some ability to shard then you have to store the whole chain at every node. If you shard you no longer have all the information, so can’t claim to know everything. Given a network containing n copies of the blockchain, each time you add a blockchain copy at a new node, the total cost of confirmation effort in compute increases linearly from nt to (n+1)t (if all the nodes’ compute times are added up).
Each time you add a block to the chain the number of blocks stored across all copies increases by n+1. Distributed computing was generally intended to partition a problem rather than increasing its size and complexity.
Decentralised…or not
The idea of decentralisation is that there’s no bottleneck. There is no single arbiter, or director that must be consulted before anything can be approved. However, if every node has to store the entire chain that creates a whole new kind of bottleneck – everyone has to be consulted and agree before anyone can say that it’s approved. To get around this there’s the Delegated Consensus Protocol whereby a few nodes are the chosen ones who get to make a call. But this pushes us back to a centralised network again, and each of these nodes risks becoming a bottleneck.
Scale rate and Block Time
There’s a limit to how fast blockchain can scale (theoretically 7 transactions per second for Bitcoin but actually 3 or 4) which puts a natural ceiling on its use. Ethereum is cited as being quicker (12 to 30 transactions per second depending on who you ask), but a live mass transaction ledger it is not. Visa runs around 2000 transactions per second. Even Paypal is around 200 per second so blockchain is still at least a couple of orders of magnitude off anything we’d generally consider mass transactional.
The “block time” (average time to create a block to the network) is currently in the 14 to 15 seconds range for Ethereumiv but 10 minutes for Bitcoin. Recently Ethereum block sizes have been between 15K and 35Kv . The number of transactions per block are limited in Ethereum by the amount of “gas” each transaction consumes. Gas is the cost of adding the transaction to the chain, essentially it’s a measure of the computational effort required to execute a transaction. Simple transactions use less gas, large smart contracts with embedded functions use more. Will it scale? Not while PoW remains the arbiter of inclusion and everything has to be stored everywhere, even if Ethereum transactions are typically smaller than Bitcoin ones.vi
Bandwidth
Blockchain is very ‘chatty’ – everything needs to agree with everything else which naturally requires lots of bandwidth. The more agreement across nodes you have the more bandwidth you need. If data volumes increase linearly (or worse) with the number of nodes storing the blockchain, then network contention is a risk. A 2018 studyvii tried to assess the impact of a large scale deployment of blockchain for IoT usage concluded that “a remarkable amount of downlink resources” were required to maintain the veracity of the network. That’s data travelling down to the nodes to ensure correctness rather than because node has requested it or needs that specific data.
51% attack
The basis for the “51% attack” is that if someone takes majority control of the mining power then they are in a strong position to determine what block gets added to the chain. It doesn’t strictly require one party to have 51% of the mining capability, but it’s significantly more unlikely if they don’t. It’s a risk for public blockchain, even if only a small one. Given the vast quantity of global computing power currently being devoted to mining bitcoinviii it’s a pretty improbable scenario, but theoretically possible.
Why is blockchain the right solution for your use case?
Here’s the number one gnawing question in my head when I think of blockchain. There seem to be a vast number of people claiming “We can use blockchain for that!” Actually what they should really be doing is asking “What’s the best solution for our use case?”. Just because it’s there and everyone’s talking about it doesn’t mean it’s right – I’ve never yet needed a torque wrench, jig saw or trolley jack to assemble Ikea furniture. Similarly there are plenty of solutions that need to be considered before you decide that blockchain’s the answer to your problem. Other solutions are battle-hardened, well understood and probably cheaper to implement. I once inherited a system from a previous developer who had implemented all inter-process communication using the current vogue tech, CORBA. It was wholly inappropriate so the first thing I did was rip it all out and replace it with good old-fashioned TCP/IP (sockets) which worked better, faster, and didn’t require us to licence an ORB with every distribution.
Conclusion (Part 1)
In conclusion blockchain’s not going to replace the mass transaction platforms such as Mastercard or Visa. It doesn’t scale fast enough or large enough, it breaks most of the rules of distributed computing, is bound by all the laws of distributed computing, and (PoW) is fundamentally bad for the human race. No wonder Satoshi Nakamoto used a pseudonym.
Many tasks currently proposed as being “solved” by blockchain can be resolved without blockchain at all. It’s just the case that everyone wants to jump on the bandwagon and claim that they’re doing the first/most fabulous implementation of blockchain to resolve their burning issue. Sometimes blockchain is being developed in isolation from any issue, “company X is developing the best blockchain ever!” Why? What’s the problem that it’s solving? Many of them seem to be attempting to address one or more of the issues listed above, but it seems that the proposed remedies are not without flaws too.
So that’s the bad news people, your blockchain startup needs a fundamental rethink before the VC pitch. But let’s see what some of the potential fixes look like.
Is blockchain doomed?
In a word ‘no’. But we have to look ahead to the Plateau of Productivity before it starts to make sense. There are alternatives to PoW – Proof of Stake (PoS) for example, is currently under review for a hard fork by the Ethereum community. It does not require the same level of compute and the Casper protocol has been designed to be resilient to DDOS attacks. See https://blockgeeks.com/guides/proof-of-work-vs-proof-of-stake/ for a reasoned exposition, but briefly, a pool of “validators” (or “forgers”) post some of their own Ether (the Ethereum currency) and are then the good guys who are permitted to determine whether a new block gets added or not. If they do something bad, like creating a bad block, then they lose their validator status and the Ether that they posted. If they play nicely then they get compensated at a yet to be announced rate. The jury’s still out on this one for a variety of reasons but it shows that the issues are being considered.
The issue of chain growth is a thorny one. Ethereum has coherent plans to support shardingix but then you no longer have full visibility of the chain – you only get to see, well, the bit you get to see, and If that’s all you care about then that’s fine, but incomplete. The main issue is the complex inter-shard communication that is needed to implement this solution.
Recently Vitalik Buterin has claimed that using sharding and Plasma (a technology intended to enable Ethereum to handle much larger datasets) there is a view that mass volume transactions are possiblex. So currently vapourware, but hope persists.
A forward peek into the Plateau of Productivity
The tribulations of public blockchain are many, but in the near term we can still consider cases where the permissioned (aka private) blockchain can present opportunities. Let’s get specific and consider fintech. In most cases private is going to be an easier start than diving straight in with public. Some of the issues and opportunities that blockchain may be able to address include:
1. Reconciliation reduction – if we know that our chains are identical then we know that the contracts within the chain must match. If they canonically and completely describe the transaction then both parties can rest assured that there are no further checks and balances required. That’s a big ‘if’ of course since we typically restructure trades as they pass between our internal systems to re-render them into the appropriate format expected by each system they touch. What the blockchain industry’s proposal seems to be saying is “if everyone’s system reads the blockchain then everyone will know that they have the same transaction data because the chain is immutable”. But this also pre-supposes that everyone interprets the content of the data element of the block identically. This is a pre-existing problem but not one that blockchain necessarily helps to fix.
2. Immutable recording – Asset ownership is a case where we need to have a monotonically recorded sequence of transactions that cannot be altered. For example, if a bond is issued and the owner recorded on the blockchain then we can traverse the chain to identify the owner and pay them their due coupon. Naturally any such implementation needs to address a current problem rather than moving an existing, functioning process onto new technology for the sake of it however.
3. Disintermediation – I see lots of use of the word “potential” but I’m still waiting for the killer app. Disintermediation of systems, firms or individuals that add no tangible benefit to the transactional process but still levy a tax on it must be encouraged. The challenge remains to identify suitable candidates.
4. Smart contracts – I’m hearing that they can be embedded into a block to enable stuff like automated payments to go ahead. This is great provided there’s a clear benefit. Isn’t this going to either increase the block size or reduce the number of transactions a block can support? I suspect that the smart money is on minimising the amount of data stored on the chain and keeping ancillary information such as reference data elsewherexi. Both bitcoin (Lightning network) and Ethereum (Raiden) have consideration for off-chance transactioning as an option to solve some of the issues.
In January 2018 Forbes carried a story listing 35 applications of blockchainxii, it will be telling to review the list in January 2019 to see which persist.
Conclusion (Part 2)
Many of the initial opportunities remain for blockchain but there are a range of issues that need to be addressed before mass adoption is practical. In the near term private blockchain with PoS or some related validation mechanism seems to offer the best opportunities for adoption. There are plenty of smart folk out there working on the issues amidst all the chatter and charlatans to make me think that we will achieve a truly beneficial implementation in the not-too distant future. It just won’t be everywhere. In the meantime watch out for anyone offering to rewrite all your applications to use blockchain. First ask: “What’s in it for me?”
i With apologies to Lynne Ramsey.
ii Those under the age of, say, 45 should reach for Google now.
iii See http://www.goland.org/blockchain_and_cap/
iv See etherscan.io for the current chart.
v See https://etherscan.io/chart/blocksize
vi See https://ethereum.stackexchange.com/questions/30175/what-is-the-size-bytes-of-a-simple-ethereum-transaction-versus-a-bitcoin-trans?rq=1 – note that the gas limit on Ethereum means block sizes are not fixed as in bitcoin, but further investigation of the nuances of block size are beyond the scope of this blog entry.
vii See https://arxiv.org/pdf/1711.00540.pdf for details. Note that their analysis used local validator nodes, ie delegated consensus – this in itself should have considerably reduced the network load, however it was still considered problematically noteworthy.
viii Needlessly (QED).
ix Other new implementations, notably Zilliqa have implemented their own variant of sharding they’re calling “network sharding” from scratch. They refer to Ethereum’s implementation as “state sharding” to differentiate the approaches.
x Look out for the OmiseGO AMA session on YouTube for an exposition. It’s worth watching just for Buterin’s awesome T-shirt not to mention his intellect and clarity of explanation.
xi Take a look at what VAKT are proposing for physical post-trade processing. They are starting with oil but withal eye on the commodity industry in general. Crucially they are building an enterprise-grade system that uses blockchain for storage rather than an enterprise blockchain – it’s a nuanced but important distinction. Equally important is the buy-in of a significant quorum of market participants – VAKT have this in place. Look up a chap called Adam Vile for details – he’s both compos mentis and knowledgeable on the matter.
xii See https://www.forbes.com/sites/bernardmarr/2018/01/22/35-amazing-real-world-examples-of-how-blockchain-is-changing-our-world/#34f5be4b43b5