site
scripts to generate personal blog and git repositories
git clone https://9o.is/git/site.git
what-is-blockchain.md
(19003B)
1 filename: what-is-blockchain.html
2 title: What is blockchain?
3 description: "What is blockchain?" was the one question people at the IBM Garage asked the most after several new Garage locations were announced, due to the high demand for consultancy services related to blockchain-as-a-service. With my experience working at a Bitcoin startup, I aim to answer that question by walking you through smart contracts, a bit of Bitcoin's history, and the differences between Bitcoin and private blockchains.
4 keywords: blockchain bitcoin IBM
5 created: 2017-03-01
6 updated: 2017-03-01
7
8
9 <nav aria-labelledby="toc-heading">
10 <h2 id="toc-heading">Table of Contents</h2>
11 <ol>
12 <li><a href="#generic-use-case">Generic Use-case Scenario</a></li>
13 <li><a href="#tsa">TSA: Timestamping Authority</a></li>
14 <li><a href="#hashing">Hashed documents?</a></li>
15 <li><a href="#timestamping">Timestamping</a></li>
16 <li><a href="#security">Security of blockchain</a></li>
17 <li><a href="#origin">Origin of blockchain</a></li>
18 <li><a href="#blockchain-as-a-service">Blockchain as a service</a></li>
19 <li><a href="#smart-contracts">Smart Contracts</a></li>
20 <li><a href="#opinions">My opinions</a></li>
21 </ol>
22 </nav>
23
24 You may have tried googling this question to end up coming short or, even worse, be more confused than you initially were. That's because people interpret the term "blockchain" in different ways. Like the word meme...
25
26 > ...meme was coined by Richard Dawkins to represent a "social gene" — a meaning that only exists within social constructs. Winking in Ireland is a meme for being friendly. It's socially passed down from one person to the next like a gene, but it was not passed down completely, which is why it means something different — promiscuous — in the US. So when you're in the US, don't wink at strangers (wink wink).
27
28 Today, the social web has extended the term "meme" to mean any funny joke. The term blockchain is no different. It means something different depending on which community uses it.
29
30 That causes a lot of confusion and has made it the one question that people at work in IBM's Bluemix Garage ask the most after several new Garage locations were announced because of the highly demanded consultancy for blockchain-as-a-service. The service is powered by the open source project, Fabric, [IBM's blockchain merged with digital assets](https://github.com/hyperledger/hyperledger/blob/3ae1e266f3a715b129d26169d166e56aeabcb1f6/README.md#fabric-incubator), and is now the leading incubation project for [Hyperledger](https://www.hyperledger.org/).
31
32 I'm asked this question a lot because I've been in the Bitcoin loop for years now and have founded a Bitcoin startup before joining the Garage. Luckily, Hyperledger is very different from Bitcoin and much simpler to understand because of its centralized security model.
33
34 To put the question to rest, my observation — from being part of the Bitcoin community, interested in what Ethereum has been up to, and listening to some leaders of the Hyperledger project explain blockchain to large crowds of people — is that all definitions of blockchain cross at one point. That point most closely resembles [Haber and Stornetta's](https://www.anf.es/pdf/Haber_Stornetta.pdf) [timestamping solution](http://danielsuo.com/projects/bitcoin/literature/improving-time-stamping.pdf) (circa 1990) as referenced in [Bitcoin's whitepaper](https://bitcoin.org/bitcoin.pdf).
35
36 ## Generic Use-case Scenario {#generic-use-case}
37
38 To understand the timestamping problem, here's an example scenario:
39
40 * Alice owns property X and wants to transfer ownership to Bob.
41 * Alice digitally signs a contract — stamped with the date and time — that transfers ownership of X to Bob.
42 * Bob can prove ownership of X since the date and time on the contract.
43
44 The problem is that nothing stops Alice from digitally signing a second contract with an earlier timestamp to transfer ownership of X to Charlie. Charlie can get the authorities involved and claim Bob stole X from him. Bob shows his contract, and now the spotlight is on Alice. Why did she sign two contracts? Alice can invent a story that she didn't sign Bob's contract: he stole her private key and signed the contract himself.
45
46 So to whom does X belong? Alice, Bob, or Charlie? No one outside the negotiation knows, not even the authorities, so it can't be settled in court. That makes Alice's timestamp useless.
47
48 That's the problem. The timestamp must come from a trustworthy legal entity. This is the same problem that cryptocurrencies, like Bitcoin, have faced for decades and is commonly known as the double-spending attack.
49
50 ## TSA: Timestamping Authority {#tsa}
51
52 The easy solution is to use a third-party service, known as the Timestamping Authority [TSA], that everyone can trust, like a bank, to sign and timestamp contracts. Now, Alice can digitally sign a contract that transfers ownership of X to Bob and go to the TSA to have the contract signed and timestamped with the current date and time.
53
54 Bob can then accept the contract since it was signed by the TSA. Alice can sign a second contract to transfer the ownership of X to Charlie, but the TSA will not timestamp the contract because property X belongs to Bob, not Alice. That solves the problem...sort of.
55
56 The problem with that solution is that everyone trusts the TSA. This leaves the TSA with too much authority to sign and timestamp contracts. If the TSA somehow benefits from screwing over Bob and taking sides with Charlie and Alice for capital-gain reasons, Bob will get screwed just like in the initial scenario. Trusting the TSA doesn't guarantee a trustworthy system.
57
58 Haber and Stornetta's timestamping proposal helps reduce — not fix — the problem, by minimizing the TSA's authoritative power. Instead of the TSA signing and timestamping contracts, that authority is left to the parties of the contract, greatly reducing the TSA's privilege. The TSA only acts as a ticking machine that sends out a signal at an interval to aggregate a bunch of contracts and transparently broadcast that information to an *open network of witnesses*.
59
60 You can't simply trust a random person, so to join this network of witnesses, or let's call it "the gang," you have to be invited. That means having a registration system and identifying each member of your gang. It can just be your friends or your business colleagues. The larger the gang, the better, since it makes the gang more distributed. So, let's say the interval is 10 minutes. During those 10 minutes, the gang will spend time collecting all hashed documents that have to be timestamped.
61
62 ## Hashed documents? {#hashing}
63
64 Hashed documents. Say it with me. Hashed documents. If you want a blockchain, you don't want a weak chain. You want a strong chain! Yeah! Think of it as a physical chain. What kind of chain do you use to tie your bike? A plastic one? I don't think so. You want a steel chain that is hard to break. The same goes for blockchain. Hashing is the steel. You don't need to know how it works, but what it does.
65
66 A crypto hash function takes a text of any length and randomly creates a fixed-sized text called a hash. That's it. If you take a legal document text and hash it, it will produce a short random text. Hashing is magical for several reasons you should know:
67
68 1. It's impossible to get any information from the legal document with only the hash, so you can safely share the hash with anyone in the world and know that the chances of Earth getting struck by an asteroid are significantly more probable than someone recovering your document from the hash.
69 2. It's impossible to generate the same hash with different documents, so the hash is the identifier of the legal document.
70 3. It is random. Changing a single character in the legal document text will generate a completely different hash.
71
72 ## Timestamping {#timestamping}
73
74 Okay, now that you know hashing, you're halfway to knowing how blockchain works! Yay! Stay with me! Back to how timestamping works. When a client wants to timestamp a legal document, the client hashes the document and gives the hash to the gang. The gang continues to collect all hashed documents until the interval is up.
75
76 <figure class="light-bg">
77 <img src="/images/what-is-blockchain/what-is-blockchain-1.webp" alt="Diagram showing a sequence of documents before it reaches the 10-minute interval">
78 <figcaption>Fig 1. Collect all hashed documents until the interval is up.</figcaption>
79 </figure>
80
81 When the interval is up, all hashed documents are hashed together one by one to represent all documents collected as one "block".
82
83 <figure class="light-bg">
84 <img src="/images/what-is-blockchain/what-is-blockchain-2.webp" alt="Diagram showing a sequence of documents once it reached the 10-minute interval">
85 <figcaption>Fig 2. Collect all hashed documents once the interval is up.</figcaption>
86 </figure>
87
88 <figure class="light-bg">
89 <img src="/images/what-is-blockchain/what-is-blockchain-3.webp" alt="Diagram showing a sequence of documents hashed together as a binary tree">
90 <figcaption>Fig 3. All hashed documents are hashed together one by one creating a block.</figcaption>
91 </figure>
92
93 The same happens at each interval.
94
95 <figure class="light-bg">
96 <img src="/images/what-is-blockchain/what-is-blockchain-4.webp" alt="Diagram showing a sequence of documents hashed together as a binary tree with an additional interval collecting more documents for the second, upcoming block">
97 <figcaption>Fig 4. Repeat the process for every interval.</figcaption>
98 </figure>
99
100 But the blocks don't stand alone. The block hash is hashed with the previous block's hash. So the hash of one block depends on the one before it, and the one before it depends on the one before that one, and so on. Because of the hashing properties I mentioned, this literally ties all blocks together, creating a chain.
101
102 <figure class="light-bg">
103 <img src="/images/what-is-blockchain/what-is-blockchain-5.webp" alt="Diagram showing a couple of blocks hashed together">
104 <figcaption>Fig 5. The block hash is hashed with the previous block's hash.</figcaption>
105 </figure>
106
107 That builds up a chain of blocks aka. blockchain.
108
109 <figure class="light-bg">
110 <img src="/images/what-is-blockchain/what-is-blockchain-6.webp" alt="Diagram showing a multiple blocks hashed together">
111 <figcaption>Fig 6. Multiple blocks hashed together creates a blockchain.</figcaption>
112 </figure>
113
114 Changing a single character in any hashed document or adding a block in the middle of the chain will require changing everything in the current block and every block ahead of it. That is a very important tamper-proof property that blockchain provides and is one of the biggest selling points anyone would make to you. Think of blockchain like a timeline — the past should not be changed.
115
116 In that sense, the timestamping solution is misleading because it never specifies a time but the relative time[^1]. That's the most important matter — order preservation. That is sufficient to help poor Bob not get screwed like last time.
117
118 ## Security of blockchain {#security}
119
120 This is my favorite topic of all, and I can spend hours talking about it, but I'll just get to the point. I lied earlier. Blockchain is not 100% tamper-proof. In other words, it can be tampered with. It's technically tamper-evident. If something changes, the hashes will change. As I mentioned earlier, changing a single character in any hashed document or adding a block in the middle of the chain will have to change everything in the current block and every block ahead of it.
121
122 One question I skipped was: how does the gang agree on who adds the next block to the blockchain? Another way to phrase it: What is the consensus?
123
124 A simple consensus could be a roll of the die. Whoever in the gang gets the highest roll gets to add the next block. That's just an example. You can get creative and come up with any algorithm you think works best. What matters is that the consensus is fair and doesn't benefit anyone.
125
126 But when there is consensus, there needs to be cooperation. Gang members need to be honest. If one member is not honest, then it'll be okay; the majority will overpower the dishonest member. Still, this particular scenario goes back to the initial problem. There are two blockchains. There are two stories. There's the honest story and the dishonest story. There's Alice's story and Bob's story. If it happens to be that the majority of the gang are dishonest because they want to screw, you know who, poor ol' Bob, the gang can do it, so the blockchain model can break. It's much harder to break, but it can break.
127
128 We went from trusting Alice which was a disaster, to trusting the TSA, to trusting the gang, which is better but can still fail in the worst-case scenario.
129
130 Bitcoin further improves the solution in a very clever way that excitingly works well in practice using a method, known as Proof of Work, where gang members are not registered or identified, allowing Bitcoin to be decentralized and censorship-resistant. The member you trust to add the next block to the chain is the member that's most financially invested in the gang. Bitcoin has presented new challenges with its drastic innovative changes, but it is still vulnerable to some hypothetical attacks. There has been lots of research on alternatives like variations of proof of virtual work. Only time will tell, but the future looks bright!
131
132 Lesson learned: *trust is the essence of security, and you cannot escape it.*
133
134 ## Origin of blockchain {#origin}
135
136 Satoshi Nakamoto never mentioned "block chain" or "blockchain" when he announced his invention of Bitcoin — he mentioned "block" and "chaining" but never block chain.
137
138 The first time *block chain* was mentioned in a related Bitcoin discussion was between Satoshi and Hal Finney[^2] in the [cryptography mailing list, cypherpunk](https://satoshi.nakamotoinstitute.org/emails/cryptography/6/). The term *block chain* made sense during the time of the discussion because it wasn't the first time that naming has been used in cryptography. *Blockchaining* is an encryption method that has existed since 1976 and is still used today, and it works similarly to the blockchaining in blockchain. To further support my definition of blockchain, Hal Finney used *block chain* in the context of timestamping.
139
140 ## Blockchain as a service {#blockchain-as-a-service}
141
142 ... has existed for +20 years. That's right. Tell that to your friend and leave them baffled. In 1994, Haber and Stornetta saw the potential in their discovery and started a company called [Surety](http://surety.com/).
143
144 However, that doesn't mean that Surety competes against IBM and other companies invested in blockchain because everyone's definition of blockchain is different. Like I mentioned in the security section, blockchain includes proof of work when you speak of blockchain in Bitcoinland. Blockchain includes smart contracts when you speak in the Hyperledger world.
145
146 ## Smart Contracts {#smart-contracts}
147
148 The goal of the Hyperledger is to build a platform for smart contracts or *chaincode*, as it's known in the Hyperledger project. Smart contract was coined by Nick Szabo, the creator of BitGold, a proposal that is awfully similar to Bitcoin, and the arguably leading contender of Satoshi Nakamoto's true identity.
149
150 > [Smart contracts] embed contracts in all sorts of property that is valuable and controlled by digital means. Smart contracts reference that property in a dynamic, often proactively enforced form, and provide much better observation and verification where proactive measures must fall short.
151 > — Nick Szabo, https://web.archive.org/web/20160312140021/http://szabo.best.vwh.net/idea.html
152
153 Whenever there are two or more legal personalities, like a person or corporation, there are usually contracts involved. If a party breaks the contract, the counter-party will take them to court and settle the issue, justly or unjustly — it will be settled with human intervention. This is a reactive solution. Smart contracts are proactive.
154
155 The (application's) code is the contract, and there is no way around it; there shouldn't be any human intervention at its optimal use case. Of course, there is the possibility of a bug being present in the smart contract code, but it's argued that a bug is part of the contract; it's a feature, not a vulnerability. Taking advantage of the feature is not a criminal activity.
156
157 A perfect example of this demonstration is *the DAO*, a smart contract application that allows investors to crowdsource funding in a decentralized environment, so there are no identities and not a single registration system. [The DAO received over $152 million in funding, making it the largest crowdsourced project in history](http://www.nytimes.com/2016/05/22/business/dealbook/crypto-ether-bitcoin-currency.html).
158
159 Later, it was discovered that the numbers in the DAO contract weren't matching because someone was leaking money from the DAO to their account using a bug no one was aware of. Over $60 million was drained from the DAO using *this feature*. Call it stealing, but the code is the contract, and the attacker used it as their [get-out-of-jail card](http://pastebin.com/CcGUBgDG).
160
161 What does this mean for the Hyperledger project? Hyperledger doesn't focus on decentralized blockchains. In a centralized environment, anyone with the slightest privilege has an identity, so if they try to pull off something clever like the DAO attack, they'll most likely go to court. This arguably defeats the whole purpose of smart contracts, but the economics of a private blockchain are still unknown. We'll just have to wait and see how it goes and how IBM handles its end.
162
163 ## My opinions {#opinions}
164
165 Haber and Stornetta's timestamping proposal, as we may know it as blockchain, is not new (relative to internet technology). The innovative part of Bitcoin is the economics and game-theory involved in the security of decentralized blockchains, not blockchain itself. Many companies can benefit from private (centralized) blockchains, but they may be in for disappointment if they end up in a worst-case scenario regarding security. Because of that, IBM should start looking into best practices when deploying blockchains for customers.
166
167 [^1]: Even though blockchain only provides relative timestamping, a time period can be guaranteed by *anchoring* blocks. To anchor the blocks in a fixed time period, a *widely-witnessed* piece of information, like a New York Times article title with the published date, is copied to a block. That guarantees the anchored block was never created before the published date. [Satoshi Nakamoto did this in the very first block of the Bitcoin blockchain to guarantee that it was never created before January 3rd of 2009](https://blockchain.info/tx/4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b).
168
169 [^2]: Who is Hal Finney? https://bitcointalk.org/index.php?topic=155054.0