The Business Of
By Josh Moir
From looking to build a decentralised internet where users own their data to revolutionising how data is stored, an interview with the founder of Sia, David Vorick.
The Business Of Interview Series
29th March 2021, 22:30 GMT
Hello, David, thank you very much for taking the time to speak to us today. We're honoured to have you and are looking forward to ask you some questions.
Hey, great to be here, super excited.
So we'd like to begin really at the start and talk about what you were doing before you got involved in Blockchain, can you talk to us a little bit about what you were doing before founding Sia? You know, sort of what you were studying and then going on to work as a software developer for IBM, I believe.
Yeah, so I was an undergraduate in college and really I ran into Bitcoin in 2011, my freshman year, I was introduced to me by a friend and kind of right away I knew that literally within like two days I knew that the rest of my life would be orbiting the Blockchain technology. It was at the intersection of a bunch of different interests I had, namely, like computer science, economics and those kind of cypher-punk crypto-punk, you know, be your own bank sort of feeling and just connected with me really well. And then I did spend the summer working at IBM my junior year, but that was just an internship. And then I got a job offer from Google and I told them that I would work for them if they let me drop out. And so I got the job offer with two semesters left of college, and so I said, let me drop out. Let me skip some semesters, like I'll save some money. You'll get me working immediately. You know, I've already taken all the hard classes, so it doesn't matter. And they said no. And then in that time, the entrepreneurial department kind of took advantage of all my free time at school.
And some of the professors there and mentors and entrepreneurs there convinced me not to go to Google and to start my own company. So I ended up about three months later calling the Google recruiter, telling them I changed my mind, turning down the job offer and started what turned into Sia. I think at the time it was called Bitecoin.
Right, yeah, and then sort of when you found out about Bitcoin and block chain technology, what was it that really got your interest? And you sort of said, you know, I want to go into this and want to learn more about this and eventually found Sia.
Yeah. So it was really, really the idea that you could have money without a bank, I think at that point in time I kind of become a little bit disenfranchised with the government, which is, you know, maybe a phase that everyone goes through when they're when they're 18. But I felt like, you know, the Fed was doing a lot of inflation, had control over our finances, and then you'd see things like hyperinflation in places like Venezuela. And so the idea that you could make something that was immune to all this but still functional as money, which is very interesting to me. And there were there were a lot of especially in 2011, just like this open field of unexplored ideas, it opened a whole bunch of possibilities that had never been possible before and nobody had explored those possibilities yet. It was something that I was very early to and it was very easy to be at the forefront of because the forefront was so near the original idea at that point. So, yeah, that's what alerted me.
And so what was this process of almost coming up with Sia and then launching Sia? So what was that like?
Yeah, so I started with I think, you know, every engineer goes through this when they learn about Bitcoin, they're like, wow, this is a terrible system and it's super wasteful and it doesn't scale at all. And it just has all these really awful properties. And it's like, I can do better than that. I can do way better than that. And so that's kind of one of the early traps that Bitcoin throws people into. And I definitely fell into that trap. And so, yeah, I spent probably the first three years of my Bitcoin time just trying to think of ways to make Bitcoin better. And eventually I kind of came to the conclusion, like instead of being wasteful, computations, the proof work could be based on useful storage and we can make a storage system that is both building consensus and also storing all the data in the world. And so that was kind of the 2013 idea. And then I actually had the fortune of getting into a group of a bunch of Bitcoin developers who are very patient and very generous with their time, and so they listen to a lot of my ideas, corrected a lot of my misconceptions. And so I spent probably, on an off, a year talking to these guys and basically threw away everything that I had come up with. I kind of realized, you know, I got out of the trap at that point of thinking that Bitcoin is terrible. And I started to understand why Bitcoin had made all these decisions that it did and what the sort of limitations of the technology were and how all this stuff was elegant and made sense. And so at that point, we redesigned Sia.
So we threw everything away about nine months of work at this point, just completely, completely tossed tens of thousands of lines of code. And we started from scratch. We spent an entire month not writing any code and just building out a brand new design for the Sia network and this one, instead of having the storage build consensus, we use proof of work to build consensus. And we built the entire network from the ground up with the idea of how can we make it the most efficient, fastest, like, best network possible for data storage.
I think we did a much better job on our second draft. And actually the design we came up with, this is now late 2014, November 2014. So the design we came up with at that point is the design we're using today and something that I still think holds up very well. There are not very many changes I would make if I were to start over today. So I think I think we did a great job.
Yeah, and then could you give a brief overview of what is Sia to our audience and then explain what was it to you that attracted almost an interest in creating this decentralized cloud storage platform?
Yeah, so Sia is a decentralised cloud storage platform, the idea of being kind of parallel to Bitcoin, so bitcoin is money without a bank and then Sia was the idea, can you have a cloud without a cloud provider?
And so we wanted to create a cloud or even something like us or Google Drive, but without Apple and without Amazon and without Google and kind of that's where we ended up. So we have a way for people to store data and retrieve data that is completely decentralized. So your data is not controlled or owned by us or any one particular entity. It's spread out all over, all over the world. And it's governed by smart contracts on a blockchain as opposed to just a centralized administration service.
And so the vision that we had was to give people control of their data again, and I think in 2015 the problem actually wasn't as bad as it is now. But today I would say that, you know, as things like centralized platforms like Reddit and Discord and Twitch and YouTube, like the Internet has transformed over the past 10 years and especially over the past five years from this enormous archipelago of very diverse services to really just like five to 10 places that most people spend all their time on. And so we've kind of lost a lot of the individuality of the Internet as we've congregated on to a very small number of very large platforms. And in the process of doing that, we've completely lost control of the data, you know, YouTube content creators and Twitch and Instagram and all these services alike don't have control over their audience.
They don't have confidence that they won't be de-platformed or that they won't be de-monetized. And like if there's a creator who wants an audience and has an audience, they can't actually guarantee that they'll continue to have access to their audience. Their lives are in the hands of Google or in the hands of Amazon. And so that's something that we think can be a lot better.
Yeah, and then so could you talk us through the technology Sia uses to ensure privacy when transferring these files, and how is this different to traditional cloud storage platforms?
Yeah, so we use basically three major technologies to protect the user's data, the first, of course, is the block chain and the role that the block chain plays is one of almost like a cryptographic SLA. So when you are storing files on the Sia network, those files are on a large number of independent machines. And then those machines have actually put up money like kind of collateral. And they said, we'll store your data for three months or for six months. And if we don't, this collateral out of pocket will be destroyed. So the hosts that are keeping your data have skin in the game. So that's the first component. The second component is encryption. So none of the data that you put onto this network is visible to the people storing the data.
It gets wrapped into your encryption and it can only be decrypted by people with the private key. And so if it's personal data, you're the only one who can decrypt that data. If it's public data, maybe you publish broadly the encryption key so that your followers can see that data and access that. But the important thing is that data is only visible if you want it to be visible. And then the third technology is something called Eraser Coding, which basically means it's like really smart redundancy.
So we put the data in typically 30 different places. And then out of those 30 places, any 10 doesn't matter which 10 can be used to recover the data. So let's say you upload a file and then tomorrow something goes wrong and half the world goes offline. Well, that means roughly 15 out of your 30 hosts have gone offline. But that's OK because you only need 10 and it doesn't matter which 10. So we don't even need to know which half of the network went offline. And it can be any half. We still have access to our data. So those are the three major things that make Sia possible.
Yeah, and then so Sia has obviously created a marketplace for both these renters who want to store files and then the storage providers who will host the files and then all transactions through this are then paid with Sia coin. Could you discuss the decentralized server costs for renters and how they compare to incumbent cloud storage providers?
Yes, I think that's one of the things that we did very well with the design of our network, which is that we made it an open marketplace. So when you store data on this network today, there are about four hundred different choices for hosting providers that you can pick. And they all have different properties that each host can choose their own prices. And each host, of course, is in a different place. So you're going to have a different latency to downloading the file and different bandwidth. And so there are a bunch of different metrics.
But it's this open market and you as, the uploader get to pick the 30 hosts that you think are best for you. And what that means in turn is that these four hundred entities are in continuous price competition with each other, because one of the biggest criteria for who you pick to store your data with is how much money it's going to cost you to pick. And so, generally speaking of the history of the network, it's been about two dollars per terabyte per month. Right now, we're actually in the middle of a period of big growth. And so I think it's closer to five dollars per terabyte per month today. But we generally expect that to return back to two dollars per terabyte per month once the storage network has caught up to all the new demand.
Yeah, and then almost linking on with that marketplace discussion, could you walk us through the process of how these files are sent to hosts? I believe this is done through file contracts. Could you explain exactly what this is and how the process works for our audience?
Yeah. So all file contracts are this two party agreement. So you can think of file contracts as like block chain escrow for data. So you as the renter put some money into a contract and the host, the person storing your data, also puts some money into the contract and then you record in that contract what data is being stored and then you record some amount of time that data is required to be stored. And so it's like file X must be stored for six months and I'm going to pay one dollar to store that file. And then the Host might put up three dollars of collateral. And so if the host successfully holds the file for six months, the host will get four dollars back. At the end, they'll get the three dollars they put up and also my dollar. And then if the host cannot prove at the end of the final contract that they still have the data, everything gets destroyed.
And so the host doesn't get their three dollars back and they also don't get my one dollar and I can go into it a little bit more if you want. But essentially we have a cryptographic proof that the host can submit to the block chain that says, look, I have the data, I've been a good host.
Yeah, so that's sort of linking onto my next question, actually. So how the host prove their story in the renter's data. So I understand this is done through something called storage proofs, I believe. can you explain what this process is?
Yeah, so we take the data and divide it into little tiny chunks, sixty four bite chunks that we hash together, and then and then that creates a set of hashes. We hash those together, which creates another set of hashes, and it builds up into something called the Merkle tree. So you start with whatever the data sizes. And then each time you hash the number of hashes gets cut in half until at the very end you have a single piece of data called the Merkle Root. And so with that Merkle root, the Merkle root goes onto the block chain so everyone can see what the hash of the data is.
And then the Blockchain essentially requests that the host show a tiny piece of data from the file. So it might say like show me the one million and four hundred thousandth sixty four byte piece of the file. And so if the host of storing the file, they will have that piece. And so they can show what that piece is and then they can also make something called a Merkle proof and prove that the data is in part of the file contract or part of the file that they are supposed to be storing.
And so this is a probabilistic proof. If the host throws away half the data, they have no way to predict which segment is going to be requested in advance. And so if they're cheating or if they have corruption and they've lost a fraction of the data, they have a percentage chance of failing the storage proof because the block chain might ask for a piece of data that the host no longer has access to. Let me know if that makes sense.
Yeah, I think that's a good explanation of it. I'm sure our audience will understand it better now. Could you go on to talk about what happens when these individuals go offline? You already discussed it briefly, but sort of how renter's data is moved through your file repair process.
Got it. Yeah, so the renters kind of keep a continuous monitor on the network. They so they'll continually ping the network and see which hosts are online and offline and kind of do random checks, almost like the block chain does. But this is all off chain. It's all out of hand. Just do random checks to see which data's still online and how healthy it is. And if they find that some of the hosts that they store data on are offline, they will do a repair process where they download, so again, we said you need 10 out of 30 pieces and let's say seven of those pieces are missing.
You're going to go ahead and download 10 out of the remaining twenty three and then you're going to use the 10 that you downloaded to rebuild the original 30 pieces. And so you'll have the seven missing ones again. You'll pick seven new hosts and upload that data to the seven new hosts.
So as the network continues, if data is going offline and hosts are going offline, the renter will just continually repair and make sure their files are in good health and have a big safety buffer in case a bunch of hosts go offline all at once.
Yes, and then sort of a feature that I find very interesting with Sia is how the files are divided prior to upload, and I think this makes Sia highly redundant. Could you explain why you've done this and how this gives you an edge over traditional cloud storage providers?
Yeah, so the really important idea behind Sia is that we don't particularly expect any individual host to be super reliable. And so I kind of go back to that we saw something very similar in the hard drive world, I think, around the year 2000. So in the year 2000, there was this concept of an enterprise storage drive. So if you were a company offering data services to a customer, it was really important that your hard drive not fail and that you have your data available for the customer.
You didn't want the customer to come to you and then you go, oh, man, the hard drive broke we can't get you your data back. So enterprises would pay something like one hundred X the price, just this this really large premium to get these ultra reliable hard drives that wouldn't fail under any circumstances.
And eventually this practice was made completely obsolete because someone realized, wait, we're paying one hundred X the price for these hard drives that are, you know, one hundred X as reliable. But we could get the same reliability just by buying three normal hard drives and storing the data three times over, because the chance that all three of them fail at the same time is much smaller even than the chance that a single enterprise grade hard drive fails. So it's like, why? Why would we pay one hundred X when we could pay three X and get even better reliability? And so this kind of idea was called just a bunch of disks. And so enterprises started replacing all these ultra expensive high reliability drives with a bunch of really low cost but highly redundant set of hard drives.
And so we've kind of done the same thing. The way data centres work today is sort of the same as those ultra expensive hard drives from the 2000s. You had these data centres with like multiple power companies and multiply ISPs and like technicians on site 24/7 and these really smart architects and they're again like one hundred x as expensive as they have to be. But the thing is, these data centres don't go down. Amazon gets a 99.99% uptime, and that's extremely difficult to achieve from a single data centre. But it's like, why? Why would we go through all the trouble of running this ultra professional, like, insanely engineered, high quality data centre when we could just have like three random cheap data centres run by people who aren't even around most of the time. And then sometimes they go offline and they're gone for a weekend. But then the engineer can show up or whoever can restart it and get it going again.
And so that was kind of the idea behind Sia was that we can bring costs way down if we also bring our reliability requirements way down. And instead of putting data on only one data centre, we would just put it on a whole bunch of data centres. And in practice, this has played out super well. Sometimes data centre engineers get kind of uneasy when they hear about uptime requirements of 95%. They're like, that's ultra low. But just like with enterprise hard drives, the reason that we only need 95% uptime from our data centres is because we're using 30 of them. And so if if three to five data centres are offline at a time, it's actually not a problem. Are our data is completely safe.
Yeah, and then almost going into the process for our audience. Sort of how does the process of storing this data work for renters? Can you talk a bit about the prepaying for the storage, along with the typical length of a file contract and then the process of renewing these contracts?
Yeah, so when you create a file contract, it works very similar to something called like a payment channel or a stay channel, you put a whole bunch of money into the contract up front. So in essence, you're sort of preparing for the host, storing a bunch of data, even if you haven't uploaded it yet. But at that point in time, none of the money is actually controlled by the host. So you have this escrow contract on the blockchain. It says, like the renters, put in $100 and the host has put in $200. Actually, usually it's much less than that. So we'll say the renters put in $5 and the host has put in $10-15. And if we close out this contract, the renter is going to get back the $5 and the host is going to get back $10. And then as the renter starts to upload data, the contract will change to reflect that, like, OK, now the renter gets back $4 and the host gets back $11 and the renter gets back $3, the host gets back $12.
And the amount of data that the file contract says the host is responsible for storing will grow and the typical duration for these contracts is 3 to 6 months. I think most everyone's on the three month rotation. Some people are on a six month rotation. And at the end of that three to six months, we'll say three months. The renter can negotiate a renewal with that host and can say, instead of having me re upload all this data again on a new contract, let's just take the data we have. I'll put more money in, you'll put more money in, and then we'll just extend it out another three months. And so that's that's kind of the general process. If you're storing data for two years on the Sia network, normally there's going to be somewhere around six renewals of that data.
Yeah, that's really interesting, I think, you know, we've really covered the bulk of what Sia is, and I'd like to sort of move on now to Skynet, which is this idea of building a free Internet I believe. Could you talk about what Skynet is and then the difference between Skynet and the traditional Internet?
Yeah, so SkyNet was a major technology breakthrough that we had in 2020. So for the longest time, Sia was about personal storage. Everything was your data and your file contracts. And there actually wasn't very good infrastructure for sharing that data with other people. It was very difficult to do. And then one day we kind of realized, oh, we can extend the Sia platform in a way that allows us to share our data with other people, but then we realized we could go a lot further than that. We could have data that is shared across multiple people and across multiple applications. And we realized you could create things like an entire Reddit where every piece of data is owned by the users. And actually, I think for for the purposes of example, I'm going to jump to YouTube.
So right now on YouTube, everything is owned by Google. You have a bunch of videos, you have a bunch of content creator accounts and channels, you have a bunch of comments and likes and just user activity data and all that is on Google, which means Google has all the power over YouTube. And if they want to deplatform someone, if they want to delete a video forever, if they want to make changes in any way to how everything works, that's their prerogative. They can do anything they want to the YouTube platform.
But on Skynet, it's possible to create it so that the creator owns their own channel and the data stored for the creators channel their videos and their comments and their playlists. All of that is owned entirely by the creator and controlled by the creator. And then the users who have likes, who have history, who have maybe videos that they've saved or playlists that they've made. All of those are owned by each individual user, which means nobody has the power to kind of step in and make arbitrary changes. You can't just step into a decentralized YouTube and take something down because you don't own the rights to that. You don't control the servers that thing is running on.
And so with Skynet, we can completely, like completely rebuild every piece of the Internet to be user controlled. And I think that that's something that is a very big idea. It's extremely exciting and almost a little terrifying for what it might turn into.
Yeah, I think it will catch a lot of our audience's attention and then I guess, could you almost explain to our audience how may they use Skynet? And I'm sure a lot of them are then wondering, how is this Skynet infrastructure paid for?
Yeah, so right now, there's a website called siasky.net. So if you just want to upload files and share them with friends, you can go to siasky.net. But if you want to use the more exciting stuff that's actually on a per application basis, almost like the Internet is, there is no if you ask someone how to, how do you use the Internet? The real answer is that you actually have to know what applications you want to use.
So there are things like a decentralized Dropbox called Mars storage or there's a decentralized Twitter called Sky feed, or there's decentralized code editors there's something called, I believe, hacker paste. And so we have all these decentralized applications that are being built by developers. Some of them already, some of them are early, but it's almost like a restart button on the Internet where when you're using Skynet, you know that you and everyone you're interacting with has control of the data and owns the data.
Yeah, and I think we'll be sure to put the relevant links in our article for our audience, to check that out. And it's very exciting, both Sia and Skynet.
I'd now like to move on to some more personal questions for you in particular, starting with who is someone that you look up to and what is it about that person that inspires you?
Yeah. So I would say, you know, one, this might be kind of cliché, but one person that I definitely look up to is Elon Musk. I think that he has a reputation both for being an incredibly intelligent engineer, a savvy businessman, but also just he has a vision for how the world should be. And then just a very like. No nonsense attitude or just a personality that just blasts through barriers in his way, and so he's like, we're going to change the entire world to run on electric cars instead of gas cars. And there are a hundred million reasons why that's an insane idea and it's like Elon Musk one man just can't can't accomplish that.
That's that's a big task for you to say. I'm just going to do it. But then he was like, no, I'm just going to do it. And it took a decade or whatever, but he did it. And then a similar thing with SpaceX right? This wasn't just a one off thing. He's like, you know, I want to put humans on Mars. We're going to colonize Mars. I'm going to colonize Mars. And of course, we haven't seen that play out yet. Elon Musk has not yet put someone on Mars. But when you look at the progress that SpaceX has made he's a million times further than you know, even his own mother probably would have given him credit for when he set out to accomplish this. And so I think just I take a huge amount of inspiration from just how bold he's been in deciding what he's going to do and how he's going to change the world.
And then also kind of putting his money where his mouth is going, going all in on these ideas and then actually blasting through barriers that everybody thought would be impossible to overcome. I find that extremely inspirational as an entrepreneur.
Yeah, I mean, he is someone that, you know, we've had a few interviews and it's almost 50 percent of them, said their inspiration was Elon Musk aha.
Sort of going on to what are the attributes you would contribute to your success thus far with Sia, for example, maybe your work ethic or your belief in your idea, something like this. What a three that you would contribute.
Yeah, so I think one is just and probably our strongest suit is technical innovation. I think our team has a way of looking at technical challenges and saying we're going to make it happen. And so early on, we will basically split into two paths. Either we will do it or we will prove it's impossible. And if we can't prove that it's impossible, we just assume it's possible and we go for it. And I think we've been very effective at accomplishing technical feats that people just assume were way over our heads. And I think Skynet is like kind of the pinnacle of just this attitude of like we will make it so that people can store their own data and that you can have an Internet where every single person owns all of their own data. And that's something that I think would have sounded absolutely ridiculous in 2015, just from a technical standpoint. But but here we are today.
And then another thing that I think is very defining of our team is how kind of mission driven we are. Our team is not focused on making money as the number one priority. That's that's not to say that like we shun money or that we're not looking to make money in some form, but really we're looking to change the world. And so the money aspect of what we're doing is, you know, you need money to motivate people to change the world. But like we want to change the world above everything else. And I think that also really comes through in a lot of the decisions we've made. We've been through three bull markets now and. I think through that we've been very, very successful at staying focused on accomplishing a world where people own their fata rather than getting distracted by a three billion dollar market cap.
Yes, and then almost apart from the blockchain space and blockchain technology. Are there any other industries and spaces that you really have a particular interest in, maybe artificial intelligence, for example?
Yeah, there are a few. So at least at this point, it looks like I'm going to continue being in blockchain for a long time. I think that we're in a very good spot in terms of like special knowledge and also just like in a privileged position to make positive change in the world. But a couple other things that have my attention. One I would say is open source hardware, which is another one of those things that if you talk to all the established players in hardware, you know, they just kind of laugh off the idea of open source hardware. I don't think software people realize just how bad the hardware world is. We've dipped our toes. We've made hardware and shipped it before.
But it's like in software, everything's open source. And so if you want to, like, sort data, you can just go to stack overflow and someone will tell you how to sort data. And if you want to use a different programming language, that programming language is free. And if you want to do LINTANG or vetting or like there's all of these tools, all of these libraries, there's like databases that are free, there are entire operating system that's free. Like trillions of dollars of R&D, that's all completely free in the software world. And in the hardware world, everything costs money, if you want to sort integers in hardware, you can pay $50,000 for the privilege of doing that. It's like not even close to cheap, you have to be an established, like funded company to be able to do anything sophisticated in hardware because even the most basic things cost you anywhere from ten thousand dollars to several million dollars for the IP to use in your project.
And I think that this is going to change. And it's something that if blockchain didn't exist, I think I would be very focused on. I'd be working to make laptops, completely open source, smartphones completely open source cars and self-driving cars, completely open source TVs, just like all of this stuff that people aren't used to having the right to see how it was made. I think it's time to change hardware and make it an open source world. So that's kind of one thing.
Another thing is space. I really like the idea of humans being a multi star system race, and I would love to contribute to that. So whether it's like space manufacturing or asteroid mining or some other, I would love to find an industrial reason to go to space. And something that we can manufacture in space that we couldn't on Earth to kind of accelerate the process of getting humans into space.
Yes, I agree. I think both of those industries you mentioned are very interesting. I want to thank you very much for taking the time to participate in this interview. Do you have any additional information you wanted to add?
No, it's been a pleasure and yeah, it's really, really fun to talk about everything.
I thank you very much.
Ok, take care.
In This Interview
Your Involvement and studies before blockchain.
What about Blockchain excited you?
An Overview of what Sia is.
An explanation of the technology used to keep files secure.
The costs of decentralised file storage.
How files are sent to hosts on Sia.
How hosts prove they are storing renters data.
The data storage process and how file contracts work.
Details about SkyNet and creating a new internet.
Who Inspires You?
Key Attributes to Achieve Success.
Outlook on the Future and other Industries.
So we have a way for people to store data and retrieve data that is completely decentralized. So your data is not controlled or owned by us or any one particular entity. It's spread out all over, all over the world. And it's governed by smart contracts on a blockchain as opposed to just a centralized administration service.
So we put the data in typically 30 different places. And then out of those 30 places, any 10 doesn't matter which 10 can be used to recover the data.
So all file contracts are this two party agreement. So you can think of file contracts as like block chain escrow for data. So you as the renter put some money into a contract and the host, the person storing your data, also puts some money into the contract and then you record in that contract what data is being stored and then you record some amount of time that data is required to be stored.
So as the network continues, if data is going offline and hosts are going offline, the renter will just continually repair and make sure their files are in good health
And so that was kind of the idea behind Sia was that we can bring costs way down if we also bring our reliability requirements way down. And instead of putting data on only one data centre, we would just put it on a whole bunch of data centres.
We could have data that is shared across multiple people and across multiple applications. And we realized you could create things like an entire Reddit where every piece of data is owned by the users.
but one person that I definitely look up to is Elon Musk. I think that he has a reputation both for being an incredibly intelligent engineer, a savvy businessman, but also just he has a vision for how the world should be.
To Learn More about Sia and what they are building you can visit: