Review: Picolo, the future of querying decentralized databases?

July 09, 2018

by Editorial Team

Contributors: Erick Pinos, Lead Researcher, Game Theory Group

Problem

First generation of dapps stored their data entirely on the blockchain. Newer dapps store data on decentralized file storage systems like IPFS while only storing the hashes of the data on the blockchain. This works for dapps with simple storage needs like images, videos, documents, etc. But this is a limited for dapps with more complex needs like storing data in a easily queryable database. Sia, Storj, and IPFS are not currently suited for storing structured data and do not support complex queries beyond simple keyword search. Dapps are turning to cloud hosted databases like Google’s Cloud SQL but that introduced centrality into the system.

Solution

Picolo is an open decentralized database network that combines the benefits of a traditional database and p2p software. Anyone with spare computing power and disk space can join and get rewarded for hosting and serving structured data. The Picolo overlay can be implemented on top of any datagram network protocol such as UDP or IP. Other blockchains, like Ethereum, can offload data to the Picolo network similar to IPFS. Picolo can also be used to index data from other storage networks like IPFS and Storj.

There are four types of participants in the Picolo network:

● Storage providers - provide computing resources (disk space, cpu, memory) for PINT tokens. Providers are required to put up security deposits in PINTs. Malicious nodes are punished by slashing their deposits. Honest nodes are rewarded for diagnosing and reporting byzantine behavior. Proof-of-Query is used for the diagnosis and deposit slashing is enforced by a smart contract system.

● Storage consumers - pay providers and use the resources. Dapp developers can use the network with the same ease of use as a cloud hosted database but at a much cheaper rate.

● Data providers - people who want to share their data and sell access to it. Ex: program that reads the state of Ethereum and stores it in query-able format on Piccolo. Storage consumers can also be data providers. Ex: end user using a dapp that stores data on Picolo and puts data control in the hands of the user enables them to be a data provider for other dapps.

● Data consumers - people/applications/nodes interested in data shared by data providers. Users can solely control their data and allow applications to provide services by selectively giving access to their data.

Use Cases

● Dapps that need to be able to make complex queries to their databases (beyond keyword search)

● Program that reads the state of Ethereum and stores it in query-able format on Piccolo.

● Ethereum can offload data to the Picolo network.

● Picolo can be used to index data from other storage networks like IPFS and Storj.

● Enigma can use Picolo to safely store their “secret contract” state of secure computations and avoid bloating blocks.

● Decentralized twitter that gives users control over their tweets. Users can use any other client to interact with their tweets.

● Decentralized ticketing app where data control stays with the app and thus users can’t delete data about which tickets they bought.

● A program that monitors the contract state of CryptoKitties and keeps track of stats (kitten births, times of creation, trading, popularity) and sells that data to developers/researchers.

Technology

Picolo is a p2p network where any node can join and discover each other in a decentralized manner through a distributed hash tables (DHT) overlay network. Each node maintains just enough routing information to discover the topology of the whole network. Nodes are connected to each other via a DHT overlay network.

Picolo offers a familiar SQL interface for developers to interact with, as well as distributed transactions, external consistency, lock-free reads and snapshot reads.

Node Clusters

Storage providers power the network and are organized into clusters. Each cluster serves a storage consumer, typically a dapp, and consist of shards. Each shard is made up of small pieces of data sets defined by a key range. Each key range is replicated for durability and availability. Clusters are horizontally scalable by adding more nodes and splitting data into new shards.

Proof-of-Query

Nodes in a database cluster ping each other for health checking with requests of small randomly sampled data that node is responsible for storing. If a node fails the check, its security deposit is slashed. Thus, storage consumers can be reasonably confident their data is being reliably stored.

Network Subsystem

The network layer is a fully decentralized, p2p overlay-based routing layer among the nodes participating in the picolo network.

Location independent routing - class of techniques for locating objects based on content rather than their location.

Picolo’s design offers following properties:

● Deterministic node mapping - able to locate objects anywhere in the network.

● Low routing inefficiency - routes have low stretch (ratio between network distance traveled by a query to an object and the minimum possible network distance from the query to the object. Optimally it would the shortest route)

● Balanced load - load should be evenly distributed across the nodes in the network

● Dynamic membership: the system allows arrival and departure of nodes while maintaining functionality. Important for handling failures and Byzantine behavior of nodes.

Market Size

The cloud storage market size is expected to grow from USD 30.70 billion in 2017 to USD 88.91 billion by 2022 (Markets and Markets Report).

Competitors

PeerDB - full fledged data management system that supports fine-grained content-based searching on a distributed network of nodes with heterogeneous schemas (mismatching data). Academic project, not industry. Introduced “Code goes to data” paradigm” where computation is done at the peer nodes with mobile agents to reduce network bandwidth.

PIER - provides a relational data model and query operators on top of any distributed storage system. Does not provide replication or ACID ((Atomicity, Consistency, Isolation, Durability) guarantees of a database.

Piazza - Peers have pairwise schema mappings between heterogeneous schemas. Works for a few reformulations, but a large number of them may result in info loss or returning of irrelevant results.

Filecoin, Storj, Sia - great for storing large unstructured files like images, videos, documents, etc, but not suitable for storing structured data and do not support complex queries beyond simple keyword search.

BigchainDB - offers blockchain like characteristics on top of a MongoDB engine. Can be queried. However, membership is controlled by devops team for a particular BigchainDB instance.

Bluzelle - open participation model, but only supports key value lookup semantics, has no notion of data sharing and access control, does not support distributed query processing.

Mediachain - decentralized data network for tracking ownership of creative arts like music. Supports a query language that enables complex querying. Not in active development.

Token Utility/Economics

Any node that runs Picolo’s software can deposit PINTs (Picolo Network Tokens) and become a storage provider eligible to earn rewards.

There will be a fixed number of PINTs, all minted at genesis, distributed amongst investors, developers, partners, and network participants.

Picolo uses the work token model. Network tokens are only required for storage providers to earn the right to join the network. But they can accept payment from users in any form they prefer.

When demand for the network goes up, more providers will want to join the network and will have to purchase PINTs to use as a stake.

Community/Social

Telegram: 665 members

Twitter: 29 followers

Team

No info on website and linkedin yet. Adi Kancherla is a telegram admin:

Arunesh Mishra:

https://www.linkedin.com/in/aruneshm/

Adi Kancherla:

● Software engineer at Google for almost 2 years (SF)

● Software engineer at Apigee for 9 month (SF)

● Software development engineer intern at Amazon for 4 months (Seattle)

● Senior member of Technical staff at Oracle (Bangalore)

● Associate Technology at Sapient for almost 2 years (India)

● Software Developer intern at Knolskape Solutions for 6 months (Bangalore)

● Master’s degree in computer science from University of Wisconsin-Madison

● Bachelor of Engineering from Birla Institute of Technology and Science, Pilani

Found some contributors on github (https://github.com/picolonetwork/alpha/graphs/contributors):

Thomas J.Hu:

● Software Engineer at CryptoParency for less than a year

● Undergraduate researcher at University of Western Ontario for 3 years

○ Research Topics: Rule-Based Integration Systems, Symbolic Integration, Lambert W function

● B.Sc in Mathematics from the University of Western Ontario

Arjun Sharma:

● Software engineer at IBM Cloud for 3 years (developed VMWare Cloud Foundation solutions on IBM cloud and Network Service Applications for IBM Bluemix Cloud) in SF

● Software engineer intern at Ericsson for 5 months in San Jose

● Software engineer at Computer Science Corporation for 3 years

● Master of Science in Computer Networks from University of Southern California

Opportunities

● Picolo is aiming to build a data network where queries can be executed across all notes with different schemas. The machine-readable Web (commonly known as Web 3.0)

● Picolo is building towards the idea of data sovereignty. Two different schema types: application controlled and user controlled.

● Can integrate with Ethereum. Ethereum can offload to Picolo for easier, faster querying, and cheaper storage.

● Can integrate with IPFS, Sia, and Storj. So maybe it can’t beat them with actual storage, but it can become an indexing solution for them.

Risks

● Lots of use of the word “we aim” to do this/that. Very early stages.

Thoughts/Thesis

Decentralized storage in general has been a wildly popular and straightforward use case of blockchain. Look at the how early Sia was to the game and how much Filecoin has raised. People get it. Peercoin is sufficiently different from decentralized storage systems like Storj, IPFS, or Sia, which work for general file storage but are not optimized for constructing a queryable database for a dapp. Concern: how hard is it to add a queryable database solution to IPFS, Storj, or Sia and are any in development? (EDIT: WHOA. Picolo IS the queryable database solution to IPFS, Storj, and Sia!) Not only can people store data on the storage providers’ disk space on the Picolo network but the Picolo network can also be used to index and create a query-able database of other decentralize storage solutions like IPFS, Storj, and Sia. The flexibility of the Picolo network, if it works, is incredible and I highly suggest exploring this opportunity further.

External Links

Website: https://picolo.network/

Whitepaper: https://picolo.network/paper

Telegram: https://t.me/picolonet

Twitter: https://twitter.com/picolonetwork