Barrel Protocol aims to become the infrastructure that will facilitate a free-flowing and compliant data economy. Using the blockchain and smart contracts, Barrel automatically verifies that data is completely clean from any identifiable information, ensuring data compliance, privacy, and security. After the data has been verified, it is sealed and certified by the Barrel Protocol.
Barrel’s mission is to build an eco-system that will ensure and certify that data is clean and compliant with laws and regulations, thereby removing the barriers to a free-flowing data economy.
After months of hard work developing Barrel Protocol’s Alpha version, we’re proud to present it and demonstrate its utility to the blockchain community. Throughout this article we will use the following sample CSV file containing basic data from ride hailing e-receipts:
Our protocol is divided into 3 distinct roles: Data Provider, Data Consumerand Validators. We’ll show you how the file is uploaded to Barrel Network by a Data Provider, then certified as anonymised by the Validators, and finally how this piece of data can be decrypted and downloaded from the network by a Data Consumer.
This whole eco-system is summarised in the following diagram:
We will explain in detail each and every step of the workflow, by describing each role of the eco-system:
Data Provider
The Data Provider wants to sell some data that belongs to them and upload it to Barrel Network. This data can be any valuable asset that is supported by our network. At launch, Barrel Protocol will support e-receipts and SMS receipts data from online/in-app purchases and ride-hailing/food delivery. Thereafter, developers will have the opportunity to add support for other data types.
When uploading a data file to Barrel Network, first thing done is to check that the file structure is correct so the protocol can understand it.
In our example case, we’ll check that the file format corresponds to the generic Ride Hailing e-receipts format. More precisely, our protocol will check that all required fields are present in the file, and other fields are part of the optional fields corresponding to this format. In addition, each column in the data file contains data from a specific type, and we need to check its compliance. For example, distance is a column of type float (real number), so we need to check that every data in this column is float only. This is a simple one to check, but if we look at the source_addresscolumn which is a column of type address, we’ll need to check that every data in it is an address, and this requires somewhat more advanced verifications.
At this point, our data file is structurally valid so we can start uploading it to Barrel Network! When uploading Data File, each column is separated and divided into same-size chunks, and each chunk is sent to the Verification Network.
SHA-256 stands for Secure Hash Algorithm — 256 bit and is a type of hash function commonly used in blockchain. A hash function is a type of mathematical function which turns data into a fingerprint of that data called a hash. It’s like a formula or algorithm which takes the input data and turns it into an output of a fixed length, which represents the fingerprint of the data. Example of sha256: df0342321b232ddc463d7490fb75ac42cc9dfd342126de6e81ce110dee2758413
A chunk name will have the following format: {chunk hash}_{chunk type}.txt, when chunk hash is a 256-bit fingerprint of the Data Chunk (as explained above).
When all Data Chunks of a same file have been uploaded to the Verification Network, our protocol will output the Merkle Root Key, which contains all hashes of the file’s chunks in the right order. Since a chunk hash represents the fingerprint of the data inside this chunk, the Merkle Root Key will represent the fingerprint of the whole Data File. This means that when uploaded to the network, a Data File will be accessible only by providing the Merkle Root Key, and without it all chunks will remain independent and have no signification.
At the end of the process, Barrel Network computes a certain amount of BRN Tokens which will be the file’s cost on the network, according to several parameters that include the number of rows in the file, data set type and the quality of data — which was calculated during the upload process.
Let’s recap this process with the help of the following diagram:
Upload Data File to Barrel ProtocolTransfer BRN Tokens as a reward for ValidatorsCreate Data Chunks from Data FileUpload Data Chunks to Barrel Verification NetworkCompute and uploading Merkle Root Key to Barrel ProtocolWhen Data Consumer is buying file, transfer BRN Tokens to Data Provider
Validators
The Validators’ role is to assure that all Data Chunks that have been sent to Barrel Verification Network are verified and anonymised. Think of that as a stamp on your data that gives it a certain certification, and whose requirements are defined by Barrel Protocol.
Personally identifiable information (PII) is any data that could potentially identify a specific individual. Any information that can be used to distinguish one person from another and can be used for de-anonymising anonymous data can be considered PII.
Every node involved in Barrel Protocol will check whether a Data Chunkcontains PII information. Remember that we saved the chunk type in its name? (each chunk name is {chunk hash}_{chunk type}.txt). In fact, each chunk type is tied to a verification function which removes all possible PII content in the chunk. So if we want our Data Chunk to be PII-cleaned we need to trigger the relevant verification function. Let’s get back to our example: we already verified that source_address column contains only addresses, yet is still to verify if those addresses are PII! Then each data input is processed by the address verification function, which will ensure that it does not contain PII data.
By applying this process to each and every Data Chunk, we ensure that our process is compliant per data set and not per company, since every data set type will have its global verification function that is common to each and every data of this specific type within Barrel Protocol. In the same context, if two Data Consumers will be interested to buy the same Data File, process will happen only once since each Data Chunk is already validated on the Barrel Network.
When a Data Chunk has been validated by a Verification Node, we will encrypt its content and move this freshly created Encrypted Data Chunk to the Barrel Anonymised Network. Also, by completing this action the Node Verifier will be rewarded by BRN Tokens that have been transferred by Data Provider when uploading the Data File.
To certify that a chunk is clean, we will ask for a certain number of confirmations from Validators (“Validations Threshold”), so that when a Data Chunk is already considered as clean other Validators won’t need to check it again and waste computational power.
Let’s recap this process with the help of the following diagram:
Download Data Chunks to Barrel Verification NodeVerify and encrypt Data ChunksUpload Encrypted Data Chunks to Barrel Anonymised NetworkTransfer BRN Tokens from Data Provider to Node Verifier
Data Consumer
The Data Consumer wants to buy some data that was uploaded on Barrel Network. He wants its data to be PII-cleaned, certified by Validators and easily accessible.
When acquiring data on Barrel Network, the Data Consumer needs to transfer an amount in BRN Tokens to Data Provider, which is equal to the amount computed by Barrel Protocol when they uploaded the file.
After paying the requested amount, the Merkle Root Key of the file we want to download is delivered and will allow Barrel Protocol to reassemble the file. How exactly it is working? By looking at each Data Chunk hash line by line, we access it from Barrel Anonymised Network and check if the chunk has received enough confirmations from Validators (“Validations Threshold”). If it did, then we access the Data Chunk and retrieve its content. After retrieving all chunks’ contents, the Data File will be restored and decrypted entirely so Data Consumer will be able to download it.
Let’s recap this process with the help of the following diagram:
Transfer BRN Tokens to Data Provider who uploaded this Data FileUpload Merkle Root Key from Barrel ProtocolDownload Encrypted Data Chunks from Barrel Anonymised NetworkReassemble Data File from Merkle Root KeyDownload Data File to Data Consumer
How will Data be validated?
Throughout this article we showed you a way to compute over encrypted data called Secure Multi-party Computation (MPC), which emulates a trusted third party by using a bunch of untrusted parties. In our case, each Data Chunk is validated by an untrusted party, yet put together it can reconstruct the original data back.
You can limit the risk associated with providing a validator with access to data — by ensuring that their specific slice of data is so small that it is insignificant. For example reducing a ride-hailing transaction to its components, like just a date field, or part of an address. While reducing the size of the Data Chunk is useful in reducing the risk. Certain fields require further encryption, in our case we are implementing Homomorphic Encryption which actually enables computing on encrypted data. In layman’s terms running a set of algorithms, and ensuring they were in fact executed, without having access to the underlying data.
Although we will write another article that focuses on this topic, our choice will be based primarily on scalability.
Comments