Proof of data possession

From CryptoWiki
Jump to: navigation, search

Proof of data possession protocols (PDP) provide the integrity of test data stored on remote servers, and provide a number of requirements: easy and reliable test, which is based on various assumptions, security and computational efficiency.


Relevance of the use of Proof of data possession protocols

Modern technological advances have led to an increase in the popularity of the cloud storage and services. Digital content grows continuously, and therefore there grow the demand for data storage. The remote storage systems is growing interest on the part of users and companies, since they increase the cost-effectiveness of the existing architecture. These solutions support the transmission, remote storage and data-intensive. However, these forward-looking at first glance, in the design the cloud services generate a large number of complex issues, mostly related to the loss of end-user control of the basic properties of information in the process of remote processing. These issues are directly related to the state security of the cloud system. Critical security problems associate with the storage of sensitive data in the cloud services, with users required proof of safety of services that enable trusted providers of commercial cloud services. As storage-outsourcing services and resource-sharing networks have become popular, the problem of efficiently proving the integrity of data stored at untrusted servers has received increased attention. In the provable data possession (PDP) model, the client preprocesses the data and then sends it to an untrusted server for storage, while keeping a small amount of meta-data. The client later asks the server to prove that the stored data has not been tampered with or deleted (without downloading the actual data). However, the original PDP scheme applies only to static (or append-only) files. One of the most acute problems associated with the storage of data from remote services is the problem of ensuring the integrity and availability of data to untrusted systems. Typically, providers of cloud servers argue that storing user files made redundant, sufficient to provide protection from data loss. Most systems used solutions providing distributed data storage fragments. The distribution of data across multiple independent software and hardware platform provides resistance against the problems associated with the physical violation of the integrity of computing and disk resources (hardware problems, breakdowns, natural disasters and others). However, in order to reduce operating costs and keep a place in the "cloud" deliberately dishonest suppliers may neglect the replication procedures that can lead to permanent damage or data loss. The attacks of various kinds by third parties can also lead to data leakage. Consequently, the client who uses the cloud service, must have an effective way to periodically check the basic properties of information stored in the service, without having to download the data to a computer. Worry client increase its limited storage capabilities, relatively low processing power of client equipment and large amounts of information. The solution may be a deterministic protocol PDP, which, in turn, have a number of advantages:

Firstly, such protocols supported by the public that the data owners frees the necessity of periodic inspection.

Second, it provide a constant complex switching in which messages are exchanged between the storage server and the client.

Third, it's effective and safe, so it used the approach to ensure the confidentiality of the data is resistant to attack and eliminate the leak-tested data.


Proof of data possession is a protocol that allows the client to verify the data file stored on a remote server cloud, the availability of the original version. The client and server exchange messages according to the model Request - Answer.

Protocol PDP consists of four treatments: pre-process, inquiry, confirmation, check. The client C (data owner) preprocesses the file, generating a small piece of metadata that is stored locally, transmits the file to the server S, and may delete its local copy. The server stores the file and responds to challenges issued by the client. To verify ownership of a file, the client sends a random challenge to the server to check for evidence of the specified file. The server generates a proof in response. This calculation requires the possession of the original data and the data of the current request, to avoid repeated attacks. Upon receipt, the client compares the evidence with a locally stored file metadata.

Types PDP-protocols

PDP-protocols are widely analyzed in two categories, depending on the role of inspection in the protocol: private check (private verifiability), where only the data owner can check the data on the server and multiple (public) verification (public verifiability), where any competent authority can perform verification procedure.

The simplest PDP based on a hash function

The simplest solution for the development of the PDP is hash function-based way. The client pre-computes a k random calls Pdp pic 1.png ‎ and calculates the relevant evidence Pdp pic 2.png. During the procedure, the client sends a request Pdp pic 3.png server, which computes Pdp pic 4.png. If the comparison is made, the client assumes that the server stores the file with the correct data. This solution has the disadvantage - the client can verify the authenticity of the files on the server only k times.

Requirements for PDP-protocol

1. Public verifiability: the public data possession verification is an important requirement, permitting any authorized entity to verify the correctness of outsourced data. Thus, the data owner can be relieved from the burden of storage and computation. 2. Stateless verification: proofs should be generated according to a randomly produced challenge. Thus, stateless verification requires the use of unpredictable values. 3. Low computation overhead: on one hand, for scalability reasons, the amount of computation at the cloud storage server should be also minimized, as the server may be involved in concurrent interactions. On the other hand, the proposed algorithms should also have low processing complexity, at the client side. 4. Low communication overhead: an efficient PDP should minimize the usage of bandwidth, relying on low communication cost. 5. Low storage cost: the limited storage capacities of the user devices has a critical importance in designing our solution. So, low storage cost at the client side is highly recommended. 6. Unlimited challenges: the number of challenges should be unlimited. This condition is considered as important to the efficiency of a PDP scheme.

Probabilistic proof of ownership of the data

The protocol is based on the generation of probabilistic evidence of ownership of random sample sets of server units, which significantly reduces costs. The client maintains a constant amount of metadata to test the evidence. Request / response protocols transmit a small, constant amount of data which minimizes communication costs. The protoсol PDP for remote data monitoring is light weight and supports large data sets in a distributed storage system. Experiments using this implementation checked practicality PDP. It's performance is limited more by hardware rather than cryptographic computations.


Formulation of the problem. The client C wants to store on the server S a file F which is a finite ordered collection of F blocks: Pdp pic 5.png.We denote the output x of an algorithm A by Pdp pic 6.png. We denote by |x| the length of x (in bits). Homomorphic Verifiable Tags (HVTs). Lets introduce the concept of homomorphic checked the tag that will be used as a building blocks for systems PDP.

For the message b (out of a block of a file) will be denoted by Pdp pic 7.png its homomorphic checked tag. Tags will be stored on the server with the file F. Homomorphic tags checks serve to check the metadata file blocks. It’s genuine and have a tendency to Blockless-check: using HVTs, the server can build the evidence that will allow the client to check whether the server block definition files, even if the client does not It has access to the actual file blocks.

Given the two values Pdp pic 8.png and Pdp pic 9.png, we can combine them into a value of Pdp pic 10.png, corresponding to the amount of messages Pdp pic 11.png .

HVT is a pair of values Pdp pic 12.png, where Wi is a random value obtained from an index i and Pdp pic 34.png is stored at the server. The index i can be seen as a one-time index because it is never reused for computing tags (a simple way to ensure that every tag uses a different index i is to use a global counter for i). The random value Wi is generated by concatenating the index i to a secret value, which ensures that Wi is different and unpredictable each time a tag is computed. HVTs and their corresponding proofs have a fixed constant size and are (much) smaller than the actual file blocks. The client must be able to verify specific tag file block, even if the client does not possess any of these blocks.

Definition Protocol PDP in the context of the task. The protocol is a set of four PDP polynomial-time algorithmsKeygen, TagBlock, GenProof, CheckProof.

Pdp pic 13.png - a probabilistic algorithm for generating keys, which is run by the client needed to adjust the scheme. It takes the security option k input and returns a pair: public and private keys - Pdp pic 14.png.

Pdp pic 15.png - algorithm for generating test launch client metadata; the input of the algorithm takes the public pk' and a secret sk' keys and block the file b and returns metadata validation Pdp pic 7.png.

Pdp pic 16.png - an algorithm that runs on the server to generate the proof of ownership of data; takes the input public key pk', an ordered set of F units, the request chal' and ordered set of Pdp pic 17.png, which is a test of metadata corresponding block in the F. The algorithm returns the proof of ownership of the data Pdp pic 18 .png for units F, which are defined by the query chal.

Pdp pic 19.png - an algorithm executed by the client to check for proof of ownership data server. Takes as input the public key pk, the secret key sk, request chal and proof of ownership Pdp pic 18 .png. The algorithm returns a response, whether correct proof of ownership for units installed chal.

The two phases of construction of the PDP-protocol

Setup and Request.

Settings. Clent has the file F and running Pdp pic 13.png, followed by the implementation of Pdp pic 15.png, for all [[File: Pdp_pic_20.png] ]. C save a couple of (sk, pk). C sends pk, F and Pdp pic 21.png S in storage and may remove F and Pdp pic 17.png.

Request. C generates the call of chal, indicating concrete blocks, for which C wants to check the proof of ownership. C sends chal to S. S launches Pdp pic 16.png and sends proof of ownership Pdp pic 18 .png to C. Now, C can check the validity of the evidence Pdp pic 18 .png, running the Pdp pic 19.png.

In phase Settings, C computes the tags for each block of the file and stores them with the file on S. In the phase Request C asks for the proof of ownership for a subset of blocks in F. This phase can be done unlimited number of times in order to find out whether the all S has selected block. Note that GenProof and CheckProof may receive different input values for chal, because these algorithms run by S and C, respectively.

Additional designations.

Let Pdp pic 22.png and Pdp pic 23.png - simple and let Pdp pic 24.png - module of RSA. Let g - Generator Pdp pic 25.png, in a unique cyclic subgroup Pdp pic 26.png order Pdp pic 27.png. We can get g, such Pdp pic 28.png, where Pdp pic 29.png, such that Pdp pic 30.png. All exponentiation performed modulo N, and for simplicity, sometimes it will be skipped. Let Pdp pic 31.png - deterministic hash function that displays rows evenly in Pdp pic 25.png.

In summary, look at two sheme embodiments of the algorithm.

Pdp pic 32.png

Pdp pic 33.png



Go to list of references to the section " Proof of Data Possession, PDP "

Kiryakina, S. 2015