Content area
The authors consider the private information retrieval (PIR) problem, in particular, the problem of ensuring secure queries to a database. Previously, the authors considered this problem for a cloud database in the presence of an active adversary who does not interfere with the execution of the PIR protocol but can carry out an attack with known open queries. In the proposed algorithms, bit number i is represented as an l-ary number with d digits. An algorithm for database placement on the cloud and an algorithm for bit querying with the use of permutations in the digits of a bit number, where bit number i is defined in a base-l number system, are proposed. The permutations are regarded as secret encryption keys. The communication complexity and the probability of guessing the bit number in a single attack with a known open query for bit number i, as well as in an attack with an unlimited number of known open queries, are estimated.
INTRODUCTION
In [1, 2], the authors investigated the private information retrieval (PIR) problem of organizing confidential queries to a replicated cloud database in the presence of an active adversary who acts in accordance with the PIR protocol but can carry out an attack with known open queries.
The PIR problem was formulated in information-theoretic terms in 1995 by Chor, Goldreich, Kushilevitz, and Sudan [3].
The classical formulation of the PIR problem is as follows:
given a database—binary string X = (x1, …, xn) of length n—hosted on a cloud server,
the client wants to retrieve one information bit xi from database X in such a way that no other party can determine from which position bit i was retrieved.
It was shown in [3, 4] that, if a database is hosted on a single server completely controlled by the adversary, then the information-theoretic confidentiality condition of the correct PIR protocol can only be satisfied when the client queries the entire database. However, in that case, the communication complexity of the protocol, i.e., the total number of bits exchanged between PIR parties, is not less than the size of the database, which makes the protocol practically inapplicable.
To reduce the communication complexity, a model of a replicated database was proposed in [3, 4], where several identical copies of row X = (x1, …, xn) were stored on different servers. The client was assumed to communicate with each of these servers; however, the servers themselves had no communication with each other. It was also assumed that the adversary could observe any of the servers that host the database, possibly different servers during different PIR sessions.
Obviously, in cloud information systems, it is impossible to ensure that the copies of a distributed database are isolated from each other. Since neither the user nor the client can control the cloud, it is always assumed that there is an adversary on the cloud. Taking this into account, a model for further research is developed. An important difference from the classical model is that, in the model under consideration, adversaries on different servers can communicate.
In [1, 2], a model that included a cloud consisting of several servers, each storing a copy of the same database, a dealer, an authentication center, users, clients, and an adversary was proposed.
The cloud servers used insecure channels to communicate with each other and with the dealer. The cloud servers and insecure communication channels were accessible to an outside observer. The dealer was located outside the cloud and was inaccessible to the adversary. The main function of the dealer was to encrypt and decrypt cloud data. It was assumed that the adversary not only passively observes but carries out an attack with known open queries.
In [1, 2], the communication complexity of the PIR protocol, as well as the probabilities for the adversary acting in accordance with the protocol to guess the number of bit i in a single attack and in an attack with an unlimited number of known open queries, were estimated.
Unlike [1, 2], this study uses permutations of bit numbers rather than encryption.
Permutation-based algorithms are proposed to represent the bit number in a new number system, generate queries to the cloud, process the cloud’s response, and retrieve the value of the requested bit. New estimates for the communication complexity of the algorithms and the probability for the adversary to guess the number of the requested bit are provided.
Instead of encryption used by the algorithms considered in [1, 2], it is proposed to use permutations of the digits of bit numbers in a selected number system. In the new algorithm, the adversarial search for adjacent bits is hindered, and replacing a shift in the interval of a set of bits with a permutation does not allow the adversary to use one requested set to draw any conclusion about the remaining sets that do not intersect with this set.
The probability of guessing the bit number is estimated for a single attack with a known open query for bit number i and for an attack with an unlimited number of known open queries in the presence of the adversary. The communication complexity of the algorithm is also estimated.
COMPUTATIONAL MODEL, BASIC DEFINITIONS, AND PROBLEM STATEMENT
Structure of the Model
The computational model for which the cloud version of the PIR problem is investigated in this paper is similar in its structure to the model considered in [2].
The model has the following components.
Cloud consists of several servers, which store copies of the same database. The copies on the servers have different permutations of bit numbers. The servers use insecure channels to communicate with each other and with the dealer.
User stores data on the cloud. It is assumed that k copies of databases are stored on the cloud. To retrieve data, the user contacts the dealer via a communication channel that uses a standard cryptographic protocol for encrypted data exchange.
Clients request information from the database. For this purpose, they contact the dealer via communication channels that use a standard cryptographic protocol for encrypted data exchange.
Authentication center verifies the identity of the user and clients.
Dealer is located outside the cloud and is inaccessible to the adversary. Its communication channel with the cloud is not secured. The dealer’s memory capacity for permanent data storage is negligible compared to n. Having received client data that need to be placed on the cloud, the dealer carries out a permutation unique for each copy of the database and places the data on the cloud. When requested, the dealer returns the value to the client.
Hereinafter, for simplicity, we assume that the database is uploaded by one user. It is also assumed that the database cannot be modified or completely changed once it is uploaded. In addition, we consider the case of a single bit query.
Basic Definitions
Adversary does not interfere with the execution of the cryptographic protocol. He or she has access to the cloud database, each of its copies, and communication channels, as well as can create fake clients that act in accordance with the protocol. In other words, the adversary not only passively observes but also can carry out an attack with known open queries. The adversary can force the dealer to send a query to the cloud. In that case, the adversary knows the cloud querying algorithm but does not know its specific parameters. The adversary also knows the dealer’s algorithm for generating a response to the client based on the response received from the cloud.
Threat is the guess of the number of a bit requested by the client from the database.
Attack is divided into two types:
a passive attack on the cloud consists in observing the dealer’s query and the cloud’s response to the dealer (this information is used for statistics and analysis);
an active attack (attack with known open queries) consists in creating and manipulating fake clients. Using fake clients, an active adversary can generate an arbitrary number of queries, which, in combination with the passive attack, makes it possible to collect and analyze information on the entire database.
Problem Statement and Notation
In contrast to the classical setting of the PIR problem, we consider a database (binary string) X = (x0, …, xn –1) of length n, where elements of X are enumerated beginning with zero. This makes it easier to work with different number systems.
The client wants to retrieve one bit of information xi with number i from database X in such a way that the adversary does not know the position i from which the bit is retrieved.
The cloud stores copies of the database. Suppose that k = 2d, where d ≥ 2. In practice, 4 ≤ k ≤ 32. Hereinafter, we assume that these constraints hold for k.
Without loss of generality, it is assumed that , i.e., d = and l = . Note that l ≥ 2, where l is the base of the number system, and d is selected so that . Suppose that Lp = .
With l = , we round up to an integer.
An element of group is denoted by x, and the number of a database copy is denoted by h (h ∈ {1, …, k}).
REPRESENTATION OF THE BIT NUMBER IN THE -ARY NUMBER SYSTEM
The bit number is represented by digits in the l-ary number system. The choice of l and d is based on the well-known model [1, 2], where the number of bit is represented in the -ary notation: i = + + … + . The digits of the -ary notation are elements of the tuple , …, of length d, i.e., an -ary digit is in the jth place of the tuple.
These d elements of the tuple can be interpreted as a point with integer coordinates in a d-dimensional hypercube with sides of length , where l = . This set of points of the d-dimensional cube can be regarded as a set of words of length d with alphabet (0, …, l – 1)
CLOUD DATA PLACEMENT
Initialization of Array
Recall that and . Let us select integers and lu ≥ 2 such that ≤ Lp and denote = ( ≤ Lp). Suppose that l* = , where l* ≥ l and nu = . It is shown below that the communication complexity depends on l*. To ensure that the communication complexity does not increase, it is required to select and so that difference Lp – is minimum. Below, we describe a particular choice of and that not only minimizes Lp – but also reduces the number of computational operations carried out by the dealer.
With , we add bits into the database. The added bits can have arbitrary values. These bits are added at the end of the database.
Suppose that and are integers such that and nc ≥ nu; then, nc ≥ n.
Array , where nc ≥ n, is uploaded to the cloud, i.e., bits are uploaded. The enumeration of array begins with zero. Increasing the number of elements in the array is necessary to use secret permutation of bits instead of bit encryption. This makes it possible to reduce the amount of memory required for the dealer to store data, as well as reduces the communication complexity.
Construction of a Permutation Matrix
The number of the requested bit in the -ary number system can be represented as an integer point of a -dimensional hypercube with sides of length . In this case, the jth coordinate (j ∈ {1, …, dc}) of the hypercube point is the th digit of the number’s representation in the -ary number system.
Let us modify the numbers of bits by changing their digits in the -ary representation. The digits are changed by permutation from 0 to . For each digit, its own permutation is calculated. With the permutations being random, they may coincide for different digits.
Suppose that we have a database copy and is a random permutation of numbers from 0 to lc – 1, where j ∈ {1, …, dc} is the number of a digit in the lc-ary number system. Let us represent the permutation as a table:
Then, is a set of digit-wise permutations in the -ary number system.
Let us construct matrix , where is the number of database copies and is an integer such that nc = . An element of matrix is a random permutation of integers from 0 to lc – 1. Thus, for each database copy, random permutations are carried out.
To generate digit permutations for bit numbers in the lc-ary number system, a pseudorandom generator PRNG(db, m) is used. The input of PRNG(db, m) is a unique value for a particular database and the number of a database copy. With the same initialization, the pseudorandom generator carries out the same permutation.
Since storing matrix requires a significant amount of memory, it is restored (if necessary) using the pseudorandom generator.
Mapping
Bit number i is represented in the lc-ary number system with dc digits:
For database copy h, the permutation of the digits of i is carried out:
Let us denote this transformation by , where βh = is a row of matrix Mβ. The elements of matrix Mβ required for the algorithm are generated by the pseudorandom generator.
Auxiliary Constructions Necessary for the Bit Querying Algorithm: Matrix G
Let us construct matrix G:
The first column of this matrix is a sequence of integers in [0, nc– 1]. The second column of matrix G is formed by the elements of Xc = .
Construction of Matrices Gh Based on Matrix G for Each Database Copy
The algorithm for construction of matrices Gh is as follows.
Input: matrix G.
Output: matrices Gh for each .
Step 1. Select a database copy h.
Step 2. Copy matrix G into matrix Gaux.
Step 3. For each element i, i = 0, …, nc – 1, from the first column of matrix Gaux, carry out transformation .
Step 4. Sort the rows of matrix Gaux by its first column to obtain matrix Gh.
Step 5. Perform steps 1–4 for each h ∈ {1, …, k} .
As a result, we obtain Gh (h ∈ {1, …, k}) matrices for each database copy
Cloud Data Placement
The cloud data placement algorithm is similar to the algorithm considered in [1, 2].
The input of the algorithm is binary array X = (x1, ..., xn). For convenience, it is assumed that X = (x0, …, xn– 1). As a result, the algorithm places k permuted database copies on the cloud.
The cloud data placement algorithm is as follows.
Input: X.
Output: placement of Gh, where h ∈ {1, …, k}, on the cloud.
Step 1. The dealer authenticates the user. The user passes the dealer the array of bit values X = (x0, …, xn– 1). The dealer converts array X = (x0, …, xn– 1) to array Xc = .
Step 2. The dealer generates matrix G = ||gi,j||, where i = 0, .., nc – 1, j = 1, 2.
Step 3. The dealer generates elements of matrix .
Step 4. The dealer creates matrix Gh, where h ∈ {1, …, k}.
Step 5. The dealer places matrix Gh, where h ∈ {1, …, k}, on the cloud.
Step 6. The dealer performs steps 4–5 in a loop for each database copy. Once k database copies are processed, the algorithm terminates.
The number of elements transferred to each of k copies is 2nc.
It should be noted that, since matrix G is not expected to be restored in the future, the dealer does not store it.
To carry out the permutation, the dealer requires the amount of memory comparable to that for storage of one database copy. It is assumed that the dealer can process several databases and there is no need to store them simultaneously. Once the database is uploaded to the cloud, the dealer’s memory is freed for the next database.
In the algorithm considered below, the dealer must perform steps 4–5 sequentially for each database copy. Otherwise, the dealer has to store all k copies in its memory, which increases the memory requirements.
EXECUTION OF THE CLIENT QUERY BY THE DEALER
To hide the bit number, instead of the number requested by the client, the dealer requests a set of bit numbers with cardinality l* from the cloud. In this case, only one of these bit numbers is requested by the client. Each element from the set of bit numbers lies in only one element of the decomposition (partition) of cardinality of this set.
Using Multiple Bit Numbers in a Query to Hide the True Bit Number
Suppose that the client is interested in element . This element corresponds to the ith row of matrix , with . The sequence of integers from to includes l* intervals with elements in each.
Number i lies only in one of these intervals. For this element i, one element is selected in each of the other l* – 1 intervals (the corresponding algorithm is described in Section 5.2). Thus, each number i is associated with other l* – 1 numbers that fall within the remaining intervals different from the interval of i.
As mentioned above, each interval can be interpreted as a -dimensional discrete hypercube with sides of length lu.
Let us select interval . Suppose that is a random permutation of lu numbers from 0 to lu – 1, where j ∈ {1, …, du} is the digit of a number in the lu-ary number system. As above, the permutation is represented as follows:
Then, is a set of digit-wise permutations in the lu-ary number system with digits.
Let us construct matrix (by analogy with matrix ), where l* is the number of intervals and du is an integer such that . The element of matrix is a random permutation of integers from 0 to . Thus, for each interval into which the database is divided, random permutations are carried out.
Pseudorandom generator is used to generate digit permutations of bit numbers from matrix . Here, is a unique value for a particular database, m is the number of an interval (l* intervals in total), and j ∈ {1, …, du} is a digit in the lu-ary number system. With the same initialization, the pseudorandom generator carried out the same permutation.
Since matrix requires a significant amount of memory, the elements of are restored using the pseudorandom generator.
With the sequence of integers from to being divided into l* intervals with elements in each interval, any number from this set falls within one of these intervals.
Let us find the number w of the interval that contains number i upon dividing the sequence of integers from to into l* intervals:
Note that .
Let us obtain modulo while denoting the result by = i. Then, is the ordinal number in the wth interval.
Ordinal number in the wth interval is represented in the -ary number system by = , where aj ∈ {0, …, lu – 1}.
Let us consider sum , where is the inverse permutation such that = aj.
By , we denote a secret known only to the dealer.
In each interval m (m ∈ [1, l*]), we find a number such that = , where j = 1, …, du. For all intervals, there are l* such numbers. Note that, for interval w, it is number .
By , we denote the set of l* numbers obtained by Algorithm 5.2 (the algorithm for constructing by the dealer).
Construction of by the Dealer
Algorithm for constructingby the dealer.
Input: , i, l*, , .
Output: .
// find the interval number
// find the ordinal number of i
// in the corresponding partition interval
// represent number in the lu-ary notation
,
// find coordinates
for j ← 1 to du do
for m ← 1 to l*do
Listing 1. Algorithm for constructing .
Client Query for Bit i: Preliminaries
Suppose that the client requests a bit with number i.
Once the authentication center verifies the client, the dealer allows the client to send a query.
Let us recall that matrices Gh, , are physically stored on the cloud. A row of matrix Gh consists of a row number and a bit value. Without loss of generality (as mentioned above), the database elements are enumerated beginning with zero.
As in [1, 2], for each element of Seti, the number of a database copy is randomly and uniformly selected. All elements of Seti for which database copy is selected are denoted by , where h ∈ {1, …, k}. Obviously, do not intersect because all elements of Seti are different and each element is associated with exactly one (database copy). The union of contains all elements of Seti, i.e., Seti = , where h ∈ {1, …, k}.
Suppose that a database copy is selected for the requested number i in Seti.
For each element of , permutation is carried out. Suppose that . The elements obtained upon the permutation are sorted. We denote this set by . The position of element in is stored.
During the execution of the query, the dealer generates a row vector with three components, snum = (s1, s2, s3):
the first component of row vector snum contains number ;
the second component of snum contains the number of the copy of the database that stores ;
the third component of snum contains the position of number in .
Row vector snum allows the dealer, upon receiving the response from the cloud, to immediately discard all data except for the data necessary to execute the query.
Row vector snum is generated by the dealer and is known only to the dealer. Row vector snum is kept in secret.
To database copies, the dealer sends l* numbers in total.
Upon executing the query, the dealer uses row vector snum to obtain the value of the requested element and sends it to the client.
Cloud querying algorithm.
Input: , , l*, , and .
Output: and snum.
Step 1. The client contacts the dealer and requests the value of a bit with number .
Step 2. The dealer generates Seti.
Step 3. When executing the query for the ith bit, the dealer reserves memory for row vector snum.
Step 4. The dealer writes into the first component of snum.
Step 5. The dealer divides Seti into disjoint subsets , where , by randomly and uniformly selecting the number of a database copy for each element of Seti.
The dealer writes the database copy number corresponding to into the second component of snum.
Step 6. The dealer carries out permutation for the elements of ; then, the resulting numbers are sorted. Thus, is obtained. This step is performed for all .
Step 7. The dealer writes the position of number in into the third component of snum.
Execution of the Query to the Cloud and the Cloud’s Response
Algorithm for information exchange with the cloud.
Input: .
Output: bit values in the same order in which the bit numbers from were sent to each database copy (h ∈ {1, …, k}).
Step 1. To each database copy, the dealer sends .
Step 2. For the requested numbers of bits, the cloud returns the values of bits from the second column of matrix , where (h ∈ {1, …, k}). The order of the returned bit values corresponds to the original order of elements in .
Cloud Response Processing
Algorithm for cloud response processing.
Input: snum and bit values in the same order in which the bit numbers from were sent to each database copy (h ∈ {1, …, k}).
Output: value of bit .
Step 1. The dealer uses row vector snum to select the requested bit value; the other bit values are discarded.
Step 2. The dealer receives the bit value for the first component of row vector snum. By construction of snum, the requested number is
Step 3. The dealer sends the value of bit i to the client.
MEMORY REQUIREMENTS AND ESTIMATION OF ALGORITHM COMPLEXITY
Amount of Information the Dealer Needs to Store
Throughout the life cycle of the database, the dealer stores information for the database of size :
value for initializing pseudorandom generators PRNG(db, m) and PRNG(db, m, j);
values , , and for pseudorandom generator PRNG(db, m), which generates matrix (an element of matrix is a random permutation of integers from 0 to );
values , , , and l* for pseudorandom generator PRNG(db, m, j), which generates matrix (an element of matrix is a random permutation of integers from 0 to ).
The amount of information stored by the dealer is much less than .
Characteristics of the Proposed Scheme
The scheme considers the active adversary who acts in accordance with the PIR protocol. The adversary has access to the database on the cloud, to each of its copies, and to communication channels on the cloud, as well as can create fake clients that operate in accordance with the PIR protocol. In other words, the adversary not only passively observes but can also make queries by using fake clients, i.e., the adversary carries out an attack with known open queries. The adversary does not have access to the dealer.
Main Results
In the proposed scheme, the communication complexity is the total number of bits required for information exchange between the dealer and the cloud. In other words, it is the sum of the number of bits sent by the dealer to the cloud and the number of bits received by the dealer from the cloud that are required for the dealer to find the value of the requested bit. Suppose that s is the number of bits needed to represent the bit number. Recall that l* is the cardinality of Seti.
Statement 1.The communication complexity of the scheme for retrieving the bit number by the dealer without disclosing it isbits.
Proof. To retrieve the bit value, the dealer sends l* bit numbers of length s each to the database copies. The cloud responds with l* bits.
Let us analyze the probability for the adversary to guess the number of the requested bit.
If the queries are made to one set, then they are indistinguishable. Thus, if the adversary makes n or more queries, then he or she will not know the number of a particular bit. The best information that the adversary can get is the subset of Seti such that, when querying for each element from this subset, all bits from Seti and only they are accessed on the cloud. This means that the adversary does not know, but only guesses, which bit from Seti was requested.
Statement 2.In a single attack with a known open query for bit number i under the assumption that there is a passive adversary on the cloud, the probability for the adversary to guess the bit number does not exceed.
Proof.
Seti includes l* elements, and equality Seti= Setj holds for each j ∈ Seti, i.e., there is an identical set for l* different bit numbers. Thus, the probability of guessing the number from Seti is , because Seti has cardinality l*.
Having made or more queries, the adversary receives information about the number of real bits in each . The number of these bits is less than or equal to l*. Thus, the adversary understands that the number requested by the client is among the true bit numbers. The probability that the client requests a particular number of a real bit is the same for all real numbers of bits from .
To estimate the probability of guessing a particular number of a bit from , it is required to calculate the average number of true bits from Seti.
The cardinality of set is , where the elements of correspond to elements. From set , l* elements are selected. In this case, the question arises: how many elements from a random sample of cardinality l* correspond on average to elements of set X?
Suppose that is the probability that a sample of cardinality l* includes exactly r elements that correspond to elements of set X. Then, is found by the formula
Then, the average number of true bits in Seti is
Statement 3.In an attack with an unlimited number of known open queries under the assumption that there is a passive adversary on the cloud, the probability for the adversary to guess the bit number averagely does not exceed.
Proof. Upon executing queries, the adversary determines the elements included in each Seti. Even if fake clients make any number of queries that exceeds , the adversary still does not receive any additional information.
The number of real bits included in each sample of cardinality l* is on average N. Then, the probability of guessing number i averagely does not exceed .
Statement 4.In an attack with an unlimited number of known open queries under the assumption that there is a passive adversary on the cloud, the probability for the adversary to guess the bit number does not exceed.
Proof. By construction, the hypercubes, which correspond to intervals, are filled in such a way that dummy numbers of bits are added only to an interval with number l*. Suppose that l* – 1 hypercubes are filled with bit numbers, while the interval with number l* is filled only with dummy bits.
In this case, having executed an arbitrary number of queries, the adversary determines the number of a dummy bit from the interval with number l*, because only one bit is selected from each interval to construct . Therefore, for any number of queries, the probability for the adversary to guess the bit number does not exceed .
Value k is small compared to n. Let us prove that the proposed protocol satisfies the conciseness condition.
As mentioned above, for the communication complexity not to increase, values and need to be selected in such a way as to minimize difference . This follows from the fact that l = and l* = ; then, l* ≈ l and therefore l* n.
Theorem 1.For the proposed scheme for organizing the database of sizen,
• The communication complexity of a query isbits.
• In a single attack with a known open query for bit number i under the assumption that there is a passive adversary on the cloud, the probability for the adversary to guess the bit number does not exceed.
• In an attack with an unlimited number of known open queries under the assumption that there is a passive adversary on the cloud, the probability for the adversary to guess the bit number averagely does not exceed.
• In an attack with an unlimited number of known open queries under the assumption that there is a passive adversary on the cloud, the probability for the adversary to guess the bit number does not exceed.
Proof. Statement 1 implies that the communication complexity of a query is bits.
Statement 2 implies that, for a single query, the probability of guessing the bit number is .
Statement 3 implies that, for any number of queries made by fake clients in the presence of the adversary on the cloud, the probability of guessing the bit number averagely does not exceed .
Statement 4 implies that, for any number of queries made by fake clients in the presence of the adversary on the cloud, the probability for the adversary to guess the bit number does not exceed .
EVALUATION OF THE INFORMATION GENERATED BY THE DEALER
Recall that and . By the above constructions, is the size of the hypercube and is the length of its side. To minimize the amount of computations carried out by the dealer, it is required to reduce the size of matrices and . For this purpose, it is necessary to reduce the size of rows in these matrices, i.e., minimize (lc = ) and (lu = ).
Let us consider function , which relates the length of the hypercube’s side and its dimension. We investigate function to minimize the amount of information generated by the dealer. Note that x ≥ 1.
Using well-known formula = , we find the derivative of :
The derivative is zero at the extremum. Let us find the coordinate of the extremum. Since > 0, the second factor vanishes:
Then,
Now, let us find out whether this extremum is a minimum or a maximum.
Thus, the minimum of function is at point x = lnn. Hence, the minimum number of elements of matrices and that need to be generated by the dealer is achieved at x = lnn. Value x must be an integer because the number of digits and the base of the number system must be integers.
The choice of or affects the number of intervals l* = ( = ). If the interval contains more elements, then l* decreases; otherwise, l* increases. Value l* affects the communication complexity.
COMPARISON WITH THE PREVIOUSLY PROPOSED ALGORITHMS
The communication complexity of the proposed algorithm is .
In practice, for a database of size , the number of bits used to represent number s does not exceed 40 bits ≤26 (the bits are enumerated beginning with zero).
In [2], the communication complexity is ⋅ d! ⋅ Len(K, Kenc), where Len(K, Kenc) is the sum of ciphertext lengths. In practice, d ≤ 4.
By construction of the intervals,
For popular probabilistic encryption algorithms (AES and RSA), Len(K, Kenc) is not less than 128 + 128 = 256 bits ().
Let us compare l* and :
In practice, the communication complexity of the proposed algorithm is = 210 × 26; the communication complexity of the algorithm from [2] is
Recall that, when using permutations, the guessing probability with a single query is = .
Using encryption, for ,
When increasing l* (suppose that ), the guessing probability with a single query is the same for the previously considered and proposed algorithm. In this case, the communication complexity of the proposed algorithm is
Thus, in practice, with similar guessing probabilities, the communication complexity of the proposed algorithm is at least two times better.
Let us estimate the amount of memory the dealer needs in practice. With the minimum of function f(x) = being at , we select integer for . For x = 28, the base of the number system is ≈ 3; for , it is ≈ 3. In both the cases, the permutation is carried out for the sequence of three numbers and it does not depend on the integer part of . However, the dimension of the hypercube depends on the integer part of . If the dimension of the hypercube is larger, then l* is smaller, and vice versa. As shown above, when l* increases, the guessing probability decreases, whereas the communication complexity increases, and vice versa. The choice of the integer part of depends on our goal: reducing the guessing probability or reducing the communication complexity.
Thus, for the selected parameters, the number of permutations is and the length of a permutation is . For , it is required to generate 28 × 3 = 84 elements to calculate a new number when sending a query to the cloud.
When generating , once the next number in the interval is received, the dealer’s memory is freed.
Permutations for the entire database and intervals can be generated in different threads. For each of l* intervals, permutations of digits when forming can be carried out in parallel, which speeds up the algorithm.
CONCLUSIONS
The proposed algorithm has significant advantages over the algorithms described in [1, 2].
These advantages are as follows.
Instead of encrypting the bit number, permutations of digits in a selected number system are used. In practice, encrypting and decrypting the bit number requires more operations than the permutation of bits in a selected number system.
Since the permutation is carried out when the database is uploaded to the cloud, the query for adjacent bits is most often executed for the bits that are not adjacent on the cloud. This makes it difficult for the adversary to find the requested adjacent bits.
All shifts in the intervals are different for any Seti. This does not allow the adversary to draw any conclusion about Setj based on Seti. Previously, in [2], vector b was used, which contained an interval shift for each element of any Seti. When the adversary guessed vector b, this allowed him or her to judge about Setj based on Seti.
It should also be noted that, when implementing these computations in practice, it is easy to organize parallel processes that speed up the execution of the algorithm.
ACKNOWLEDGMENTS
The authors are especially grateful to D.O. Lazarev for his valuable remarks during the preparation of this paper.
FUNDING
This work was supported by ongoing institutional funding. No additional funding to carry out or direct this particular research were obtained.
CONFLICT OF INTEREST
The authors of this work declare that they have no conflicts of interest.
Translated by Yu. Kornienko
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
AI tools may have been used in the translation or editing of this article.
REFERENCES
1 Martishin, S.A., Khrapchenko, M.V., and Shokurov, A.V., Organization of a secure query to a database in the cloud, Tr. Inst. Sist. Program. Ross. Akad. Nauk, 2022, vol. 34, no. 3, pp. 173–188.
2 Varnovskiy, N.P., Martishin, S.A., Khrapchenko, M.V., and Shokurov, A.V., About cloud request protection, Tr. Inst. Sist. Program. Ross. Akad. Nauk, 2023, vol. 35, no. 5, pp. 37–54.
3 Chor, B., Goldreich, O., Kushilevitz, E., and Sudan, M., Private information retrieval, Proc. IEEE Annu. Symp. Foundations of Computer Science, 1995, pp. 41–50.
4 Chor, B.; Goldreich, O.; Kushilevitz, E.; Sudan, M. Private information retrieval. J. ACM; 1998; 45, pp. 965-982.MathSciNet ID: 1678848[DOI: https://dx.doi.org/10.1145/293347.293350]
© Pleiades Publishing, Ltd. 2024. ISSN 0361-7688, Programming and Computer Software, 2024, Vol. 50, No. 6, pp. 425–434. © Pleiades Publishing, Ltd., 2024. Russian Text © The Author(s), 2024, published in Programmirovanie, 2024, Vol. 50, No. 6.