Skip to Main content Skip to Navigation

Privacy preserving top-k query processing over outsourced data

Sakina Mahboubi 1
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Outsourcing corporate or individual data at a cloud provider, e.g. using Database-as-a-Service, is practical and cost-effective. But it introduces a major problem: how to preserve the privacy of the outsourced data, while supporting powerful user queries. A simple solution is to encrypt the data before it is outsourced. Then, to answer a query, the user client can retrieve the encrypted data from the cloud, decrypt it, and evaluate the query over plaintext (non encrypted) data. This solution is not practical, as it does not take advantage of the computing power provided by the cloud for evaluating queries.In this thesis, we consider an important kind of queries, top-k queries,and address the problem of privacy-preserving top-k query processing over encrypted data in the cloud.A top-k query allows the user to specify a number k, and the system returns the k tuples which are most relevant to the query. The relevance degree of tuples to the query is determined by a scoring function.We first propose a complete system, called BuckTop, that is able to efficiently evaluate top-k queries over encrypted data, without having to decrypt it in the cloud. BuckTop includes a top-k query processing algorithm that works on the encrypted data, stored at one cloud node,and returns a set that is proved to contain the encrypted data corresponding to the top-k results. It also comes with an efficient filtering algorithm that is executed in the cloud on encypted data and removes most of the false positives included in the set returned.When the outsourced data is big, it is typically partitioned over multiple nodes in a distributed system. For this case, we propose two new systems, called SDB-TOPK and SD-TOPK, that can evaluate top-k queries over encrypted distributed data without having to decrypt at the nodes where they are stored. In addition, SDB-TOPK and SD-TOPK have a powerful filtering algorithm that filters the false positives as much as possible in the nodes, and returns a small set of encrypted data that will be decrypted in the user side. We analyze the security of our system, and propose efficient strategies to enforce it.We validated our solutions through implementation of BuckTop , SDB-TOPK and SD-TOPK, and compared them to baseline approaches over synthetic and real databases. The results show excellent response time compared to baseline approaches. They also show the efficiency of our filtering algorithm that eliminates almost all false positives. Furthermore, our systems yieldsignificant reduction in communication cost between the distributed system nodes when computing the query result.
Document type :
Complete list of metadata

Cited literature [96 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Wednesday, January 23, 2019 - 6:36:22 PM
Last modification on : Wednesday, March 16, 2022 - 3:14:08 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01946258, version 2


Sakina Mahboubi. Privacy preserving top-k query processing over outsourced data. Cryptography and Security [cs.CR]. Université Montpellier, 2018. English. ⟨NNT : 2018MONTS026⟩. ⟨tel-01946258v2⟩



Record views


Files downloads