Date of Award
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Cloud storage services allow data owners to outsource their data, and thus reduce their workload and cost in data storage and management. However, most data owners today are still reluctant to outsource their data to the cloud storage providers (CSP), simply because they do not trust the CSPs, and have no confidence that the CSPs will secure their valuable data. This dissertation focuses on Remote Data Checking (RDC), a collection of protocols which can allow a client (data owner) to check the integrity of data outsourced at an untrusted server, and thus to audit whether the server fulfills its contractual obligations.
Robustness has not been considered for the dynamic RDCs in the literature. The R-DPDP scheme being designed is the first RDC scheme that provides robustness and, at the same time, supports dynamic data updates, while requiring small, constant, client storage. The main challenge that has to be overcome is to reduce the client-server communication during updates under an adversarial setting. A security analysis for R-DPDP is provided.
Single-server RDCs are useful to detect server misbehavior, but do not have provisions to recover damaged data. Thus in practice, they should be extended to a distributed setting, in which the data is stored redundantly at multiple servers. The client can use RDC to check each server and, upon having detected a corrupted server, it can repair this server by retrieving data from healthy servers, so that the reliability level can be maintained. Previously, RDC has been investigated for replication-based and erasure coding-based distributed storage systems. However, RDC has not been investigated for network coding-based distributed storage systems that rely on untrusted servers. RDC-NC is the first RDC scheme for network coding-based distributed storage systems to ensure data remain intact when faced with data corruption, replay, and pollution attacks. Experimental evaluation shows that RDC-NC is inexpensive for both the clients and the servers.
The setting considered so far outsources the storage of the data, but the data owner is still heavily involved in the data management process (especially during the repair of damaged data). A new paradigm is proposed, in which the data owner fully outsources both the data storage and the management of the data. In traditional distributed RDC schemes, the repair phase imposes a significant burden on the client, who needs to expend a significant amount of computation and communication, thus, it is very difficult to keep the client lightweight. A new self-repairing concept is developed, in which the servers are responsible to repair the corruption, while the client acts as a lightweight coordinator during repair. To realize this new concept, two novel RDC schemes, RDC-SR and ERDC-SR, are designed for replication-based distributed storage systems, which enable Server-side Repair and minimize the load on the client side.
Version control systems (VCS) provide the ability to track and control changes made to the data over time. The changes are usually stored in a VCS repository which, due to its massive size, is often hosted at an untrusted CSP. RDC can be used to address concerns about the untrusted nature of the VCS server by allowing a data owner to periodically check that the server continues to store the data. The RDC-AVCS scheme being designed relies on RDC to ensure all the data versions are retrievable from the untrusted server over time. The RDC-AVCS prototype built on top of Apache SVN only incurs a modest decrease in performance compared to a regular (non-secure) SVN system.
Chen, Bo, "New directions for remote data integrity checking of cloud storage" (2014). Dissertations. 174.