King's College London

Research portal

Reverse-Safe Data Structures for Text Indexing

Research output: Chapter in Book/Report/Conference proceedingChapter

Giulia Bernardini, Huiping Chen, Gabriele Fici, Grigorios Loukidis, Solon Pissis

Original languageEnglish
Title of host publicationReverse-Safe Data Structures for Text Indexing
DOIs
Accepted/In press6 Oct 2019
Published5 Jan 2020

King's Authors

Abstract

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O(n) and answers pattern matching queries of length at most d optimally, where d is maximal for any such z-reverse-safe data structure. The construction algorithm takes O(n ω log d) time, where ω is the matrix multiplication exponent. We show that, despite the n ω factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We further show that plugging our method in data analysis applications gives insignificant or no data utility loss. Finally, we show how our technique can be extended to support applications under a realistic adversary model.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454