Representative Subsets for Preference Queries

Show simple item record

dc.contributor.author Chester, Sean
dc.date.accessioned 2013-08-26T17:50:39Z
dc.date.available 2013-08-26T17:50:39Z
dc.date.copyright 2013 en_US
dc.date.issued 2013-08-26
dc.identifier.uri http://hdl.handle.net/1828/4833
dc.description.abstract We focus on the two overlapping areas of preference queries and dataset summarization. A (linear) preference query specifies the relative importance of the attributes in a dataset and asks for the tuples that best match those preferences. Dataset summarization is the task of representing an entire dataset by a small, representative subset. Within these areas, we focus on three important sub-problems, significantly advancing the state-of-the-art in each. We begin with an investigation into a new formulation of preference queries, identifying a neglected and important subclass that we call threshold projection queries. While literature typically constrains the attribute preferences (which are real-valued weights) such that their sum is one, we show that this introduces bias when querying by threshold rather than cardinality. Using projection, rather than inner product as in that literature, removes the bias. We then give algorithms for building and querying indices for this class of query, based, in the general case, on geometric duality and halfspace range searching, and, in an important special case, on stereographic projection. In the second part of the dissertation, we investigate the monochromatic reverse top-k (mRTOP) query in two dimensions. A mRTOP query asks for, given a tuple and a dataset, the linear preference queries on the dataset that will include the given tuple. Towards this goal, we consider the novel scenario of building an index to support mRTOP queries, using geometric duality and plane sweep. We show theoretically and empirically that the index is quick to build, small on disk, and very efficient at answering mRTOP queries. As a corollary to these efforts, we defined the top-k rank contour, which encodes the k-ranked tuple for every possible linear preference query. This is tremendously useful in answering mRTOP queries, but also, we posit, of significant independent interest for its relation to myriad related linear preference query problems. Intuitively, the top-k rank contour is the minimum possible representation of knowledge needed to identify the k-ranked tuple for any query, without apriori knowledge of that query. We also introduce k-regret minimizing sets, a very succinct approximation of a numeric dataset. The purpose of the approximation is to represent the entire dataset by just a small subset that nonetheless will contain a tuple within or near to the top-k for any linear preference query. We show that the problem of finding k-regret minimizing sets—and, indeed, the problem in literature that it generalizes—is NP-Hard. Still, for the special case of two dimensions, we provide a fast, exact algorithm based on the top-k rank contour. For arbitrary dimension, we introduce a novel greedy algorithm based on linear programming and randomization that does excellently in our empirical investigation. en_US
dc.language English eng
dc.language.iso en en_US
dc.subject databases en_US
dc.subject computational geometry en_US
dc.subject top-k queries en_US
dc.subject preference queries en_US
dc.subject k-regret minimizing sets en_US
dc.subject depth contours en_US
dc.subject indexing en_US
dc.subject reverse data management en_US
dc.subject stereographic projection en_US
dc.subject plane sweep en_US
dc.subject linear programming en_US
dc.subject computational complexity en_US
dc.subject algorithms en_US
dc.subject NP-hardness en_US
dc.subject randomization en_US
dc.subject summarization en_US
dc.subject duality en_US
dc.title Representative Subsets for Preference Queries en_US
dc.type Thesis en_US
dc.contributor.supervisor Thomo, Alex
dc.contributor.supervisor Srinivasan, Venkatesh
dc.contributor.supervisor Whitesides, Sue H.
dc.degree.department Dept. of Computer Science en_US
dc.degree.level Doctor of Philosophy Ph.D. en_US
dc.rights.temp Available to the World Wide Web en_US
dc.description.scholarlevel Graduate en_US
dc.description.proquestcode 0984 en_US

Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Search UVicSpace


My Account