Tracking and visualizing dimension space coverage for exploratory data analysis

dc.contributor.authorSarvghad Batn Moghaddam, Ali
dc.contributor.supervisorTory, Melanie
dc.date.accessioned2016-08-15T16:02:05Z
dc.date.available2016-08-15T16:02:05Z
dc.date.copyright2016en_US
dc.date.issued2016-08-15
dc.degree.departmentDepartment of Computer Science
dc.degree.levelDoctor of Philosophy Ph.D.en_US
dc.description.abstractIn this dissertation, I investigate interactive visual history for collaborative exploratory data analysis (EDA). In particular, I examine use of analysis history for improving the awareness of the dimension space coverage 1 2 3 to better support data exploration. Commonly, interactive history tools facilitate data analysis by capturing and representing information about the analysis process. These tools can support a wide range of use-cases from simple undo and redo to complete reconstructions of the visualization pipeline. In the con- text of exploratory collaborative Visual Analytics (VA), history tools are commonly used for reviewing and reusing past states/actions and do not efficiently support other use-cases such as understanding the past analysis from the angle of dimension space coverage. How- ever, such knowledge is essential for exploratory analysis which requires constant formulation of new questions about data. To carry out exploration, an analyst needs to understand “what has been done” versus “what is remaining” to explore. Lack of such insight can result in premature fixation on certain questions, compromising the coverage of the data set and breadth of exploration [80]. In addition, exploration of large data sets sometimes requires collaboration between a group of analysts who might be in different time/location settings. In this case, in addition to personal analysis history, each team member needs to understand what aspects of the problem his or her collaborators have explored. Such scenarios are common in domains such as science and business [34] where analysts explore large multi-dimensional data sets in search of relationships, patterns and trends. Currently, analysts typically rely on memory and/or externalization to keep track of investigated versus uninvestigated aspects of the problem. Although analysis history 4 mechanisms have the potential to assist analyst(s) with this problem, most common visual representations of history are geared towards reviewing & reusing the visualization pipeline or visualization states. I started this research with an observational user study to gain a better understanding of analysts’ history needs in the context of collaborative exploratory VA. This study showed that understanding the coverage of dimension space by using linear history 5 was cumbersome and inefficient. To address this problem, I investigated how alternate visual representations of analysis history could support this use-case. First, I designed and evaluated Footprint-I, a visual history tool that represented analysis from the angle of dimension space coverage (i.e. history of investigation of data dimensions; specifically, this approach revealed which dimensions had been previously investigated and in which combinations). I performed a user study that evaluated participants’ ability to recall the scope of past analysis using my proposed design versus a linear representation of analysis history. I measured participants’ task duration and accuracy in answering questions about a past exploratory VA session. Findings of this study showed that participants with access to dimension space coverage information were both faster and more accurate in understanding dimension space coverage information. Next, I studied the effects of providing coverage information on collaboration. To investigate this question, I designed and implemented Footprint-II, the next version of Footprint-I. In this version, I redesigned the representation of dimension space coverage to be more usable and scalable. I conducted a user study that measured the effects of presenting history from the angle of dimension space coverage on task coordination (tacit breakdown of a common task between collaborators). I asked each participant to assume the role of a business data analyst and continue a exploratory analysis work which was started by a collaborator. The results of this study showed that providing dimension space coverage information helped participants to focus on dimensions that were not investigated in the initial analysis, hence improving tacit task coordination. Finally, I investigated the effects of providing live dimension space coverage information on VA outcomes. To this end, I designed and implemented a standalone prototype VA tool with a visual history module. I used scented widgets [76] to incorporate real-time dimension space coverage information into the GUI widgets. Results of a user study showed that providing live dimension space coverage information increased the number of top-level findings. Moreover, it expanded the breadth of exploration (without compromising the depth) and helped analysts to formulate and ask more questions about their data.en_US
dc.description.proquestcode0984en_US
dc.description.proquestemailali.sarvghad@gmail.comen_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/7442
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/2.5/ca/*
dc.subjectExploratory Data Analysisen_US
dc.subjectAnalysis Historyen_US
dc.subjectDimension Space Coverageen_US
dc.subjectVisulizationen_US
dc.subjectTabular Dataen_US
dc.subjectEmpirical laboratory studyen_US
dc.subjectScented widgetsen_US
dc.subjectCollaborationen_US
dc.subjectLarge Interactive Surfacesen_US
dc.titleTracking and visualizing dimension space coverage for exploratory data analysisen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sarvghad_Ali_PhD_2016.pdf
Size:
15.22 MB
Format:
Adobe Portable Document Format
Description:
Dissertation
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.74 KB
Format:
Item-specific license agreed upon to submission
Description: