Trigger: A Hybrid Model for Low-Latency Processing of Large Data Sets

Show simple item record

dc.contributor.author Xiang, Min
dc.date.accessioned 2015-08-12T20:51:48Z
dc.date.available 2015-08-12T20:51:48Z
dc.date.copyright 2015 en_US
dc.date.issued 2015-08-12
dc.identifier.uri http://hdl.handle.net/1828/6431
dc.description.abstract Large data sets now need to be processed at close to real-time speeds. For ex- ample, video hosting sites like Youtube and Netflix have a huge amount of traffic every day and large amounts of data needs to be processed on demand so that statistics and analytics or application logic can generate contents for user queries. In such cases, data can be stream processed or batch processed. Stream processing treats the incoming data as a stream and processes it through a processing pipeline as soon as the stream is gathered. It is more computationally intensive but grants lower latency. Batch processing tries to gather more data before processing it. It consumes fewer resources but at the cost of higher latency. This project explores an adaptable model that allows a developer to strike a balance between the efficient use of computational resources and the amount of latency involved in processing a large data set. The proposed model uses an event triggered batch processing method to balance resource utilization versus latency. The model is also configurable, so it can adapt to different tradeoffs according to application specific needs. In a very simple application and a extremely best case scenario, we show that this model offers a 1 second latency when applied to a video hosting site where a traditional batch process- ing method introduced 1 minute latency. When the initial system has low latency, this model will not increase the latency when appropriate parameters are chosen. en_US
dc.language.iso en en_US
dc.rights Available to the World Wide Web en_US
dc.rights.uri http://creativecommons.org/publicdomain/zero/1.0/ *
dc.subject low latency en_US
dc.subject batch processing en_US
dc.subject live data sets en_US
dc.title Trigger: A Hybrid Model for Low-Latency Processing of Large Data Sets en_US
dc.type project en_US
dc.contributor.supervisor Coady, Yvonne
dc.degree.department Department of Computer Science en_US
dc.degree.level Master of Science M.Sc. en_US
dc.description.scholarlevel Graduate en_US

Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Available to the World Wide Web Except where otherwise noted, this item's license is described as Available to the World Wide Web

Search UVicSpace


My Account