Trigger: A Hybrid Model for Low-Latency Processing of Large Data Sets

dc.contributor.authorXiang, Min
dc.contributor.supervisorCoady, Yvonne
dc.date.accessioned2015-08-12T20:51:48Z
dc.date.available2015-08-12T20:51:48Z
dc.date.copyright2015en_US
dc.date.issued2015-08-12
dc.degree.departmentDepartment of Computer Scienceen_US
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractLarge data sets now need to be processed at close to real-time speeds. For ex- ample, video hosting sites like Youtube and Netflix have a huge amount of traffic every day and large amounts of data needs to be processed on demand so that statistics and analytics or application logic can generate contents for user queries. In such cases, data can be stream processed or batch processed. Stream processing treats the incoming data as a stream and processes it through a processing pipeline as soon as the stream is gathered. It is more computationally intensive but grants lower latency. Batch processing tries to gather more data before processing it. It consumes fewer resources but at the cost of higher latency. This project explores an adaptable model that allows a developer to strike a balance between the efficient use of computational resources and the amount of latency involved in processing a large data set. The proposed model uses an event triggered batch processing method to balance resource utilization versus latency. The model is also configurable, so it can adapt to different tradeoffs according to application specific needs. In a very simple application and a extremely best case scenario, we show that this model offers a 1 second latency when applied to a video hosting site where a traditional batch process- ing method introduced 1 minute latency. When the initial system has low latency, this model will not increase the latency when appropriate parameters are chosen.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/6431
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.rights.urihttp://creativecommons.org/publicdomain/zero/1.0/*
dc.subjectlow latencyen_US
dc.subjectbatch processingen_US
dc.subjectlive data setsen_US
dc.titleTrigger: A Hybrid Model for Low-Latency Processing of Large Data Setsen_US
dc.typeprojecten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Xiang_Min_MSc_2015.pdf
Size:
191.32 KB
Format:
Adobe Portable Document Format
Description:
Main report
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.74 KB
Format:
Item-specific license agreed upon to submission
Description: