UVicSpace

Data Cleaning using a Matching Dependency Technique

Show simple item record

dc.contributor.author Jain, Shashank
dc.date.accessioned 2019-01-03T00:47:56Z
dc.date.available 2019-01-03T00:47:56Z
dc.date.copyright 2018 en_US
dc.date.issued 2019-01-02
dc.identifier.uri http://hdl.handle.net/1828/10477
dc.description.abstract In today’s digital society, people are often required to enter their home or office addresses on forms available online. It is not uncommon for people to introduce some minor mistakes, such as misspelled addresses, or incorrect postal codes/zip codes. Such mistakes made by the user can be quite problematic when automated systems must process their request. For example, if a person orders something online providing the incorrect postal code in the entered address, this mistake could lead to delay in the delivery of the item or even worse, the item may remain undelivered. To avoid such situations, these systems often use a machine learning technique called ‘Matching Dependency’ which has been proven helpful in making recommendations for the correction of any incorrect value in the input address. This technique uses a binary search algorithm to reduce the number of cycles the process has to go through to make recommendations. Our exploration of one possible implementation of this algorithm uses our own synthesized sample data sets instead of real user input with the external data. External data has been used as the authenticated data source to verify the user input data. We compare our synthesized user input data with the external data that is considered to be completely trust worthy. The system then makes possible recommendations based on the correctness of the user input. The evaluation was mainly done on two different sizes of data sets, 1000 and 15000. The results had zero false negatives, few false positives, and mostly relevant recommendations. en_US
dc.language.iso en en_US
dc.rights Available to the World Wide Web en_US
dc.title Data Cleaning using a Matching Dependency Technique en_US
dc.type Project en_US
dc.contributor.supervisor Coady, Yvonne
dc.degree.department Department of Computer Science en_US
dc.degree.level Master of Science M.Sc. en_US
dc.description.scholarlevel Graduate en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UVicSpace


Browse

My Account

Statistics

Help