Privacy preservation for training datasets in database: application to decision tree learning

dc.contributor.authorFong, Pui Kuen
dc.contributor.supervisorWeber, Jens H.
dc.date.accessioned2008-12-15T22:11:32Z
dc.date.available2008-12-15T22:11:32Z
dc.date.copyright2008en_US
dc.date.issued2008-12-15T22:11:32Z
dc.degree.departmentDepartment of Computer Science
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractPrivacy preservation is important for machine learning and datamining, but measures designed to protect private information sometimes result in a trade off: reduced utility of the training samples. This thesis introduces a privacy preserving approach that can be applied to decision-tree learning, without concomitant loss of accuracy. It describes an approach to the preservation of privacy of collected data samples in cases when information of the sample database has been partially lost. This approach converts the original sample datasets into a group of unreal datasets, where an original sample cannot be reconstructed without the entire group of unreal datasets. This approach does not perform well for sample datasets with low frequency, or when there is low variance in the distribution of all samples. However, this problem can be solved through a modified implementation of the approach introduced later in this thesis, by using some extra storage.en_US
dc.identifier.urihttp://hdl.handle.net/1828/1291
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectDatabaseen_US
dc.subjectData-miningen_US
dc.subjectDecision-treeen_US
dc.subjectPrivacy Preservationen_US
dc.subjectMechine-learningen_US
dc.subjectID3en_US
dc.subjectData Perturbationen_US
dc.subjectEntropyen_US
dc.subjectData Complementationen_US
dc.subjectDataseten_US
dc.subjectAccuracyen_US
dc.subject.lcshUVic Subject Index::Sciences and Engineering::Applied Sciences::Computer scienceen_US
dc.titlePrivacy preservation for training datasets in database: application to decision tree learningen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
BThesis(11-4-Final).pdf
Size:
606.73 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.95 KB
Format:
Item-specific license agreed upon to submission
Description: