Privacy-Preserving Data Mining: Models and Algorithms

Privacy-Preserving Data Mining: Models and Algorithms

http://kingcheapebook.blogspot.com/2013/12/privacy-preserving-data-mining-models.html

Privacy-Preserving Data Mining: Models and Algorithms (Advances in Database Systems) by Charu C. Aggarwal and Philip S. Yu

A fruitful direction for future data mining research will be the development of technique that incorporates privacy concerns. Specifically, we address the following question. Since the primary task in data mining is the development of models about aggregated data, can we develop accurate models without access to precise information in individual data records? We analyze the possibility of privacy in data mining techniques in two phasesrandomization and reconstruction. Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To preserve client privacy in the data mining process, techniques based on random perturbation of data records are used. Suppose there are many clients, each having some personal information, and one server, which is interested only in aggregate, statistically significant, properties of this information. The clients can protect privacy of their data by perturbing it with a randomization algorithm and then submitting the randomized version. This approach is called randomization. The randomization algorithm is chosen so that aggregate properties of the data can be recovered with sufficient precision, while individual entries are significantly distorted. For the concept of using value distortion to protect privacy to be useful, we need to be able to reconstruct the original data distribution so that data mining techniques can be effectively utilized to yield the required statistics. Analysis Let xi be the original instance of data at client i. We introduce a random shift yi using randomization technique explained below. The server runs the reconstruction algorithm (also explained below) on the perturbed value zi = xi + yi to get an approximate of the original data distribution suitable for data mining applications.
Ebook Format: PDF
Ebook page: 524
File size: 5.59 MB

$50.00

Post a Comment

 

Copyright © 2014. King Cheap eBook - All Rights Reserved