Search This Blog

Sunday 29 January 2012

A New Privacy Measure for Data Publishing


1.ABSTRACT
                                  The k-anonymity privacy requirement for publishing micro data requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain “identifying” attributes) contains at least k records. Recently, we have recognized that  k-anonymity cannot  prevent attribute disclosure. The notion of `-diversity has been proposed to address this; `-diversity requires that each equivalence class has at least ` well-represented values for each sensitive attribute. Here, we show that `-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called “closeness”. We first present the base model  t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t). We then propose a more flexible privacy model called (n, t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.

2.EXISTING SCHEME
                               Organizations often need to publish microdata, e.g., medical data or census data, for research and other purposes. Typically, such data is stored in a table, and each record (row) corresponds to one individual. Each record has a number of attributes, which can be divided into the following three categories.
(1)  Attributes that clearly identify individuals. These are known as explicit identifiers and include, e.g.,Social Security Number.
(2) Attributes whose values when taken together can potentially identify an individual. These are known as quasi-identifiers, and may include,e.g., Zip-code, Birth-date, and Gender.
(3) Attributes that are considered sensitive, such as Disease and Salary. When releasing microdata, it is necessary to prevent thesensitive information of the individuals from being disclosed. Two types of information disclosure have been identified in the literature: identity disclosure and attribute disclosure
               
3.PROPOSED SCHEME :

                                 To effectively limit disclosure, we need to measure the
disclosure risk of an anonymized table. To this end, We introduced k-anonymity as the property that each record is indistinguishable with at least k-1 other records with respect to the quasi-identifier. In other words,
k-anonymity requires that each equivalence class contains atleast k records.  k-anonymity protects against identity disclosure, it does not provide sufficient protection against attribute disclosure We propose two instantiations: a base model called t-closeness and a more flexible privacy model called (n, t)-closeness. We explain the rationale of the (n, t)-closeness model and show that it achieves a better balance
between privacy and utility.

     
4.HARDWARE REQUIREMENTS:

         System                : Pentium IV 2.4 GHz.
         Hard Disk            : 40 GB.
         Floppy Drive       : 1.44 Mb.
         Monitor                : 15 VGA Colour.
         Mouse                 : Logitech.
         Ram                     : 256 Mb.



5.SOFTWARE REQUIREMENTS:

         Operating System       : - Windows XP Professional.
         Front End                     : - Asp .Net 2.0.
         Coding Language       : - Visual C# .Net.


No comments:

Post a Comment