ABSTRACT:
A data distributor has given sensitive data to a set
of supposedly trusted agents (third parties). Some of the data is leaked and
found in an unauthorized place (e.g., on the web or somebody’s laptop). The distributor
must assess the likelihood that the leaked data came from one or more agents,
as opposed to having been independently gathered by other means. We propose
data allocation strategies (across the agents) that improve the probability of
identifying leakages. These methods do not rely on alterations of the released
data (e.g., watermarks). In some
cases we can also inject “realistic but fake” data records to further improve
our chances of detecting leakage and identifying the guilty party.
EXISTING SYSTEM:
Traditionally, leakage detection is handled by
watermarking, e.g., a unique code is embedded in each distributed copy. If that
copy is later discovered in the hands of an unauthorized party, the leaker can
be identified. Watermarks can be very useful in some cases, but again, involve
some modification of the original data. Furthermore, watermarks can sometimes
be destroyed if the data recipient is malicious. E.g. A hospital may give patient records to researchers who will
devise new treatments. Similarly, a company may have partnerships with other
companies that require sharing customer data. Another enterprise may outsource
its data processing, so data must be given to various other companies. We call
the owner of the data the distributor and the supposedly trusted third parties the
agents.
PROPOSED SYSTEM:
Our goal is to detect when the distributor’s sensitive
data has been leaked by agents, and if possible to identify the agent that
leaked the data. Perturbation is a very useful technique
where the data is modified and made “less sensitive” before being handed to
agents. we develop unobtrusive
techniques for detecting leakage of a set of objects or records.
In this section we develop a model for assessing the “guilt”
of agents. We also present algorithms for distributing objects to agents, in a
way that improves our chances of identifying a leaker. Finally, we also
consider the option of adding “fake” objects to the distributed set. Such
objects do not correspond to real entities but appear realistic to the agents.
In a sense, the fake objects acts as a type of watermark for the entire set,
without modifying any individual members. If it turns out an agent was given
one or more fake objects that were leaked, then the distributor can be more
confident that agent was guilty.
Problem Setup and Notation:
A distributor owns a set T={t1,…,tm}of valuable data objects. The distributor
wants to share some of the objects with a set of agents U1,U2,…Un, but does not wish the objects be leaked to other
third parties. The objects in T could be of any type and size, e.g., they could be tuples in a
relation, or relations in a database. An agent Ui receives a subset of objects, determined
either by a sample request or an explicit request:
1. Sample
request
2. Explicit
request
Guilt Model Analysis:
our model parameters interact and to check if the
interactions match our intuition, in this section we study two simple scenarios
as Impact of Probability p and Impact of Overlap between Ri and S. In each scenario we have a target that
has obtained all the distributor’s objects, i.e., T = S.
Algorithms:
1. Evaluation of Explicit Data Request Algorithms
In the first
place, the goal of these experiments was to see whether fake objects in the
distributed data sets yield significant improvement in our chances of detecting
a guilty agent. In the second place, we wanted to evaluate our e-optimal
algorithm relative to a random allocation.
2.
Evaluation of Sample Data Request Algorithms
With sample
data requests agents are not interested in particular objects. Hence, object
sharing is not explicitly defined by their requests. The distributor is
“forced” to allocate certain objects to multiple agents only if the number of
requested objects exceeds the number of objects in set T. The more data objects the agents request
in total, the more recipients on average an object has; and the more objects
are shared among different agents, the more difficult it is to detect a guilty
agent.
MODULES:
1.
Data Allocation Module:
The main focus of our project is the
data allocation problem as how can the distributor “intelligently” give data to
agents in order to improve the chances of detecting a guilty agent.
2. Fake Object Module:
Fake objects are objects generated by
the distributor in order to increase the chances of detecting agents that leak
data. The distributor may be able to add fake objects to the distributed data
in order to improve his effectiveness in detecting guilty agents. Our use of
fake objects is inspired by the use of “trace” records in mailing lists.
3. Optimization Module:
The Optimization Module is the distributor’s data
allocation to agents has one constraint and one objective. The distributor’s
constraint is to satisfy agents’ requests, by providing them with the number of
objects they request or with all available objects that satisfy their
conditions. His objective is to be able to detect an agent who leaks any
portion of his data.
4. Data Distributor:
A data distributor has given sensitive data to a set
of supposedly trusted agents (third parties). Some of the data is leaked and
found in an unauthorized place (e.g., on the web or somebody’s laptop). The
distributor must assess the likelihood that the leaked data came from one or
more agents, as opposed to having been independently gathered by other means.
Hardware Required:
v System : Pentium IV 2.4 GHz
v Hard Disk :
40 GB
v Floppy Drive : 1.44
MB
v Monitor : 15 VGA colour
v Mouse : Logitech.
v Keyboard : 110
keys enhanced.
v RAM : 256 MB
Software Required:
v O/S
: Windows
XP.
v Language
: Asp.Net, c#.
v Data
Base : Sql
Server 2005
hii..how to create fake object with orginal document.what are the steps for it?
ReplyDelete