SlideShare a Scribd company logo
1 of 6
Download to read offline
International
OPEN

Journal

ACCESS

Of Modern Engineering Research (IJMER)

Data Leakage Detectionusing Distribution Method
Anis Sultana Syed, Mini
Dept of CS, Nimra College of Engineering and Tech...

ABSTRACT: Modern business activities rely on extensive email exchange. Email leakages have
become widespread, and the severe damage caused by such leakages constitutes a disturbing problem
for organizations. We study the following problem: A data distributor has given sensitive data to a set
of supposedly trusted agents (third parties). If the data distributed to third parties is found in a
public/private domain then finding the guilty party is a nontrivial task to distributor. Traditionally, this
leakage of data is handled by water marking technique which requires modification of data. If the
watermarked copy is found at some unauthorized site then distributor can claim his ownership. To
overcome the disadvantages of using watermark [2], data allocation strategies are used to improve the
probability of identifying guilty third parties. The distributor must assess the likelihood that the leaked
came from one or more agents, as opposed to having been independently gathered by other means. In
this project, we implement and analyze a guilt model that detects the agents using allocation strategies
without modifying the original data. The guilty agent is one who leaks a portion of distributed data. We
propose data allocation strategies that improve the probability of identifying leakages. In some cases
we can also inject “realistic but fake” data record to further improve our changes of detecting leakage
and identifying the guilty party. The algorithms implemented using fake objects will improve the
distributor chance of detecting guilty agents. It is observed that by minimizing the sum objective the
chance of detecting guilty agents will increase. We also developed a framework for generating fake
objects.

Keywords: Sensitive Data, Fake Objects, Data Allocation Strategies

I. INTRODUCTION
Demanding market conditions encourage many companies to outsource certain business processes (e.g.
marketing, human resources) and associated activities to a third party. This model is referred as Business Process
Outsourcing (BPO) and it allows companies to focus on their core competency by subcontracting other activities
to specialists, resulting in reduced operational costs and increased productivity. Security and business assurance
are essential for BPO. In most cases, the service providers need access to a company's intellectual property and
other confidential information to carry out their services. For example a human resources BPO vendor may need
access to employee databases with sensitive information (e.g. social security numbers), a patenting law firm to
some research results, a marketing service vendor to the contact information for customers or a payment service
provider may need access to the credit card numbers or bank account numbers of customers.
The main security problem in BPO is that the service provider may not be fully trusted or may not be
securely administered. Business agreements for BPO try to regulate how the data will be handled by service
providers, but it is almost impossible to truly enforce or verify such policies across different administrative
domains. Due to their digital nature, relational databases are easy to duplicate and in many cases a service
provider may have financial incentives to redistribute commercially valuable data or may simply fail to handle it
properly. Hence, we need powerful techniques that can detect and deter such dishonest.
We study unobtrusive techniques for detecting leakage of a set of objects or records. Specifically, we
study the following scenario: After giving a set of objects to agents, the distributor discovers some of those same
objects in an unauthorized place. (For example, the data may be found on a web site, or may be obtained through
a legal discovery process.) At this point the distributor can assess the likelihood that the leaked data came from
one or more agents, as opposed to having been independently gathered by other means. We develop a model for
assessing the “guilt” of agents. We also present algorithms for distributing objects to agents, in a way that
improves our chances of identifying a leaker. Finally, we also consider the option of adding “fake” objects to the
distributed set.

| IJMER | ISSN: 2249–6645 |

www.ijmer.com

| Vol. 4 | Iss. 1 | Jan. 2014 |146|
Data Leakage Detectionusing Distribution Method
II. PROBLEM DEFINITION
Suppose a distributor owns a set T = {t1,tm } of valuable data objects. The distributor wants to share
some of the objects with a set of agents U1,U2,… ,Un but does wish the objects be leaked to other third parties. An
agent Ui receives a subset of objects Ri which belongs to T, determined either by a sample request or an explicit
request,
Sample Request Ri = SAMPLE ( T,mi ) : Any subset of mi records from T can be given to Ui. Explicit
Request Ri = EXPLICIT ( T,condi ) : Agent Ui receives all the T objects that satisfy condi .
The objects in T could be of any type and size, e.g., they could be tuples in a relation, or relations in a
database. After giving objects to agents, the distributor discovers that a set S of T has leaked. This means that
some third party called the target has been caught in possession of S. For example, this target may be displaying S
on its web site, or perhaps as part of a legal discovery process, the target turned over S to the distributor. Since
the agents U1,U2,… ,Un, have some of the data, it is reasonable to suspect them leaking the data. However, the
agents can argue that they are innocent, and that the S data was obtained by the target through other means.
2.1 Agent Guilt Model
Suppose an agent Ui is guilty if it contributes one or more objects to the target. The event that agent Uiis
guilty for a given leaked set S is denoted by Gi | S. The next step is to estimate Pr {Gi | S }, i.e.,the probability
that agent Gi is guilty given evidence S.
To compute the Pr {Gi| S}, estimate the probability that values in S can be “guessed” by the target. For
instance, say some of the objects in t are emails of individuals. Conduct an experiment and ask a person to find
the email of say 100 individuals, the person may only discover say 20, leading to an estimate of 0.2. Call this
estimate as Pt, the probability that object t can be guessed by the target.
The two assumptions regarding the relationship among the various leakage events.
Assumption 1: For all t, t ∈ S such that t ≠ ťthe provenance of t is independent of the provenance ofť.The term
provenance in this assumption statement refers to the source of a value t that appears in the leaked set. The source
can be any of the agents who have t in their sets or the target itself.
Assumption 2: An object t ∈ S can only be obtained by the target in one of two ways.A single agent Ui leaked t
from its own Ri set, or
The target guessed (or obtained through other means) t without the help of any of the n agents.
To find the probability that an agent Ui is guilty given a set S, consider the target guessed t1 with probability p
and that agent leaks t1 to S with the probability 1-p. First compute the probability that he leaks a single object t to
S. To compute this, define the set of agents Vt = { Ui| t_Ri } that have t in their data sets. Then using Assumption
2 and known probability p, we have
Pr {some agent leaked t to S} = 1- p
………..…1.1
Assuming that all agents that belong to Vt can leak t to S with equal probability and using Assumption 2 obtain,
Pr { __ leaked _ to _} =
____|__|,
__ __ ∈ __ $

…1.2

0,
__ !"_#
Given that agent Ui is guilty if he leaks at least one value to S, with Assumption 1 and Equation 1.2 compute the
probability Pr { Gi| S}, agent Ui is guilty,

…..1.3
2.2 Data Allocation Problem
The distributor “intelligently” gives data to agents in order to improve the chances of detecting a guilty
agent. There are four instances of this problem, depending on the type of data requests made by agents and
whether “fake objects” [4] are allowed. Agent makes two types of requests, called sample and explicit. Based on
the requests the fakes objects are added to data list. Fake objects are objects generated by the distributor that are
not in set T. The objects are designed to look like real objects, and are distributed to agents together with the T
objects, in order to increase the chances of detecting agents that leak data.

| IJMER | ISSN: 2249–6645 |

www.ijmer.com

| Vol. 4 | Iss. 1 | Jan. 2014 |147|
Data Leakage Detectionusing Distribution Method

The Fig. 1 represents four problem instances with the names EF, E%, SF an&S%& , where E stands for
explicit requests, S for sample requests, F for the use of fake objects, and % for the case where fake objects are
not allowed.
The distributor may be able to add fake objects to the distributed data in order to improve his
effectiveness in detecting guilty agents. Since, fake objects may impact the correctness of what agents do, so they
may not always be allowable. Use of fake objects is inspired by the use of “trace” records in mailing lists. The
distributor creates and adds fake objects to the data that he distributes to agents. In many cases, the distributor
may be limited in how many fake objects he can create.
In EF problems, objective values are initialized by agent’s data requests. Say, for example, that T = {t1,t2
} and there are two agents with explicit data requests such that R1 = {t1,t2 } and R2 = {t1}. The distributor cannot
remove or alter the R1 or R2 data to decrease the overlap R1 R2 . However, say the distributor can create one fake
object (B = 1) and both agents can receive one fake object (b1=b2 = 1). If the distributor is able to create more
fake objects, he could further improve the objective.
2.3 Optimization Problem
The distributor’s data allocation to agents has one constraint and one objective. The distributor’s
constraint is to satisfy agents’ requests, by providing them with the number of objects they request or with all
available objects that satisfy their conditions. His objective is to be able to detect an agent who leaks any portion
of his data.
We consider the constraint as strict. The distributor may not deny serving an agent request and may not
provide agents with different perturbed versions of the same objects. The fake object distribution as the only
possible constraint relaxation
The objective is to maximize the chances of detecting a guilty agent that leaks all his data objects.
The Pr {Gj|S =Ri } or simply Pr { Gj| Ri } is the probability that Uj agent is guilty if the distributor
discovers a leaked table S that contains all Ri objects. The difference functions
∆(_, )* is defined as:
∆(_, )*= Pr {Gi |Ri}- Pr {Gj|Ri }
………..1.4
1) Problem definition: Let the distributor have data requests from n agents. The distributor wants to give tables
R1, …..Rnto agents U1 , . . . ,Unrespectively, so that
Distribution satisfies agents’ requests; and
Maximizes the guilt probability differences (i, j ) for all i, j = 1. . . n and i = j.
Assuming that the Rt sets satisfy the agents’ requests, we can express the problem as a multi-criterion 2)
Optimization problem:
Maximize (. . . , ( i, j ), . . . ) i ! = j
……….… 1.5
(over R1, . . . ,Rn )
The approximation [3] of objective of the above equation does not depend on agent’s probabilities and
therefore minimize the relative overlap among the agents as
Minimize (. . . ,

|+,∩+.|
..…..1.6
+,

, . . . ) i != j

(over R1, . . . ,Rn )
| IJMER | ISSN: 2249–6645 |

www.ijmer.com

| Vol. 4 | Iss. 1 | Jan. 2014 |148|
Data Leakage Detectionusing Distribution Method
This approximation is valid if minimizing the relative overlap,|+,∩++,

.|

maximizes

( i, j ).

2.4 Objective Approximation
In case of sample request, all requests are of fixed size. Therefore, maximizing the chance of detecting a
guilty agent that leaks all his data by minimizing,|+,∩++,.| is equivalent to minimizing (|Ri∩Rj|). The
minimum
value of (|Ri∩Rj|) maximizes Π(|Ri∩Rj|) and ( i, j ) , since Π(|Ri |) is fixed.
∩
If agents have explicit data requests, then overlaps |R
R | are defined by their own requests and
|R∩R | are fixed. Therefore, minimizing | R | j is equivalenti jto maximizing | R | (with the addition of
i
j
i
i
∩
fake objects). The maximum value of | Ri | minimizes Π(Ri ) and maximizes ( i, j ), since Π( RiRj) is fixed.
Our paper focuses on identifying the leaker. So we propose to trace the ip address of the leaker. The file
is send to the agents in the form of email attachments which need a secret key to download it. This secret key is
generated using a random function and send to the agent either on the mobile number used at registration or to the
other global email service account such as gmail.
Whenever the secret key mismatch takes place the fake file gets downloaded. To further enhance our
objective approximation ip address tracking [5][6][7] is done of the system where fake object is downloaded.
Various commands are available for getting ip address information. Ping, tracert, nslookup, etc anyone may be
used to get it. The ip address is traced with the time so as to overcome problem of dynamic ip addressing. But as
we are doing this for an organisation there is no problem of dynamic ip. Or else if looking for the ip address
universally it is unique for that period of time therefore it can be traced to unique system of the leaker.

III. ALLOCATION STRATEGIES
In this section the allocation strategies [1] solve exactly or approximately the scalar versions of Equation
1.7 for the different instances presented in Fig. 1. In Section A deals with problems with explicit data requests
and in Section B with problems with sample data requests.
3.1 Explicit Data Request
In case of explicit data request with fake not allowed, the distributor is not allowed to add fake objects to
the distributed data. So Data allocation is fully defined by the agent’s data request.
In case of explicit data request with fake allowed, the distributor cannot remove or alter the requests R
from the agent. However distributor can add the fake object. In algorithm for data allocation for explicit request,
the input to this is a set of request R1,R2,……,Rn from n agents and different conditions for requests. The eoptimal algorithm finds the agents that are eligible to receiving fake objects. Then create one fake object in
iteration and allocate it to the agent selected. The e-optimal algorithm minimizes every term of the objective
summation by adding maximum number bi of fake objects to every set Ri yielding optimal solution.
Step 1: Calculate total fake records as sum of fake records allowed. Step 2: While total fake objects > 0
Step 3: Select agent that will yield the greatest improvement in the sum objective
i.e. i =
Step 4: Create fake record
Step 5: Add this fake record to the agent and also to fake record set. Step 6: Decrement fake record from total
fake record set.
Algorithm makes a greedy choice by selecting the agent that will yield the greatest improvement in the sumobjective.
3.2 Sample Data Request
With sample data requests, each agent Ui may receive any T from a subset out of

different ones.

Hence, there are
different allocations. In every allocation, the distributor can permute T objects and
keep the same chances of guilty agent detection. The reason is that the guilt probability depends only on which
agents have received the leaked objects and not on the identity of the leaked
objects. Therefore, from the distributor’s perspective there
/ |T| are different allocations. An object
allocation that satisfies requests and ignores the distributor’s objective is to give each agent a unique subset of T
of size m. The s-max algorithm allocates to an agent the data record that yields the minimum increase of the
maximum relative overlap among any pair of agents. The s-max algorithm is as follows.
| IJMER | ISSN: 2249–6645 |

www.ijmer.com

| Vol. 4 | Iss. 1 | Jan. 2014 |149|
Data Leakage Detectionusing Distribution Method
Step 1: Initialize Min_overlap←1, the minimum out of the maximum relative overlaps that the
allocations of different objects to U
i
Step 2: for k∈ {k |tk∈ Ri} do
Initialize max_rel_ov ← 0, the maximum relative overlap between Ri and any set Rj that the
allocation of t toU
Step 3: for allkj = 1,i..., n : j = i and tk∈ RjdoCalculate absolute overlap as
abs_ov ← |Ri∩Rj | + 1 Calculate relative overlap as rel_ov ← abs_ov / min (mi ,mj )
Step 4: Find maximum relative as max_rel_ov ← MAX (max_rel_ov, rel_ov) If max_rel_ov ≤ min_overlap then
min_overlap ← max_rel_ov
ret_k ← k Return ret_k
It can be shown that algorithm s-max is optimal for the sum-objective and the max-objective in problems where
M ≤ |T| and n < |T|. It is also optimal for the max-objective if |T| ≤ M ≤ 2 |T| or all agents request data of the same
size.
It is observed that the relative performance of algorithm and main conclusion do not change. If p approaches to 0,
it becomes easier to find guilty agents and algorithm performance converges. On the other hand, if p approaches
1, the relative differences among algorithms grow since more evidence is needed to find an agent guilty.
The algorithm presented implements a variety of data distribution strategies that can improve the distributor’s
chances of identifying a leaker. It is shown that distributing objects judiciously can make a significant difference
in identifying guilty agents, especially in cases where there is large overlap in the data that agents must receive.

IV. RELATED WORK
The presented guilt detection approach is related to the data provenance problem [8]: tracing the lineage
of S objects implies essentially the detection of the guilty agents. Suggested solutions are domain specific, such
as lineage tracing for data warehouses [9], and assume some prior knowledge on the way a data view is created
out of data sources. Our problem formulation with objects and sets is more general and simplifies lineage tracing,
since we do not consider any data transformation from Ri sets to S.
As far as the allocation strategies are concerned, our work is mostly relevant to watermarking that is
used as a means of establishing original ownership of distributed objects. Watermarks were initially used in
images, video and audio data [2] whose digital representation includes considerable redundancy. Our approach
and watermarking are similar in the sense of providing agents with some kind of receiver-identifying information.
However, by its very nature, a watermark modifies the item being watermarked. If the object to be watermarked
cannot be modified then a watermark cannot be inserted.
In such cases methods that attach watermarks to the distributed data are not applicable. Finally, there are
also lots of other works on mechanisms that allow only authorized users to access sensitive data. Such approaches
prevent in some sense data leakage by sharing information only with trusted parties. However, these policies are
restrictive and may make it impossible to satisfy agents’ requests.

V. CONCLUSION
In doing a business there would be no need to hand over sensitive data to agents that may unknowingly
or maliciously leak it. And even if we had to hand over sensitive data, in a perfect world we could watermark
each object so that we could trace its origins with absolute certainty. However, in many cases we must indeed
work with agents that may not be 100% trusted, and we may not be certain if a leaked object came from an agent
or from some other source. In spite of these difficulties, we have shown it is possible to assess the likelihood that
an agent is responsible for a leak, based on the overlap of his data with the leaked data and the data of other
agents, and based on the probability that objects can be “guessed” by other means.
Our model is relatively simple, but we believe it captures the essential trade-offs. The algorithms we
have presented implement a variety of data distribution strategies that can improve the distributor’s chances of
identifying a leaker. We have shown that distributing objects judiciously can make a significant difference in
identifying guilty agents, especially in cases where there is large overlap in the data that agents must receive.
Our future work includes the investigation of agent guilt models that capture leakage scenarios that are
not studied in this paper. For example, what is the appropriate model for cases where agents can collude and
identify fake tuples? A preliminary discussion of such a model is available in. Another open problem is the
| IJMER | ISSN: 2249–6645 |

www.ijmer.com

| Vol. 4 | Iss. 1 | Jan. 2014 |150|
Data Leakage Detectionusing Distribution Method
extension of our allocation strategies so that they can handle agent requests in an online fashion (the presented
strategies assume that there is a fixed set of agents with requests known in advance).

REFERENCES
[1].
[2].
[3].
[4].
[5].
[6].
[7].

Rudragouda G Patil, “Development of Data leakage Detection Using Data Allocation Strategies International
Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE II, JUNE 2011, [ISSN: 2231-4946].
S. Czerwinski, R. Fromm, and T. Hodes. Digital music distribution and audio watermarking.
L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression, 2002.
S. U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani. Towards robustness in query auditing. In
VLDB ’06.
Stevens Le Blond, Chao Zhang Arnaud Legout, Keith Ross, Walid Dabbous, Exploiting P2P Communications to
Invade Users’ Privacy
How to Trace an IP Address http://www.wikihow.com/Trace-an-IP-Address
P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In J. V. den
Bussche and V. Vianu, editors, Database Theory - ICDT 2001, 8th International Conference, London, UK, January
4-6, 2001, Proceedings, volume 1973 of Lecture Notes in Computer Science, pages 316-330. Springer, 2001

| IJMER | ISSN: 2249–6645 |

www.ijmer.com

| Vol. 4 | Iss. 1 | Jan. 2014 |151|

More Related Content

What's hot

Subscription fraud analytics using classification
Subscription fraud analytics using classificationSubscription fraud analytics using classification
Subscription fraud analytics using classificationSomdeep Sen
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionVikrant Arya
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionbunnz12345
 
Fraud Detection using Data Mining Project
Fraud Detection using Data Mining ProjectFraud Detection using Data Mining Project
Fraud Detection using Data Mining ProjectAlbert Kennedy III
 
Data Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage DetectionData Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage DetectionIOSR Journals
 
Pollyanna Document Classifier
Pollyanna Document ClassifierPollyanna Document Classifier
Pollyanna Document ClassifierVijay PG
 
Privacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage SystemPrivacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage SystemKumar Goud
 
IRJET-Fake Product Review Monitoring
IRJET-Fake Product Review MonitoringIRJET-Fake Product Review Monitoring
IRJET-Fake Product Review MonitoringIRJET Journal
 
IRJET- Fake Profile Identification using Machine Learning
IRJET-  	  Fake Profile Identification using Machine LearningIRJET-  	  Fake Profile Identification using Machine Learning
IRJET- Fake Profile Identification using Machine LearningIRJET Journal
 
Game Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime DetectionGame Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime DetectionIOSR Journals
 
Modeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage FraudModeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage FraudIOSR Journals
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionvineeta vineeta
 
A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...
A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...
A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...ijaia
 
Application of Data Mining and Machine Learning techniques for Fraud Detectio...
Application of Data Mining and Machine Learning techniques for Fraud Detectio...Application of Data Mining and Machine Learning techniques for Fraud Detectio...
Application of Data Mining and Machine Learning techniques for Fraud Detectio...Christian Adom
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningVenkat Projects
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...gerogepatton
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractVenkat Projects
 

What's hot (20)

Subscription fraud analytics using classification
Subscription fraud analytics using classificationSubscription fraud analytics using classification
Subscription fraud analytics using classification
 
547 551
547 551547 551
547 551
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Fraud Detection using Data Mining Project
Fraud Detection using Data Mining ProjectFraud Detection using Data Mining Project
Fraud Detection using Data Mining Project
 
Data Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage DetectionData Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage Detection
 
Pollyanna Document Classifier
Pollyanna Document ClassifierPollyanna Document Classifier
Pollyanna Document Classifier
 
Privacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage SystemPrivacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage System
 
IRJET-Fake Product Review Monitoring
IRJET-Fake Product Review MonitoringIRJET-Fake Product Review Monitoring
IRJET-Fake Product Review Monitoring
 
IRJET- Fake Profile Identification using Machine Learning
IRJET-  	  Fake Profile Identification using Machine LearningIRJET-  	  Fake Profile Identification using Machine Learning
IRJET- Fake Profile Identification using Machine Learning
 
Game Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime DetectionGame Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime Detection
 
Modeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage FraudModeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage Fraud
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...
A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...
A COMPARATIVE ANALYSIS OF DIFFERENT FEATURE SET ON THE PERFORMANCE OF DIFFERE...
 
Application of Data Mining and Machine Learning techniques for Fraud Detectio...
Application of Data Mining and Machine Learning techniques for Fraud Detectio...Application of Data Mining and Machine Learning techniques for Fraud Detectio...
Application of Data Mining and Machine Learning techniques for Fraud Detectio...
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstract
 

Viewers also liked

Regression techniques to study the student performance in post graduate exam...
Regression techniques to study the student performance in post  graduate exam...Regression techniques to study the student performance in post  graduate exam...
Regression techniques to study the student performance in post graduate exam...IJMER
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningIJMER
 
Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...
Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...
Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...IJMER
 
Colour Rendering For True Colour Led Display System
Colour Rendering For True Colour Led Display SystemColour Rendering For True Colour Led Display System
Colour Rendering For True Colour Led Display SystemIJMER
 
Optimized FIR filter design using Truncated Multiplier Technique
Optimized FIR filter design using Truncated Multiplier TechniqueOptimized FIR filter design using Truncated Multiplier Technique
Optimized FIR filter design using Truncated Multiplier TechniqueIJMER
 
Role of IT in Lean Manufacturing: A brief Scenario
Role of IT in Lean Manufacturing: A brief ScenarioRole of IT in Lean Manufacturing: A brief Scenario
Role of IT in Lean Manufacturing: A brief ScenarioIJMER
 
A Bayesian Probit Online Model Framework for Auction Fraud Detection
A Bayesian Probit Online Model Framework for Auction Fraud DetectionA Bayesian Probit Online Model Framework for Auction Fraud Detection
A Bayesian Probit Online Model Framework for Auction Fraud DetectionIJMER
 
Advanced Micro-Grids
Advanced Micro-Grids Advanced Micro-Grids
Advanced Micro-Grids IJMER
 
Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...
Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...
Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...IJMER
 
An Investigative Review on Thermal Characterization of Hybrid Metal Matrix C...
An Investigative Review on Thermal Characterization of Hybrid  Metal Matrix C...An Investigative Review on Thermal Characterization of Hybrid  Metal Matrix C...
An Investigative Review on Thermal Characterization of Hybrid Metal Matrix C...IJMER
 
On Contra – RW Continuous Functions in Topological Spaces
On Contra – RW Continuous Functions in Topological SpacesOn Contra – RW Continuous Functions in Topological Spaces
On Contra – RW Continuous Functions in Topological SpacesIJMER
 
Design &Analysis of Pure Iron Casting with Different Moulds
Design &Analysis of Pure Iron Casting with Different MouldsDesign &Analysis of Pure Iron Casting with Different Moulds
Design &Analysis of Pure Iron Casting with Different MouldsIJMER
 
Cm32996999
Cm32996999Cm32996999
Cm32996999IJMER
 
Experimental Study Using Different Tools/Electrodes E.G. Copper, Graphite on...
Experimental Study Using Different Tools/Electrodes E.G.  Copper, Graphite on...Experimental Study Using Different Tools/Electrodes E.G.  Copper, Graphite on...
Experimental Study Using Different Tools/Electrodes E.G. Copper, Graphite on...IJMER
 
Br31263266
Br31263266Br31263266
Br31263266IJMER
 
An Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) Tools
An Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) ToolsAn Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) Tools
An Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) ToolsIJMER
 
Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...
Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...
Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...IJMER
 
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...IJMER
 
High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...IJMER
 
Dr3211721175
Dr3211721175Dr3211721175
Dr3211721175IJMER
 

Viewers also liked (20)

Regression techniques to study the student performance in post graduate exam...
Regression techniques to study the student performance in post  graduate exam...Regression techniques to study the student performance in post  graduate exam...
Regression techniques to study the student performance in post graduate exam...
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
 
Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...
Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...
Numerical Analysis of Rotating Mixing of Fluids in Container Induced by Contr...
 
Colour Rendering For True Colour Led Display System
Colour Rendering For True Colour Led Display SystemColour Rendering For True Colour Led Display System
Colour Rendering For True Colour Led Display System
 
Optimized FIR filter design using Truncated Multiplier Technique
Optimized FIR filter design using Truncated Multiplier TechniqueOptimized FIR filter design using Truncated Multiplier Technique
Optimized FIR filter design using Truncated Multiplier Technique
 
Role of IT in Lean Manufacturing: A brief Scenario
Role of IT in Lean Manufacturing: A brief ScenarioRole of IT in Lean Manufacturing: A brief Scenario
Role of IT in Lean Manufacturing: A brief Scenario
 
A Bayesian Probit Online Model Framework for Auction Fraud Detection
A Bayesian Probit Online Model Framework for Auction Fraud DetectionA Bayesian Probit Online Model Framework for Auction Fraud Detection
A Bayesian Probit Online Model Framework for Auction Fraud Detection
 
Advanced Micro-Grids
Advanced Micro-Grids Advanced Micro-Grids
Advanced Micro-Grids
 
Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...
Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...
Performance Analysis of the Constructed Updraft Biomass Gasifier for Three Di...
 
An Investigative Review on Thermal Characterization of Hybrid Metal Matrix C...
An Investigative Review on Thermal Characterization of Hybrid  Metal Matrix C...An Investigative Review on Thermal Characterization of Hybrid  Metal Matrix C...
An Investigative Review on Thermal Characterization of Hybrid Metal Matrix C...
 
On Contra – RW Continuous Functions in Topological Spaces
On Contra – RW Continuous Functions in Topological SpacesOn Contra – RW Continuous Functions in Topological Spaces
On Contra – RW Continuous Functions in Topological Spaces
 
Design &Analysis of Pure Iron Casting with Different Moulds
Design &Analysis of Pure Iron Casting with Different MouldsDesign &Analysis of Pure Iron Casting with Different Moulds
Design &Analysis of Pure Iron Casting with Different Moulds
 
Cm32996999
Cm32996999Cm32996999
Cm32996999
 
Experimental Study Using Different Tools/Electrodes E.G. Copper, Graphite on...
Experimental Study Using Different Tools/Electrodes E.G.  Copper, Graphite on...Experimental Study Using Different Tools/Electrodes E.G.  Copper, Graphite on...
Experimental Study Using Different Tools/Electrodes E.G. Copper, Graphite on...
 
Br31263266
Br31263266Br31263266
Br31263266
 
An Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) Tools
An Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) ToolsAn Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) Tools
An Approach for Project Scheduling Using PERT/CPM and Petri Nets (PNs) Tools
 
Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...
Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...
Neural Network Based Optimal Switching Pattern Generation for Multiple Pulse ...
 
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
 
High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...
 
Dr3211721175
Dr3211721175Dr3211721175
Dr3211721175
 

Similar to Data Leakage Detectionusing Distribution Method

Data Leakage Detection
Data Leakage DetectionData Leakage Detection
Data Leakage DetectionAshwini Nerkar
 
A Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in OrganizationsA Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in OrganizationsIOSR Journals
 
Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)Chaitanya Kn
 
83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.ppt83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.pptnaresh2004s
 
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdfDrog3
 
IRJET- Data Leakage Detection System
IRJET- Data Leakage Detection SystemIRJET- Data Leakage Detection System
IRJET- Data Leakage Detection SystemIRJET Journal
 
data-leakage-detection
data-leakage-detectiondata-leakage-detection
data-leakage-detectionNagendra Kumar
 
Jpdcs1 data leakage detection
Jpdcs1 data leakage detectionJpdcs1 data leakage detection
Jpdcs1 data leakage detectionChaitanya Kn
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionkalpesh1908
 
Data leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdf
Data leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdfData leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdf
Data leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdfnaresh2004s
 
dataleakagedetection-1811210400vgjcd01.pptx
dataleakagedetection-1811210400vgjcd01.pptxdataleakagedetection-1811210400vgjcd01.pptx
dataleakagedetection-1811210400vgjcd01.pptxnaresh2004s
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectiongaurav kumar
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionMohit Pandey
 
Evolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY IndiaEvolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY Indiagauravmiishra701
 
Evolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY IndiaEvolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY Indiaaparnatikekar4
 

Similar to Data Leakage Detectionusing Distribution Method (20)

Data Leakage Detection
Data Leakage DetectionData Leakage Detection
Data Leakage Detection
 
A Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in OrganizationsA Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
 
Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)
 
Sub1555
Sub1555Sub1555
Sub1555
 
83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.ppt83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.ppt
 
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
 
IRJET- Data Leakage Detection System
IRJET- Data Leakage Detection SystemIRJET- Data Leakage Detection System
IRJET- Data Leakage Detection System
 
data-leakage-detection
data-leakage-detectiondata-leakage-detection
data-leakage-detection
 
Jpdcs1 data leakage detection
Jpdcs1 data leakage detectionJpdcs1 data leakage detection
Jpdcs1 data leakage detection
 
purnima.ppt
purnima.pptpurnima.ppt
purnima.ppt
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
DLD_SYNOPSIS
DLD_SYNOPSISDLD_SYNOPSIS
DLD_SYNOPSIS
 
709 713
709 713709 713
709 713
 
Data leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdf
Data leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdfData leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdf
Data leakage detbxhbbhhbsbssusbgsgsbshsbsection.pdf
 
dataleakagedetection-1811210400vgjcd01.pptx
dataleakagedetection-1811210400vgjcd01.pptxdataleakagedetection-1811210400vgjcd01.pptx
dataleakagedetection-1811210400vgjcd01.pptx
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Fake News Detection
Fake News DetectionFake News Detection
Fake News Detection
 
Evolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY IndiaEvolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY India
 
Evolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY IndiaEvolution of Forensic Data Analytics - EY India
Evolution of Forensic Data Analytics - EY India
 

More from IJMER

A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...IJMER
 
Developing Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingDeveloping Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingIJMER
 
Study & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreStudy & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreIJMER
 
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)IJMER
 
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...IJMER
 
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...IJMER
 
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...IJMER
 
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...IJMER
 
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationStatic Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationIJMER
 
High Speed Effortless Bicycle
High Speed Effortless BicycleHigh Speed Effortless Bicycle
High Speed Effortless BicycleIJMER
 
Integration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIntegration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIJMER
 
Microcontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemMicrocontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemIJMER
 
On some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesOn some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesIJMER
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...IJMER
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningIJMER
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessIJMER
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersIJMER
 
Studies On Energy Conservation And Audit
Studies On Energy Conservation And AuditStudies On Energy Conservation And Audit
Studies On Energy Conservation And AuditIJMER
 
An Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLAn Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLIJMER
 
Discrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyDiscrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyIJMER
 

More from IJMER (20)

A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...A Study on Translucent Concrete Product and Its Properties by Using Optical F...
A Study on Translucent Concrete Product and Its Properties by Using Optical F...
 
Developing Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed DelintingDeveloping Cost Effective Automation for Cotton Seed Delinting
Developing Cost Effective Automation for Cotton Seed Delinting
 
Study & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja FibreStudy & Testing Of Bio-Composite Material Based On Munja Fibre
Study & Testing Of Bio-Composite Material Based On Munja Fibre
 
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)
 
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...
 
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...
 
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...
 
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...
 
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationStatic Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
Static Analysis of Go-Kart Chassis by Analytical and Solid Works Simulation
 
High Speed Effortless Bicycle
High Speed Effortless BicycleHigh Speed Effortless Bicycle
High Speed Effortless Bicycle
 
Integration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIntegration of Struts & Spring & Hibernate for Enterprise Applications
Integration of Struts & Spring & Hibernate for Enterprise Applications
 
Microcontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation SystemMicrocontroller Based Automatic Sprinkler Irrigation System
Microcontroller Based Automatic Sprinkler Irrigation System
 
On some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological SpacesOn some locally closed sets and spaces in Ideal Topological Spaces
On some locally closed sets and spaces in Ideal Topological Spaces
 
Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...Intrusion Detection and Forensics based on decision tree and Association rule...
Intrusion Detection and Forensics based on decision tree and Association rule...
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine Learning
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
 
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersMaterial Parameter and Effect of Thermal Load on Functionally Graded Cylinders
Material Parameter and Effect of Thermal Load on Functionally Graded Cylinders
 
Studies On Energy Conservation And Audit
Studies On Energy Conservation And AuditStudies On Energy Conservation And Audit
Studies On Energy Conservation And Audit
 
An Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDLAn Implementation of I2C Slave Interface using Verilog HDL
An Implementation of I2C Slave Interface using Verilog HDL
 
Discrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One PreyDiscrete Model of Two Predators competing for One Prey
Discrete Model of Two Predators competing for One Prey
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Data Leakage Detectionusing Distribution Method

  • 1. International OPEN Journal ACCESS Of Modern Engineering Research (IJMER) Data Leakage Detectionusing Distribution Method Anis Sultana Syed, Mini Dept of CS, Nimra College of Engineering and Tech... ABSTRACT: Modern business activities rely on extensive email exchange. Email leakages have become widespread, and the severe damage caused by such leakages constitutes a disturbing problem for organizations. We study the following problem: A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). If the data distributed to third parties is found in a public/private domain then finding the guilty party is a nontrivial task to distributor. Traditionally, this leakage of data is handled by water marking technique which requires modification of data. If the watermarked copy is found at some unauthorized site then distributor can claim his ownership. To overcome the disadvantages of using watermark [2], data allocation strategies are used to improve the probability of identifying guilty third parties. The distributor must assess the likelihood that the leaked came from one or more agents, as opposed to having been independently gathered by other means. In this project, we implement and analyze a guilt model that detects the agents using allocation strategies without modifying the original data. The guilty agent is one who leaks a portion of distributed data. We propose data allocation strategies that improve the probability of identifying leakages. In some cases we can also inject “realistic but fake” data record to further improve our changes of detecting leakage and identifying the guilty party. The algorithms implemented using fake objects will improve the distributor chance of detecting guilty agents. It is observed that by minimizing the sum objective the chance of detecting guilty agents will increase. We also developed a framework for generating fake objects. Keywords: Sensitive Data, Fake Objects, Data Allocation Strategies I. INTRODUCTION Demanding market conditions encourage many companies to outsource certain business processes (e.g. marketing, human resources) and associated activities to a third party. This model is referred as Business Process Outsourcing (BPO) and it allows companies to focus on their core competency by subcontracting other activities to specialists, resulting in reduced operational costs and increased productivity. Security and business assurance are essential for BPO. In most cases, the service providers need access to a company's intellectual property and other confidential information to carry out their services. For example a human resources BPO vendor may need access to employee databases with sensitive information (e.g. social security numbers), a patenting law firm to some research results, a marketing service vendor to the contact information for customers or a payment service provider may need access to the credit card numbers or bank account numbers of customers. The main security problem in BPO is that the service provider may not be fully trusted or may not be securely administered. Business agreements for BPO try to regulate how the data will be handled by service providers, but it is almost impossible to truly enforce or verify such policies across different administrative domains. Due to their digital nature, relational databases are easy to duplicate and in many cases a service provider may have financial incentives to redistribute commercially valuable data or may simply fail to handle it properly. Hence, we need powerful techniques that can detect and deter such dishonest. We study unobtrusive techniques for detecting leakage of a set of objects or records. Specifically, we study the following scenario: After giving a set of objects to agents, the distributor discovers some of those same objects in an unauthorized place. (For example, the data may be found on a web site, or may be obtained through a legal discovery process.) At this point the distributor can assess the likelihood that the leaked data came from one or more agents, as opposed to having been independently gathered by other means. We develop a model for assessing the “guilt” of agents. We also present algorithms for distributing objects to agents, in a way that improves our chances of identifying a leaker. Finally, we also consider the option of adding “fake” objects to the distributed set. | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 1 | Jan. 2014 |146|
  • 2. Data Leakage Detectionusing Distribution Method II. PROBLEM DEFINITION Suppose a distributor owns a set T = {t1,tm } of valuable data objects. The distributor wants to share some of the objects with a set of agents U1,U2,… ,Un but does wish the objects be leaked to other third parties. An agent Ui receives a subset of objects Ri which belongs to T, determined either by a sample request or an explicit request, Sample Request Ri = SAMPLE ( T,mi ) : Any subset of mi records from T can be given to Ui. Explicit Request Ri = EXPLICIT ( T,condi ) : Agent Ui receives all the T objects that satisfy condi . The objects in T could be of any type and size, e.g., they could be tuples in a relation, or relations in a database. After giving objects to agents, the distributor discovers that a set S of T has leaked. This means that some third party called the target has been caught in possession of S. For example, this target may be displaying S on its web site, or perhaps as part of a legal discovery process, the target turned over S to the distributor. Since the agents U1,U2,… ,Un, have some of the data, it is reasonable to suspect them leaking the data. However, the agents can argue that they are innocent, and that the S data was obtained by the target through other means. 2.1 Agent Guilt Model Suppose an agent Ui is guilty if it contributes one or more objects to the target. The event that agent Uiis guilty for a given leaked set S is denoted by Gi | S. The next step is to estimate Pr {Gi | S }, i.e.,the probability that agent Gi is guilty given evidence S. To compute the Pr {Gi| S}, estimate the probability that values in S can be “guessed” by the target. For instance, say some of the objects in t are emails of individuals. Conduct an experiment and ask a person to find the email of say 100 individuals, the person may only discover say 20, leading to an estimate of 0.2. Call this estimate as Pt, the probability that object t can be guessed by the target. The two assumptions regarding the relationship among the various leakage events. Assumption 1: For all t, t ∈ S such that t ≠ ťthe provenance of t is independent of the provenance ofť.The term provenance in this assumption statement refers to the source of a value t that appears in the leaked set. The source can be any of the agents who have t in their sets or the target itself. Assumption 2: An object t ∈ S can only be obtained by the target in one of two ways.A single agent Ui leaked t from its own Ri set, or The target guessed (or obtained through other means) t without the help of any of the n agents. To find the probability that an agent Ui is guilty given a set S, consider the target guessed t1 with probability p and that agent leaks t1 to S with the probability 1-p. First compute the probability that he leaks a single object t to S. To compute this, define the set of agents Vt = { Ui| t_Ri } that have t in their data sets. Then using Assumption 2 and known probability p, we have Pr {some agent leaked t to S} = 1- p ………..…1.1 Assuming that all agents that belong to Vt can leak t to S with equal probability and using Assumption 2 obtain, Pr { __ leaked _ to _} = ____|__|, __ __ ∈ __ $ …1.2 0, __ !"_# Given that agent Ui is guilty if he leaks at least one value to S, with Assumption 1 and Equation 1.2 compute the probability Pr { Gi| S}, agent Ui is guilty, …..1.3 2.2 Data Allocation Problem The distributor “intelligently” gives data to agents in order to improve the chances of detecting a guilty agent. There are four instances of this problem, depending on the type of data requests made by agents and whether “fake objects” [4] are allowed. Agent makes two types of requests, called sample and explicit. Based on the requests the fakes objects are added to data list. Fake objects are objects generated by the distributor that are not in set T. The objects are designed to look like real objects, and are distributed to agents together with the T objects, in order to increase the chances of detecting agents that leak data. | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 1 | Jan. 2014 |147|
  • 3. Data Leakage Detectionusing Distribution Method The Fig. 1 represents four problem instances with the names EF, E%, SF an&S%& , where E stands for explicit requests, S for sample requests, F for the use of fake objects, and % for the case where fake objects are not allowed. The distributor may be able to add fake objects to the distributed data in order to improve his effectiveness in detecting guilty agents. Since, fake objects may impact the correctness of what agents do, so they may not always be allowable. Use of fake objects is inspired by the use of “trace” records in mailing lists. The distributor creates and adds fake objects to the data that he distributes to agents. In many cases, the distributor may be limited in how many fake objects he can create. In EF problems, objective values are initialized by agent’s data requests. Say, for example, that T = {t1,t2 } and there are two agents with explicit data requests such that R1 = {t1,t2 } and R2 = {t1}. The distributor cannot remove or alter the R1 or R2 data to decrease the overlap R1 R2 . However, say the distributor can create one fake object (B = 1) and both agents can receive one fake object (b1=b2 = 1). If the distributor is able to create more fake objects, he could further improve the objective. 2.3 Optimization Problem The distributor’s data allocation to agents has one constraint and one objective. The distributor’s constraint is to satisfy agents’ requests, by providing them with the number of objects they request or with all available objects that satisfy their conditions. His objective is to be able to detect an agent who leaks any portion of his data. We consider the constraint as strict. The distributor may not deny serving an agent request and may not provide agents with different perturbed versions of the same objects. The fake object distribution as the only possible constraint relaxation The objective is to maximize the chances of detecting a guilty agent that leaks all his data objects. The Pr {Gj|S =Ri } or simply Pr { Gj| Ri } is the probability that Uj agent is guilty if the distributor discovers a leaked table S that contains all Ri objects. The difference functions ∆(_, )* is defined as: ∆(_, )*= Pr {Gi |Ri}- Pr {Gj|Ri } ………..1.4 1) Problem definition: Let the distributor have data requests from n agents. The distributor wants to give tables R1, …..Rnto agents U1 , . . . ,Unrespectively, so that Distribution satisfies agents’ requests; and Maximizes the guilt probability differences (i, j ) for all i, j = 1. . . n and i = j. Assuming that the Rt sets satisfy the agents’ requests, we can express the problem as a multi-criterion 2) Optimization problem: Maximize (. . . , ( i, j ), . . . ) i ! = j ……….… 1.5 (over R1, . . . ,Rn ) The approximation [3] of objective of the above equation does not depend on agent’s probabilities and therefore minimize the relative overlap among the agents as Minimize (. . . , |+,∩+.| ..…..1.6 +, , . . . ) i != j (over R1, . . . ,Rn ) | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 1 | Jan. 2014 |148|
  • 4. Data Leakage Detectionusing Distribution Method This approximation is valid if minimizing the relative overlap,|+,∩++, .| maximizes ( i, j ). 2.4 Objective Approximation In case of sample request, all requests are of fixed size. Therefore, maximizing the chance of detecting a guilty agent that leaks all his data by minimizing,|+,∩++,.| is equivalent to minimizing (|Ri∩Rj|). The minimum value of (|Ri∩Rj|) maximizes Π(|Ri∩Rj|) and ( i, j ) , since Π(|Ri |) is fixed. ∩ If agents have explicit data requests, then overlaps |R R | are defined by their own requests and |R∩R | are fixed. Therefore, minimizing | R | j is equivalenti jto maximizing | R | (with the addition of i j i i ∩ fake objects). The maximum value of | Ri | minimizes Π(Ri ) and maximizes ( i, j ), since Π( RiRj) is fixed. Our paper focuses on identifying the leaker. So we propose to trace the ip address of the leaker. The file is send to the agents in the form of email attachments which need a secret key to download it. This secret key is generated using a random function and send to the agent either on the mobile number used at registration or to the other global email service account such as gmail. Whenever the secret key mismatch takes place the fake file gets downloaded. To further enhance our objective approximation ip address tracking [5][6][7] is done of the system where fake object is downloaded. Various commands are available for getting ip address information. Ping, tracert, nslookup, etc anyone may be used to get it. The ip address is traced with the time so as to overcome problem of dynamic ip addressing. But as we are doing this for an organisation there is no problem of dynamic ip. Or else if looking for the ip address universally it is unique for that period of time therefore it can be traced to unique system of the leaker. III. ALLOCATION STRATEGIES In this section the allocation strategies [1] solve exactly or approximately the scalar versions of Equation 1.7 for the different instances presented in Fig. 1. In Section A deals with problems with explicit data requests and in Section B with problems with sample data requests. 3.1 Explicit Data Request In case of explicit data request with fake not allowed, the distributor is not allowed to add fake objects to the distributed data. So Data allocation is fully defined by the agent’s data request. In case of explicit data request with fake allowed, the distributor cannot remove or alter the requests R from the agent. However distributor can add the fake object. In algorithm for data allocation for explicit request, the input to this is a set of request R1,R2,……,Rn from n agents and different conditions for requests. The eoptimal algorithm finds the agents that are eligible to receiving fake objects. Then create one fake object in iteration and allocate it to the agent selected. The e-optimal algorithm minimizes every term of the objective summation by adding maximum number bi of fake objects to every set Ri yielding optimal solution. Step 1: Calculate total fake records as sum of fake records allowed. Step 2: While total fake objects > 0 Step 3: Select agent that will yield the greatest improvement in the sum objective i.e. i = Step 4: Create fake record Step 5: Add this fake record to the agent and also to fake record set. Step 6: Decrement fake record from total fake record set. Algorithm makes a greedy choice by selecting the agent that will yield the greatest improvement in the sumobjective. 3.2 Sample Data Request With sample data requests, each agent Ui may receive any T from a subset out of different ones. Hence, there are different allocations. In every allocation, the distributor can permute T objects and keep the same chances of guilty agent detection. The reason is that the guilt probability depends only on which agents have received the leaked objects and not on the identity of the leaked objects. Therefore, from the distributor’s perspective there / |T| are different allocations. An object allocation that satisfies requests and ignores the distributor’s objective is to give each agent a unique subset of T of size m. The s-max algorithm allocates to an agent the data record that yields the minimum increase of the maximum relative overlap among any pair of agents. The s-max algorithm is as follows. | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 1 | Jan. 2014 |149|
  • 5. Data Leakage Detectionusing Distribution Method Step 1: Initialize Min_overlap←1, the minimum out of the maximum relative overlaps that the allocations of different objects to U i Step 2: for k∈ {k |tk∈ Ri} do Initialize max_rel_ov ← 0, the maximum relative overlap between Ri and any set Rj that the allocation of t toU Step 3: for allkj = 1,i..., n : j = i and tk∈ RjdoCalculate absolute overlap as abs_ov ← |Ri∩Rj | + 1 Calculate relative overlap as rel_ov ← abs_ov / min (mi ,mj ) Step 4: Find maximum relative as max_rel_ov ← MAX (max_rel_ov, rel_ov) If max_rel_ov ≤ min_overlap then min_overlap ← max_rel_ov ret_k ← k Return ret_k It can be shown that algorithm s-max is optimal for the sum-objective and the max-objective in problems where M ≤ |T| and n < |T|. It is also optimal for the max-objective if |T| ≤ M ≤ 2 |T| or all agents request data of the same size. It is observed that the relative performance of algorithm and main conclusion do not change. If p approaches to 0, it becomes easier to find guilty agents and algorithm performance converges. On the other hand, if p approaches 1, the relative differences among algorithms grow since more evidence is needed to find an agent guilty. The algorithm presented implements a variety of data distribution strategies that can improve the distributor’s chances of identifying a leaker. It is shown that distributing objects judiciously can make a significant difference in identifying guilty agents, especially in cases where there is large overlap in the data that agents must receive. IV. RELATED WORK The presented guilt detection approach is related to the data provenance problem [8]: tracing the lineage of S objects implies essentially the detection of the guilty agents. Suggested solutions are domain specific, such as lineage tracing for data warehouses [9], and assume some prior knowledge on the way a data view is created out of data sources. Our problem formulation with objects and sets is more general and simplifies lineage tracing, since we do not consider any data transformation from Ri sets to S. As far as the allocation strategies are concerned, our work is mostly relevant to watermarking that is used as a means of establishing original ownership of distributed objects. Watermarks were initially used in images, video and audio data [2] whose digital representation includes considerable redundancy. Our approach and watermarking are similar in the sense of providing agents with some kind of receiver-identifying information. However, by its very nature, a watermark modifies the item being watermarked. If the object to be watermarked cannot be modified then a watermark cannot be inserted. In such cases methods that attach watermarks to the distributed data are not applicable. Finally, there are also lots of other works on mechanisms that allow only authorized users to access sensitive data. Such approaches prevent in some sense data leakage by sharing information only with trusted parties. However, these policies are restrictive and may make it impossible to satisfy agents’ requests. V. CONCLUSION In doing a business there would be no need to hand over sensitive data to agents that may unknowingly or maliciously leak it. And even if we had to hand over sensitive data, in a perfect world we could watermark each object so that we could trace its origins with absolute certainty. However, in many cases we must indeed work with agents that may not be 100% trusted, and we may not be certain if a leaked object came from an agent or from some other source. In spite of these difficulties, we have shown it is possible to assess the likelihood that an agent is responsible for a leak, based on the overlap of his data with the leaked data and the data of other agents, and based on the probability that objects can be “guessed” by other means. Our model is relatively simple, but we believe it captures the essential trade-offs. The algorithms we have presented implement a variety of data distribution strategies that can improve the distributor’s chances of identifying a leaker. We have shown that distributing objects judiciously can make a significant difference in identifying guilty agents, especially in cases where there is large overlap in the data that agents must receive. Our future work includes the investigation of agent guilt models that capture leakage scenarios that are not studied in this paper. For example, what is the appropriate model for cases where agents can collude and identify fake tuples? A preliminary discussion of such a model is available in. Another open problem is the | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 1 | Jan. 2014 |150|
  • 6. Data Leakage Detectionusing Distribution Method extension of our allocation strategies so that they can handle agent requests in an online fashion (the presented strategies assume that there is a fixed set of agents with requests known in advance). REFERENCES [1]. [2]. [3]. [4]. [5]. [6]. [7]. Rudragouda G Patil, “Development of Data leakage Detection Using Data Allocation Strategies International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE II, JUNE 2011, [ISSN: 2231-4946]. S. Czerwinski, R. Fromm, and T. Hodes. Digital music distribution and audio watermarking. L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression, 2002. S. U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani. Towards robustness in query auditing. In VLDB ’06. Stevens Le Blond, Chao Zhang Arnaud Legout, Keith Ross, Walid Dabbous, Exploiting P2P Communications to Invade Users’ Privacy How to Trace an IP Address http://www.wikihow.com/Trace-an-IP-Address P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In J. V. den Bussche and V. Vianu, editors, Database Theory - ICDT 2001, 8th International Conference, London, UK, January 4-6, 2001, Proceedings, volume 1973 of Lecture Notes in Computer Science, pages 316-330. Springer, 2001 | IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss. 1 | Jan. 2014 |151|