Abstract:
As Cloud Computing becomes
prevalent, more and more sensitive information are being centralized into the
cloud. Although traditional searchable encryption schemes allow a user to
securely search over encrypted data through keywords and selectively retrieve
files of interest, these techniques support only exact keyword search. In this
paper, for the first time we formalize and solve the problem of effective fuzzy
keyword search over encrypted cloud data while maintaining keyword privacy.
Fuzzy keyword search greatly enhances system usability by returning the
matching files when users’ searching inputs exactly match the predefined
keywords or the closest possible matching files based on keyword similarity
semantics, when exact match fails. In our solution, we exploit edit distance to
quantify keywords similarity and develop two advanced techniques on
constructing fuzzy keyword sets, which achieve optimized storage and representation
overheads. We further propose a brand new symbol-based trie-traverse searching
scheme, where a multi-way tree structure is built up using symbols transformed
from the resulted fuzzy keyword sets. Through rigorous security analysis, we
show that our proposed solution is secure and privacy-preserving, while
correctly realizing the goal of fuzzy keyword search. Extensive experimental
results demonstrate the efficiency of the proposed solution.
Algorithm / Technique used:
String Matching Algorithm
Algorithm Description:
The approximate string matching algorithms
among them can be classified into two categories: on-line and off-line. The
on-line techniques, performing search without an index, are unacceptable for
their low search efficiency, while the off-line approach, utilizing indexing
techniques, makes it dramatically faster. A variety of indexing algorithms,
such as suffix trees, metric trees and q-gram methods, have been presented. At
the first glance, it seems possible for one to directly apply these string
matching algorithms to the context of searchable encryption by computing the
trapdoors on a character base within an alphabet. However, this trivial
construction suffers from the dictionary and statistics attacks and fails to
achieve the search privacy. An instance M of the data type string-matching is an
object maintaining a pattern and a string. It provides a collection of
different algorithms for computation of the exact string matching problem. Each
function computes a list of all starting positions of occurrences of the
pattern in the string.
System Architecture:
Existing System:
This
straightforward approach apparently provides fuzzy keyword search over the
encrypted files while achieving search privacy using the technique of secure
trapdoors. However, this approaches serious efficiency disadvantages. The
simple enumeration method in constructing fuzzy key-word sets would introduce
large storage complexities, which greatly affect the usability.
For example, the
following is the listing variants after a substitution operation on the first
character of keyword
CASTLE: {AASTLE, BASTLE, DASTLE,
YASTLE, ZASTLE}.
Proposed System:
Main Modules:
1.
Wildcard – Based Technique
2.
Gram - Based Technique
3.
Symbol – Based Trie – traverse Search Scheme
1. Wildcard – Based Technique:
In the above
straightforward approach, all the variants of the keywords have to be listed
even if an operation is performed at the same position. Based on the above
observation, we proposed to use an wildcard to denote edit operations at the
same position. The wildcard-based fuzzy set edits distance to solve the problems.
For example, for the keyword CASTLE with the pre-set edit distance 1, its wildcard based fuzzy keyword set
can be constructed as
SCASTLE, 1 = {CASTLE, *CASTLE,*ASTLE, C*ASTLE, C*STLE, CASTL*E, CASTL*, CASTLE*}.
Edit Distance:
a. Substitution
b. Deletion
c. Insertion
a) Substitution : changing one character to another
in a word;
b)
Deletion : deleting
one character from a word;
c) Insertion: inserting a single character
into a word.
2. Gram – Based Technique:
Another efficient
technique for constructing fuzzy set is based on grams. The gram of a string is
a substring that can be used as a
signature for efficient approximate search. While gram has been widely used for
constructing inverted list for approximate string search, we use gram for the
matching purpose. We propose to utilize the fact that any primitive edit
operation will affect at most one specific character of the keyword, leaving
all the remaining characters untouched. In other words, the relative order of
the remaining characters after the primitive operations is always kept the same
as it is before the operations.
For
example, the gram-based fuzzy set SCASTLE, 1 for keyword CASTLE can be constructed
as
{CASTLE, CSTLE, CATLE, CASLE, CASTE, CASTL, ASTLE}.
3. Symbol – Based Trie – traverse Search
Scheme
To enhance the search efficiency, we now
propose a symbol-based trie-traverse search scheme, where a multi-way tree is constructed for
storing the fuzzy keyword set over a finite symbol set. The key idea behind
this construction is that all trapdoors sharing a common prefix may have common
nodes. The root is associated with an empty set and the symbols in a trapdoor
can be recovered in a search from the root to the leaf that ends the trapdoor.
All fuzzy words in the trie can be found by a depth-first search.
In this section, we consider a natural extension from the previous
single-user setting to multi-user setting, where a data owner stores a file
collection on the cloud server and allows an arbitrary group of users to search
over his file collection.
System Requirements:
Hardware Requirements:
•
System :
Pentium IV 2.4 GHz.
•
Hard Disk : 40 GB.
•
Floppy Drive : 1.44 Mb.
•
Monitor : 15 VGA Colour.
•
Mouse : Logitech.
•
Ram : 512 Mb.
Software Requirements:
•
Operating system : - Windows XP.
•
Coding Language : DOT NET
•
Data Base : SQL Server 2005
Conclusion:
1. In
this paper, for the first time we formalize and solve the problem of supporting
efficient yet privacy-preserving fuzzy search for achieving effective
utilization of remotely stored encrypted data in Cloud Computing.
2. We design two advanced techniques (i.e., wildcard-based and gram-
based techniques) to construct the storage-efficient fuzzy keyword sets by
exploiting two significant observations on the similarity metric of edit distance.
3. Based on the constructed fuzzy keyword sets, we further propose a
brand new symbol-based trie-traverse searching scheme, where a multi-way tree
structure is built up using symbols transformed from the resulted fuzzy keyword
sets.
4. Through rigorous security analysis, we show that our proposed
solution is secure and privacy- preserving, while correctly realizing the goal
of fuzzy keyword search. Extensive experimental results demonstrate the
efficiency of our solution.
No comments:
Post a Comment