Friday, 10 August 2012

Design and Production Process Project


Providing Security against SQL Injection Attack in Web Applicatio 
            
                                                                                       MAMTA SAHARE 44
Algorithm of the Needleman-Wunsch Algorithm
1. Initialization
C(0, 0) = 0
C(0, i) = −i * d
C(j, 0) = −j * d

2. Main Iteration
For each i = 1 . . . M
For each j = 1 . . . N
C(i, j) = max
                       { C(i − 1, j − 1) + s(xi, yj), case 1
                         C(i − 1, j) − d , case2
                      C(i, j − 1) − d , case3
                     }
Ptr(i, j) =
                { DIAG , if case 1
                LEFT , if case 2
               UP , if case 3
              }
Where C is the Score Matrix and Ptr is the Traceback Matrix.
Suppose there are two sequences
e.g.               Normal:- AM EQ (profile amino acid code)
                     Attacked:- AM EQ AM
The Needleman-Wunsch algorithm consists of three steps
1.    Initialization of the Score Matrix.
Score Matrix is a matrix which is used in Needleman wunsch algorithm to calculate best score. It also helps to construct a matrix which in turn helps to deduce best alignment between two sequences.
2.    Calculation of scores and Filling the Traceback matrix.
Traceback Matrix is used to deduce a Traceback path. This Traceback path helps to find best alignment.
3.    Deducing the alignment from the Traceback matrix.
 Initialization of the Score Matrix
The cells of the Score Matrix are labeled C(i, j) where
i = 1; 2; …..;N and j = 1; 2;….;M

A
M
E
Q
A
M
C(1,1)
C(1,2)
C(1,3)
C(1,4)
C(1,5)
C(1,6)
C(1,7)
A
C(2,1)
C(2,2)
C(2,3)
C(2,4)
C(2,5)
C(2,6)
C(2,7)
M
C(3,1)
C(3,2)
C(3,3)
C(3,4)
C(3,5)
C(3,6)
C(3,7)
E
C(4,1)
C(4,2)
C(4,3)
C(4,4)
C(4,5)
C(4,6)
C(4,7)
Q
C(5,1)
C(5,2)
C(5,3)
C(5,4)
C(5,5)
C(5,6)
C(5,7)

SCORE MATRIX
This is the initialization of first row and first column of the score matrix. This can be done by any value. Here the initialization is done with the values 0, -1,-2, …………..     
A
M
E
Q
A
M
 0  
-1
-2
-3
-4
-5
-6
A
-1
M
-2
E
-3
Q
-4
                                                                
TRACEBACK MATRIX
This is the initialization of first row and first column of Traceback Matrix.
First row is initialized with a value ‘left’ and first column is initialized with a value ‘up’. This will help to deduce Traceback Path.
A
M
E
Q
A
M
done
left
left
left
left
Left
left
A
up
M
up
E
up
Q
up




3. Scoring
Formula for calculating the identity score
The score of any cell C(i , j) is the maximum of:
    qdiag = C(i -1, j - 1) + S(i ; j)                       
qup = C(i -1,j) + g
qleft = C(i , j - 1) + g                                    ----------------(1)
Where S(i , j)   is a score given to whether i-th element and j-th element of alignment are identical or not. Gap is a cut in score for insertion of blank, which fills matrix C. The score lower right end of matrix C is the maximum score of global alignment.
 The calculation for the cell C(2 , 2):-
           i=2, j=2, g=(-1)
qdiag = C(1, 1) + S(A,A) = 0 + 1 = 1
qup = C(1, 2) + g = -1 + (-1) = -2
qleft = C(2, 1) + g = -1 + (-1) = -2
Here the maximum value is 1 so we put C(2,2) =1 in score matrix and fill diag into the Traceback matrix in the cell Ptr(2,2)=diag.
SCORE MATRIX
In this matrix the value for the cell c(2,2) is filled as calculated with the  formula 1. 
A
M
E
Q
A
M
0
-1
-2
    -3
-4
-5
-6
A
   -1
  1
M
   -2
E
   -3
Q
   -4

TRACEBACK MATRIX
Since the value calculated for the cell c(2,2) has come from qleft. Thus ptr(2,2)=diag is filled in the Traceback Matrix.
A
M
E
Q
A
M
 done
left
left
left
left
left
Left
A
up
diag
M
up
E  
up
Q
up

This process is continued until all the blank cells are fulfilled.
SCORE MATRIX     
Here the values are calculating for all the cells of score matrix. The completely filled matrix is given below
A
M
E
Q
A
M
 0
-1
-2
-3
-4
-5
-6
A
-1
1
0
-1
-2
-3
-4
M
-2
0
2
1
0
-1
0
E
-3
-2
1
3
2
1
0
Q
-4
-3
0
2
4
3
2




TRACEBACK MATRIX
After filling Score Matrix, different cells of Traceback Matrix are to be filled according to their values in Score Matrix. The final Traceback Matrix obtained is given below
A
M
E
Q
A
M
done
left
left
left
left
left
Left
A
up
diag
left
left
left
left
Left
M
up
up
diag
left
left
left
Diag
E
up
up
up
diag
left
left
left
Q
up
up
up
up
diag
left
Left

For deducing best alignment, Traceback path is to be formed. For calculating Traceback path, start from rightmost lower corner. In the first cell i.e. c(5,7)=left is written, so now move in the leftward direction. In the next cell again a left is found, so again move in leftward direction. Now in the next cell a diag is found, so now move in diagonal direction. Proceeding in the similar manner, traverse until done is found.
The path formed in this manner is called Traceback path. The final matrix formed is given below 
A
M
E
Q
A
M
done
left
left
 left
left
left
left
A
up
diag
left
Left
left
left
left
M
up
up
diag
 left
left
left
diag
E
up
up
up
diag
left
left
left
Q
up
up
up
up
diag
left
left

For deducing best alignment, start traversing Traceback path. Start again from rightmost lower corner. The rule to be followed for deducing the alignment is as follows
1. If a left is encountered, place a gap in leftward sequence.
2. If a up is encountered, place a gap in upward sequence.
3. If a diag is encountered, then it means that the two characters are aligned. 
With the help of given formulas, traverse the Traceback path to introduce gap in the sequences.
AM EQ AM
AM EQ
Best score is=2
Step 5 Measure similarity of Na Code Amino and Aa Code Amino
Alignment identity I for measuring similarity to detect whether Pa Code Amino and Aa Code Amino are injected is computed as shown in formula.

I = (alignment match code (Pa Code Amino, Aa Code Amino)/
Aa Code Amino length + gap)) %100

The identity for the measurement of similarity is computed by dividing identity scores of two sequence alignments by sum of blank length and whole sequence length. Namely, upon calculating similarity, score is computed in consideration of sequence length both in the case of long sequence length with high identity score and in the case of short sequence length with low identity score, which secures balance. Alignment is compared in order to judge which alignment is superior. If the code sequences are 100% identical then this query is allowed to be executed otherwise this is rejected. The identity is calculated for given two sequences


No comments:

Post a Comment