A module that determines similarity between two handwritten signatures using the Pearson product-moment correlation coefficient
This module uses the Pearson product-moment correlation coefficient to determine how similar two handwritten signatures are to one another
Identifying an individuals signature across a multitude of organisations and documents is tough. Hancock doesn't attempt to determine to whom each signature belongs, it only gives a value (between 0 and 1) which can aid a human in identifying a pattern
Hancock doesn't compare each image pixel-for-pixel. It generates a profile of each signature by counting adding up the number of black pixels along a Y-axis for each position along the X-axis of an image. These values are then used to plot a curve which is can be compared to other curves (generated in the same manner) with the Pearson correlation coefficient.
Before a profile is generated for comparison, each image is first normalised to remove erroneous data.
Each image to be compared has a profile generated. These profiles are the items that are compared, not the images themselves.
countsarray) with the size of the image width. Each index in this array corresponds to the X-axis of the image we're generating a profile for (it's a one-dimensional array).
countsarray which represents the X value of the pixel that we're checking.
(we also determine the locations and sizes of peaks in the counts data, which could be used for further analysis, but this information isn't currently used by Hancock at this time)
At this point, we now have two (or more) one-dimensional arrays containing the number of black pixels in a y-axis along the x-axis for each image. We don't do a straight comparison between the datasets to return a value. Instead, we break each curve up into 10 equal sections and apply dynamic time warping to each section to try and find the best correlation between each sections closest neighbor. The closest correlation for each section is added to the closest correlation for every other section and the average of those correlations are returned.
countsarray from the profile process is broken up into 10 equal sections