Kindly look at the graph plot and data. what is the method or algorithm to find closest matching (highest degree of similarity) between plot "px4" and other plots. Any suggestion would be appreciated, if there exist any C# library or VF-Graph recognition algorithm can be adapted for this problem .
DATA
enr px1 px2 px3 px4 px5 px6
1 90 5 15 20 60 10
2 70 10 20 30 85 15
3 100 15 15 10 32 18
4 80 20 8 3 9 44
5 60 25 3 5 15 12
6 50 30 12 8 24 16
7 70 18 28 24 70 25
8 90 12 32 28 84 22
9 75 20 12 15 45 16
10 65 10 20 18 54 25
I'm no stats expert. But... I'd take a plot and compare the difference between each point and it's equivalent point on another plot, one point at a time. I'd use Math.Abs() to turn each of these 10 differences into a positive number and then use whatever method (mean, median, etc.) you wish, to take an average of the 10 differences. I'd repeat each comparison for every other plot. Most of the calculations can be ditched along the way, you only need to keep the average number for each plot. The smallest average would probably be the plot that closest matches.
Because I've not got much to do today...
Dictionary<string, int[]> plots = new Dictionary<string, int[]>();
plots.Add("px1", new int[] { 90, 70, 100, 80, 60, 50, 70, 90, 75, 65 });
plots.Add("px2", new int[] { 5, 10, 15, 20, 25, 30, 18, 12, 20, 10 });
plots.Add("px3", new int[] { 15, 20, 15, 8, 3, 12, 28, 32, 12, 20 });
plots.Add("px4", new int[] { 20, 30, 10, 3, 5, 8, 24, 28, 15, 18 });
plots.Add("px5", new int[] { 60, 85, 32, 9, 15, 24, 70, 84, 45, 54 });
plots.Add("px6", new int[] { 10, 15, 18, 44, 12, 16, 25, 22, 16, 25 });
string test = "px4";
string winner = string.Empty;
double smallestAverage = double.MaxValue;
foreach (string key in plots.Keys)
{
if (key == test)
{
continue;
}
int[] a = plots[test];
int[] b = plots[key];
double count = 0;
for (int point = 0; point <= 9; point++)
{
count += Math.Abs(a[point] - b[point]);
}
double average = count / 10;
if (average < smallestAverage)
{
smallestAverage = average;
winner = key;
}
}
Console.WriteLine("Winner: {0}", winner);
average
; you're performing an integer division. Note that if you don't calculate teh average at all, but use the sum/count
instead, you'll still get the same result - Rawling 2012-04-04 09:26
count
from an int
to a double
. The division by 10 is pretty redundant too, it'd only be useful if you need to record or display the averages - Adrian Thompson Phillips 2012-04-04 09:34
There are literally innumerable ways of defining the "difference" between two of your graphs.
If you treat your graphs as 10-dimensional vectors, you could use a vector norm.
If you want to treat them as real-valued functions on the interval [1, 10], you could use the norm on an L^p-space. (While this should involve integration, since your functions are all made up of straight segments you could calculate this norm exactly without having to do numerical approximation of the integral.)
Really, you need to decide how you want to define "similar", and then pick a method that acts like you'd expect.