Closest matching(recognition) line graph plot method or algorithm

Go To StackoverFlow.com

1

Kindly look at the graph plot and data. what is the method or algorithm to find closest matching (highest degree of similarity) between plot "px4" and other plots. Any suggestion would be appreciated, if there exist any C# library or VF-Graph recognition algorithm can be adapted for this problem .

px PLOTS from data

DATA

    enr px1 px2 px3 px4 px5 px6
    1   90  5   15  20  60  10
    2   70  10  20  30  85  15
    3   100 15  15  10  32  18
    4   80  20  8   3   9   44
    5   60  25  3   5   15  12
    6   50  30  12  8   24  16
    7   70  18  28  24  70  25
    8   90  12  32  28  84  22
    9   75  20  12  15  45  16
    10  65  10  20  18  54  25
2012-04-04 07:46
by Raj
At the end of the day, the answer depends on why you're needing to do this. If it's some Programming in C# homework, then you're probably okay using any old method, the more you show investigation into different methods the better. However if you're needing to do this for analysing government medical data then you'd be best to contract someone with some sort of statistics doctorate. :- - Adrian Thompson Phillips 2012-04-04 09:55


3

I'm no stats expert. But... I'd take a plot and compare the difference between each point and it's equivalent point on another plot, one point at a time. I'd use Math.Abs() to turn each of these 10 differences into a positive number and then use whatever method (mean, median, etc.) you wish, to take an average of the 10 differences. I'd repeat each comparison for every other plot. Most of the calculations can be ditched along the way, you only need to keep the average number for each plot. The smallest average would probably be the plot that closest matches.

Because I've not got much to do today...

Dictionary<string, int[]> plots = new Dictionary<string, int[]>();

plots.Add("px1", new int[] { 90, 70, 100, 80, 60, 50, 70, 90, 75, 65 });
plots.Add("px2", new int[] { 5, 10, 15, 20, 25, 30, 18, 12, 20, 10 });
plots.Add("px3", new int[] { 15, 20, 15, 8, 3, 12, 28, 32, 12, 20 });
plots.Add("px4", new int[] { 20, 30, 10, 3, 5, 8, 24, 28, 15, 18 });
plots.Add("px5", new int[] { 60, 85, 32, 9, 15, 24, 70, 84, 45, 54 });
plots.Add("px6", new int[] { 10, 15, 18, 44, 12, 16, 25, 22, 16, 25 });

string test = "px4";
string winner = string.Empty;
double smallestAverage = double.MaxValue;

foreach (string key in plots.Keys)
{
    if (key == test)
    {
        continue;
    }

    int[] a = plots[test];
    int[] b = plots[key];

    double count = 0;

    for (int point = 0; point <= 9; point++)
    {
        count += Math.Abs(a[point] - b[point]);
    }

    double average = count / 10;

    if (average < smallestAverage)
    {
        smallestAverage = average;
        winner = key;
    }
}

Console.WriteLine("Winner: {0}", winner);
2012-04-04 08:03
by Adrian Thompson Phillips
This is essentially using the vector 1-norm. Be careful with your average; you're performing an integer division. Note that if you don't calculate teh average at all, but use the sum/count instead, you'll still get the same result - Rawling 2012-04-04 09:26
Cheers, you're right, I've changed count from an int to a double. The division by 10 is pretty redundant too, it'd only be useful if you need to record or display the averages - Adrian Thompson Phillips 2012-04-04 09:34


2

There are literally innumerable ways of defining the "difference" between two of your graphs.

If you treat your graphs as 10-dimensional vectors, you could use a vector norm.

If you want to treat them as real-valued functions on the interval [1, 10], you could use the norm on an L^p-space. (While this should involve integration, since your functions are all made up of straight segments you could calculate this norm exactly without having to do numerical approximation of the integral.)

Really, you need to decide how you want to define "similar", and then pick a method that acts like you'd expect.

2012-04-04 09:04
by Rawling
thanks Got the Idea...i am considering Correlation and Regression Line for matching criteria thanks again - Raj 2012-04-05 02:44
Ads