I am working on a Web Application that includes long listings of names. The client originally wanted to have the names split up into div
s by letter so it is easy to jump to a particular name on the list.
Now, looking at the list, the client pointed out several letters that have only one or two names associated with them. He now wants to know if we can combine several consecutive letters if there are only a few names in each.
(Note that letters with no names are not displayed at all.)
What I do right now is have the database server return a sorted list, then keep a variable containing the current character. I run through the list of names, incrementing the character and printing the opening and closing div
and ul
tags as I get to each letter. I know how to adapt this code to combine some letters, however, the one thing I'm not sure about how to handle is whether a particular combination of letters is the best-possible one. In other words, say that I have:
A
- 12 namesB
- 2 namesC
- 1 nameD
- 1 nameE
- 1 nameF
- 23 namesI know how to end up with a group A-C
and then have D
by itself. What I'm looking for is an efficient way to realize that A
should be by itself and then B-D
should be together.
I am not really sure where to start looking at this.
If it makes any difference, this code will be used in a Kohana Framework module.
UPDATE 2012-04-04:
Here is a clarification of what I need:
Say the minimum number of items I want in a group is 30. Now say that letter A has 25 items, letters B, C, and D, have 10 items each, and letter E has 32 items. I want to leave A alone because it will be better to combine B+C+D. The simple way to combine them is A+B, C+D+E - which is not what I want.
In other words, I need the best fit that comes closest to the minimum per group.
If a letter contains more than 10 names, or whatever reasonable limit you set, do not combine it with the next one. However, if you start combining letters, you might have it run until 15 or so names are collected if you want, as long as no individual letter has more than 10. That's not a universal solution, but it's how I'd solve it.
I came up with this function using PHP. It groups letters that combined have over $ammount names in it.
function split_by_initials($names,$ammount,$tollerance = 0) {
$total = count($names);
foreach($names as $name) {
$filtered[$name[0]][] = $name;
}
$count = 0;
$key = '';
$temp = array();
foreach ($filtered as $initial => $split) {
$count += count($split);
$temp = array_merge($split,$temp);
$key .= $initial.'-';
if ($count >= $ammount || $count >= $ammount - $tollerance) {
$result[$key] = $temp;
$count = 0;
$key = '';
$temp = array();
}
}
return $result;
}
the 3rd parameter is used for when you want to limit the group to a single letter that doesn't have the ammount specified but is close enough.
Something like i want to split in groups of 30 but a has 25 to so, if you set a tollerance of 5, A will be left alone and the other letters will be grouped.
I forgot to mention but it returns a multi dimensional array with the letters it contains as key then the names it contains. Something like
Array ( [A-B-C-] => Array ( [0] => Bandice Bergen [1] => Arey Lowell [2] => Carmen Miranda ) )
It is not exactly what you needed but i think it's close enough.
Using the jsfiddle that mrsherman put, I came up with something that could work: http://jsfiddle.net/F2Ahh/
Obviously that is to be used as a pseudocode, some techniques to make it more efficient could be applied. But that gets the job done.
Javascrip Version: enhanced version with sort and symbols grouping
function group_by_initials(names,ammount,tollerance) {
tolerance=tollerance||0;
total = names.length;
var filtered={}
var result={};
$.each(names,function(key,value){
val=value.trim();
var pattern = /[a-zA-Z0-9&_\.-]/
if(val[0].match(pattern)) {
intial=val[0];
}
else
{
intial='sym';
}
if(!(intial in filtered))
filtered[intial]=[];
filtered[intial].push(val);
})
var count = 0;
var key = '';
var temp = [];
$.each(Object.keys(filtered).sort(),function(ky,value){
count += filtered[value].length;
temp = temp.concat(filtered[value])
key += value+'-';
if (count >= ammount || count >= ammount - tollerance) {
key = key.substring(0, key.length - 1);
result[key] = temp;
count = 0;
key = '';
temp = [];
}
})
return result;
}