How to generate a sequence of n random positive integers which add up to some value?

I'm trying to generate an array of integers contains randoms adding up to a particular value. Here's my code:

private long[] getRandoms(long size , long sum) throws Exception {
  double iniSum = 0;
  System.out.println("sum = " + sum);
  long[] ret = new long[(int) size];
  for (int i = 0 ; i < ret.length; i++) {
    ret[i] = randomInRange(1, sum);
    iniSum += ret[i];
  }

  double finSum = 0;
  for (int i = 0 ; i < ret.length; i++) {
    ret[i] =  Math.round((sum * ret[i]) / iniSum);
    System.out.println("ret[" + i +"] = " + ret[i]);
    finSum += ret[i];
  }

  if (finSum != sum) throw new Exception("Could not find " + size + " numbers adding up to " + sum  + " . Final sum = " + finSum);
  return ret;
}



private long randomInRange(long min , long max) {
  Random rand = new Random();
  long ret = rand.nextInt((int) (max - min + 1)) + min;
  System.out.println("ret = " + ret);
  return ret;
}

However, the results are not accurate, for instance:

Could not find 100 numbers adding up to 4194304 . Final sum = 4194305.0

I think I'm losing accuracy in this bit:

(sum * ret[i]) / iniSum

Can you recommend an alternative algorithm or a fix in my code which can help me achieve this objective?

2012-04-05 02:12
by Dhruv

Is the same exception thrown every time, or is the exception different when you run the program more than once - Adam Mihalcin 2012-04-05 02:15

"I'm trying to generate an array of integers contains randoms adding up to a particular value." (scratches head) Would that not mean that at least one of the values in the the collection was not random - Andrew Thompson 2012-04-05 02:16

@AndrewThompson Had a good laugh - assylias 2012-04-05 02:17

So your idea appears to be pick 100 random integers, sum them, then scale every integer by the ratio of the desired sum to the sum of the original 100 numbers. Not clear to me that would ever work, except by luck - QuantumMechanic 2012-04-05 02:19

Pseudorandom numbers which simulate blocks of memory adding to a certain size in unit test. Is that better? Do you have a solution - Dhruv 2012-04-05 02:20

@AdamMihalcin The exception is different each time because there's a call to Math.random in the helper method - Dhruv 2012-04-05 02:22

(shrugs) have an indefinite number of blocks of memory (of random size), when the remainder gets below a limit, assign it to the last block - Andrew Thompson 2012-04-05 02:22

@emory Good point. 'pseudo-random' is typically good enough for computing devices, but that does not make it the same as 'random' - Andrew Thompson 2012-04-05 02:23

For what it's worth, unit tests should not have random behavior. They should have specifically designed behavior to test a specific case - Reverend Gonzo 2012-04-05 02:23

Why are you using double for your sums vs long? And you are aware, aren't you, that random behavior assures that the algorithm will fail most of the time - Hot Licks 2012-04-05 02:30

@QuantumMechanic: Taking n random numbers (in any range, as long as the distribution is uniform), summing them, and dividing them all by the sum is the standard way to generate n random "fair" numbers from [0,1] which sum to 1. It should be obvious that multiplying them all by another number k will give you n random "fair" numbers from [0,k] - BlueRaja - Danny Pflughoeft 2012-04-06 16:08

Each time you scale a value with ret[i] = Math.round((sum * ret[i]) / iniSum), you will lose some precision, partly from the division operation itself, but mostly from storing the scaled value as an integer. The later situation is similar to proportional electoral systems where a small number of seats must be allocated in proprtion to a larger numbers of votes.

Two techniques for mitigating this:

First scale all the values in the list, but keep track of the difference between the ideal scaled value (a real number) and stored scaled value. Use truncation instead of rounding, so that the discrepency will always be positive. If there is a discrepency, you can increment some of the values in order of the difference between the ideal amount and the current stored amount.

long finSum = 0;  // Might as well be a long
float[] idealValues = new float[size];
for (int i = 0 ; i < ret.length; i++) {
    ideal[i] = (sum * ret[i]) / iniSum;
    ret[i] = (long) (ideal[i]);  // Truncate, not round
    System.out.println("ret[" + i +"] = " + ret[i]);
    finSum += ret[i];
}

/* There are iniSum - finSum extra amounts to add. */
for (int i = 0; i < iniSum - finSum; i++)
{
    /* Pick a target slot to increment.  This could be done by keeping a priority queue. */
    int target = 0;
    float bestScore = 0.0;
    for (int j = 0; j < size; j++) {
        float score = (ideal[j] - ret[j])/ret[j];
        if (score > bestScore) {
            target = j;
            bestScore = score;
        }
    }

    /* Allocate an additional value to the target. */
    ret[target]++;
}

Or more simply, you could just set the last value in the list to whatever is outstanding after scaling doing all the others. That does statistically skew the output, however.

2012-04-05 02:35
by Edmund

I like this approach, however you could just put out those extra amount by choosing a random slot. I don't think it will bias the distribution any worse then the existing method - Michael Anderson 2012-04-05 02:55

This solution mitigates the floating-point inaccuracy problem, but does not eliminate it. There are solutions that do not rely on floating-point at all, such as @woodings or my answer - BlueRaja - Danny Pflughoeft 2012-04-06 18:44

Just got an idea. Initialize the array to all 0. Randomly pick a number in the array and increase by 1 and decrease sum by 1 until sum is 0. It's not practical at all when sum is large:)

long ret = new long[size];
for(int i=0;i<size;i++) ret[i]=0;
for(int i=0;i<sum;i++) {
  int n = random(0,size);
  ret[n]++;
}

2012-04-05 02:25
by woodings

This has the advantage of being unbiased, but the disadvantage of being slow - Michael Anderson 2012-04-05 02:38

A modification to this is to divide the number up equally (or nearly eaually) into N pieces, and then repeatedly pick two pieces at random and shift some of the values between them. Kinda like shuffling. I'm not sure how much you'd need to shuffle to get something truely random. You might also do this shifting gradually decresing amounts about too .. start large, and work to small .. this may give a better distribution of numbers - Michael Anderson 2012-04-05 02:43

Yes, there is a standard way to do this which avoids floating-point inaccuracies. It is related to the method of counting the number of ways for n numbers to sum to s. It is not complicated, but oddly enough no one has mentioned it yet, so here goes:

Imagine we have a string with n-1 X's in it and s o's. So, for example, for s=5, n=3, one example string would be

oXooXoo

Notice that the X's divide the o's into three distinct groupings: one of length 1, length 2, and length 2. This corresponds to the solution [1,2,2]. Every possible string gives us a different solution, by counting the number of o's grouped together (a 0 is possible: for example, XoooooX would correspond to [0,5,0]).

So, all we need to do is imagine creating such a string by choosing random positions for the n-1 X's, then figure out what solution that corresponds to. Here is a more detailed breakdown:

Generate n-1 unique random numbers between [1, s+n-1]. Doing this is simple enough with rejection sampling - if a number is chosen twice, just drop it and pick another one.
Sort them, then figure out the number of o's between each X. This number turns out to be currentValue - previousValue - 1.

Finally, here is some example Java (untested) which should do this:

private List<long> getRandomSequenceWithSum(int numValues, long sum)
{
    List<long> xPositions = new List<long>();
    List<long> solution = new List<long>();
    Random rng = new Random();

    //Determine the positions for all the x's
    while(xPositions.size() < numValues-1)
    {
        //See https://stackoverflow.com/a/2546186/238419
        //for definition of nextLong
        long nextPosition = nextLong(rng, sum+numValues-1) + 1; //Generates number from [1,sum+numValues-1]
        if(!xPositions.contains(nextPosition))
            xPositions.add(nextPosition);
    }

    //Add an X at the beginning and the end of the string, to make counting the o's simpler.
    xPositions.add(0);
    xPositions.add(sum+numValues);
    Collections.sort(xPositions);

    //Calculate the positions of all the o's
    for(int i = 1; i < xPositions.size(); i++)
    {
        long oPosition =  xPositions[i] - xPositions[i-1] - 1;
        solution.add(oPosition);
    }

    return solution;
}

2012-04-06 16:49
by BlueRaja - Danny Pflughoeft

You can avoid having to use rejection sampling (Which becomes really nasty in some edge cases), by splitting intervals (in your case runs of "o"s), weighting the choice of interval by number of internal o-o boundaries. (e.g. id you've split you sum like this ooooXoXoo You have 3 intervals , the first with weight 3, the second with weight 0 and the third with weight 1. So picking a number from 1 to 4 will let you chose the spot to split. (This is basically the method I outline in my answer above) - Michael Anderson 2012-04-09 02:57

I believe your issue lies with Math.round try modifying your code to use doubles and avoid any loss of precision

2012-04-05 02:16
by RyanS

A good method is to work with a list of intervals which you then split at each step. Here's the pseudo-code

 array intervals = [(0,M)]
   while number intervals<desired_number
     pick an split point x
     find the interval containing x
     split that interval at x

If you want to be really careful you need to check that your split point x is not an end point of an interval. You can do this by rejection, or you can pick the interval to split, then where to split that interval, but in that case you need to be careful about not introducing bias. (If you care about bias).

2012-04-05 02:35
by Michael Anderson

How about this:

long finSum = 0;
for (int i = 0 ; i < ret.length; i++) {
  ret[i] =  Math.round((sum * ret[i]) / iniSum);
  System.out.println("ret[" + i +"] = " + ret[i]);
  finSum += ret[i];
}
ret[0] += sum - finSum;

In other words, put all the rounding errors into the first element.

2012-04-05 02:29
by QuantumMechanic

Suggestions:

  for (int i = 0 ; i < ret.length; i++) {
    ret[i] = randomInRange(1, sum);
    iniSum += ret[i];
  }

In the first for loop, instead of iterating from 0 to (ret.length-1), use iterations only from 0 to (ret.length-2). Keep the last element for adjustment.

Also dont call randomInRange(1, sum), use randomInRange(1, sum-iniSum) instead. This will reduce the normalization required.

In the end, the last output number will be (sum-iniSum). Thus, the 2nd for loop can be removed.

2012-04-05 02:29
by Tejas Patil

Take a number. Subtract a reasonable random number from it. Repeat until you have gotten enough numbers. Their sum is by definition fixed and is equal to the initial number. You make exactly n operations where n is the quantity of numbers you need; no guessing.

Here's Python code:

import random
def generate_randoms(initial, n):
  # generates n random numbers that add up to initial.
  result = [] # empty list
  remainder = initial # we'll subtract random numbers and track the remainder
  numbers_left = n
  while numbers_left > 1:
    # guarantee that each following step can subtract at least 1
    upper_bound = remainder - numbers_left 
    next_number = random.randrange(1, upper_bound) # [1, upper_bound) 
    result.append(next_number)
    remainder -= next_number # chop off
    numbers_left -= 1
  result.append(remainder) # we've cut n-1 numbers; don't forget what remains
  return result

It works, but it has a downside: the farther are the numbers, the smaller they are. You might need to shuffle the resulting list to fight this, or play with smaller upper bounds.

2012-04-05 02:44
by 9000

This sure is an interesting question (upvote!), and I look forward to seeing some more creative solutions. One issue I have with the code you've listed is that you're using the Random class, which only returns integers, not longs, but then you just cast it to a long. So if your function really needs longs, you'll have to find a different random number generator.

The "ideal" solution would be to generate all possible additive combinations (called partitions) and then pick one randomly. However, the best-known algorithms for doing this are very expensive indeed. There's lots of good source material out there on this problem. Here's a particularly good read on partitioning algorithms:

http://www.site.uottawa.ca/~ivan/F49-int-part.pdf

Part of the problem is figuring out what you mean by "random" in this instance. For example, if you're looking for an array of 5 numbers that sum to 1000, {1000,0,0,0,0} is just as valid as {200,200,200,200,200}, which is just as valid as {136,238,124,66,436}. An algorithm that has an equal chance of generating any of these sets would be truly random.

However, I'm guessing that you're not looking for a completely random distribution, but something in between.

Unfortunately, my Java is pretty rusty, and I'm doing most everything in C# these days, so I'll present in C#...it shouldn't take too much translation to convert these algorithms into Java.

Here's my take at the problem:

public int[] GetRandoms( int size, int sum, Random r=null ) {
  if( r==null ) r = new Random();
  var ret = new int[size];
    while( sum > 0 ) {
      var n = r.Next( 1, (int)Math.Ceiling((double)sum/size)+1 );
      ret[r.Next(size)] += n;
      sum -= n;
    }
    return ret;
}

(On a practical note, see how I'm passing in my random number generator? I default it to null and just create a new one for convenience's sake, but now I can pass in a seeded random number generator, and generate the same results if I wish, which is great for testing purposes. Whenever you have a method that utilizes a random number generator, I highly recommend taking this approach.)

This algorithm basically keeps adding random numbers to random entries in the array until the sum has been reached. Because of its additive nature, it will favor larger numbers (i.e., it will be extremely unlikely that small numbers will appear in the result). However, the numbers will appear to be random, and they will sum appropriately.

If your target number is small (under 100 say), you can actually achieve the true random approach of generating all of the partitions and just picking one at random. For example, here's a partitioning algorithm (courtesy http://homepages.ed.ac.uk/jkellehe/partitions.php, translated into C#):

public List<int[]> GetPartitions( int n ) {
var result = new List<int[]>();
var a = new int[n+1];
var k = 1;
a[0] = 0;
var y = n - 1;
while( k != 0 ) {
    var x = a[k-1] + 1;
    k--;
    while( 2*x <= y ) {
        a[k] = x;
        y -= x;
        k++;
    }
    var l = k + 1;
    while( x <= y ) {
        a[k] = x;
        a[l] = y;
        result.Add( a.Take( k + 2 ).ToArray() );
        x++;
        y--;
    }
    a[k] = x + y;
    y = x + y - 1;
    result.Add( a.Take( k + 1 ).ToArray() );
}
return result;
}

(To aid you in the Java translation, a.Take(x) means take the first x elements out of array a...I'm sure there's some equivalent magic in Java these days for doing that.)

This returns a list of partitions...just pick one at random, and you'll have a perfect random distribution.

2012-04-05 07:55
by Ethan Brown

-How about this?

private long[] getRandoms(int size , long sum) {
    long[] array = new long[size];
    long singleCell = sum / size;
    Arrays.fill(array, singleCell);
    array[size-1] += sum % singleCell;
    int numberOfCicles = randomInRange(size, size*10);
    for (int i = 0; i < numberOfCicles; i++) {
        int randomIndex = randomInRange(0, size);
        long randomNumToChange = randomInRange(0, array[randomIndex]);
        array[randomIndex] -= randomNumToChange;
        array[randomInRange(0, size)] += randomNumToChange;
    }
    return array;
}

2012-04-05 08:27
by shem