Minimization with constraint on all parameters in R

I want to minimize a simple linear function Y = x1 + x2 + x3 + x4 + x5 using ordinary least squares with the constraint that the sum of all coefficients have to equal 5. How can I accomplish this in R? All of the packages I've seen seem to allow for constraints on individual coefficients, but I can't figure out how to set a single constraint affecting coefficients. I'm not tied to OLS; if this requires an iterative approach, that's fine as well.

2012-04-03 19:34
by eykanal

If the coefficients must sum to 5, then you can drop the last parameter and set it equal to 5-sum(p[1:4]) ... You could conceivably do the calculus yourself and get a closed-form expression .. - Ben Bolker 2012-04-03 19:48

@BenBolker Thanks for the suggestion. How would that work? If I phrase the equation as Y ~ x1+x2+x3+x4+x5, how do I indicate to the minimizing function that I want to keep the parameter for x5 set to 5-sum(x[1:4])? I can't just solve for Y ~ x1+x2+x3+x4, because that (appears to me to be) a completely different optimization problem - eykanal 2012-04-03 19:55

At present I think this problem is ill-defined. Suppose (for simplicity) that n=3 and sum(p)=C. The original linear problem (without constraints) is ill-posed, because we can make a1*x1+a2*x2+a3*x3 as small as we want by setting the coefficients to large negative numbers if x is positive and vice versa. Putting the constraint on (a1+a2+a3=C) transforms this to a lower-dimensional, but still ill-posed problem, i.e. minimizing a1*(x1-x3)+a2*(x2-x3)+C*x3). Care to clarify the problem ... ? (Perhaps you mean you want to fit a linear least-squares problem?? - Ben Bolker 2012-04-03 20:05

If the last sentence of the previous comment is what you want, then I think you can do this by transforming your parameters and using an offset ... I won't bother to spell it out unless you clarify what you actually want to do .. - Ben Bolker 2012-04-03 20:07

@BenBolker Sorry if my terminology is wrong. I'm currently minimizing a linear function using an ordinary least squares approach. If this requires using optim or nlminb or something, though, that's fine as well. So yes, that's what I'm looking for, and clarification will be rewarded with cupcakes (actual cupcakes not included) - eykanal 2012-04-03 20:10

The basic math is as follows: we start with

mu = a0 + a1*x1 + a2*x2 + a3*x3 + a4*x4

and we want to find a0-a4 to minimize the SSQ between mu and our response variable y.

if we replace the last parameter (say a4) with (say) C-a1-a2-a3 to honour the constraint, we end up with a new set of linear equations

mu = a0 + a1*x1 + a2*x2 + a3*x3 + (C-a1-a2-a3)*x4
   = a0 + a1*(x1-x4) + a2*(x2-x4) + a3*(x3-x4) + C*x4

(note that a4 has disappeared ...)

Something like this (untested!) implements it in R.

Original data frame:

d <- data.frame(y=runif(20),
                x1=runif(20),
                x2=runif(20),
                x3=runif(20),
                x4=runif(20))

Create a transformed version where all but the last column have the last column "swept out", e.g. x1 -> x1-x4; x2 -> x2-x4; ...

dtrans <- data.frame(y=d$y,
                     sweep(d[,2:4],
                           1,
                           d[,5],
                           "-"),
                     x4=d$x4)

Rename to tx1, tx2, ... to minimize confusion:

names(dtrans)[2:4] <- paste("t",names(dtrans[2:4]),sep="")

Sum-of-coefficients constraint:
```
constr <- 5  
```

Now fit the model with an offset:

lm(y~tx1+tx2+tx3,offset=constr*x4,data=dtrans)

It wouldn't be too hard to make this more general.

This requires a little more thought and manipulation than simply specifying a constraint to a canned optimization program. On the other hand, (1) it could easily be wrapped in a convenience function; (2) it's much more efficient than calling a general-purpose optimizer, since the problem is still linear (and in fact one dimension smaller than the one you started with). It could even be done with big data (e.g. biglm). (Actually, it occurs to me that if this is a linear model, you don't even need the offset, although using the offset means you don't have to compute a0=intercept-C*x4 after you finish.)

2012-04-03 20:20
by Ben Bolker

I'm a little confused by your first sentence. You refer to constraining x4 be equal to 5-x1-x2-x3; I'm looking to constrain the coefficients, not the variables themselves. How would I set up the constrain a4=5-a1-a2-a3 - eykanal 2012-04-04 11:31

sorry, typo (fixed now) -- but the rest of it should be correct, I thin - Ben Bolker 2012-04-04 12:22

I think the equation following that one may need to be corrected also... shouldn't y=a0+a1*(x1-x4)+a2*(x2-x4)+a3*(x3-x4)+C*x4 be y=a0 + (a1-a4)*x1 + ... + (a3-a4)*x3 - eykanal 2012-04-04 12:41

Nice exposition, this should partially clear up why it is ill defined question to begin with. In the pyschometrics literature such "a constrain of the coefficients" would be considered a formative measurement model or possible a composite measurement model in the words of Ken Bollen. See Bollen & Bauldry, 2011 - Andy W 2012-04-04 12:41

Another way to think of the problem is (which isn't any different than what @Ben did here), fit the ols model with the original data, and then just make an arbitrary linear transformation to one (or more) of the independent variables to fit the constraints placed on the estimated regression coefficients. Similar problem to say if time is one of the independent variables we could measure time in a variety of ways (and change the estimated coefficient) but what is being optimized is still the same - Andy W 2012-04-04 12:50

@eykanal, I think you're wrong: a4 doesn't appear in the equation any more, it has been transformed out. Have you worked through the algebra yourself - Ben Bolker 2012-04-04 13:18

@BenBolker - I did, I'll check it over again, I probably made a stupid mistake. Thanks for all your help - eykanal 2012-04-04 14:49

Since you said you are open to other approaches, this can also be solved in terms of a quadratic programming (QP):

Minimize a quadratic objective: the sum of the squared errors,

subject to a linear constraint: your weights must sum to 5.

Assuming X is your n-by-5 matrix and Y is a vector of length(n), this would solve for your optimal weights:

library(limSolve)
lsei(A = X,
     B = Y,
     E = matrix(1, nrow = 1, ncol = 5),
     F = 5)

2012-04-03 23:55
by flodel