This method was used by Gleb in his fmtools and aotools package. The fitting of any function to data using least absolute deviation, i.e. ∑ | [ ∑ w_i *f*(x_i) ] –*g*(y) | where *f* and *g* do not depend on **w** can be converted into a linear program, which drastically simplifies the weights identification.

The procedure will always converge to a global minimum, although there may be multiple possibilities. Here is the code for a quadratic mean, i.e. y = ( ∑ w_i x^2 )^(1/2), but obviously you can replace the generating function. Even though the function is non-linear, the inputs are transformed such that we only fit to a linear relationship.

**Pseudocode**

1. Read data as x_1, x_2, … x_n, y – identify n, y-column, number of data

2. Construct the constraints matrix:

a. For the i-th input of the j-th datum, the constraints matrix will have C_ij = (x_ij)^2

b. to the right of the n_th column, append a diagonal matrix of -1s, and a diagonal matrix of 1s

c. add a row with the first n entries equal to 1 and the rest z0s.

d. combine a diagonal matrix (n x n) with a zero matrix and append it below

3. Construct the right-hand-side constraints vector

a. for each data j, the j-th entry will be (y_j)^2

b. append a 1 to the end

c. leave the rest 0

So this constraints matrix corresponds to (QM is the quadratic mean)

QM(x_j) – y_j – (r_j +) + (r_j -) = 0 (for j = 1… num data)

∑w_i = 0

w_i ≥ 0 (for all i)

4. Write the objective function. The LAD minimizes the absolute differences, so in order to express this we simply give a weight of 1 to each of the decision variables representing residuals, and 0 to each of the weights.

5. Use the lpsolve to find the w_i values.

**R implementation**

# read data set to fit to.fit <- read.table("documents/R/train1.txt") n <- ncol(to.fit)-1 ycol <- ncol(to.fit) instances <- nrow(to.fit) # specify values of p p<- 2 # create x_i^p strings (beginning of constr.matrix) all.const <- array(0,0) for(k in 1:instances) { const.i <- array(0,0) for(i in 1:n) { const.i<-c(const.i,(to.fit[k,i]^p)) } all.const <- rbind(all.const,const.i) } # residual coefficients resid.pos <- -1*diag(instances) resid.neg <- diag(instances) all.const <- cbind(all.const,resid.pos,resid.neg) # then to enforce weights >0, and sum to 1 w.geq0 <- diag(n) w.geq0 <- cbind(w.geq0,array(0,c(n,2*instances))) all.const <- rbind(all.const,c(array(1,n),array(0,2*instances))) all.const<-rbind(all.const,w.geq0) # now create the rhs of the constr constr.v <- array(0,nrow(all.const)) for(i in 1:instances) constr.v[i] <- to.fit[i,ycol]^p constr.v[instances+1] <- 1 constr.d <- c(array("==",(instances+1)),array(">=",n*(n-1))) # obj.function is just sum of the residuals obj.coeff <- c(array(0,n),array(1,2*instances)) lp.output<-lp(direction="min",obj.coeff,all.const,constr.d,constr.v)$solution w.weights <- array(lp.output[1:n])

Hello,

I have a similar problem and am trying to replicate your example. However, I dont know the contents of your train1.txt data file. Is there any place where I can download or view the dat within the file? Thanks in advance.

Apologies I didn’t reply to this sooner, Nyg – the train1.txt file just needs to be any matrix of rows columns, where the last column is the observed outputs and each of the others relates to a variable. E.g.

4 5 2 3

2 3 2 2

would be where we want f(4,5,2) = 3 and f(2,3,2) = 2.