# Piecewise Linear Data Scaling

The following was first developed in order to fit the Choquet integral to journal rankings data.  The Choquet integral requires the inputs to be commensurable, and the distributions of journal indices e.g. impact factor, result in the usual scaling techniques such as normalisation, standardisation or functional have little distinguishing ability between journals of differing quality.  What we want is for a journal with 0.8 across all categories to have an ouput of 0.8 (idempotency).

Relevant Papers

Pseudocode

1. Read the data as x1 … xn y

2.For each variable, calculate the median for each y-label, *sort, and associate with the desired output.

3. Define the transformation function u(x) as the piecewise-linear function which interpolates the medians.

4. Transform the data and output as a new table.

R-implementation

Required input:

change the name of the input file, output file, and class values.

```#read the data
#create the median/average matrix
mdat<-matrix(NA,nrow=4,ncol=ncol(data))
# fill matrix with values,
# need to change Y1, Y2 etc to label values in ascending order
# this is set for 4 lables, to add in more just add manually
for(i in 1:4){
mdat[1,i]<-median(split(data,data\$y)\$`Y1`[,i])
mdat[2,i]<-median(split(data,data\$y)\$`Y2`[,i])
mdat[3,i]<-median(split(data,data\$y)\$`Y3`[,i])
mdat[4,i]<-median(split(data,data\$y)\$`Y4`[,i])}
# enforce monotonicity of the matrices
for(i in 1:4){ mdat[,i]<-sort(mdat[,i])}
#alternative to this is this (j is number of iterations)
## for(i in 1:4 j in 1:10){ mdat[,i]<-sort(mdat[,i])}
#create new data set
data1<-data
n <- nrow(data1)
for(i in 1:n) for( j in 1:4) {data1[i,j]<-
if(data[i,j]<mdat[1,j])
mdat[1,j]*(data[i,j]-min(data[,j]))
else if(data[i,j]<mdat[2,j])
mdat[1,j]+(mdat[2,j]-mdat[1,j])*(data[i,j]-mdat[1,j])/(mdat[2,j]-mdat[1,j])
else if(data[i,j]<mdat[3,j])
mdat[2,j]+(mdat[2,j]-mdat[1,j])*(data[i,j]-mdat[2,j])/(mdat[3,j]-mdat[2,j])
else if(data[i,j]<mdat[4,j])
mdat[3,j]+(mdat[2,j]-mdat[1,j])*(data[i,j]-mdat[3,j])/(mdat[4,j]-mdat[3,j])
else 1}
#plot utility functions
plot(data[,1],data1[,1])
#write to a file
write.table(data1,"newdata.txt")```