Following our submission to IPMU, Aoi and I worked on this article for MDAI, which was held in Kitakyushu, Japan. In this case we were interested in how the overall function behaviour changes with respect to the different components. (available online)
Title: Orness and cardinality indices for averaging inclusion-exclusion integrals
Authors: A. Honda, S. James and S. Rajasegarar
The inclusion-exclusion integral is a generalization of the discrete Choquet integral, defined with respect to a fuzzy measure and an interaction operator that replaces the minimum function in the Choquet integral’s M\”obius representation. While in general this means that the resulting operator can be non-monotone, we have previously proposed using averaging aggregation functions for the interaction component, which under certain requirements can produce non-linear, but still averaging, operators. Here we consider how the orness of the overall function changes depending on the chosen component functions and hence propose a simplified calculation for approximating the orness of an averaging inclusion-exclusion integral.
Our Robust OWA work which we started earlier last year and submitted to FUZZIEEE 2016 was accepted to IEEE Transactions on Fuzzy Systems. We also had a write-up as a Deakin Media Release. I find robust statistics and robust methods both interesting and pertinent given how blindly some machine learning methods can be applied these days. (available online)
Title: Robustifying OWA operators for aggregating data with outliers
Authors: G. Beliakov, S. James and T. Wilkin
We propose a version of Ordered Weighted Averaging (OWA) operators which are robust against inputs with outliers. Outliers may heavily bias the outputs of the standard OWA. The penalty-based method proposed here comprises both outlier detection and reallocation of weights of the OWA. At the first stage the outliers are identified based on a robust criterion that can accommodate up to half the inputs being outliers, but at the same time not removing the inputs unnecessarily. Three numerical algorithms for calculating the optimal value of this criterion are proposed. At the second stage the OWA weights are recalculated for a subset of clean data while preserving the overall character of the weighting vector. The method is numerically tested on simulated data and exemplified on aggregating a large number of online ratings where the outliers represent biased, missing or erroneous evaluations.
Marek and I have become intrigued by the problem of fitting the Sugeno integral, which has been relatively understudied. Some of the preliminary results were presented by Marek at EUSFLAT this year. (available online)
Title: Fitting Symmetric Fuzzy Measures for Discrete Sugeno Integration
Authors: M. Gagolewski and S. James
The Sugeno integral has numerous successful applications, including but not limited to the areas of decision making, preference modeling, and bibliometrics. Despite this, the current state of the de- velopment of usable algorithms for numerically fitting the underlying discrete fuzzy measure based on a sample of prototypical values – even in the simplest possible case, i.e., assuming the symmetry of the capac- ity – is yet to reach a satisfactory level. Thus, the aim of this paper is to present some results and observations concerning this class of data approximation problems.
We conducted some research with Deakin’s DSTIL team and had a paper accepted to Applied Soft Computing as part of a special issue. Here we looked at learning weights for inequality- and aggregation-based indices for measuring traffic, looking particularly to find indices correlated with low traffic speeds. (available online)
Title: Measuring traffic congestion: An approach based on learning weighted inequality, spread and aggregation indices from comparison data
Authors: G. Beliakov and M. Gagolewski and S. James and S. Pace and N. Pastorello and E. Thilliez and R. Vasa
As cities increase in size, governments and councils face the problem of designing infrastructure and approaches to traffic management that alleviate congestion. The problem of objectively measuring congestion involves taking into account not only the volume of traffic moving throughout a network, but also the inequality or spread of this traffic over major and minor intersections. For modeling such data, we investigate the use of weighted congestion indices based on various aggregation and spread functions. We formulate the weight learning problem for comparison data and use real traffic data obtained from a medium-sized Australian city to evaluate their usefulness.
Along with researchers from the School of IT at Deakin, we have begun looking at the application of robust aggregation to peer-assessment. I find this pretty interesting as I see a lot of opportunities and good that could come from it in my lecturing roles. Looking forward to our future work in this area! Presented by Tim at FUZZIEEE in Sicily this year (available online).
Title: Online Peer Marking with Aggregation Functions
Authors: S. James, L. Pan, T. Wilkin and L. Yin
With the rise of Massive Open Online Courses (MOOCs), online peer marking is an attractive contemporary tool for educational assessment. However its widespread use faces serious challenges, most significantly in the perceived and actual reliability of assessment grades, which can be affected by the ability of peers to mark accurately and the potential for collusion and bias. There exist a number of aggregation approaches for alleviating the impact of biased or scores, usually involving either the down-weighting or removal of outliers. Here we investigate the use of the least trimmed squares (LTS) and Huber mean for the aggregation step, comparing their performance to weighting of markers based on divergence from other peers’ marks. We design an experimental setup to generate scores and test a number of conditions. Overall we find that for a feasible number of peer markers, when the student pool comprises a significant number of ‘biased’ markers, outlier removal techniques are likely to result in a number of very unfair assessments, while more standard approaches will have more grades unfairly influenced but to a lesser extent.
This is the first collaboration with Aoi Honda, looking at some extensions of the inclusion-exclusion integral she proposed to generalize the Choquet integral. This paper explored the conditions under which our extension satisfied monotonicity and averaging behaviour. It was presented at IFSA held in Otsu, Japan this year. (available online)
Title: Averaging Aggregation Functions Based on Inclusion-exclusion Integrals
Authors: A. Honda and S. James
The inclusion-exclusion integral, defined with respect to a fuzzy measure and interaction operator generalizes the Choquet integral. Here we look at some of its interesting properties and investigate the conditions on the interaction operator which ensure the integral is averaging. We present some illustrative examples.
An extended stability paper has been accepted to Fuzzy Sets and Systems. This was the full version of our conference paper submitted to AGOP in 2015. R-code and some extra tables/data relating to the paper are available here.
Title: Approaches to learning strictly-stable weights for data with missing values
Authors: G. Beliakov, D. Gómez, S. James, J. Montero, J.T. Rodríguez
The problem of missing data is common in real-world applications of supervised machine learning such as classification and regression. Such data often gives rise to the need for functions defined for varying dimension. Here we propose optimization methods for learning the weights of quasi-arithmetic means in the context of data with missing values. We investigate some alternative approaches depending on the number of variables that have missing values and show results for several numerical experiments.