Thursday, January 19, 2006

The research community doesn't understand the customer

Folks in the Machine Learning 'research community' could think a bit more about their 'customers'.
It's a generic problem in academic research. In private companies you learn fast that the customer is king. In the 'academic community', though, they are more interested in their own problems and tend to ostracize the applications and users.
For example, Machine Learning helps users classify/rank their data. It should help the users understand their problem domain. There are two areas that I don't see evolving in Machine Learning.
First in feature selection, their ultimate goal is to optimize the set of attributes for classification. But in doing so, the process can reveal a lot about the problem domain itself. I can't find much interpretation in the literature of how to utilize the result of feature selection to interpret the original problem domain. For instance, the literature uses a lot of Information Theory, but it doesn't show how much information the attributes convey conditionally on the class.
The second symptom of the problem is again the assumption that you are always dealing with the Euclidean space. Mercer's theory in SVM assumes a vector space. They definitely should work with better abstract algebra and category theory to redefine their theory, so that the machines can manipulate other objects directly.

No comments: