Self-Organising Maps For Colour Recognition

Introduction

I recently read the tutorial 'Color Recognition and Neural Networks' which tried a neural network solution where you have to know the colors before hand to be able to train the network to. So i tried to figure out a different way to implement it one that has no prior knowledge of the color it trains to. It should just group similar colors together. After some searching i came up with two different techniques. 'Learning Vector Quantization' (LVQ) and 'Self Organizing Maps' (SOM). I chose the latter because it can learn unsupervised.

The Algorithms

Both algorithms use a set of training data (samples) and a set of 'neurons' (nodes) consisting of n-dimensional vectors. The idea is to move the the nodes into the places so that their corresponding points in the searchspace are closest to that node. The search space will be speterated into different areas wich are the corresponding to the nodes type.

LVQ

The LVQ algorithm needs some prior knowlegde you have to build different subsets (classes) and determine which nodes correspond to the different classes.

LVQ pseudocode

 for each element in samples S
	c=arg min(k) |S	- nodes[k]|
	nodes[c] = nodes[c] sgn a(t) [S - nodes[c]]

The sgn function is positive when the samples class S is corresponding to corresponds to the node otherwise it is negative. The || function can be any function to determine the distance between the both vectors. The a(t) ( t element ]0,1[ ) function is decreasing monotnically over time.

Ok so if you want to know which class a new vector falls in you determine just use the distance function to determine the nearest node. The vector falls in the same class as this node.

SOM

The SOM algorithm is similar to the LVQ algorithm. The nodes in the SOM algorithm have some toplogogy (so that each nodes has some neighbours) this is call map.

SOM psudocode:

 for each element in samples S
	c=arg min(k) |S - nodes[k]|
	for each element in nodes N
		N = N + a(t) NF(t, |MP(nodes[c]) - MP(N)| ) [S - N]

The MP (map position) function retrieves the postion of the node in the map.

The NF(t,distance) function (Neighbouring Function) is a function that determines the distance of two nodes and returns some positve value for neigbouring nodes (most likely decresing over the distance). Its purpose is to get similar nodes further to the sample. The use of t will be explained in the implementation section.

There is no sgn function because we dont know about the classes before. After the training you can retrieve the 'class' (which is unspecified) with the same technique as in the LVQ algorithm.

Implementation

As i mentioned above i used the SOM algorithm. Now id like to go a little bit more into the detail of the implementation. Its improtant to have a look at the different functions. Troughput my implementation i used simple eucledian distance/length.

 l = sqrt( v[1]^2 + v[2]^2 + ... v[n]^2 ) ( v[n] element |R )

Before i can expailn the neigbouring function and the map position function ill have to tell you which topology i have chosen. The topology i use for the map is a simple one dimensional array. The map could as well use any other topology you can think of. But the array is the most simple one because i can just use the nodes index of the array in which i store them.

So MP function retrieves just the index fo the given node and the distance determined between the two map pionts is the difference between the indices.

The NF is a function which decreases linearly over distance and returns 1.0 for the distance 0.0 and 0.0 for the distance of a certain radius. The radius decreases monotonicaly over time but will never become 0.0.

 factor = 1.0 - (distance / radius) 
                (factor, radius element | R  | distance elment | N) 

The NF could also use a gaussian function which would be nicer. The radius determines how many neigbours a node has. If it was three it would retrieve the two previous nodes of the array as well as the following two. The nodes next to the node are weightened 0.66 the outer ones 0.33. So they will move slightly in the same direction as the node.

I reapeat the whole algorithm until the nodes move only a very small distance.

Application

Ok here we go. Now that we know how the SOM algorithm works we are having a look at how we can use it for color recognition. Every color can be understood as a vector of three components (red, green, blue) well use a color just like a point in 3d space. Then we need a set of sample colors which i generate at random this will be our the colors to be categorised. Nodes are just generated in the same way though it might be wiser to actually set some of both sets by hand I will explain in the conclusion why. Now that we have all the things we actually need to get started id like to have a look at the different functions used in the application.

I used Common Lisp for the application so i will stick to its notation from now on which is quiet easy. You will find the sourcecode at the end of the article and i would like you to get a common lisp interpreter to try it out. You just will have to follow the instructions below to get the programm running.

The learn function is just the SOM algorithm and Nkernel is the neighbouring function. The train function will start the learning.

When you start the common lisp interpreter you see a prompt. Then just type:

 > (train)

and it will start the learning. The numbers wich you will see are the total difference of the movement of nodes which i use to determine if the nodes are close to selltlement. When the algorithm has stopped you will see a list of all the trained nodes.

The whatis function is the function you use to identify a color you can use it after the train function:

 > (whatis '(r g b))

r g and b are nartural numbers ranging from 0 to 255. If the color is close to some node it will retrieve the name of the node if it has one already. Otherwise it will ask you to specify the type of the nearest node. The name is then saved in a hashtable with the index of the nearest node as its key.

Example:

 > (whatis '(0 0 128))
 "I dont know this yet please declare: " red

 > (whatis '(0 0 230))
 "This is RED"

Conclusion

You might experience some color mismatching for extreme values like black and white or red ( 12 0 0 ) im not totally sure why this happens but i think that it is related to the random samples and nodes but i chose to randomize them because its easyier than typeing in a lot of colors.

Further assumptions were to write a kernel wich kills nodes that are the same color if they are close and lay nodes together in a higher dimension ( so that nodes wich are seperated at the map which are nearly the same color will be closer together in this new space). And it might be good to introduce distance to the whatis function so that if a node is really far he would say it might be the color and adds a new node at this point to the samples and a node near it and train again. And maybe some look for different topologies of the map might be good. I havent found any good explanation how the map dimensions correlate to the overall performance. A graphical output of the map might be fine to show the process.

It would also be very fine to normalize the color and train the SOM to detect the satuation and brightness of a color. The neighbouring function could be speed up by some structuring like and octree or something like that.

Pros

Cons

Links

You can download the source code with this tutorial right here: color.lisp

SOM

http://davis.wpi.edu/~matt/courses/soms/#Determine%20Neighbors
http://www.cis.hut.fi/research/som-research/

LISP

http://ww.telent.net/cliki/index ( good point to start if you are interested in lisp )
http://clisp.sourceforge.net/ ( a free CL implemenation which runs on Mac, Win32 and Linux )

Remember you can visit the Message Store to discuss this tutorial. Comments are always welcome!. There are already replies in the thread, why not join in?