A learning mouse
The moving grey object in the left panel represents a mouse, bouncing around forever in its rectangular world. Besides the mouse, this world contains two plants, represented by circles. To the mouse, the green plant tastes good, while the red one has a foul taste. When the simulation starts, the mouse moves rather billiard ball-like (with a small random component). But if you muster the patience to observe its behaviour for some time, you'll notice that the mouse develops a growing aversion to the red plant and an increasing attraction to the green one.
The mouse is equipped with a neural network that learns to translate color perceptions into steering commands. While the mouse moves around, the network stores associations between perception and steering as correlations between neuron activities. Every time the mouse hits a plant (marked by a beeping sound) its taste provides a binary feedback about its previous movement sequence: "Well done!" or "Not good!". Guided by that feedback, the network strengthens or weakens connections between neurons, in order to enhance or counter the existing relation between perception and steering.
Behind the scenes
The panel on the right shows the inner workings of the mouse. The visual input is shown in the rectangle at top left; the mouse perceives no images, but just colors: red / green, left / right. These same signals are also displayed in the bar graph at top right. (Apparently the human eye is better at judging relative bar heights than brightnesses, especially when different colors are involved.)
Below the visual input is the neural network. At left a schematic diagram of its
anatomy is drawn, with neurons represented by circles and connections between neurons by lines. The bar graphs beside this diagram monitor the signal processing by the neural network. The activities of neurons are shown by broad, black, fast-changing bars. The strengths of the connections between the neurons are shown as narrow bars (vertical lines).
Points to note:
- the brightness of the neurons in the diagram corresponds to the absolute value of their activation levels as seen in the bar graphs;
- the activities of the input neurons (top layer) correspond exactly to the strengths of the visual signals indicated directly above them;
- the connections between the input layer and the middle layer of neurons have prefixed strengths, unaffected by learning;
- the connections from the middle neuron layer to the output neurons start out with zero strength, and only acquire their (positive or negative) value as learning progresses;
- the output neurons have a random activity component, which is most apparent at the beginning of the learning process, when no signals from other neurons reach them.
Below the neural network and its activity, the number of times the mouse hits each plant is counted (in corresponding color), the percentage of collisions with the green plant is computed (in black), and this percentage is graphed to visualize the learning process.
- After some time the mouse, when moving towards the green plant, tends to overcompensate for deviations from the direct course, resulting in a wriggling movement.
- Sometimes the neural network gets stuck in a local minimum. Ideally, learning results in (from left to right) up, down, down, up, down, up, up, and down spikes in the second connection layer, with the learning graph getting ever closer to 100%. In a local minimum, the first four spikes are up, up, down, and down (or, less frequently, down, down, up, and up). In that case the other four spikes do not form completely (or even hardly at all), and the learning graph stays much lower — though usually still above 50%, which would be the percentage obtained by chance.
quick and dirty, with identifiers and (sparse) comments in Dutch. Copyright (c) 2019.
back to main()