Behavioral and Cognitive Robotics
An adaptive perspective

Stefano Nolfi

© Stefano Nolfi, 2021   |   How to cite this book   |   Send your feedback   |   Collaborate

Index Next Chapter

10. The Neural Basis of Cognition

10.1 Introduction

The term cognition does not have a shared and uncontroversial definition. Someone uses it to indicate only high-level faculties that characterize human intelligence such as planning, language, imagination and consciousness. Others include additional faculties such as perception, memory and selective attention. Still others extend the meaning of the term cognition to all processes that regulate the interaction between the agent and the environment in a way that is instrumental to the achievement of the agent’s goals (Bourgine and Stewart, 2006).

According to the more extended definition, also the examples that we reviewed in the previous chapters include cognitive robots. In the last part of this book, however, we will focus on complex cognitive faculties. More specifically, in this Chapter we will analyze the neural mechanisms and processes that are able to support the development of complex cognitive faculties. We will describe examples of cognitive robots, i.e. robots displaying high cognitive faculties, in the next Chapter.

10.2 Features extraction

As detailed in the previous chapters, the role of the robot’s brain consists in varying the state of the actuators on the basis of the current and eventually previous observations in order to orchestrate the dynamics originating from the interaction between the robot and the environment. The goal of such orchestration is that to ensure the achievement of the robot’s goal/s.

When the observations encode directly the features that can be used to determine the state of the actuators, the robot can solve the problem by connecting directly the sensory neurons to the motor neurons. This is the case, for example, of the Braitenberg’s vehicles illustrated in Chapter 2. In the other cases, instead, the required features should be extracted from the sensory neurons by internal neurons or generated by performing actions that permit to later experience observations including the required features.

Figure 10.1. Illustration of a hypothetical brain including two sensory neurons (s1 and s2) and a binary motor neuron (m1). The vertical and horizontal axes delimit the two-dimensional sensory space. The neural networks on the right of the sensory space represent the minimal network that can perform an OR and XOR function (left and right Figures, respectively).

To illustrate the role of internal feature extraction, consider a minimal robot that should regulate the state of its single binary motor on the basis of the state of its two binary sensors. If the motor is on when the first or the second sensor is active, and is off otherwise, the brain of the robot performs an OR function (Figure 10.1, left). In this case, the state of each sensor can be used to determine the state of the motor independently from the state of the other sensor. In other words, the state of sensor s1 can be used to determine the state of motor m1, independently from the state of s2 and vice versa. Consequently, the sensor s1 can be connected directly to the motor m1 with a positive connection weight. Similarly, the sensor s2 can be wired directly to the motor m1 with a positive weight. From a geometrical point of view, problems of this kind are called linearly separable since the observations that should produce different actions can be separated with a single line (Figure 10.1, left).

Instead, if the actuator is on when the first sensor is active and the second sensor is not active or vice versa, and is off otherwise, the brain of the robot performs a XOR function. In this case, the state of each sensor cannot be used to determine the state of the motor independently from the state of the other sensor. Consequently, the sensors cannot be connected directly to the motor. From a geometrical point of view, the observations which produce different actions cannot be separated with a single line. They can be separated with two or more lines, e.g. a line that separates the observation [0, 0] from the other observations and a line that separates the observation [1, 1] from the other observations (see Figure 10.1, right). The problem requires a neural network including at least two internal neurons which separate the two sub-areas of the sensory space. The features extracted by these two internal neurons can then be used to determine directly the state of the motor, i.e. can be connected directly to the motor neuron.

Unfortunately, there are not precise rules for determining the appropriate number of neurons and layers. We can only use simple heuristics like the higher the complexity of the problem is, the higher the number of neurons and layers should be. Moreover, the higher the size of the observation and action vectors are, the higher the number of neurons and layers required is.

In the next Chapter we will see how the extraction of useful features can be promoted through self-supervised learning.

For systems that are not embodied and situated, internal layers represents the only way to extract features. Whereas embodied and situated systems can also extract features through actions, i.e. can perform actions that enable them to later experience the required features in their future observations. We saw several examples of this kind of feature extraction the previous chapters.

A first example is the humanoid robot evolved for the ability to discriminate spherical and ellipsoid objects on the basis of tactile information illustrated in Section 4.5. The robot solved the problem by manipulating the objects so to converge on two types of hand/object postures, a first type of posture after interacting with spheres and a second type of posture after interacting with ellipsoids. In other words, the manipulation behavior permits to later experience observations which include the features required to discriminate the category of the object. A second example is constituted by the robot evolved for the ability to navigate in the double T-Maze described in Section 4.7. In this case the way the robots respond to the perception of the green beacons ensures that they will later experience observations including the feature required to decide which way to turn at the T-junctions. A third example is constituted by the follow-border-look-robot behavior displayed by the robots reviewed in Section 9.3. The execution of this behavior enables the companion robots to experience observations including the required features, i.e. the direction of the target destination.

Clearly, features extracted through actions can be combined with feature extracted internally. This is the case of the robots described in Section 4.5 and 4.7 that combine features extracted externally and features extracted internally by integrating observations over time. The possibility to extract information from the variations of observation over time brings us to the topic discussed in the next Section.

10.3 Features extraction over time

In some cases, the relevant features cannot be extracted from the current observation but can be extracted from the way observations vary over time. A straightforward example is constituted by the extraction of velocity information from position sensors. Another example is constituted by the extraction of distance information from an optic flow. Indeed, while the robot move forward, the rate of variations of nearby pixels located in certain portions of the visual field provides a reliable indication of the proximity of the objects located in the corresponding direction. A third example concerns the anticipation of future states. For example, the expected position of a target moving with a certain speed and direction after a certain time period.

The extraction of features from multiple observations experienced over time requires recurrent neural networks. In these networks internal neurons receive signals from other internal neurons that have not yet been updated in the current time step and consequently encode information about previous states. This property enables such networks to preserve into their internal states features extracted from previous observations.

Recurrent neural networks come in different varieties. Here we restrict our analysis to the two varieties that are particularly interesting for robotics: Continuous Time Recurrent Neural networks (CTRNN) and Long Short-Term Memory networks (LSTM).

CTRNN (Beer, 1995) use ordinary differential equations to update the activation of neurons. More specifically, the variation of the activation of a neuron i with action potential yi corresponds to:

where τi is the time constant of the post-synaptic neuron, yi is the activation of the post-synaptic neuron, i is the rate of change of the activation of the post-synaptic neuron, Wij are the connections weights between pre-synaptic neurons and the post-synaptic neuron, σ() is the logistic activation function, and Θj are the biases of the pre-synaptic nodes. The retention of information over time, therefore, is ensured also by the fact that CTRNN neurons partially preserve their previous state. The rate of preservation, and consequently the maximum rate of variation, depends on the time constant parameter.

LSTM (Hochreiter and Schmidhuber, 1997; Gers and Schmidhuber, 2001) use internal neurons constituted by complex units, called cells. Each cell includes three logistic neurons, one hyperbolic tangent neuron, and performs additive and multiplicative calculations (Figure 11.2). The activation of the four neurons is determined conventionally on the basis of the input vector (it), of the incoming weights of each neuron (Wf, Wi, Wo, and Wc), and of the activation functions of the neurons. The three gating neurons (f, i and o) use the logistic activation function that returns a value in the range [0.0, 1.0]. The cell neuron, instead, uses the hyperbolic tangent function that returns a value in the range [-1.0, 1.0]. The state of the cell is set equal to the sum of the state of the cell at time t-1 multiplied by the activation of the forget gate neuron (f) summed to the activation of the memory cell unit (o) multiplied for the activation of the input gate neuron (i). This implies that the forget gate neuron controls to what extent the old state of the unit is preserved and that the input gate neuron controls to what extent the new input alters the state of the cell. The output of the cell is calculated by multiplying the new state of the cell, squashed through a hyperbolic tangent function, for the activation of the output gate neuron. This implies that the output gate controls to what extent the internal state of the unit influences the post-synaptic neurons.

Figure 10.2. A schematization of a LSTM cell. The dotted circle represents the cell neuron that uses the hyperbolic tangent activation function. The full circles represent three logistic neurons that perform the role of forget (f), input (i) and output (o) gates and use the logistic activation function. The bottom rectangles indicate the incoming connection weights of the four neurons. The top arrow represents the state of the unit over time, i.e. at time t-1 and time t. The bottom line indicates the input vector (it) that is multiplied for the four vectors of weights (Wf, Wi, Wc and Wo) to generate the netinput of the four corresponding neurons. The black boxes indicate multiplicative (x), additive (+), and hyperbolic tangent (tanh) operations. Finally, the bottom-right arrow indicates the output of the cell (ot).

Integrating information over time requires to perform three interdependent functions: storing relevant states in the activation state of internal neurons, preserving such internal state for the time necessary, and using the internal states to alter the activity of other neurons. The gating neurons specialized for these three functions makes the LSTM more effective than vanilla recurrent neural networks and CTRNN.

Robots can also integrate information over time and store information externally, in their body or in the relation between their body and the environment.  We have seen an example of how a robot can store information in its relation with the environment and later use this information in Section 4.7. Robots can also extract features over time from their own body and store information in their posture. For example, a walking robot can infer information about previous encountered conditions from its current posture since the posture might reflect the environmental conditions experienced previously. Moreover, a robot can store information for later use in its own posture. For example, a robot can slightly alter the position of its hands in response to a certain environmental stimulus that requires the execution of a specific action later on.  Thus, the position of the hands is used to store the information that can be later accessed through propriosensors.

10.4 Deep learning

The extraction of features, that can then be used to determine actions, is related to the notion of deep learning.

Learning or credit assignment consists in finding the weights that enable the robot to exhibit an effective behavior. Depending on the problem, this might require long causal chains of computational stages, where each stage transforms, usually in a non-linear way, the aggregate activation of the network. Deep learning refers to the possibility to assign credit across many stages of this kind (Schmidhuber, 2015).

Determining in advance whether a specific problem requires deep of shallow neural networks is not really possible since problems usually admit a large number of different solutions. Establishing whether a problem is deep would require to verify that none of the possible solutions can be implemented in shallow networks. Indentifying all the possible solutions is generally not feasible. Indeed, as seen in previous chapters, certain solutions are hard to imagine from the point of view of a human observer. Determining whether a specific solution is deep or shallow, instead, can be difficult but is generally feasible. It implies determining the number of intermediate transformations performed by the solution network.  

For feedforward neural networks that are not situated, i.e. that cannot alter their next observations through actions, the maximum length of the credit assignment path corresponds to the number of internal layers. In other words, un-situated feed-forward neural networks have a maximal depth bounded by the number of internal layers. This is the reason why, for these networks, the term deep is used to indicate networks with many internal layers.

Recurrent neural networks instead are unbounded, i.e. they can rely on a large (and potentially infinite) number of functional subsequent elaboration stages, regardless from the numbers of internal layers.

Situated neural networks are also unbounded, independently from the characteristics of their architecture, since they can alter their next observations through actions. In other words, they can use their actions to transform their observations several times in successive steps. For example, to access the observation x that permits to achieve the robot’s goal through action y from state s, and embodied network can produce action 1, which permits to later experience observation 2, which triggers the action 2 which permits to later experience observation 3, which triggers the action 4, which finally permits experiencing the observation x.

Problems requiring deep solutions are challenging.

For reinforcement learning methods that backpropagate a gradient over intermediate layers and/or in time, through recurrent connections, the main challenge consists in the gradient vanishing/exploding problem (Hochreiter, 1991). The problem, that is caused by cumulative backpropagated error signals which either shrink exponentially or grow out of bounds, can be alleviated by using LSTM networks and suitable activation functions.

Evolutionary methods do not rely on backpropagation and consequently are not affected by the gradient vanishing/exploding problem. They avoid the credit assignment problem by simply retaining the variations that produces useful outcomes. They do not attempt to reconstruct the casual chain of events enabling the useful outcome. The main challenge for evolutionary methods is optimizing networks with a huge number of parameters. This problem can be alleviated by using effective stochastic optimizers, like Adam (Kingma & Ba, 2014).

10.5 Neuromodulation

The response of biological neurons is not only determined by the signals received through incoming connections. It is also influenced by neuromodulators, i.e. substances packaged into vesicles and released by neurons. Neuromodulators permit to dynamically control the properties that determine how the neurons respond to incoming stimuli in a context-dependent manner.  In artificial neural networks, modulation mechanisms of this type can be realized by using special neurons that regulate the output of other neurons. We already encountered an example of regulatory neurons in the LSTM networks described above. For example, we have seen how within an LSTM cell the state of the input gate neuron regulates the output of the cell neuron in a multiplicative manner.

Although the study of neuromodulation in adaptive robots is still in an initial exploratory phase, few works investigated its utility in problems that require to adapt to varying environmental conditions on the fly. Husband et al. (1998) and Philippides et al. (2005) demonstrated the utility of regulatory “gas” neurons, i.e. neurons diffusing a virtual gas that regulates the response of nearby neurons. Petrosino, Parisi & Nolfi (2013) demonstrated how the availability of logistic modulatory neurons, that regulate the output of associated standard neurons, facilitates the evolution of selective attention skills in evolving robots. Selective attention refers to the ability to appropriately respond to a first class of stimuli while ignoring the stimuli belonging to a second class in a certain context and to respond to the stimuli of the second class while ignoring the stimuli of the first class in a different context.

10.6 The architecture of the neural network

Finally, another important factor is the architecture of the robot’s brain, namely the number and type of neurons, the way they are interconnected, and eventually the constraints on the connection weights.

For many problems, feed-forward networks with one or two internal layers constitute a suitable architecture. As we mentioned above, there are not precise rules for determining the number of neurons and the number of layers. In general, the larger the dimensionality of the observation and action vectors is, the larger the number of internal neurons required is. Moreover, the greater the complexity of the problem is, the greater the number of internal neurons and layers required is. The network should include at least the minimal required number of neurons and layers. The usage of networks that are larger than required, on the other hand, does not usually hinder performance.

From problems that necessarily require to integrate sensory-motor information over time, i.e. that require internal memory, recurrent neural networks are needed. As discussed in the previous chapters, determining whether a problem necessarily requires a recurrent network or not is not trivial since problems admit several different solutions and since identifying all possible solutions is not feasible. Consequently, the experimenter should rely on its intuition. On the other hand, using an architecture with recurrent connections for a problem that does not require memory does not usually hinder performance.

Problems that require to process visual information benefit from the usage of convolutional neural networks. These networks are formed by stack of progressively smaller convolutional layers, with shared weights, and pooling layers (Fukushia, 1979; Weng, Ahuja & Huang, 1993). The usage of convolutional layers with shared weights permits extracting the same classes of features from the different spatial portions of the image. Moreover, the hierarchical organization of the architecture facilitates the extraction of progressively higher-level features.

Problems requiring the production of rhythmic behaviors, e.g. walking and swimming, can benefit from the availability of neuron oscillators. Neural oscillators are a special type of neurons capable of producing periodic oscillatory outputs also in the absence of rhythmic inputs (see for example Kamimura et al., 2005). Oscillatory neurons of this type play a role analogous to Central Patterns Generators (CGPs), i.e. neural circuits producing self-sustained oscillatory signals that support the production of rhythmic behavior in natural organisms.

A possible strategy to design the architecture of a network consists in creating architectures analogous to those possessed by animals and humans. An example is constituted by the Darwin X robot (Krichmar et al., 2005). The brain of this robot is constituted by a neural network with 90,000 neurons and 1.4 million connections distributed in 50 neural areas. The connectivity within and among areas reproduce the characteristics of 50 corresponding neural areas of the human brain.

The architecture of the network can also be evolved. This can be realized by evolving genotypes that encode directly or indirectly the connectivity among neurons and the connection weights (Stanley & Miikkulainen, 2002; Stanley, D’Ambrosio & Gauci, 2009; Stanley and Gauci, 2010). Alternatively, it can be realized by evolving the architecture through an evolutionary algorithm and by learning the connection weights through a different algorithm (see for example Real et al., 2017).

10.7 Learn how

Read Section 13.14 and make the Exercise 12 to learn to implement a minimal reinforcement learning algorithm.


Beer R.D. (1995). On the dynamics of small continuous-time recurrent neural networks. Adaptive Behavior 3(4): 469-509.

Bourgine P. & Stewart J. (2006). Autopoiesis and cognition. Artificial Life (10) 3: 327-345.

Fukushima K. (1979). Neural network model for a mechanism of pattern recognition unaffected by shift in position - Neocognitron. Trans. IECE, J62-A(10): 658–665.

Gauci J. & Stanley K.O. (2010). Autonomous evolution of topographic regularities in artificial neural networks. Neural Computation, 22: 1860–1898.

Gers F.A. and Schmidhuber, J. (2001). LSTM recurrent networks learn simple context free and context sensitive languages. IEEE Transactions on Neural Networks, 12(6):1333–1340.

Hochreiter S. and Schmidhuber J. (1997). Long Short-Term Memory. Neural Computation, 9(8):1735–1780. Based on TR FKI-207-95, TUM (1995).

Hochreiter S. (1991). Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut fuer Informatik, Lehrstuhl Prof. Brauer, Tech. Univ. Munich.

Husband P., Smith T., Jakobi N & O’Shea M. (1998). Better living through chemistry: Evolving GasNets for robot control. Connection Science 10(4): 185-210.

Kamimura A., Kurokawa H., Yoshida E., Murata S., Tomita K. & Kokaji S. (2005). Automatic locomotion design and experiments for a modular robotic system. IEEE/ASME Transactions on mechatronics, 10(3), 314-325.

Kingma D.P. & Ba J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Krichmar J.L., Seth A.K., Nitz D.A., Fleischer J.G. & Edelman (2005). Spatial navigation and causal analysis in a brain-based device modeling cortical–hippocampal interactions. Neuroinformatics, 5: 197-222.

Petrosino G., Parisi D. & Nolfi S. (2013). Selective attention enables action selection: evidence from evolutionary robotics experiments. Adaptive Behavior, 21(5):356-370

Philippides A.O., Husband P., Smith T & O’Shea M. (2005). Flexible coupling: Diffusing neuromodulators and adaptive robotics. Artificial Life 11 (1-2): 139-160.

Rawat W. & Wang Z. (2017). Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9), 2352-2449.

Real E., Moore S., Selle A., Saxena S., Suematsu Y.L., Tan J., Le Q.V. & Kurakin A. (2017). Large-scale evolution of image classifiers. In D. Precup & Y.W. The (Eds.), Proceedings of the 34th International Conference on Machine Learning, pp. 2902–2911.

Schmidhuber J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85-117

Stanley K.O. & Miikkulainen R. (2002). Evolving neural networks through augmenting topologies. Evolutionary Computation 10: 99–127.

Stanley K.O., D’Ambrosio D.B. & Gauci J.A. (2009). Hypercube-based indirect encoding for evolving large-scale neural networks. Artificial Life 15: 185–212.

Weng J., Ahuja N. & Huang T. S. (1993). Learning recognition and segmentation of 3-D objects from 2-D images. Proceeding of the 4th International Conference on Computer Vision. Berlin, Germany. pp. 121-128.