This is the last post I’m going to write for this particular post-series on Predictive Processing (PP). Here’s the links to parts 1, 2, 3, 4, and 5. I’ve already explored a bit on how a PP framework can account for folk psychological concepts like beliefs, desires, and emotions, how it accounts for action, language and ontology, knowledge, and also perception, imagination, and reasoning. In this final post for this series, I’m going to explore consciousness itself and how some theories of consciousness fit very nicely within a PP framework.
Consciousness as Prediction (Predicting The Self)
Earlier in this post-series I explored how PP treats perception (whether online or an offline form like imagination) as simply predictions pertaining to incoming visual information with varying degrees of precision weighting assigned to the resulting prediction error. In a sense then, consciousness just is prediction. At the very least, it is a subset of the predictions, likely those that are higher up in the predictive hierarchy. This is all going to depend on which aspect or level of consciousness we are trying to explain, and as philosophers and cognitive scientists well know, consciousness is difficult to pin down and define in any way.
By consciousness, we could mean any kind of awareness at all, or we could limit this term to only apply to a system that is aware of itself. Either way we have to be careful here if we’re looking to distinguish between consciousness generally speaking (consciousness in any form, which may be unrecognizable to us) and the unique kind of consciousness that we as human beings experience. If an ant is conscious, it doesn’t likely have any of the richness that we have in our experience nor is it likely to have self-awareness like we do (even though a dolphin, which has a large neocortex and prefrontal cortex, is far more likely to). So we have to keep these different levels of consciousness in mind in order to properly assess their being explained by any cognitive framework.
Looking through a PP lens, we can see that what we come to know about the world and our own bodily states is a matter of predictive models pertaining to various inferred causal relations. These inferred causal relations ultimately stem from bottom-up sensory input. But when this information is abstracted at higher and higher levels, eventually one can (in principle) get to a point where those higher level models begin to predict the existence of a unified prediction engine. In other words, a subset of the highest-level predictive models may eventually predict itself as a self. We might describe this process as the emergence of some level of self-awareness, even if higher levels of self-awareness aren’t possible unless particular kinds of higher level models have been generated.
What kinds of predictions might be involved with this kind of emergence? Well, we might expect that predictions pertaining to our own autobiographical history, which is largely composed of episodic memories of our past experience, would contribute to this process (e.g. “I remember when I went to that amusement park with my friend Mary, and we both vomited!”). If we begin to infer what is common or continuous between those memories of past experiences (even if only 1 second in the past), we may discover that there is a form of psychological continuity or identity present. And if this psychological continuity (this type of causal relation) is coincident with an incredibly stable set of predictions pertaining to (especially internal) bodily states, then an embodied subject or self can plausibly emerge from it.
This emergence of an embodied self is also likely fueled by predictions pertaining to other objects that we infer existing as subjects. For instance, in order to develop a theory of mind about other people, that is, in order to predict how other people will behave, we can’t simply model their external behavior as we can for something like a rock falling down a hill. This can work up to a point, but eventually it’s just not good enough as behaviors become more complex. Animal behavior, most especially that of humans, is far more complex than that of inanimate objects and as such it is going to be far more effective to infer some internal hidden causes for that behavior. Our predictions would work well if they involved some kind of internal intentionality and goal-directedness operating within any animate object, thereby transforming that object into a subject. This should be no less true for how we model our own behavior.
If this object that we’ve now inferred to be a subject seems to behave in ways that we see ourselves as behaving (especially if it’s another human being), then we can begin to infer some kind of equivalence despite being separate subjects. We can begin to infer that they too have beliefs, desires, and emotions, and thus that they have an internal perspective that we can’t directly access just as they aren’t able to access ours. And we can also see ourselves from a different perspective based on how we see those other subjects from an external perspective. Since I can’t easily see myself from an external perspective, when I look at others and infer that we are similar kinds of beings, then I can begin to see myself as having both an internal and external side. I can begin to infer that others see me similar to the way that I see them, thus further adding to my concept of self and the boundaries that define that self.
Multiple meta-cognitive predictions can be inferred based on all of these interactions with others, and from the introspective interactions with our brain’s own models. Once this happens, a cognitive agent like ourselves may begin to think about thinking and think about being a thinking being, and so on and so forth. All of these cognitive moves would seem to provide varying degrees of self-hood or self-awareness. And these can all be thought of as the brain’s best guesses that account for its own behavior. Either way, it seems that the level of consciousness that is intrinsic to an agent’s experience is going to be dependent on what kinds of higher level models and meta-models are operating within the agent.
Consciousness as Integrated Information
One prominent theory of consciousness is the Integrated Information Theory of Consciousness, otherwise known as IIT. This theory, initially formulated by Giulio Tononi back in 2004 and which has undergone development ever since, posits that consciousness is ultimately dependent on the degree of information integration that is inherent in the causal properties of some system. Another way of saying this is that the causal system specified is unified such that every part of the system must be able to affect and be affected by the rest of the system. If you were to physically isolate one part of a system from the rest of it (and if this part was the least significant to the rest of the system), then the resulting change in the cause-effect structure of the system would quantify the degree of integration. A large change in the cause-effect structure based on this part’s isolation from the system (that is, by having introduced what is called a minimum partition to the system) would imply a high degree of information integration and vice versa. And again, a high degree of integration implies a high degree of consciousness.
Notice how this information integration axiom in IIT posits that a cognitive system that is entirely feed-forward will not be conscious. So if our brain processed incoming sensory information from the bottom up and there was no top-down generative model feeding downward through the system, then IIT would predict that our brain wouldn’t be able to produce consciousness. PP on the other hand, posits a feedback system (as opposed to feed-forward) where the bottom-up sensory information that flows upward is met with a downward flow of top-down predictions trying to explain away that sensory information. The brain’s predictions cause a change in the resulting prediction error, and this prediction error serves as feedback to modify the brain’s predictions. Thus, a cognitive architecture like that suggested by PP is predicted to produce consciousness according to the most fundamental axiom of IIT.
Additionally, PP posits cross-modal sensory features and functionality where the brain integrates (especially lower level) predictions spanning various spatio-temporal scales from different sensory modalities, into a unified whole. For example, if I am looking at and petting a black cat lying on my lap and hearing it purr, PP posits that my perceptual experience and contextual understanding of that experience are based on having integrated the visual, tactile, and auditory expectations that I’ve associated to constitute such an experience of a “black cat”. It is going to be contingent on a conjunction of predictions that are occurring simultaneously in order to produce a unified experience rather than a barrage of millions or billions of separate causal relations (let alone those which stem from different sensory modalities) or having millions or billions of separate conscious experiences (which would seem to necessitate separate consciousnesses if they are happening at the same time).
Evolution of Consciousness
Since IIT identifies consciousness with integrated information, it can plausibly account for why it evolved in the first place. The basic idea here is that a brain that is capable of integrating information is more likely to exploit and understand an environment that has a complex causal structure on multiple time scales than a brain that has informationally isolated modules. This idea has been tested and confirmed to some degree by artificial life simulations (animats) where adaptation and integration are both simulated. The organism in these simulations was a Braitenberg-like vehicle that had to move through a maze. After 60,000 generations of simulated brains evolving through natural selection, it was found that there was a monotonic relationship between their ability to get through the maze and the amount of simulated information integration in their brains.
This increase in adaptation was the result of an effective increase in the number of concepts that the organism could make use of given the limited number of elements and connections possible in its cognitive architecture. In other words, given a limited number of connections in a causal system (such as a group of neurons), you can pack more functions per element if the level of integration with respect to those connections is high, thus giving an evolutionary advantage to those with higher integration. Therefore, when all else is equal in terms of neural economy and resources, higher integration gives an organism the ability to take advantage of more regularities in their environment.
From a PP perspective, this makes perfect sense because the complex causal structure of the environment is described as being modeled at many different levels of abstraction and at many different spatio-temporal scales. All of these modeled causal relations are also described as having a hierarchical structure with models contained within models, and with many associations existing between various models. These associations between models can be accounted for by a cognitive architecture that re-uses certain sets of neurons in multiple models, so the association is effectively instantiated by some literal degree of neuronal overlap. And of course, these associations between multiply-leveled predictions allows the brain to exploit (and create!) as many regularities in the environment as possible. In short, both PP and IIT make a lot of practical sense from an evolutionary perspective.
That’s All Folks!
And this concludes my post-series on the Predictive Processing (PP) framework and how I see it as being applicable to a far more broad account of mentality and brain function, than it is generally assumed to be. If there’s any takeaways from this post-series, I hope you can at least appreciate the parsimony and explanatory scope of predictive processing and viewing the brain as a creative and highly capable prediction engine.