In the first post of this series I introduced some of the basic concepts involved in the Predictive Processing (PP) theory of perception and action. I briefly tied together the notions of belief, desire, emotion, and action from within a PP lens. In this post, I’d like to discuss the relationship between language and ontology through the same framework. I’ll also start talking about PP in an evolutionary context as well, though I’ll have more to say about that in future posts in this series.
Active (Bayesian) Inference as a Source for Ontology
One of the main themes within PP is the idea of active (Bayesian) inference whereby we physically interact with the world, sampling it and modifying it in order to reduce our level of uncertainty in our predictions about the causes of the brain’s inputs. Within an evolutionary context, we can see why this form of embodied cognition is an ideal schema for an information processing system to employ in order to maximize chances of survival in our highly interactive world.
In order to reduce the amount of sensory information that has to be processed at any given time, it is far more economical for the brain to only worry about the prediction error that flows upward through the neural system, rather than processing all incoming sensory data from scratch. If the brain is employing a set of predictions that can “explain away” most of the incoming sensory data, then the downward flow of predictions can encounter an upward flow of sensory information (effectively cancelling each other out) and the only thing that remains to propagate upward through the system and do any “cognitive work” (i.e. the only thing that needs to be processed) on the predictive models flowing downward is the remaining prediction error (prediction error = predictions of sensory input minus the actual sensory input). This is similar to data compression strategies for video files (for example) that only worry about the information that changes over time (pixels that change brightness/color) and then simply compress the information that remains constant (pixels that do not change from frame-to-frame).
The ultimate goal for this strategy within an evolutionary context is to allow the organism to understand its environment in the most salient ways for the pragmatic purposes of accomplishing goals relating to survival. But once humans began to develop culture and evolve culturally, the predictive strategy gained a new kind of evolutionary breathing space, being able to predict increasingly complex causal relations and developing technology along the way. All of these inferred causal relations appear to me to be the very source of our ontology, as each hierarchically structured prediction and its ability to become associated with others provides an ideal platform for differentiating between any number of spatio-temporal conceptions and their categorical or logical organization.
An active Bayesian inference system is also ideal to explain our intellectual thirst, human curiosity, and interest in novel experiences (to some degree), because we learn more about the world (and ourselves) by interacting with it in new ways. In doing so, we are provided with a constant means of fueling and altering our ontology.
Language & Ontology
Language is an important component as well and it fits well within a PP framework as it serves to further link perception and action together in a very important way, allowing us to make new kinds of predictions about the world that wouldn’t have been possible without it. A tool like language makes a lot of sense from an evolutionary perspective as well since better predictions about the world result in a higher chance of survival.
When we use language by speaking or writing it, we are performing an action which is instantiated by the desire to do so (see previous post about “desire” within a PP framework). When we interpret language by listening to it or by reading, we are performing a perceptual task which is again simply another set of predictions (in this case, pertaining to the specific causes leading to our sensory inputs). If we were simply sending and receiving non-lingual nonsense, then the same basic predictive principles underlying perception and action would still apply, but something new emerges when we send and receive actual language (which contains information). With language, we begin to associate certain sounds and visual information with some kind of meaning or meaningful information. Once we can do this, we can effectively share our thoughts with one another, or at least many aspects of our thoughts with one another. This provides for an enormous evolutionary advantage as now we can communicate almost anything we want to one another, store it in external forms of memory (books, computers, etc.), and further analyze or manipulate the information for various purposes (accounting, inventory, science, mathematics, etc.).
By being able to predict certain causal outcomes through the use of language, we are effectively using the lower level predictions associated with perceiving and emitting language to satisfy higher level predictions related to more complex goals including those that extend far into the future. Since the information that is sent and received amounts to testing or modifying our predictions of the world, we are effectively using language to share and modulate one brain’s set of predictions with that of another brain. One important aspect of this process is that this information is inherently probabilistic which is why language often trips people up with ambiguities, nuances, multiple meanings behind words and other attributes of language that often lead to misunderstanding. Wittgenstein is one of the more prominent philosophers who caught onto this property of language and its consequence on philosophical problems and how we see the world structured. I think a lot of the problems Wittgenstein elaborated on with respect to language can be better accounted for by looking at language as dealing with probabilistic ontological/causal relations that serve some pragmatic purpose, with the meaning of any word or phrase as being best described by its use rather than some clear-cut definition.
This probabilistic attribute of language in terms of the meanings of words having fuzzy boundaries also tracks very well with the ontology that a brain currently has access to. Our ontology, or what kinds of things we think exist in the world, are often categorized in various ways with some of the more concrete entities given names such as: “animals”, “plants”, “rocks”, “cats”, “cups”, “cars”, “cities”, etc. But if I morph a wooden chair (say, by chipping away at parts of it with a chisel), eventually it will no longer be recognizable as a chair, and it may begin to look more like a table than a chair or like nothing other than an oddly shaped chunk of wood. During this process, it may be difficult to point to the exact moment that it stopped being a chair and instead became a table or something else, and this would make sense if what we know to be a chair or table or what-have-you is nothing more than a probabilistic high-level prediction about certain causal relations. If my brain perceives an object that produces too high of a prediction error based on the predictive model of what a “chair” is, then it will try another model (such as the predictive model pertaining to a “table”), potentially leading to models that are less and less specific until it is satisfied with recognizing the object as merely a “chunk of wood”.
From a PP lens, we can consider lower level predictions pertaining to more basic causes of sensory input (bright/dark regions, lines, colors, curves, edges, etc.) to form some basic ontological building blocks and when they are assembled into higher level predictions, the amount of integrated information increases. This information integration process leads to condensed probabilities about increasingly complex causal relations, and this ends up reducing the dimensionality of the cause-effect space of the predicted phenomenon (where a set of separate cause-effect repertoires are combined into a smaller number of them).
You can see the advantage here by considering what the brain might do if it’s looking at a black cat sitting on a brown chair. What if the brain were to look at this scene as merely a set of pixels on the retina that change over time, where there’s no expectations of any subset of pixels to change in ways that differ from any other subset? This wouldn’t be very useful in predicting how the visual scene will change over time. What if instead, the brain differentiates one subset of pixels (that correspond to what we call a cat) from all the rest of the pixels, and it does this in part by predicting proximity relations between neighboring pixels in the subset (so if some black pixels move from the right to the left visual field, then some number of neighboring black pixels are predicted to move with it)?
This latter method treats the subset of black-colored pixels as a separate object (as opposed to treating the entire visual scene as a single object), and doing this kind of differentiation in more and more complex ways leads to a well-defined object or concept, or a large number of them. Associating sounds like “meow” with this subset of black-colored pixels, is just one example of yet another set of properties or predictions that further defines this perceived object as distinct from the rest of the perceptual scene. Associating this object with a visual or auditory label such as “cat” finally links this ontological object with language. As long as we agree on what is generally meant by the word “cat” (which we determine through its use), then we can share and modify the predictive models associated with such an object or concept, as we can do with any other successful instance of linguistic communication.
Language, Context, and Linguistic Relativism
However, it should be noted that as we get to more complex causal relations (more complex concepts/objects), we can no longer give these concepts a simple one word label and expect to communicate information about them nearly as easily as we could for the concept of a “cat”. Think about concepts like “love” or “patriotism” or “transcendence” and realize how there’s many different ways that we use those terms and how they can mean all sorts of different things and so our meaning behind those words will be heavily conveyed to others by the context that they are used in. And context in a PP framework could be described as simply the (expected) conjunction of multiple predictive models (multiple sets of causal relations) such as the conjunction of the predictive models pertaining to the concepts of food, pizza, and a desirable taste, and the word “love” which would imply a particular use of the word “love” as used in the phrase “I love pizza”. This use of the word “love” is different than one which involves the conjunction of predictive models pertaining to the concepts of intimacy, sex, infatuation, and care, implied in a phrase like “I love my wife”. In any case, conjunctions of predictive models can get complicated and this carries over to our stretching our language to its very limits.
Since we are immersed in language and since it is integral in our day-to-day lives, we also end up being conditioned to think linguistically in a number of ways. For example, we often think with an interior monologue (e.g. “I am hungry and I want pizza for lunch”) even when we don’t plan on communicating this information to anyone else, so it’s not as if we’re simply rehearsing what we need to say before we say it. I tend to think however that this linguistic thinking (thinking in our native language) is more or less a result of the fact that the causal relations that we think about have become so strongly associated with certain linguistic labels and propositions, that we sort of automatically think of the causal relations alongside the labels that we “hear” in our head. This seems to be true even if the causal relations could be thought of without any linguistic labels, in principle at least. We’ve simply learned to associate them so strongly to one another that in most cases separating the two is just not possible.
On the flip side, this tendency of language to associate itself with our thoughts, also puts certain barriers or restrictions on our thoughts. If we are always preparing to share our thoughts through language, then we’re going to become somewhat entrained to think in ways that can be most easily expressed in a linguistic form. So although language may simply be along for the ride with many non-linguistic aspects of thought, our tendency to use it may also structure our thinking and reasoning in large ways. This would account for why people raised in different cultures with different languages see the world in different ways based on the structure of their language. While linguistic determinism seems to have been ruled out (the strong version of the Sapir-Whorf hypothesis), there is still strong evidence to support linguistic relativism (the weak version of the Sapir-Whorf hypothesis), whereby one’s language effects their ontology and view of how the world is structured.
If language is so heavily used day-to-day then this phenomenon makes sense as viewed through a PP lens since we’re going to end up putting a high weight on the predictions that link ontology with language since these predictions have been demonstrated to us to be useful most of the time. Minimal prediction error means that our Bayesian evidence is further supported and the higher the weight carried by these predictions, the more these predictions will restrict our overall thinking, including how our ontology is structured.
Moving on…
I think that these are but a few of the interesting relationships between language and ontology and how a PP framework helps to put it all together nicely, and I just haven’t seen this kind of explanatory power and parsimony in any other kind of conceptual framework about how the brain functions. This bodes well for the framework and it’s becoming less and less surprising to see it being further supported over time with studies in neuroscience, cognition, psychology, and also those pertaining to pathologies of the brain, perceptual illusions, etc. In the next post in this series, I’m going to talk about knowledge and how it can be seen through the lens of PP.