In the previous post which was part 3 in this series (click here for parts 1 and 2) on Predictive Processing (PP), I discussed how the PP framework can be used to adequately account for traditional and scientific notions of knowledge, by treating knowledge as a subset of all the predicted causal relations currently at our brain’s disposal. This subset of predictions that we tend to call knowledge has the special quality of especially high confidence levels (high Bayesian priors). Within a scientific context, knowledge tends to have an even stricter definition (and even higher confidence levels) and so we end up with a smaller subset of predictions which have been further verified through comparing them with the inferred predictions of others and by testing them with external means of instrumentation and some agreed upon conventions for analysis.
However, no amount of testing or verification is going to give us direct access to any knowledge per se. Rather, the creation or discovery of knowledge has to involve the application of some kind of reasoning to explain the causal inputs, and only after this reasoning process can the resulting predicted causal relations be validated to varying degrees by testing it (through raw sensory data, external instruments, etc.). So getting an adequate account of reasoning within any theory or framework of overall brain function is going to be absolutely crucial and I think that the PP framework is well-suited for the job. As has already been mentioned throughout this post-series, this framework fundamentally relies on a form of Bayesian inference (or some approximation) which is a type of reasoning. It is this inferential strategy then, combined with a hierarchical neurological structure for it to work upon, that would allow our knowledge to be created in the first place.
Rules of Inference & Reasoning Based on Hierarchical-Bayesian Prediction Structure, Neuronal Selection, Associations, and Abstraction
While PP tends to focus on perception and action in particular, I’ve mentioned that I see the same general framework as being able to account for not only the folk psychological concepts of beliefs, desires, and emotions, but also that the hierarchical predictive structure it entails should plausibly be able to account for language and ontology and help explain the relationship between the two. It seems reasonable to me that the associations between all of these hierarchically structured beliefs or predicted causal relations at varying levels of abstraction, can provide a foundation for our reasoning as well, whether intuitive or logical forms of reasoning.
To illustrate some of the importance of associations between beliefs, consider an example like the belief in object permanence (i.e. that objects persist or continue to exist even when I can no longer see them). This belief of ours has an extremely high prior because our entire life experience has only served to support this prediction in a large number of ways. This means that it’s become embedded or implicit in a number of other beliefs. If I didn’t predict that object permanence was a feature of my reality, then an enormous number of everyday tasks would become difficult if not impossible to do because objects would be treated as if they are blinking into and out of existence.
We have a large number of beliefs that require object permanence (and which are thus associated with object permanence), and so it is a more fundamental lower-level prediction (though not as low level as sensory information entering the visual cortex) and we use this lower-level prediction to build upon into any number of higher-level predictions in the overall conceptual/predictive hierarchy. When I put money in a bank, I expect to be able to spend it even if I can’t see it anymore (such as with a check or debit card). This is only possible if my money continues to exist even when out of view (regardless of if the money is in a paper/coin or electronic form). This is just one of many countless everyday tasks that depend on this belief. So it’s no surprise that this belief (this set of predictions) would have an incredibly high Bayesian prior, and therefore I would treat it as a non-negotiable fact about reality.
On the other hand, when I was a newborn infant, I didn’t have this belief of object permanence (or at best, it was a very weak belief). Most psychologists estimate that our belief in object permanence isn’t acquired until after several months of brain development and experience. This would translate to our having a relatively low Bayesian prior for this belief early on in our lives, and only once a person begins to form predictions based on these kinds of recognized causal relations can we begin to increase that prior and perhaps eventually reach a point that results in a subjective experience of a high degree in certainty for this particular belief. From that point on, we are likely to simply take that belief for granted, no longer questioning it. The most important thing to note here is that the more associations made between beliefs, the higher their effective weighting (their priors), and thus the higher our confidence in those beliefs becomes.
Neural Implementation, Spontaneous or Random Neural Activity & Generative Model Selection
This all seems pretty reasonable if a neuronal implementation worked to strengthen Bayesian priors as a function of the neuronal/synaptic connectivity (among other factors), where neurons that fire together are more likely to wire together. And connectivity strength will increase the more often this happens. On the flip-side, the less often this happens or if it isn’t happening at all then the connectivity is likely to be weakened or non-existent. So if a concept (or a belief composed of many conceptual relations) is represented by some cluster of interconnected neurons and their activity, then it’s applicability to other concepts increases its chances of not only firing but also increasing the strength of wiring with those other clusters of neurons, thus plausibly increasing the Bayesian priors for the overlapping concept or belief.
Another likely important factor in the Bayesian inferential process, in terms of the brain forming new generative models or predictive hypotheses to test, is the role of spontaneous or random neural activity and neural cluster generation. This random neural activity could plausibly provide a means for some randomly generated predictions or random changes in the pool of predictive models that our brain is able to select from. Similar to the role of random mutation in gene pools which allows for differential reproductive rates and relative fitness of offspring, some amount of randomness in neural activity and the generative models that result would allow for improved models to be naturally selected based on those which best minimize prediction error. The ability to minimize prediction error could be seen as a direct measure of the fitness of the generative model, within this evolutionary landscape.
This idea is related to the late Gerald Edelman’s Theory of Neuronal Group Selection (NGS), also known as Neural Darwinism, which I briefly explored in a post I wrote long ago. I’ve long believed that this kind of natural selection process is applicable to a number of different domains (aside from genetics), and I think any viable version of PP is going to depend on it to at least some degree. This random neural activity (and the naturally selected products derived from them) could be thought of as contributing to a steady supply of new generative models to choose from and thus contributing to our overall human creativity as well whether for reasoning and problem solving strategies or simply for artistic expression.
Increasing Abstraction, Language, & New Rules of Inference
This kind of use it or lose it property of brain plasticity combined with dynamic associations between concepts or beliefs and their underlying predictive structure, would allow for the brain to accommodate learning by extracting statistical inferences (at increasing levels of abstraction) as they occur and modifying or eliminating those inferences by changing their hierarchical associative structure as prediction error is encountered. While some form of Bayesian inference (or an approximation to it) underlies this process, once lower-level inferences about certain causal relations have been made, I believe that new rules of inference can be derived from this basic Bayesian foundation.
To see how this might work, consider how we acquire a skill like learning how to speak and write in some particular language. The rules of grammar, the syntactic structure and so forth which underlie any particular language are learned through use. We begin to associate words with certain conceptual structures (see part 2 of this post-series for more details on language and ontology) and then we build up the length and complexity of our linguistic expressions by adding concepts built on higher levels of abstraction. To maximize the productivity and specificity of our expressions, we also learn more complex rules pertaining to the order in which we speak or write various combinations of words (which varies from language to language).
These grammatical rules can be thought of as just another higher-level abstraction, another higher-level causal relation that we predict will convey more specific information to whomever we are speaking to. If it doesn’t seem to do so, then we either modify what we have mistakenly inferred to be those grammatical rules, or depending on the context, we may simply assume that the person we’re talking to hasn’t conformed to the language or grammar that my community seems to be using.
Just like with grammar (which provides a kind of logical structure to our language), we can begin to learn new rules of inference built on the same probabilistic predictive bedrock of Bayesian inference. We can learn some of these rules explicitly by studying logic, induction, deduction, etc., and consciously applying those rules to infer some new piece of knowledge, or we can learn these kinds of rules implicitly based on successful predictions (pertaining to behaviors of varying complexity) that happen to result from stumbling upon this method of processing causal relations within various contexts. As mentioned earlier, this would be accomplished in part by the natural selection of randomly-generated neural network changes that best reduce the incoming prediction error.
However, language and grammar are interesting examples of an acquired set of rules because they also happen to be the primary tool that we use to learn other rules (along with anything we learn through verbal or written instruction), including (as far as I can tell) various rules of inference. The logical structure of language (though it need not have an exclusively logical structure), its ability to be used for a number of cognitive short-cuts, and it’s influence on our thought complexity and structure, means that we are likely dependent on it during our reasoning processes as well.
When we perform any kind of conscious reasoning process, we are effectively running various mental simulations where we can intentionally manipulate our various generative models to test new predictions (new models) at varying levels of abstraction, and thus we also manipulate the linguistic structure associated with those generative models as well. Since I have a lot more to say on reasoning as it relates to PP, including more on intuitive reasoning in particular, I’m going to expand on this further in my next post in this series, part 5. I’ll also be exploring imagination and memory including how they relate to the processes of reasoning.