The Open Mind

Cogito Ergo Sum

Archive for the ‘Psycholinguistics’ Category

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part IV)

leave a comment »

In the previous post which was part 3 in this series (click here for parts 1 and 2) on Predictive Processing (PP), I discussed how the PP framework can be used to adequately account for traditional and scientific notions of knowledge, by treating knowledge as a subset of all the predicted causal relations currently at our brain’s disposal.  This subset of predictions that we tend to call knowledge has the special quality of especially high confidence levels (high Bayesian priors).  Within a scientific context, knowledge tends to have an even stricter definition (and even higher confidence levels) and so we end up with a smaller subset of predictions which have been further verified through comparing them with the inferred predictions of others and by testing them with external means of instrumentation and some agreed upon conventions for analysis.

However, no amount of testing or verification is going to give us direct access to any knowledge per se.  Rather, the creation or discovery of knowledge has to involve the application of some kind of reasoning to explain the causal inputs, and only after this reasoning process can the resulting predicted causal relations be validated to varying degrees by testing it (through raw sensory data, external instruments, etc.).  So getting an adequate account of reasoning within any theory or framework of overall brain function is going to be absolutely crucial and I think that the PP framework is well-suited for the job.  As has already been mentioned throughout this post-series, this framework fundamentally relies on a form of Bayesian inference (or some approximation) which is a type of reasoning.  It is this inferential strategy then, combined with a hierarchical neurological structure for it to work upon, that would allow our knowledge to be created in the first place.

Rules of Inference & Reasoning Based on Hierarchical-Bayesian Prediction Structure, Neuronal Selection, Associations, and Abstraction

While PP tends to focus on perception and action in particular, I’ve mentioned that I see the same general framework as being able to account for not only the folk psychological concepts of beliefs, desires, and emotions, but also that the hierarchical predictive structure it entails should plausibly be able to account for language and ontology and help explain the relationship between the two.  It seems reasonable to me that the associations between all of these hierarchically structured beliefs or predicted causal relations at varying levels of abstraction, can provide a foundation for our reasoning as well, whether intuitive or logical forms of reasoning.

To illustrate some of the importance of associations between beliefs, consider an example like the belief in object permanence (i.e. that objects persist or continue to exist even when I can no longer see them).  This belief of ours has an extremely high prior because our entire life experience has only served to support this prediction in a large number of ways.  This means that it’s become embedded or implicit in a number of other beliefs.  If I didn’t predict that object permanence was a feature of my reality, then an enormous number of everyday tasks would become difficult if not impossible to do because objects would be treated as if they are blinking into and out of existence.

We have a large number of beliefs that require object permanence (and which are thus associated with object permanence), and so it is a more fundamental lower-level prediction (though not as low level as sensory information entering the visual cortex) and we use this lower-level prediction to build upon into any number of higher-level predictions in the overall conceptual/predictive hierarchy.  When I put money in a bank, I expect to be able to spend it even if I can’t see it anymore (such as with a check or debit card).  This is only possible if my money continues to exist even when out of view (regardless of if the money is in a paper/coin or electronic form).  This is just one of many countless everyday tasks that depend on this belief.  So it’s no surprise that this belief (this set of predictions) would have an incredibly high Bayesian prior, and therefore I would treat it as a non-negotiable fact about reality.

On the other hand, when I was a newborn infant, I didn’t have this belief of object permanence (or at best, it was a very weak belief).  Most psychologists estimate that our belief in object permanence isn’t acquired until after several months of brain development and experience.  This would translate to our having a relatively low Bayesian prior for this belief early on in our lives, and only once a person begins to form predictions based on these kinds of recognized causal relations can we begin to increase that prior and perhaps eventually reach a point that results in a subjective experience of a high degree in certainty for this particular belief.  From that point on, we are likely to simply take that belief for granted, no longer questioning it.  The most important thing to note here is that the more associations made between beliefs, the higher their effective weighting (their priors), and thus the higher our confidence in those beliefs becomes.

Neural Implementation, Spontaneous or Random Neural Activity & Generative Model Selection

This all seems pretty reasonable if a neuronal implementation worked to strengthen Bayesian priors as a function of the neuronal/synaptic connectivity (among other factors), where neurons that fire together are more likely to wire together.  And connectivity strength will increase the more often this happens.  On the flip-side, the less often this happens or if it isn’t happening at all then the connectivity is likely to be weakened or non-existent.  So if a concept (or a belief composed of many conceptual relations) is represented by some cluster of interconnected neurons and their activity, then it’s applicability to other concepts increases its chances of not only firing but also increasing the strength of wiring with those other clusters of neurons, thus plausibly increasing the Bayesian priors for the overlapping concept or belief.

Another likely important factor in the Bayesian inferential process, in terms of the brain forming new generative models or predictive hypotheses to test, is the role of spontaneous or random neural activity and neural cluster generation.  This random neural activity could plausibly provide a means for some randomly generated predictions or random changes in the pool of predictive models that our brain is able to select from.  Similar to the role of random mutation in gene pools which allows for differential reproductive rates and relative fitness of offspring, some amount of randomness in neural activity and the generative models that result would allow for improved models to be naturally selected based on those which best minimize prediction error.  The ability to minimize prediction error could be seen as a direct measure of the fitness of the generative model, within this evolutionary landscape.

This idea is related to the late Gerald Edelman’s Theory of Neuronal Group Selection (NGS), also known as Neural Darwinism, which I briefly explored in a post I wrote long ago.  I’ve long believed that this kind of natural selection process is applicable to a number of different domains (aside from genetics), and I think any viable version of PP is going to depend on it to at least some degree.  This random neural activity (and the naturally selected products derived from them) could be thought of as contributing to a steady supply of new generative models to choose from and thus contributing to our overall human creativity as well whether for reasoning and problem solving strategies or simply for artistic expression.

Increasing Abstraction, Language, & New Rules of Inference

This kind of use it or lose it property of brain plasticity combined with dynamic associations between concepts or beliefs and their underlying predictive structure, would allow for the brain to accommodate learning by extracting statistical inferences (at increasing levels of abstraction) as they occur and modifying or eliminating those inferences by changing their hierarchical associative structure as prediction error is encountered.  While some form of Bayesian inference (or an approximation to it) underlies this process, once lower-level inferences about certain causal relations have been made, I believe that new rules of inference can be derived from this basic Bayesian foundation.

To see how this might work, consider how we acquire a skill like learning how to speak and write in some particular language.  The rules of grammar, the syntactic structure and so forth which underlie any particular language are learned through use.  We begin to associate words with certain conceptual structures (see part 2 of this post-series for more details on language and ontology) and then we build up the length and complexity of our linguistic expressions by adding concepts built on higher levels of abstraction.  To maximize the productivity and specificity of our expressions, we also learn more complex rules pertaining to the order in which we speak or write various combinations of words (which varies from language to language).

These grammatical rules can be thought of as just another higher-level abstraction, another higher-level causal relation that we predict will convey more specific information to whomever we are speaking to.  If it doesn’t seem to do so, then we either modify what we have mistakenly inferred to be those grammatical rules, or depending on the context, we may simply assume that the person we’re talking to hasn’t conformed to the language or grammar that my community seems to be using.

Just like with grammar (which provides a kind of logical structure to our language), we can begin to learn new rules of inference built on the same probabilistic predictive bedrock of Bayesian inference.  We can learn some of these rules explicitly by studying logic, induction, deduction, etc., and consciously applying those rules to infer some new piece of knowledge, or we can learn these kinds of rules implicitly based on successful predictions (pertaining to behaviors of varying complexity) that happen to result from stumbling upon this method of processing causal relations within various contexts.  As mentioned earlier, this would be accomplished in part by the natural selection of randomly-generated neural network changes that best reduce the incoming prediction error.

However, language and grammar are interesting examples of an acquired set of rules because they also happen to be the primary tool that we use to learn other rules (along with anything we learn through verbal or written instruction), including (as far as I can tell) various rules of inference.  The logical structure of language (though it need not have an exclusively logical structure), its ability to be used for a number of cognitive short-cuts, and it’s influence on our thought complexity and structure, means that we are likely dependent on it during our reasoning processes as well.

When we perform any kind of conscious reasoning process, we are effectively running various mental simulations where we can intentionally manipulate our various generative models to test new predictions (new models) at varying levels of abstraction, and thus we also manipulate the linguistic structure associated with those generative models as well.  Since I have a lot more to say on reasoning as it relates to PP, including more on intuitive reasoning in particular, I’m going to expand on this further in my next post in this series, part 5.  I’ll also be exploring imagination and memory including how they relate to the processes of reasoning.

Advertisements

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part II)

leave a comment »

In the first post of this series I introduced some of the basic concepts involved in the Predictive Processing (PP) theory of perception and action.  I briefly tied together the notions of belief, desire, emotion, and action from within a PP lens.  In this post, I’d like to discuss the relationship between language and ontology through the same framework.  I’ll also start talking about PP in an evolutionary context as well, though I’ll have more to say about that in future posts in this series.

Active (Bayesian) Inference as a Source for Ontology

One of the main themes within PP is the idea of active (Bayesian) inference whereby we physically interact with the world, sampling it and modifying it in order to reduce our level of uncertainty in our predictions about the causes of the brain’s inputs.  Within an evolutionary context, we can see why this form of embodied cognition is an ideal schema for an information processing system to employ in order to maximize chances of survival in our highly interactive world.

In order to reduce the amount of sensory information that has to be processed at any given time, it is far more economical for the brain to only worry about the prediction error that flows upward through the neural system, rather than processing all incoming sensory data from scratch.  If the brain is employing a set of predictions that can “explain away” most of the incoming sensory data, then the downward flow of predictions can encounter an upward flow of sensory information (effectively cancelling each other out) and the only thing that remains to propagate upward through the system and do any “cognitive work” (i.e. the only thing that needs to be processed) on the predictive models flowing downward is the remaining prediction error (prediction error = predictions of sensory input minus the actual sensory input).  This is similar to data compression strategies for video files (for example) that only worry about the information that changes over time (pixels that change brightness/color) and then simply compress the information that remains constant (pixels that do not change from frame-to-frame).

The ultimate goal for this strategy within an evolutionary context is to allow the organism to understand its environment in the most salient ways for the pragmatic purposes of accomplishing goals relating to survival.  But once humans began to develop culture and evolve culturally, the predictive strategy gained a new kind of evolutionary breathing space, being able to predict increasingly complex causal relations and developing technology along the way.  All of these inferred causal relations appear to me to be the very source of our ontology, as each hierarchically structured prediction and its ability to become associated with others provides an ideal platform for differentiating between any number of spatio-temporal conceptions and their categorical or logical organization.

An active Bayesian inference system is also ideal to explain our intellectual thirst, human curiosity, and interest in novel experiences (to some degree), because we learn more about the world (and ourselves) by interacting with it in new ways.  In doing so, we are provided with a constant means of fueling and altering our ontology.

Language & Ontology

Language is an important component as well and it fits well within a PP framework as it serves to further link perception and action together in a very important way, allowing us to make new kinds of predictions about the world that wouldn’t have been possible without it.   A tool like language makes a lot of sense from an evolutionary perspective as well since better predictions about the world result in a higher chance of survival.

When we use language by speaking or writing it, we are performing an action which is instantiated by the desire to do so (see previous post about “desire” within a PP framework).  When we interpret language by listening to it or by reading, we are performing a perceptual task which is again simply another set of predictions (in this case, pertaining to the specific causes leading to our sensory inputs).  If we were simply sending and receiving non-lingual nonsense, then the same basic predictive principles underlying perception and action would still apply, but something new emerges when we send and receive actual language (which contains information).  With language, we begin to associate certain sounds and visual information with some kind of meaning or meaningful information.  Once we can do this, we can effectively share our thoughts with one another, or at least many aspects of our thoughts with one another.  This provides for an enormous evolutionary advantage as now we can communicate almost anything we want to one another, store it in external forms of memory (books, computers, etc.), and further analyze or manipulate the information for various purposes (accounting, inventory, science, mathematics, etc.).

By being able to predict certain causal outcomes through the use of language, we are effectively using the lower level predictions associated with perceiving and emitting language to satisfy higher level predictions related to more complex goals including those that extend far into the future.  Since the information that is sent and received amounts to testing or modifying our predictions of the world, we are effectively using language to share and modulate one brain’s set of predictions with that of another brain.  One important aspect of this process is that this information is inherently probabilistic which is why language often trips people up with ambiguities, nuances, multiple meanings behind words and other attributes of language that often lead to misunderstanding.  Wittgenstein is one of the more prominent philosophers who caught onto this property of language and its consequence on philosophical problems and how we see the world structured.  I think a lot of the problems Wittgenstein elaborated on with respect to language can be better accounted for by looking at language as dealing with probabilistic ontological/causal relations that serve some pragmatic purpose, with the meaning of any word or phrase as being best described by its use rather than some clear-cut definition.

This probabilistic attribute of language in terms of the meanings of words having fuzzy boundaries also tracks very well with the ontology that a brain currently has access to.  Our ontology, or what kinds of things we think exist in the world, are often categorized in various ways with some of the more concrete entities given names such as: “animals”, “plants”, “rocks”, “cats”, “cups”, “cars”, “cities”, etc.  But if I morph a wooden chair (say, by chipping away at parts of it with a chisel), eventually it will no longer be recognizable as a chair, and it may begin to look more like a table than a chair or like nothing other than an oddly shaped chunk of wood.  During this process, it may be difficult to point to the exact moment that it stopped being a chair and instead became a table or something else, and this would make sense if what we know to be a chair or table or what-have-you is nothing more than a probabilistic high-level prediction about certain causal relations.  If my brain perceives an object that produces too high of a prediction error based on the predictive model of what a “chair” is, then it will try another model (such as the predictive model pertaining to a “table”), potentially leading to models that are less and less specific until it is satisfied with recognizing the object as merely a “chunk of wood”.

From a PP lens, we can consider lower level predictions pertaining to more basic causes of sensory input (bright/dark regions, lines, colors, curves, edges, etc.) to form some basic ontological building blocks and when they are assembled into higher level predictions, the amount of integrated information increases.  This information integration process leads to condensed probabilities about increasingly complex causal relations, and this ends up reducing the dimensionality of the cause-effect space of the predicted phenomenon (where a set of separate cause-effect repertoires are combined into a smaller number of them).

You can see the advantage here by considering what the brain might do if it’s looking at a black cat sitting on a brown chair.  What if the brain were to look at this scene as merely a set of pixels on the retina that change over time, where there’s no expectations of any subset of pixels to change in ways that differ from any other subset?  This wouldn’t be very useful in predicting how the visual scene will change over time.  What if instead, the brain differentiates one subset of pixels (that correspond to what we call a cat) from all the rest of the pixels, and it does this in part by predicting proximity relations between neighboring pixels in the subset (so if some black pixels move from the right to the left visual field, then some number of neighboring black pixels are predicted to move with it)?

This latter method treats the subset of black-colored pixels as a separate object (as opposed to treating the entire visual scene as a single object), and doing this kind of differentiation in more and more complex ways leads to a well-defined object or concept, or a large number of them.  Associating sounds like “meow” with this subset of black-colored pixels, is just one example of yet another set of properties or predictions that further defines this perceived object as distinct from the rest of the perceptual scene.  Associating this object with a visual or auditory label such as “cat” finally links this ontological object with language.  As long as we agree on what is generally meant by the word “cat” (which we determine through its use), then we can share and modify the predictive models associated with such an object or concept, as we can do with any other successful instance of linguistic communication.

Language, Context, and Linguistic Relativism

However, it should be noted that as we get to more complex causal relations (more complex concepts/objects), we can no longer give these concepts a simple one word label and expect to communicate information about them nearly as easily as we could for the concept of a “cat”.  Think about concepts like “love” or “patriotism” or “transcendence” and realize how there’s many different ways that we use those terms and how they can mean all sorts of different things and so our meaning behind those words will be heavily conveyed to others by the context that they are used in.  And context in a PP framework could be described as simply the (expected) conjunction of multiple predictive models (multiple sets of causal relations) such as the conjunction of the predictive models pertaining to the concepts of food, pizza, and a desirable taste, and the word “love” which would imply a particular use of the word “love” as used in the phrase “I love pizza”.  This use of the word “love” is different than one which involves the conjunction of predictive models pertaining to the concepts of intimacy, sex, infatuation, and care, implied in a phrase like “I love my wife”.  In any case, conjunctions of predictive models can get complicated and this carries over to our stretching our language to its very limits.

Since we are immersed in language and since it is integral in our day-to-day lives, we also end up being conditioned to think linguistically in a number of ways.  For example, we often think with an interior monologue (e.g. “I am hungry and I want pizza for lunch”) even when we don’t plan on communicating this information to anyone else, so it’s not as if we’re simply rehearsing what we need to say before we say it.  I tend to think however that this linguistic thinking (thinking in our native language) is more or less a result of the fact that the causal relations that we think about have become so strongly associated with certain linguistic labels and propositions, that we sort of automatically think of the causal relations alongside the labels that we “hear” in our head.  This seems to be true even if the causal relations could be thought of without any linguistic labels, in principle at least.  We’ve simply learned to associate them so strongly to one another that in most cases separating the two is just not possible.

On the flip side, this tendency of language to associate itself with our thoughts, also puts certain barriers or restrictions on our thoughts.  If we are always preparing to share our thoughts through language, then we’re going to become somewhat entrained to think in ways that can be most easily expressed in a linguistic form.  So although language may simply be along for the ride with many non-linguistic aspects of thought, our tendency to use it may also structure our thinking and reasoning in large ways.  This would account for why people raised in different cultures with different languages see the world in different ways based on the structure of their language.  While linguistic determinism seems to have been ruled out (the strong version of the Sapir-Whorf hypothesis), there is still strong evidence to support linguistic relativism (the weak version of the Sapir-Whorf hypothesis), whereby one’s language effects their ontology and view of how the world is structured.

If language is so heavily used day-to-day then this phenomenon makes sense as viewed through a PP lens since we’re going to end up putting a high weight on the predictions that link ontology with language since these predictions have been demonstrated to us to be useful most of the time.  Minimal prediction error means that our Bayesian evidence is further supported and the higher the weight carried by these predictions, the more these predictions will restrict our overall thinking, including how our ontology is structured.

Moving on…

I think that these are but a few of the interesting relationships between language and ontology and how a PP framework helps to put it all together nicely, and I just haven’t seen this kind of explanatory power and parsimony in any other kind of conceptual framework about how the brain functions.  This bodes well for the framework and it’s becoming less and less surprising to see it being further supported over time with studies in neuroscience, cognition, psychology, and also those pertaining to pathologies of the brain, perceptual illusions, etc.  In the next post in this series, I’m going to talk about knowledge and how it can be seen through the lens of PP.

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part I)

leave a comment »

I’ve been away from writing for a while because I’ve had some health problems relating to my neck.  A few weeks ago I had double-cervical-disc replacement surgery and so I’ve been unable to write and respond to comments and so forth for a little while.  I’m in the second week following my surgery now and have finally been able to get back to writing, which feels very good given that I’m unable to lift or resume martial arts for the time being.  Anyway, I want to resume my course of writing beginning with a post-series that pertains to Predictive Processing (PP) and the Bayesian brain.  I’ve written one post on this topic a little over a year ago (which can be found here) as I’ve become extremely interested in this topic for the last several years now.

The Predictive Processing (PP) theory of perception shows a lot of promise in terms of finding an overarching schema that can account for everything that the brain seems to do.  While its technical application is to account for the acts of perception and active inference in particular, I think it can be used more broadly to account for other descriptions of our mental life such as beliefs (and knowledge), desires, emotions, language, reasoning, cognitive biases, and even consciousness itself.  I want to explore some of these relationships as viewed through a PP lens more because I think it is the key framework needed to reconcile all of these aspects into one coherent picture, especially within the evolutionary context of an organism driven to survive.  Let’s begin this post-series by first looking at how PP relates to perception (including imagination), beliefs, emotions, and desires (and by extension, the actions resulting from particular desires).

Within a PP framework, beliefs can be best described as simply the set of particular predictions that the brain employs which encompass perception, desires, action, emotion, etc., and which are ultimately mediated and updated in order to reduce prediction errors based on incoming sensory evidence (and which approximates a Bayesian form of inference).  Perception then, which is constituted by a subset of all our beliefs (with many of them being implicit or unconscious beliefs), is more or less a form of controlled hallucination in the sense that what we consciously perceive is not the actual sensory evidence itself (not even after processing it), but rather our brain’s “best guess” of what the causes for the incoming sensory evidence are.

Desires can be best described as another subset of one’s beliefs, and a set of beliefs which has the special characteristic of being able to drive action or physical behavior in some way (whether driving internal bodily states, or external ones that move the body in various ways).  Finally, emotions can be thought of as predictions pertaining to the causes of internal bodily states and which may be driven or changed by changes in other beliefs (including changes in desires or perceptions).

When we believe something to be true or false, we are basically just modeling some kind of causal relationship (or its negation) which is able to manifest itself into a number of highly-weighted predicted perceptions and actions.  When we believe something to be likely true or likely false, the same principle applies but with a lower weight or precision on the predictions that directly corresponds to the degree of belief or disbelief (and so new sensory evidence will more easily sway such a belief).  And just like our perceptions, which are mediated by a number of low and high-level predictions pertaining to incoming sensory data, any prediction error that the brain encounters results in either updating the perceptual predictions to new ones that better reduce the prediction error and/or performing some physical action that reduces the prediction error (e.g. rotating your head, moving your eyes, reaching for an object, excreting hormones in your body, etc.).

In all these cases, we can describe the brain as having some set of Bayesian prior probabilities pertaining to the causes of incoming sensory data, and these priors changing over time in response to prediction errors arising from new incoming sensory evidence that fails to be “explained away” by the predictive models currently employed.  Strong beliefs are associated with high prior probabilities (highly-weighted predictions) and therefore need much more counterfactual sensory evidence to be overcome or modified than for weak beliefs which have relatively low priors (low-weighted predictions).

To illustrate some of these concepts, let’s consider a belief like “apples are a tasty food”.  This belief can be broken down into a number of lower level, highly-weighted predictions such as the prediction that eating a piece of what we call an “apple” will most likely result in qualia that accompany the perception of a particular satisfying taste, the lower level prediction that doing so will also cause my perception of hunger to change, and the higher level prediction that it will “give me energy” (with these latter two predictions stemming from the more basic category of “food” contained in the belief).  Another prediction or set of predictions is that these expectations will apply to not just one apple but a number of apples (different instances of one type of apple, or different types of apples altogether), and a host of other predictions.

These predictions may even result (in combination with other perceptions or beliefs) in an actual desire to eat an apple which, under a PP lens could be described as the highly weighted prediction of what it would feel like to find an apple, to reach for an apple, to grab it, to bite off a piece of it, to chew it, and to swallow it.  If I merely imagine doing such things, then the resulting predictions will necessarily carry such a small weight that they won’t be able to influence any actual motor actions (even if these imagined perceptions are able to influence other predictions that may eventually lead to some plan of action).  Imagined perceptions will also not carry enough weight (when my brain is functioning normally at least) to trick me into thinking that they are actual perceptions (by “actual”, I simply mean perceptions that correspond to incoming sensory data).  This low-weighting attribute of imagined perceptual predictions thus provides a viable way for us to have an imagination and to distinguish it from perceptions corresponding to incoming sensory data, and to distinguish it from predictions that directly cause bodily action.  On the other hand, predictions that are weighted highly enough (among other factors) will be uniquely capable of affecting our perception of the real world and/or instantiating action.

This latter case of desire and action shows how the PP model takes the organism to be an embodied prediction machine that is directly influencing and being influenced by the world that its body interacts with, with the ultimate goal of reducing any prediction error encountered (which can be thought of as maximizing Bayesian evidence).  In this particular example, the highly-weighted prediction of eating an apple is simply another way of describing a desire to eat an apple, which produces some degree of prediction error until the proper actions have taken place in order to reduce said error.  The only two ways of reducing this prediction error are to change the desire (or eliminate it) to one that no longer involves eating an apple, and/or to perform bodily actions that result in actually eating an apple.

Perhaps if I realize that I don’t have any apples in my house, but I realize that I do have bananas, then my desire will change to one that predicts my eating a banana instead.  Another way of saying this is that my higher-weighted prediction of satisfying hunger supersedes my prediction of eating an apple specifically, thus one desire is able to supersede another.  However, if the prediction weight associated with my desire to eat an apple is high enough, it may mean that my predictions will motivate me enough to avoid eating the banana, and instead to predict what it is like to walk out of my house, go to the store, and actually get an apple (and therefore, to actually do so).  Furthermore, it may motivate me to predict actions that lead me to earn the money such that I can purchase the apple (if I don’t already have the money to do so).  To do this, I would be employing a number of predictions having to do with performing actions that lead to me obtaining money, using money to purchase goods, etc.

This is but a taste of what PP has to offer, and how we can look at basic concepts within folk psychology, cognitive science, and theories of mind in a new light.  Associated with all of these beliefs, desires, emotions, and actions (which again, are simply different kinds of predictions under this framework), is a number of elements pertaining to ontology (i.e. what kinds of things we think exist in the world) and pertaining to language as well, and I’d like to explore this relationship in my next post.  This link can be found here.

The Co-Evolution of Language and Complex Thought

with 17 comments

Language appears to be the most profound feature that has arisen during the evolution of the human mind.  This feature of humanity has led to incredible thought complexity, and also provided the foundation for the most simplistic thoughts imaginable.  Many of us may wonder how exactly language is related to thought and also how the evolution of language has affected the evolution of thought complexity.  In this post, I plan to discuss what I believe to be some evolutionary aspects of psycholinguistics.

Mental Languages

It is clear that humans think in some form of language, whether it is accomplished as an interior monologue using our native spoken language and/or some form of what many call “mentalese” (i.e. a means of thinking about concepts and propositions without the use of words).  Our thoughts are likely accomplished by a combination of these two “types” of language.  The fact that young infants and aphasics (for example) are able to think, clearly implies that not all thoughts are accomplished through a spoken language.  It is also likely that the aforementioned “mentalese” is some innate form of mental symbolic representation that is primary in some sense, supported by the fact that it appears to be necessary in order for spoken language to develop or exist at all.  Considering that words and sentences do not have any intrinsic semantic content or value (at least non-iconic forms) illustrates that this “mentalese” is in fact a prerequisite for understanding or assigning the meaning of words and sentences.  Complex words can always be defined by a number of less complex words, but at some point a limit is reached whereby the most simple units of definition are composed of seemingly irreducible concepts and propositions.  Furthermore, those irreducible concepts and propositions do not require any words to have meaning (for if they did, we would have an infinite regress of words being defined by words being defined by words, ad infinitum).  The only time we theoretically require symbolic representation of semantic content using words is if the concepts are to be easily (if at all) communicated to others.

While some form of mentalese is likely the foundation or even ultimate form of thought, it is my contention that communicable language has likely had a considerable impact on the evolution of the human mind — not only in the most trivial or obvious way whereby communicated words affect our thoughts (e.g. through inspiration, imagination, and/or reflection of new knowledge or perspectives), but also by serving as a secondary multidimensional medium for symbolic representation. That is, spoken language (as well as its subsequent allotrope, written language) has provided a form of combinatorial leverage somewhat independent of (although working in harmony with) the mental or cognitive faculties that innately exist for thought.

To be sure, spoken language has likely co-evolved with our mentalese, as they seem to affect one another in various ways.  As new types or combinations of propositions and concepts are discovered, the spoken language has to adapt in order to make those new propositions and concepts communicable to others.  What interests me more however, is how communicable language (spoken or written) has affected the evolution of thought complexity itself.

Communicable Language and Thought Complexity

Words and sentences, which primarily developed in order to communicate instances of our mental language to others, have also undoubtedly provided a secondary multidimensional medium for symbolic representation.  For example, when we use words, we are able to compress a large amount of information (i.e. many concepts and propositions) into small tokens with varying densities.  This type of compression has provided a way to maximize our use of short-term and long-term memory in order for more complex thoughts and mental capabilities to develop (whether that increase in complexity is defined as longer strings of concepts or propositions, or otherwise).

When we think of a sentence to ourselves, we end up utilizing a phonological/auditory loop, whereby we can better handle and organize information at any single moment by internally “hearing” it.  We can also visualize the words in multiple ways including how the mouth movements of people speaking those words would look like (and we can use our tactile and/or motor memory to mentally simulate how our mouth feels when these words are spoken), and if a written form of the communicable language exists, we can actually visualize the words as they would appear in their written form (as well as the aforementioned tactile/motor memory to mentally simulate how it feels to write those words).  On top of this, we can often visualize each glyph in multiple formats (i.e. different sizes, shapes, fonts, etc.).  This has provided a multidimensional memory tool, because it serves to represent the semantic information in a way that our brain can perceive and associate with multiple senses (in this case through our auditory, visual, and somatosensory cortices).  In some cases, when a particular written language uses iconic glyphs (as opposed to arbitary symbols), the user can also visualize the concept represented by the symbol in an idealized fashion.  Associating information with multiple cognitive faculties or perceptual systems means that more neural network patterns of the brain will be involved with the attainment, retention, and recall of that information.  For those of us that have successfully used various pneumonic devices and other memory-enhancing “tricks”, we can clearly see the efficacy and importance of communicable language and its relationship to how we think about and combine various concepts and propositions.

By enhancing our memory, communicable language has served as an epistemic catalyst allowing us to build upon our previous knowledge in ways that would have likely been impossible without said language.  Once written language was developed, we were no longer limited by our own short-term and long-term memory, for we had a way of recording as much information as possible, and this also allowed us to better formulate new ideas and consider thoughts that would have otherwise been too complex to mentally visualize or keep track of.  Mathematics, for example, exponentially increased in complexity once we were able to represent the relationships between variables in a written form.  While previously we would have been limited by our short-term and long-term memory, written language allowed us to eventually formulate incredibly long (sometimes painfully long) mathematical expressions.  Once written language was converted further into an electro-mechanical language (i.e. through the use of computers), our “writing” mediums, information acquisition mechanisms, and pattern recognition capabilities, were further aided and enhanced exponentially thus providing yet another platform for an increased epistemic or cognitive “breathing space”.  If our brains underwent particular mutations after communicable language evolved, it may have provided a way to ratchet our way into entirely new cognitive niches or capabilities.  That is, by communicable language providing us with new strings of concepts and propositions, there may have been an unprecedented natural selection pressure/opportunity (if an advantageous brain mutation accompanied this new cognitive input) in order for our brain to obtain an entirely new and possibly more complex fundamental concept or way of thinking.

Summary

It seems evident to me that communicable language, once it had developed, served as an extremely important epistemic catalyst and multidimensional cognitive tool that likely had a great influence on the evolution of the human brain.  While some form of mentalese was likely a prerequisite and precursor to any subsequent forms of communicable language, the cognitive “breathing space” that communicable language provided, seems to have had a marked impact on the evolution of human thought complexity, and on the amount of knowledge that we’ve been able to obtain from the world around us.  I have no doubt that the current external linguistic tools we use (i.e. written and electronic forms of handling information) will continue to significantly alter the ongoing evolution of the human mind.  Our biological forms of memory will likely adapt in order to be economically optimized and better work with those external media.  Likewise, our increasing access to new types of information may have provided (and may continue to provide) a natural selective pressure or opportunity for our brains to evolve in order to think about entirely new and potentially more complex concepts, thereby periodically increasing the lexicon or conceptual database of our “mentalese” (assuming that those new concepts provide a survival/reproductive advantage).