The Open Mind

Cogito Ergo Sum

Archive for the ‘Memory’ Category

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part V)

leave a comment »

In the previous post, part 4 in this series on Predictive Processing (PP), I explored some aspects of reasoning and how different forms of reasoning can be built from a foundational bedrock of Bayesian inference (click here for parts 1, 2, or 3).  This has a lot to do with language, but I also claimed that it depends on how the brain is likely generating new models, which I think is likely to involve some kind of natural selection operating on neural networks.  The hierarchical structure of the generative models for these predictions as described within a PP framework, also seems to fit well with the hierarchical structure that we find in the brain’s neural networks.  In this post, I’m going to talk about the relation between memory, imagination, and unconscious and conscious forms of reasoning.

Memory, Imagination, and Reasoning

Memory is of course crucial to the PP framework whether for constructing real-time predictions of incoming sensory information (for perception) or for long-term predictions involving high-level, increasingly abstract generative models that allow us to accomplish complex future goals (like planning to go grocery shopping, or planning for retirement).  Either case requires the brain to have stored some kind of information pertaining to predicted causal relations.  Rather than memories being some kind of exact copy of past experiences (where they’d be stored like data on a computer), research has shown that memory functions more like a reconstruction of those past experiences which are modified by current knowledge and context, and produced by some of the same faculties used in imagination.

This accounts for any false or erroneous aspects of our memories, where the recalled memory can differ substantially from how the original event was experienced.  It also accounts for why our memories become increasingly altered as more time passes.  Over time, we learn new things, continuing to change many of our predictive models about the world, and thus have a more involved reconstructive process the older the memories are.  And the context we find ourselves in when trying to recall certain memories, further affect this reconstruction process, adapting our memories in some sense to better match what we find most salient and relevant in the present moment.

Conscious vs. Unconscious Processing & Intuitive Reasoning (Intuition)

Another attribute of memory is that it is primarily unconscious, where we seem to have this pool of information that is kept out of consciousness until parts of it are needed (during memory recall or or other conscious thought processes).  In fact, within the PP framework we can think of most of our generative models (predictions), especially those operating in the lower levels of the hierarchy, as being out of our conscious awareness as well.  However, since our memories are composed of (or reconstructed with) many higher level predictions, and since only a limited number of them can enter our conscious awareness at any moment, this implies that most of the higher-level predictions are also being maintained or processed unconsciously as well.

It’s worth noting however that when we were first forming these memories, a lot of the information was in our consciousness (the higher-level, more abstract predictions in particular).  Within PP, consciousness plays a special role since our attention modifies what is called the precision weight (or synaptic gain) on any prediction error that flows upward through the predictive hierarchy.  This means that the prediction errors produced from the incoming sensory information or at even higher levels of processing are able to have a greater impact on modifying and updating the predictive models.  This makes sense from an evolutionary perspective, where we can ration our cognitive resources in a more adaptable way, by allowing things that catch our attention (which may be more important to our survival prospects) to have the greatest effect on how we understand the world around us and how we need to act at any given moment.

After repeatedly encountering certain predicted causal relations in a conscious fashion, the more likely those predictions can become automated or unconsciously processed.  And if this has happened with certain rules of inference that govern how we manipulate and process many of our predictive models, it seems reasonable to suspect that this would contribute to what we call our intuitive reasoning (or intuition).  After all, intuition seems to give people the sense of knowing something without knowing how it was acquired and without any present conscious process of reasoning.

This is similar to muscle memory or procedural memory (like learning how to ride a bike) which is consciously processed at first (thus involving many parts of the cerebral cortex), but after enough repetition it becomes a faster and more automated process that is accomplished more economically and efficiently by the basal ganglia and cerebellum, parts of the brain that are believed to handle a great deal of unconscious processing like that needed for procedural memory.  This would mean that the predictions associated with these kinds of causal relations begin to function out of our consciousness, even if the same predictive strategy is still in place.

As mentioned above, one difference between this unconscious intuition and other forms of reasoning that operate within the purview of consciousness is that our intuitions are less likely to be updated or changed based on new experiential evidence since our conscious attention isn’t involved in the updating process. This means that the precision weight of upward flowing prediction errors that encounter downward flowing predictions that are operating unconsciously will have little impact in updating those predictions.  Furthermore, the fact that the most automated predictions are often those that we’ve been using for most of our lives, means that they are also likely to have extremely high Bayesian priors, further isolating them from modification.

Some of these priors may become what are called hyperpriors or priors over priors (many of these believed to be established early in life) where there may be nothing that can overcome them, because they describe an extremely abstract feature of the world.  An example of a possible hyperprior could be one that demands that the brain settle on one generative model even when it’s comparable to several others under consideration.  One could call this a “tie breaker” hyperprior, where if the brain didn’t have this kind of predictive mechanism in place, it may never be able to settle on a model, causing it to see the world (or some aspect of it) as a superposition of equiprobable states rather than simply one determinate state.  We could see the potential problem in an organism’s survival prospects if it didn’t have this kind of hyperprior in place.  Whether or not a hyperprior like this is a form of innate specificity, or acquired in early learning is debatable.

An obvious trade-off with intuition (or any kind of innate biases) is that it provides us with fast, automated predictions that are robust and likely to be reliable much of the time, but at the expense of not being able to adequately handle more novel or complex situations, thereby leading to fallacious inferences.  Our cognitive biases are also likely related to this kind of unconscious reasoning whereby evolution has naturally selected cognitive strategies that work well for the kind of environment we evolved in (African savanna, jungle, etc.) even at the expense of our not being able to adapt as well culturally or in very artificial situations.

Imagination vs. Perception

One large benefit of storing so much perceptual information in our memories (predictive models with different spatio-temporal scales) is our ability to re-create it offline (so to speak).  This is where imagination comes in, where we are able to effectively simulate perceptions without requiring a stream of incoming sensory data that matches it.  Notice however that this is still a form of perception, because we can still see, hear, feel, taste and smell predicted causal relations that have been inferred from past sensory experiences.

The crucial difference, within a PP framework, is the role of precision weighting on the prediction error, just as we saw above in terms of trying to update intuitions.  If precision weighting is set or adjusted to be relatively low with respect to a particular set of predictive models, then prediction error will have little if any impact on the model.  During imagination, we effectively decouple the bottom-up prediction error from the top-down predictions associated with our sensory cortex (by reducing the precision weighting of the prediction error), thus allowing us to intentionally perceive things that aren’t actually in the external world.  We need not decouple the error from the predictions entirely, as we may want our imagination to somehow correlate with what we’re actually perceiving in the external world.  For example, maybe I want to watch a car driving down the street and simply imagine that it is a different color, while still seeing the rest of the scene as I normally would.  In general though, it is this decoupling “knob” that we can turn (precision weighting) that underlies our ability to produce and discriminate between normal perception and our imagination.

So what happens when we lose the ability to control our perception in a normal way (whether consciously or not)?  Well, this usually results in our having some kind of hallucination.  Since perception is often referred to as a form of controlled hallucination (within PP), we could better describe a pathological hallucination (such as that arising from certain psychedelic drugs or a condition like Schizophrenia) as a form of uncontrolled hallucination.  In some cases, even with a perfectly normal/healthy brain, when the prediction error simply can’t be minimized enough, or the brain is continuously switching between models, based on what we’re looking at, we experience perceptual illusions.

Whether it’s illusions, hallucinations, or any other kind of perceptual pathology (like not being able to recognize faces), PP offers a good explanation for why these kinds of experiences can happen to us.  It’s either because the models are poor (their causal structure or priors) or something isn’t being controlled properly, like the delicate balance between precision weighting and prediction error, any of which that could result from an imbalance in neurotransmitters or some kind of brain damage.

Imagination & Conscious Reasoning

While most people would tend to define imagination as that which pertains to visual imagery, I prefer to classify all conscious experiences that are not directly resulting from online perception as imagination.  In other words, any part of our conscious experience that isn’t stemming from an immediate inference of incoming sensory information is what I consider to be imagination.  This is because any kind of conscious thinking is going to involve an experience that could in theory be re-created by an artificial stream of incoming sensory information (along with our top-down generative models that put that information into a particular context of understanding).  As long as the incoming sensory information was a particular way (any way that we can imagine!), even if it could never be that way in the actual external world we live in, it seems to me that it should be able to reproduce any conscious process given the right top-down predictive model.  Another way of saying this is that imagination is simply another word to describe any kind of offline conscious mental simulation.

This also means that I’d classify any and all kinds of conscious reasoning processes as yet another form of imagination.  Just as is the case with more standard conceptions of imagination (within PP at least), we are simply taking particular predictive models, manipulating them in certain ways in order to simulate some result with this process decoupled (at least in part) from actual incoming sensory information.  We may for example, apply a rule of inference that we’ve picked up on and manipulate several predictive models of causal relations using that rule.  As mentioned in the previous post and in the post from part 2 of this series, language is also likely to play a special role here where we’ll likely be using it to help guide this conceptual manipulation process by organizing and further representing the causal relations in a linguistic form, and then determining the resulting inference (which will more than likely be in a linguistic form as well).  In doing so, we are able to take highly abstract properties of causal relations and apply rules to them to extract new information.

If I imagine a purple elephant trumpeting and flying in the air over my house, even though I’ve never experienced such a thing, it seems clear that I’m manipulating several different types of predicted causal relations at varying levels of abstraction and experiencing the result of that manipulation.  This involves inferred causal relations like those pertaining to visual aspects of elephants, the color purple, flying objects, motion in general, houses, the air, and inferred causal relations pertaining to auditory aspects like trumpeting sounds and so forth.

Specific instances of these kinds of experienced causal relations have led to my inferring them as an abstract probabilistically-defined property (e.g. elephantness, purpleness, flyingness, etc.) that can be reused and modified to some degree to produce an infinite number of possible recreated perceptual scenes.  These may not be physically possible perceptual scenes (since elephants don’t have wings to fly, for example) but regardless I’m able to add or subtract, mix and match, and ultimately manipulate properties in countless ways, only limited really by what is logically possible (so I can’t possibly imagine what a square circle would look like).

What if I’m performing a mathematical calculation, like “adding 9 + 9”, or some other similar problem?  This appears (upon first glance at least) to be very qualitatively different than simply imagining things that we tend to perceive in the world like elephants, books, music, and other things, even if they are imagined in some phantasmagorical way.  As crazy as those imagined things may be, they still contain things like shapes, colors, sounds, etc., and a mathematical calculation seems to lack this.  I think the key thing to realize here is the fundamental process of imagination as being able to add or subtract and manipulate abstract properties in any way that is logically possible (given our current set of predictive models).  This means that we can imagine properties or abstractions that lack all the richness of a typical visual/auditory perceptual scene.

In the case of a mathematical calculation, I would be manipulating previously acquired predicted causal relations that pertain to quantity and changes in quantity.  Once I was old enough to infer that separate objects existed in the world, then I could infer an abstraction of how many objects there were in some space at some particular time.  Eventually, I could abstract the property of how many objects without applying it to any particular object at all.  Using language to associate a linguistic symbol for each and every specific quantity would lay the groundwork for a system of “numbers” (where numbers are just quantities pertaining to no particular object at all).  Once this was done, then my brain could use the abstraction of quantity and manipulate it by following certain inferred rules of how quantities can change by adding to or subtracting from them.  After some practice and experience I would now be in a reasonable position to consciously think about “adding 9 + 9”, and either do it by following a manual iterative rule of addition that I’ve learned to do with real or imagined visual objects (like adding up some number of apples or dots/points in a row or grid), or I can simply use a memorized addition table and search/recall the sum I’m interested in (9 + 9 = 18).

Whether we consider imagining a purple elephant, mentally adding up numbers, thinking about what I’m going to say to my wife when I see her next, or trying to explicitly apply logical rules to some set of concepts, all of these forms of conscious thought or reasoning are all simply different sets of predictive models that I’m simply manipulating in mental simulations until I arrive at a perception that’s understood in the desired context and that has minimal prediction error.

Putting it all together

In summary, I think we can gain a lot of insight by looking at all the different aspects of brain function through a PP framework.  Imagination, perception, memory, intuition, and conscious reasoning fit together very well when viewed as different aspects of hierarchical predictive models that are manipulated and altered in ways that give us a much more firm grip on the world we live in and its inferred causal structure.  Not only that, but this kind of cognitive architecture also provides us with an enormous potential for creativity and intelligence.  In the next post in this series, I’m going to talk about consciousness, specifically theories of consciousness and how they may be viewed through a PP framework.

Advertisements

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part II)

leave a comment »

In the first post of this series I introduced some of the basic concepts involved in the Predictive Processing (PP) theory of perception and action.  I briefly tied together the notions of belief, desire, emotion, and action from within a PP lens.  In this post, I’d like to discuss the relationship between language and ontology through the same framework.  I’ll also start talking about PP in an evolutionary context as well, though I’ll have more to say about that in future posts in this series.

Active (Bayesian) Inference as a Source for Ontology

One of the main themes within PP is the idea of active (Bayesian) inference whereby we physically interact with the world, sampling it and modifying it in order to reduce our level of uncertainty in our predictions about the causes of the brain’s inputs.  Within an evolutionary context, we can see why this form of embodied cognition is an ideal schema for an information processing system to employ in order to maximize chances of survival in our highly interactive world.

In order to reduce the amount of sensory information that has to be processed at any given time, it is far more economical for the brain to only worry about the prediction error that flows upward through the neural system, rather than processing all incoming sensory data from scratch.  If the brain is employing a set of predictions that can “explain away” most of the incoming sensory data, then the downward flow of predictions can encounter an upward flow of sensory information (effectively cancelling each other out) and the only thing that remains to propagate upward through the system and do any “cognitive work” (i.e. the only thing that needs to be processed) on the predictive models flowing downward is the remaining prediction error (prediction error = predictions of sensory input minus the actual sensory input).  This is similar to data compression strategies for video files (for example) that only worry about the information that changes over time (pixels that change brightness/color) and then simply compress the information that remains constant (pixels that do not change from frame-to-frame).

The ultimate goal for this strategy within an evolutionary context is to allow the organism to understand its environment in the most salient ways for the pragmatic purposes of accomplishing goals relating to survival.  But once humans began to develop culture and evolve culturally, the predictive strategy gained a new kind of evolutionary breathing space, being able to predict increasingly complex causal relations and developing technology along the way.  All of these inferred causal relations appear to me to be the very source of our ontology, as each hierarchically structured prediction and its ability to become associated with others provides an ideal platform for differentiating between any number of spatio-temporal conceptions and their categorical or logical organization.

An active Bayesian inference system is also ideal to explain our intellectual thirst, human curiosity, and interest in novel experiences (to some degree), because we learn more about the world (and ourselves) by interacting with it in new ways.  In doing so, we are provided with a constant means of fueling and altering our ontology.

Language & Ontology

Language is an important component as well and it fits well within a PP framework as it serves to further link perception and action together in a very important way, allowing us to make new kinds of predictions about the world that wouldn’t have been possible without it.   A tool like language makes a lot of sense from an evolutionary perspective as well since better predictions about the world result in a higher chance of survival.

When we use language by speaking or writing it, we are performing an action which is instantiated by the desire to do so (see previous post about “desire” within a PP framework).  When we interpret language by listening to it or by reading, we are performing a perceptual task which is again simply another set of predictions (in this case, pertaining to the specific causes leading to our sensory inputs).  If we were simply sending and receiving non-lingual nonsense, then the same basic predictive principles underlying perception and action would still apply, but something new emerges when we send and receive actual language (which contains information).  With language, we begin to associate certain sounds and visual information with some kind of meaning or meaningful information.  Once we can do this, we can effectively share our thoughts with one another, or at least many aspects of our thoughts with one another.  This provides for an enormous evolutionary advantage as now we can communicate almost anything we want to one another, store it in external forms of memory (books, computers, etc.), and further analyze or manipulate the information for various purposes (accounting, inventory, science, mathematics, etc.).

By being able to predict certain causal outcomes through the use of language, we are effectively using the lower level predictions associated with perceiving and emitting language to satisfy higher level predictions related to more complex goals including those that extend far into the future.  Since the information that is sent and received amounts to testing or modifying our predictions of the world, we are effectively using language to share and modulate one brain’s set of predictions with that of another brain.  One important aspect of this process is that this information is inherently probabilistic which is why language often trips people up with ambiguities, nuances, multiple meanings behind words and other attributes of language that often lead to misunderstanding.  Wittgenstein is one of the more prominent philosophers who caught onto this property of language and its consequence on philosophical problems and how we see the world structured.  I think a lot of the problems Wittgenstein elaborated on with respect to language can be better accounted for by looking at language as dealing with probabilistic ontological/causal relations that serve some pragmatic purpose, with the meaning of any word or phrase as being best described by its use rather than some clear-cut definition.

This probabilistic attribute of language in terms of the meanings of words having fuzzy boundaries also tracks very well with the ontology that a brain currently has access to.  Our ontology, or what kinds of things we think exist in the world, are often categorized in various ways with some of the more concrete entities given names such as: “animals”, “plants”, “rocks”, “cats”, “cups”, “cars”, “cities”, etc.  But if I morph a wooden chair (say, by chipping away at parts of it with a chisel), eventually it will no longer be recognizable as a chair, and it may begin to look more like a table than a chair or like nothing other than an oddly shaped chunk of wood.  During this process, it may be difficult to point to the exact moment that it stopped being a chair and instead became a table or something else, and this would make sense if what we know to be a chair or table or what-have-you is nothing more than a probabilistic high-level prediction about certain causal relations.  If my brain perceives an object that produces too high of a prediction error based on the predictive model of what a “chair” is, then it will try another model (such as the predictive model pertaining to a “table”), potentially leading to models that are less and less specific until it is satisfied with recognizing the object as merely a “chunk of wood”.

From a PP lens, we can consider lower level predictions pertaining to more basic causes of sensory input (bright/dark regions, lines, colors, curves, edges, etc.) to form some basic ontological building blocks and when they are assembled into higher level predictions, the amount of integrated information increases.  This information integration process leads to condensed probabilities about increasingly complex causal relations, and this ends up reducing the dimensionality of the cause-effect space of the predicted phenomenon (where a set of separate cause-effect repertoires are combined into a smaller number of them).

You can see the advantage here by considering what the brain might do if it’s looking at a black cat sitting on a brown chair.  What if the brain were to look at this scene as merely a set of pixels on the retina that change over time, where there’s no expectations of any subset of pixels to change in ways that differ from any other subset?  This wouldn’t be very useful in predicting how the visual scene will change over time.  What if instead, the brain differentiates one subset of pixels (that correspond to what we call a cat) from all the rest of the pixels, and it does this in part by predicting proximity relations between neighboring pixels in the subset (so if some black pixels move from the right to the left visual field, then some number of neighboring black pixels are predicted to move with it)?

This latter method treats the subset of black-colored pixels as a separate object (as opposed to treating the entire visual scene as a single object), and doing this kind of differentiation in more and more complex ways leads to a well-defined object or concept, or a large number of them.  Associating sounds like “meow” with this subset of black-colored pixels, is just one example of yet another set of properties or predictions that further defines this perceived object as distinct from the rest of the perceptual scene.  Associating this object with a visual or auditory label such as “cat” finally links this ontological object with language.  As long as we agree on what is generally meant by the word “cat” (which we determine through its use), then we can share and modify the predictive models associated with such an object or concept, as we can do with any other successful instance of linguistic communication.

Language, Context, and Linguistic Relativism

However, it should be noted that as we get to more complex causal relations (more complex concepts/objects), we can no longer give these concepts a simple one word label and expect to communicate information about them nearly as easily as we could for the concept of a “cat”.  Think about concepts like “love” or “patriotism” or “transcendence” and realize how there’s many different ways that we use those terms and how they can mean all sorts of different things and so our meaning behind those words will be heavily conveyed to others by the context that they are used in.  And context in a PP framework could be described as simply the (expected) conjunction of multiple predictive models (multiple sets of causal relations) such as the conjunction of the predictive models pertaining to the concepts of food, pizza, and a desirable taste, and the word “love” which would imply a particular use of the word “love” as used in the phrase “I love pizza”.  This use of the word “love” is different than one which involves the conjunction of predictive models pertaining to the concepts of intimacy, sex, infatuation, and care, implied in a phrase like “I love my wife”.  In any case, conjunctions of predictive models can get complicated and this carries over to our stretching our language to its very limits.

Since we are immersed in language and since it is integral in our day-to-day lives, we also end up being conditioned to think linguistically in a number of ways.  For example, we often think with an interior monologue (e.g. “I am hungry and I want pizza for lunch”) even when we don’t plan on communicating this information to anyone else, so it’s not as if we’re simply rehearsing what we need to say before we say it.  I tend to think however that this linguistic thinking (thinking in our native language) is more or less a result of the fact that the causal relations that we think about have become so strongly associated with certain linguistic labels and propositions, that we sort of automatically think of the causal relations alongside the labels that we “hear” in our head.  This seems to be true even if the causal relations could be thought of without any linguistic labels, in principle at least.  We’ve simply learned to associate them so strongly to one another that in most cases separating the two is just not possible.

On the flip side, this tendency of language to associate itself with our thoughts, also puts certain barriers or restrictions on our thoughts.  If we are always preparing to share our thoughts through language, then we’re going to become somewhat entrained to think in ways that can be most easily expressed in a linguistic form.  So although language may simply be along for the ride with many non-linguistic aspects of thought, our tendency to use it may also structure our thinking and reasoning in large ways.  This would account for why people raised in different cultures with different languages see the world in different ways based on the structure of their language.  While linguistic determinism seems to have been ruled out (the strong version of the Sapir-Whorf hypothesis), there is still strong evidence to support linguistic relativism (the weak version of the Sapir-Whorf hypothesis), whereby one’s language effects their ontology and view of how the world is structured.

If language is so heavily used day-to-day then this phenomenon makes sense as viewed through a PP lens since we’re going to end up putting a high weight on the predictions that link ontology with language since these predictions have been demonstrated to us to be useful most of the time.  Minimal prediction error means that our Bayesian evidence is further supported and the higher the weight carried by these predictions, the more these predictions will restrict our overall thinking, including how our ontology is structured.

Moving on…

I think that these are but a few of the interesting relationships between language and ontology and how a PP framework helps to put it all together nicely, and I just haven’t seen this kind of explanatory power and parsimony in any other kind of conceptual framework about how the brain functions.  This bodes well for the framework and it’s becoming less and less surprising to see it being further supported over time with studies in neuroscience, cognition, psychology, and also those pertaining to pathologies of the brain, perceptual illusions, etc.  In the next post in this series, I’m going to talk about knowledge and how it can be seen through the lens of PP.

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part I)

leave a comment »

I’ve been away from writing for a while because I’ve had some health problems relating to my neck.  A few weeks ago I had double-cervical-disc replacement surgery and so I’ve been unable to write and respond to comments and so forth for a little while.  I’m in the second week following my surgery now and have finally been able to get back to writing, which feels very good given that I’m unable to lift or resume martial arts for the time being.  Anyway, I want to resume my course of writing beginning with a post-series that pertains to Predictive Processing (PP) and the Bayesian brain.  I’ve written one post on this topic a little over a year ago (which can be found here) as I’ve become extremely interested in this topic for the last several years now.

The Predictive Processing (PP) theory of perception shows a lot of promise in terms of finding an overarching schema that can account for everything that the brain seems to do.  While its technical application is to account for the acts of perception and active inference in particular, I think it can be used more broadly to account for other descriptions of our mental life such as beliefs (and knowledge), desires, emotions, language, reasoning, cognitive biases, and even consciousness itself.  I want to explore some of these relationships as viewed through a PP lens more because I think it is the key framework needed to reconcile all of these aspects into one coherent picture, especially within the evolutionary context of an organism driven to survive.  Let’s begin this post-series by first looking at how PP relates to perception (including imagination), beliefs, emotions, and desires (and by extension, the actions resulting from particular desires).

Within a PP framework, beliefs can be best described as simply the set of particular predictions that the brain employs which encompass perception, desires, action, emotion, etc., and which are ultimately mediated and updated in order to reduce prediction errors based on incoming sensory evidence (and which approximates a Bayesian form of inference).  Perception then, which is constituted by a subset of all our beliefs (with many of them being implicit or unconscious beliefs), is more or less a form of controlled hallucination in the sense that what we consciously perceive is not the actual sensory evidence itself (not even after processing it), but rather our brain’s “best guess” of what the causes for the incoming sensory evidence are.

Desires can be best described as another subset of one’s beliefs, and a set of beliefs which has the special characteristic of being able to drive action or physical behavior in some way (whether driving internal bodily states, or external ones that move the body in various ways).  Finally, emotions can be thought of as predictions pertaining to the causes of internal bodily states and which may be driven or changed by changes in other beliefs (including changes in desires or perceptions).

When we believe something to be true or false, we are basically just modeling some kind of causal relationship (or its negation) which is able to manifest itself into a number of highly-weighted predicted perceptions and actions.  When we believe something to be likely true or likely false, the same principle applies but with a lower weight or precision on the predictions that directly corresponds to the degree of belief or disbelief (and so new sensory evidence will more easily sway such a belief).  And just like our perceptions, which are mediated by a number of low and high-level predictions pertaining to incoming sensory data, any prediction error that the brain encounters results in either updating the perceptual predictions to new ones that better reduce the prediction error and/or performing some physical action that reduces the prediction error (e.g. rotating your head, moving your eyes, reaching for an object, excreting hormones in your body, etc.).

In all these cases, we can describe the brain as having some set of Bayesian prior probabilities pertaining to the causes of incoming sensory data, and these priors changing over time in response to prediction errors arising from new incoming sensory evidence that fails to be “explained away” by the predictive models currently employed.  Strong beliefs are associated with high prior probabilities (highly-weighted predictions) and therefore need much more counterfactual sensory evidence to be overcome or modified than for weak beliefs which have relatively low priors (low-weighted predictions).

To illustrate some of these concepts, let’s consider a belief like “apples are a tasty food”.  This belief can be broken down into a number of lower level, highly-weighted predictions such as the prediction that eating a piece of what we call an “apple” will most likely result in qualia that accompany the perception of a particular satisfying taste, the lower level prediction that doing so will also cause my perception of hunger to change, and the higher level prediction that it will “give me energy” (with these latter two predictions stemming from the more basic category of “food” contained in the belief).  Another prediction or set of predictions is that these expectations will apply to not just one apple but a number of apples (different instances of one type of apple, or different types of apples altogether), and a host of other predictions.

These predictions may even result (in combination with other perceptions or beliefs) in an actual desire to eat an apple which, under a PP lens could be described as the highly weighted prediction of what it would feel like to find an apple, to reach for an apple, to grab it, to bite off a piece of it, to chew it, and to swallow it.  If I merely imagine doing such things, then the resulting predictions will necessarily carry such a small weight that they won’t be able to influence any actual motor actions (even if these imagined perceptions are able to influence other predictions that may eventually lead to some plan of action).  Imagined perceptions will also not carry enough weight (when my brain is functioning normally at least) to trick me into thinking that they are actual perceptions (by “actual”, I simply mean perceptions that correspond to incoming sensory data).  This low-weighting attribute of imagined perceptual predictions thus provides a viable way for us to have an imagination and to distinguish it from perceptions corresponding to incoming sensory data, and to distinguish it from predictions that directly cause bodily action.  On the other hand, predictions that are weighted highly enough (among other factors) will be uniquely capable of affecting our perception of the real world and/or instantiating action.

This latter case of desire and action shows how the PP model takes the organism to be an embodied prediction machine that is directly influencing and being influenced by the world that its body interacts with, with the ultimate goal of reducing any prediction error encountered (which can be thought of as maximizing Bayesian evidence).  In this particular example, the highly-weighted prediction of eating an apple is simply another way of describing a desire to eat an apple, which produces some degree of prediction error until the proper actions have taken place in order to reduce said error.  The only two ways of reducing this prediction error are to change the desire (or eliminate it) to one that no longer involves eating an apple, and/or to perform bodily actions that result in actually eating an apple.

Perhaps if I realize that I don’t have any apples in my house, but I realize that I do have bananas, then my desire will change to one that predicts my eating a banana instead.  Another way of saying this is that my higher-weighted prediction of satisfying hunger supersedes my prediction of eating an apple specifically, thus one desire is able to supersede another.  However, if the prediction weight associated with my desire to eat an apple is high enough, it may mean that my predictions will motivate me enough to avoid eating the banana, and instead to predict what it is like to walk out of my house, go to the store, and actually get an apple (and therefore, to actually do so).  Furthermore, it may motivate me to predict actions that lead me to earn the money such that I can purchase the apple (if I don’t already have the money to do so).  To do this, I would be employing a number of predictions having to do with performing actions that lead to me obtaining money, using money to purchase goods, etc.

This is but a taste of what PP has to offer, and how we can look at basic concepts within folk psychology, cognitive science, and theories of mind in a new light.  Associated with all of these beliefs, desires, emotions, and actions (which again, are simply different kinds of predictions under this framework), is a number of elements pertaining to ontology (i.e. what kinds of things we think exist in the world) and pertaining to language as well, and I’d like to explore this relationship in my next post.  This link can be found here.

The illusion of Persistent Identity & the Role of Information in Identity

with 8 comments

After reading and commenting on a post at “A Philosopher’s Take” by James DiGiovanna titled Responsibility, Identity, and Artificial Beings: Persons, Supra-persons and Para-persons, I decided to expand on the topic of personal identity.

Personal Identity Concepts & Criteria

I think when most people talk about personal identity, they are referring to how they see themselves and how they see others in terms of personality and some assortment of (usually prominent) cognitive and behavioral traits.  Basically, they see it as what makes a person unique and in some way distinguishable from another person.  And even this rudimentary concept can be broken down into at least two parts, namely, how we see ourselves (self-ascribed identity) and how others see us (which we could call the inferred identity of someone else), since they are likely going to differ.  While most people tend to think of identity in these ways, when philosophers talk about personal identity, they are usually referring to the unique numerical identity of a person.  Roughly speaking, this amounts to basically whatever conditions or properties that are both necessary and sufficient such that a person at one point in time and a person at another point in time can be considered the same person — with a temporal continuity between those points in time.

Usually the criterion put forward for this personal identity is supposed to be some form of spatiotemporal and/or psychological continuity.  I certainly wouldn’t be the first person to point out that the question of which criterion is correct has already framed the debate with the assumption that a personal (numerical) identity exists in the first place and even if it did exist, it also assumes that the criterion is something that would be determinable in some way.  While it is not unfounded to believe that some properties exist that we could ascribe to all persons (simply because of what we find in common with all persons we’ve interacted with thus far), I think it is far too presumptuous to believe that there is a numerical identity underlying our basic conceptions of personal identity and a determinable criterion for it.  At best, I think if one finds any kind of numerical identity for persons that persist over time, it is not going to be compatible with our intuitions nor is it going to be applicable in any pragmatic way.

As I mention pragmatism, I am sympathetic to Parfit’s views in the sense that regardless of what one finds the criteria for numerical personal identity to be (if it exists), the only thing that really matters to us is psychological continuity anyway.  So despite the fact that Locke’s view — that psychological continuity (via memory) was the criterion for personal identity — was in fact shown to be based on circular and illogical arguments (per Butler, Reid and others), nevertheless I give applause to his basic idea.  Locke seemed to be on the right track, in that psychological continuity (in some sense involving memory and consciousness) is really the essence of what we care about when defining persons, even if it can’t be used as a valid criterion in the way he proposed.

(Non) Persistence & Pragmatic Use of a Personal Identity Concept

I think that the search for, and long debates over, what the best criterion for personal identity is, has illustrated that what people have been trying to label as personal identity should probably be relabeled as some sort of pragmatic pseudo-identity. The pragmatic considerations behind the common and intuitive conceptions of personal identity have no doubt steered the debate pertaining to any possible criteria for helping to define it, and so we can still value those considerations even if a numerical personal identity doesn’t really exist (that is, even if it is nothing more than a pseudo-identity) and even if a diachronic numerical personal identity does exist but isn’t useful in any way.

If the object/subject that we refer to as “I” or “me” is constantly changing with every passing moment of time both physically and psychologically, then I tend to think that the self (that many people ascribe as the “agent” of our personal identity) is an illusion of some sort.  I tend to side more with Hume on this point (or at least James Giles’ fair interpretation of Hume) in that my views seem to be some version of a no-self or eliminativist theory of personal identity.  As Hume pointed out, even though we intuitively ascribe a self and thereby some kind of personal identity, there is no logical reason supported by our subjective experience to think it is anything but a figment of the imagination.  This illusion results from our perceptions flowing from one to the next, with a barrage of changes taking place with this “self” over time that we simply don’t notice taking place — at least not without critical reflection on our past experiences of this ever-changing “self”.  The psychological continuity that Locke described seems to be the main driving force behind this illusory self since there is an overlap in the memories of the succession of persons.

I think one could say that if there is any numerical identity that is associated with the term “I” or “me”, it only exists for a short moment of time in one specific spatio-temporal slice, and then as the next perceivable moment elapses, what used to be “I” will become someone else, even if the new person that comes into being is still referred to as “I” or “me” by a person that possesses roughly the same configuration of matter in its body and brain as the previous person.  Since the neighboring identities have an overlap in accessible memory including autobiographical memories, memories of past experiences generally, and the memories pertaining to the evolving desires that motivate behavior, we shouldn’t expect this succession of persons to be noticed or perceived by the illusory self because each identity has access to a set of memories that is sufficiently similar to the set of memories accessible to the previous or successive identity.  And this sufficient degree of similarity in those identities’ memories allow for a seemingly persistent autobiographical “self” with goals.

As for the pragmatic reasons for considering all of these “I”s and “me”s to be the same person and some singular identity over time, we can see that there is a causal dependency between each member of this “chain of spatio-temporal identities” that I think exists, and so treating that chain of interconnected identities as one being is extremely intuitive and also incredibly useful for accomplishing goals (which is likely the reason why evolution would favor brains that can intuit this concept of a persistent “self” and the near uni-directional behavior that results from it).  There is a continuity of memory and behaviors (even though both change over time, both in terms of the number of memories and their accuracy) and this continuity allows for a process of conditioning to modify behavior in ways that actively rely on those chains of memories of past experiences.  We behave as if we are a single person moving through time and space (and as if we are surrounded by other temporally extended single person’s behaving in similar ways) and this provides a means of assigning ethical and causal responsibility to something or more specifically to some agent.  Quite simply, by having those different identities referenced under one label and physically attached to or instantiated by something localized, that allows for that pragmatic pseudo-identity to persist over time in order for various goals (whether personal or interpersonal/societal) to be accomplished.

“The Persons Problem” and a “Speciation” Analogy

I came up with an analogy that I thought was very fitting to this concept.  One could analogize this succession of identities that get clumped into one bulk pragmatic-pseudo-identity with the evolutionary concept of speciation.  For example, a sequence of identities somehow constitute an intuitively persistent personal identity, just as a sequence of biological generations somehow constitute a particular species due to the high degree of similarity between them all.  The apparent difficulty lies in the fact that, at some point after enough identities have succeeded one another, even the intuitive conception of a personal identity changes markedly to the point of being unrecognizable from its ancestral predecessor, just as enough biological generations transpiring eventually leads to what we call a new species.  It’s difficult to define exactly when that speciation event happens (hence the species problem), and we have a similar problem with personal identity I think.  Where does it begin and end?  If personal identity changes over the course of a lifetime, when does one person become another?  I could think of “me” as the same “me” that existed one year ago, but if I go far enough back in time, say to when I was five years old, it is clear that “I” am a completely different person now when compared to that five year old (different beliefs, goals, worldview, ontology, etc.).  There seems to have been an identity “speciation” event of some sort even though it is hard to define exactly when that was.

Biologists have tried to solve their species problem by coming up with various criteria to help for taxonomical purposes at the very least, but what they’ve wound up with at this point is several different criteria for defining a species that are each effective for different purposes (e.g. biological-species concept, morpho-species concept, phylogenetic-species concept, etc.), and without any single “correct” answer since they are all situationally more or less useful.  Similarly, some philosophers have had a persons problem that they’ve been trying to solve and I gather that it is insoluble for similar “fuzzy boundary” reasons (indeterminate properties, situationally dependent properties, etc.).

The Role of Information in a Personal Identity Concept

Anyway, rather than attempt to solve the numerical personal identity problem, I think that philosophers need to focus more on the importance of the concept of information and how it can be used to try and arrive at a more objective and pragmatic description of the personal identity of some cognitive agent (even if it is not used as a criterion for numerical identity, since information can be copied and the copies can be distinguished from one another numerically).  I think this is especially true once we take some of the concerns that James DiGiovanna brought up concerning the integration of future AI into our society.

If all of the beliefs, behaviors, and causal driving forces in a cognitive agent can be represented in terms of information, then I think we can implement more universal conditioning principles within our ethical and societal framework since they will be based more on the information content of the person’s identity without putting as much importance on numerical identity nor as much importance on our intuitions of persisting people (since they will be challenged by several kinds of foreseeable future AI scenarios).

To illustrate this point, I’ll address one of James DiGiovanna’s conundrums.  James asks us:

To give some quick examples: suppose an AI commits a crime, and then, judging its actions wrong, immediately reforms itself so that it will never commit a crime again. Further, it makes restitution. Would it make sense to punish the AI? What if it had completely rewritten its memory and personality, so that, while there was still a physical continuity, it had no psychological content in common with the prior being? Or suppose an AI commits a crime, and then destroys itself. If a duplicate of its programming was started elsewhere, would it be guilty of the crime? What if twelve duplicates were made? Should they each be punished?

In the first case, if the information constituting the new identity of the AI after reprogramming is such that it no longer needs any kind of conditioning, then it would be senseless to punish the AI — other than to appease humans that may be angry that they couldn’t themselves avoid punishment in this way, due to having a much slower and less effective means of reprogramming themselves.  I would say that the reprogrammed AI is guilty of the crime, but only if its reprogrammed memory still included information pertaining to having performed those past criminal behaviors.  However, if those “criminal memories” are now gone via the reprogramming then I’d say that the AI is not guilty of the crime because the information constituting its identity doesn’t match that of the criminal AI.  It would have no recollection of having committed the crime and so “it” would not have committed the crime since that “it” was lost in the reprogramming process due to the dramatic change in information that took place.

In the latter scenario, if the information constituting the identity of the destroyed AI was re-instantiated elsewhere, then I would say that it is in fact guilty of the crime — though it would not be numerically guilty of the crime but rather qualitatively guilty of the crime (to differentiate between the numerical and qualitative personal identity concepts that are embedded in the concept of guilt).  If twelve duplicates of this information were instantiated into new AI hardware, then likewise all twelve of those cognitive agents would be qualitatively guilty of the crime.  What actions should be taken based on qualitative guilt?  I think it means that the AI should be punished or more specifically that the judicial system should perform the reconditioning required to modify their behavior as if it had committed the crime (especially if the AI believes/remembers that it has committed the crime), for the better of society.  If this can be accomplished through reprogramming, then that would be the most rational thing to do without any need for traditional forms of punishment.

We can analogize this with another thought experiment with human beings.  If we imagine a human that has had its memories changed so that it believes it is Charles Manson, has all of Charles Manson’s memories and intentions, then that person should be treated as if they are Charles Manson and thus incarcerated/punished accordingly to rehabilitate them or protect the other members of society.  This is assuming of course that we had reliable access to that kind of mind-reading knowledge.  If we did, the information constituting the identity of that person would be what is most important — not what the actual previous actions of the person were — because the “previous person” was someone else, due to that gross change in information.

Darwin’s Big Idea May Be The Biggest Yet

with 13 comments

Back in 1859, Charles Darwin released his famous theory of evolution by natural selection whereby inherent variations in the individual members of some population of organisms under consideration would eventually lead to speciation events due to those variations producing a differential in survival and reproductive success and thus leading to the natural selection of some subset of organisms within that population.  As Darwin explained in his On The Origin of Species:

If during the long course of ages and under varying conditions of life, organic beings vary at all in the several parts of their organisation, and I think this cannot be disputed; if there be, owing to the high geometrical powers of increase of each species, at some age, season, or year, a severe struggle for life, and this certainly cannot be disputed; then, considering the infinite complexity of the relations of all organic beings to each other and to their conditions of existence, causing an infinite diversity in structure, constitution, and habits, to be advantageous to them, I think it would be a most extraordinary fact if no variation ever had occurred useful to each being’s own welfare, in the same way as so many variations have occurred useful to man. But if variations useful to any organic being do occur, assuredly individuals thus characterised will have the best chance of being preserved in the struggle for life; and from the strong principle of inheritance they will tend to produce offspring similarly characterised. This principle of preservation, I have called, for the sake of brevity, Natural Selection.

While Darwin’s big idea completely transformed biology in terms of it providing (for the first time in history) an incredibly robust explanation for the origin of the diversity of life on this planet, his idea has since inspired other theories pertaining to perhaps the three largest mysteries that humans have ever explored: the origin of life itself (not just the diversity of life after it had begun, which was the intended scope of Darwin’s theory), the origin of the universe (most notably, why the universe is the way it is and not some other way), and also the origin of consciousness.

Origin of Life

In order to solve the first mystery (the origin of life itself), geologists, biologists, and biochemists are searching for plausible models of abiogenesis, whereby the general scheme of these models would involve chemical reactions (pertaining to geology) that would have begun to incorporate certain kinds of energetically favorable organic chemistries such that organic, self-replicating molecules eventually resulted.  Now, where Darwin’s idea of natural selection comes into play with life’s origin is in regard to the origin and evolution of these self-replicating molecules.  First of all, in order for any molecule at all to build up in concentration requires a set of conditions such that the reaction leading to the production of the molecule in question is more favorable than the reverse reaction where the product transforms back into the initial starting materials.  If merely one chemical reaction (out of a countless number of reactions occurring on the early earth) led to a self-replicating product, this would increasingly favor the production of that product, and thus self-replicating molecules themselves would be naturally selected for.  Once one of them was produced, there would have been a cascade effect of exponential growth, at least up to the limit set by the availability of the starting materials and energy sources present.

Now if we assume that at least some subset of these self-replicating molecules (if not all of them) had an imperfect fidelity in the copying process (which is highly likely) and/or underwent even a slight change after replication by reacting with other neighboring molecules (also likely), this would provide them with a means of mutation.  Mutations would inevitably lead to some molecules becoming more effective self-replicators than others, and then evolution through natural selection would take off, eventually leading to modern RNA/DNA.  So not only does Darwin’s big idea account for the evolution of diversity of life on this planet, but the basic underlying principle of natural selection would also account for the origin of self-replicating molecules in the first place, and subsequently the origin of RNA and DNA.

Origin of the Universe

Another grand idea that is gaining heavy traction in cosmology is that of inflationary cosmology, where this theory posits that the early universe underwent a period of rapid expansion, and due to quantum mechanical fluctuations in the microscopically sized inflationary region, seed universes would have resulted with each one having slightly different properties, one of which that would have expanded to be the universe that we live in.  Inflationary cosmology is currently heavily supported because it has led to a number of predictions, many of which that have already been confirmed by observation (it explains many large-scale features of our universe such as its homogeneity, isotropy, flatness, and other features).  What I find most interesting with inflationary theory is that it predicts the existence of a multiverse, whereby we are but one of an extremely large number of other universes (predicted to be on the order of 10^500, if not an infinite number), with each one having slightly different constants and so forth.

Once again, Darwin’s big idea, when applied to inflationary cosmology, would lead to the conclusion that our universe is the way it is because it was naturally selected to be that way.  The fact that its constants are within a very narrow range such that matter can even form, would make perfect sense, because even if an infinite number of universes exist with different constants, we would only expect to find ourselves in one that has the constants within the necessary range in order for matter, let alone life to exist.  So any universe that harbors matter, let alone life, would be naturally selected for against all the other universes that didn’t have the right properties to do so, including for example, universes that had too high or too low of a cosmological constant (such as those that would have instantly collapsed into a Big Crunch or expanded into a heat death far too quickly for any matter or life to have formed), or even universes that didn’t have the proper strong nuclear force to hold atomic nuclei together, or any other number of combinations that wouldn’t work.  So any universe that contains intelligent life capable of even asking the question of their origins, must necessarily have its properties within the required range (often referred to as the anthropic principle).

After our universe formed, the same principle would also apply to each galaxy and each solar system within those galaxies, whereby because variations exist in each galaxy and within each substituent solar system (differential properties analogous to different genes in a gene pool), then only those that have an acceptable range of conditions are capable of harboring life.  With over 10^22 stars in the observable universe (an unfathomably large number), and billions of years to evolve different conditions within each solar system surrounding those many stars, it isn’t surprising that eventually the temperature and other conditions would be acceptable for liquid water and organic chemistries to occur in many of those solar systems.  Even if there was only one life permitting planet per galaxy (on average), that would add up to over 100 billion life permitting planets in the observable universe alone (with many orders of magnitude more life permitting planets in the non-observable universe).  So given enough time, and given some mechanism of variation (in this case, differences in star composition and dynamics), natural selection in a sense can also account for the evolution of some solar systems that do in fact have life permitting conditions in a universe such as our own.

Origin of Consciousness

The last significant mystery I’d like to discuss involves the origin of consciousness.  While there are many current theories pertaining to different aspects of consciousness, and while there has been much research performed in the neurosciences, cognitive sciences, psychology, etc., pertaining to how the brain works and how it correlates to various aspects of the mind and consciousness, the brain sciences (though neuroscience in particular) are in their relative infancy and so there are still many questions that haven’t been answered yet.  One promising theory that has already been shown to account for many aspects of consciousness is Gerald Edelman’s theory of neuronal group selection (NGS) otherwise known as neural Darwinism (ND), which is a large scale theory of brain function.  As one might expect from the name, the mechanism of natural selection is integral to this theory.  In ND, the basic idea consists of three parts as read on the Wiki:

  1. Anatomical connectivity in the brain occurs via selective mechanochemical events that take place epigenetically during development.  This creates a diverse primary neurological repertoire by differential reproduction.
  2. Once structural diversity is established anatomically, a second selective process occurs during postnatal behavioral experience through epigenetic modifications in the strength of synaptic connections between neuronal groups.  This creates a diverse secondary repertoire by differential amplification.
  3. Re-entrant signaling between neuronal groups allows for spatiotemporal continuity in response to real-world interactions.  Edelman argues that thalamocortical and corticocortical re-entrant signaling are critical to generating and maintaining conscious states in mammals.

In a nutshell, the basic differentiated structure of the brain that forms in early development is accomplished through cellular proliferation, migration, distribution, and branching processes that involve selection processes operating on random differences in the adhesion molecules that these processes use to bind one neuronal cell to another.  These crude selection processes result in a rough initial configuration that is for the most part fixed.  However, because there are a diverse number of sets of different hierarchical arrangements of neurons in various neuronal groups, there are bound to be functionally equivalent groups of neurons that are not equivalent in structure, but are all capable of responding to the same types of sensory input.  Because some of these groups should in theory be better than others at responding to some particular type of sensory stimuli, this creates a form of neuronal/synaptic competition in the brain, whereby those groups of neurons that happen to have the best synaptic efficiency for the stimuli in question are naturally selected over the others.  This in turn leads to an increased probability that the same network will respond to similar or identical signals in the future.  Each time this occurs, synaptic strengths increase in the most efficient networks for each particular type of stimuli, and this would account for a relatively quick level of neural plasticity in the brain.

The last aspect of the theory involves what Edelman called re-entrant signaling whereby a sampling of the stimuli from functionally different groups of neurons occurring at the same time leads to a form of self-organizing intelligence.  This would provide a means for explaining how we experience spatiotemporal consistency in our experience of sensory stimuli.  Basically, we would have functionally different parts of the brain, such as various maps in the visual centers that pertain to color versus others that pertain to orientation or shape, that would effectively amalgamate the two (previously segregated) regions such that they can function in parallel and thus correlate with one another producing an amalgamation of the two types of neural maps.  Once this re-entrant signaling is accomplished between higher order or higher complexity maps in the brain, such as those pertaining to value-dependent memory storage centers, language centers, and perhaps back to various sensory cortical regions, this would create an even richer level of synchronization, possibly leading to consciousness (according to the theory).  In all of the aspects of the theory, the natural selection of differentiated neuronal structures, synaptic connections and strengths and eventually that of larger re-entrant connections would be responsible for creating the parallel and correlated processes in the brain believed to be required for consciousness.  There’s been an increasing amount of support for this theory, and more evidence continues to accumulate in support of it.  In any case, it is a brilliant idea and one with a lot of promise in potentially explaining one of the most fundamental aspects of our existence.

Darwin’s Big Idea May Be the Biggest Yet

In my opinion, Darwin’s theory of evolution through natural selection was perhaps the most profound theory ever discovered.  I’d even say that it beats Einstein’s theory of Relativity because of its massive explanatory scope and carryover to other disciplines, such as cosmology, neuroscience, and even the immune system (see Edelman’s Nobel work on the immune system, where he showed how the immune system works through natural selection as well, as opposed to some type of re-programming/learning).  Based on the basic idea of natural selection, we have been able to provide a number of robust explanations pertaining to many aspects of why the universe is likely to be the way it is, how life likely began, how it evolved afterward, and it may possibly be the answer to how life eventually evolved brains capable of being conscious.  It is truly one of the most fascinating principles I’ve ever learned about and I’m honestly awe struck by its beauty, simplicity, and explanatory power.

The Origin and Evolution of Life: Part I

leave a comment »

In the past, various people have argued that life originating at all let alone evolving higher complexity over time was thermodynamically unfavorable due to the decrease in entropy involved with both circumstances, and thus it was believed to violate the second law of thermodynamics.  For those unfamiliar with the second law, it basically asserts that the amount of entropy (often referred to as disorder) in a closed system tends to increase over time, or to put it another way, the amount of energy available to do useful work in a closed system tends to decrease over time.  So it has been argued that since the origin of life and the evolution of life with greater complexity would entail decreases in entropy, these events are therefore either at best unfavorable (and therefore the result of highly improbable chance), or worse yet they are altogether impossible.

We’ve known for quite some time now that these thermodynamic arguments aren’t at all valid because earth isn’t a thermodynamically closed or isolated system due to the constant supply of energy we receive from the sun.  Because we get a constant supply of energy from the sun, and because the entropy increase from the sun far outweighs the decrease in entropy produced from all biological systems on earth, the net entropy of the entire system increases and thus fits right in line with the second law as we would expect.

However, even though the emergence and evolution of life on earth do not violate the second law and are thus physically possible, that still doesn’t show that they are probable processes.  What we need to know is how favorable the reactions are that are required for initiating and then sustaining these processes.  Several very important advancements have been made in abiogenesis over the last ten to fifteen years, with the collaboration of geologists and biochemists, and it appears that they are in fact not only possible but actually probable processes for a few reasons.

One reason is that the chemical reactions that living systems undergo produce a net entropy as well, despite the drop of entropy associated with every cell and/or it’s arrangement with respect to other cells.  This is because all living systems give off heat with every favorable chemical reaction that is constantly driving the metabolism and perpetuation of those living systems. This gain in entropy caused by heat loss more than compensates for the loss in entropy that results with the production and maintenance of all the biological components, whether lipids, sugars, nucleic acids or amino acids and more complex proteins.  Beyond this, as more complexity arises during the evolution of the cells and living systems, the entropy that those systems produce tends to increase even more and so living systems with a higher level of complexity appear to produce a greater net entropy (on average) than less complex living systems.  Furthermore, once photosynthetic organisms evolved in particular, any entropy (heat) that they give off in the form of radiation ends up being of lower energy (infrared) than the photons given off by the sun to power those reactions in the first place.  Thus, we can see that living systems effectively dissipate the incoming energy from the sun, and energy dissipation is energetically favorable.

Living systems seem to serve as a controllable channel of energy flow for that energy dissipation, just like lightning, the eye of a hurricane, or a tornado, where high energy states in the form of charge gradients or pressure or temperature gradients end up falling to a lower energy state by dissipating that energy through specific focused channels that spontaneously form (e.g. individual concentrated lightning bolts, the eye of a hurricane, vortices, etc.).  These channels for energy flow are favorable and form because they allow the energy to be dissipated faster since the channels are initiated by some direction of energy flow that is able to self-amplify into a path of decreasing resistance for that energy dissipation.  Life and the metabolic processes involved with it, seem to direct energy flow in ways that are very similar to these other naturally arising processes in non-living physical systems.  Interestingly enough, a relevant hypothesis has been proposed for why consciousness and eventually self-awareness would have evolved (beyond the traditional reasons proposed by natural selection).  If an organism can evolve the ability to predict where energy is going to flow, where an energy dissipation channel will form (or form more effective ones themselves), conscious organisms can then behave in ways that much more effectively dissipate energy even faster (and also by catalyzing more entropy production), thus showing why certain forms of biological complexity such as consciousness, memory, etc., would have also been favored from a thermodynamic perspective.

Thus, the origin of life as well as the evolution of biological complexity appears to be increasingly favored by the second law, thus showing a possible fundamental physical driving force behind the origin and evolution of life.  Basically, the origin and evolution of life appear to be effectively entropy engines and catalytic energy dissipation channels, and these engines and channels produce entropy at a greater rate than the planet otherwise would in the absence of that life, thus showing at least one possible driving force behind life, namely, the second law of thermodynamics.  So ironically, not only does the origin and evolution of life not violate the second law of thermodynamics, but it actually seems to be an inevitable (or at least favorable) result because of the second law.  Some of these concepts are still being developed in various theories and require further testing to better validate them but they are in fact supported by well-established physics and by consistent and sound mathematical models.

Perhaps the most poetic concept I’ve recognized with these findings is that life is effectively speeding up the heat death of the universe.  That is, the second law of thermodynamics suggests that the universe will eventually lose all of its useful energy when all the stars burn out and all matter eventually spreads out and decays into lower and lower energy photons, and thus the universe is destined to undergo a heat death.  Life, because it is producing entropy faster than the universe otherwise would in the absence of that life, is actually speeding up this inevitable death of the universe, which is quite fascinating when you think about it.  At the very least, it should give a new perspective to those that ask the question “what is the meaning or purpose of life?”  Even if we don’t think it is proper to think of life as having any kind of objective purpose in the universe, what life is in fact doing is accelerating the death of not only itself, but of the universe as a whole.  Personally, this further reinforces the idea that we should all ascribe our own meaning and purpose to our lives, because we should be enjoying the finite amount of time that we have, not only as individuals, but as a part of the entire collective life that exists in our universe.

To read about the newest and most promising discoveries that may explain how life got started in the first place, read part two here.

Knowledge: An Expansion of the Platonic Definition

with 12 comments

In the first post I ever wrote on this blog, titled: Knowledge and the “Brain in a Vat” scenario, I discussed some elements concerning the Platonic definition of knowledge, that is, that knowledge is ultimately defined as “justified true belief”.  I further refined the Platonic definition (in order to account for the well-known Gettier Problem) such that knowledge could be better described as “justified non-coincidentally-true belief”.  Beyond that, I also discussed how one’s conception of knowledge (or how it should be defined) should consider the possibility that our reality may be nothing more than the product of a mad scientist feeding us illusory sensations/perceptions with our brain in a vat, and thus, that how we define things and adhere to those definitions plays a crucial role in our conception and mutual understanding of any kind of knowledge.  My concluding remarks in that post were:

“While I’m aware that anything discussed about the metaphysical is seen by some philosophers to be completely and utterly pointless, my goal in making the definition of knowledge compatible with the BIV scenario is merely to illustrate that if knowledge exists in both “worlds” (and our world is nothing but a simulation), then the only knowledge we can prove has to be based on definitions — which is a human construct based on hierarchical patterns observed in our reality.”

While my views on what knowledge is or how it should be defined have changed somewhat in the past three years or so since I wrote that first blog post, in this post, I’d like to elaborate on this key sentence, specifically with regard to how knowledge is ultimately dependent on the recall and use of previously observed patterns in our reality as I believe that this is the most important aspect regarding how to define knowledge.  After making a few related comments on another blog (https://nwrickert.wordpress.com/2015/03/07/knowledge-vs-belief/), I decided to elaborate on some of those comments accordingly.

I’ve elsewhere mentioned how there is a plethora of evidence that suggests that intelligence is ultimately a product of pattern recognition (1, 2, 3).  That is, if we recognize patterns in nature and then commit them to memory, we can later use those remembered patterns to our advantage in order to accomplish goals effectively.  The more patterns that we can recognize and remember, specifically those that do in fact correlate with reality (as opposed to erroneously “recognized” patterns that are actually non-existent), the better our chances of predicting the consequences of our actions accurately, and thus the better chances we have at obtaining our goals.  In short, the more patterns that we can recognize and remember, the greater our intelligence.  It is therefore no coincidence that intelligence tests are primarily based on gauging one’s ability to recognize patterns (e.g. solving Raven’s Progressive Matrices, puzzles, etc.).

To emphasize the role of pattern recognition as it applies to knowledge, if we use my previously modified Platonic definition of knowledge, that is,  that knowledge is defined as “justified, non-coincidentally-true belief”, then I must break down the individual terms of this definition as follows, starting with “belief”:

  • Belief = Recognized patterns of causality that are stored into memory for later recall and use.
  • Non-Coincidentally-True = The belief positively and consistently correlates with reality, and thus not just through luck or chance.
  • Justified = Empirical evidence exists to support said belief.

So in summary, I have defined knowledge (more specifically) as:

“Recognized patterns of causality that are stored into memory for later recall and use, that positively and consistently correlate with reality, and for which that correlation has been validated by empirical evidence (e.g. successful predictions made and/or goals accomplished through the use of said recalled patterns)”.

This means that if we believe something to be true that is unfalsifiable (such as religious beliefs that rely on faith), since it has not met the justification criteria, it fails to be considered knowledge (even if it is still considered a “belief”).  Also, if we are able to make a successful prediction with the patterns we’ve recognized, yet are only able to do so once, due to the lack of consistency, we likely just got lucky and didn’t actually correctly identify a pattern that correlates with reality, and thus this would fail to count as knowledge.  Finally, one should also note that the patterns that are recognized were not specifically defined as “consciously” recognized/remembered, nor was it specified that the patterns couldn’t be innately acquired/stored into memory (through DNA coded or other pre-sensory neural developmental mechanisms).  Thus, even procedural knowledge like learning to ride a bike or other forms of “muscle memory” used to complete a task, or any innate form of knowledge (acquired before/without sensory input) would be an example of unconscious or implicit knowledge that still fulfills this definition I’ve given above.  In the case of unconscious/implicit knowledge, we would have to accept that “beliefs” can also be unconscious/implicit (in order to remain consistent with the definition I’ve chosen), and I don’t see this as being a problem at all.  One just has to keep in mind that when people use the term “belief”, they are likely going to be referring to only those that are in our consciousness, a subset of all beliefs that exist, and thus still correct and adherent to the definition laid out here.

This is how I prefer to define “knowledge”, and I think it is a robust definition that successfully solves many (though certainly not all) of the philosophical problems that one tends to encounter in epistemology.