Technology, Mass-Culture, and the Prospects of Human Liberation

Cultural evolution is arguably just as fascinating as biological evolution (if not more so), with new ideas and behaviors stemming from the same kinds of natural selective pressures that lead to new species along with their novel morphologies and capacities.  And as with biological evolution where it, in a sense, takes off on its own unbeknownst to the new organisms it produces and independent of the intentions they may have (with our species being the notable exception given our awareness of evolutionary history and our ever-growing control over genetics), so too cultural evolution takes off on its own, where cultural changes are made manifest through a number of causal influences that we’re largely unaware of, despite our having some conscious influence over this vastly transformative process.

Alongside these cultural changes, human civilizations have striven to find new means of manipulating nature and to better predict the causal structure that makes up our reality.  One unfortunate consequence of this is that, as history has shown us, within any particular culture’s time and place, people have a decidedly biased overconfidence in the perceived level of truth or justification for the status quo and their present world view (both on an individual and collective level).  Undoubtedly, the “group-think” or “herd mentality” that precipitates from our simply having social groups often reinforces this overconfidence, and this is so in spite of the fact that what actually influences a mass of people to believe certain things or to behave as they do is highly contingent, unstable, and amenable to irrational forms of persuasion including emotive, sensationalist propaganda that prey on our cognitive biases.

While we as a society have an unprecedented amount of control over the world around us, this type of control is perhaps best described as a system of bureaucratic organization and automated information processing, that gives less and less individual autonomy, liberty, and basic freedom, as it further expands its reach.  How much control do we as individuals really have in terms of the information we have access to, and given the implied picture of reality that is concomitant with this information in the way it’s presented to us?  How much control do we have in terms of the number of life trajectories and occupations made available to us, what educational and socioeconomic resources we have access to given the particular family, culture, and geographical location we’re born and raised in?

As more layers of control have been added to our way of life and as certain criteria for organizational efficiency are continually implemented, our lives have become externally defined by increasing layers of abstraction, and our modes of existence are further separated cognitively and emotionally from an aesthetically and otherwise psychologically valuable sense of meaning and purpose.

While the Enlightenment slowly dragged our species, kicking and screaming, out of the theocratic, anti-intellectual epistemologies of the Medieval period of human history, the same forces that unearthed a long overdue appreciation for (and development of) rationality and technological progress, unknowingly engendered a vulnerability to our misusing this newfound power.  There was an overcompensation of rationality when it was deployed to (justifiably) respond to the authoritarian dogmatism of Christianity and to the demonstrably unreliable nature of superstitious beliefs and of many of our intuitions.

This overcompensatory effect was in many ways accounted for, or anticipated within the dialectical theory of historical development as delineated by the German philosopher Georg Hegel, and within some relevant reformulations of this dialectical process as theorized by the German philosopher Karl Marx (among others).  Throughout history, we’ve had an endless clash of ideas whereby the prevailing worldviews are shown to be inadequate in some way, failing to account for some notable aspect of our perceived reality, or shown to be insufficient for meeting our basic psychological or socioeconomic needs.  With respect to any problem we’ve encountered, we search for a solution (or wait for one to present itself to us), and then we become overconfident in the efficacy of the solution.  Eventually we end up overgeneralizing its applicability, and then the pendulum swings too far the other way, thereby creating new problems in need of a solution, with this process seemingly repeating itself ad infinitum.

Despite the various woes of modernity, as explicated by the modern existentialist movement, it does seem that history, from a long-term perspective at least, has been moving in the right direction, not only with respect to our heightened capacity of improving our standard of living, but also in terms of the evolution of our social contracts and our conceptions of basic and universal human rights.  And we should be able to plausibly reconcile this generally positive historical trend with the Hegelian view of historical development, and the conflicts that arise in human history, by noting that we often seem to take one step backward followed by taking two steps forward in terms of our moral and epistemological progress.

Regardless of the progress we’ve made, we seem to be at a crucial point in our history where the same freedom-limiting authoritarian reach that plagued humanity (especially during the Middle Ages) has undergone a kind of morphogenesis, having been reinstantiated albeit in a different form.  The elements of authoritarianism have become built into the very structure of mass-culture, with an anti-individualistic corporatocracy largely mediating the flow of information throughout this mass-culture, and also mediating its evolution over time as it becomes more globalized, interconnected, and cybernetically integrated into our day-to-day lives.

Coming back to the kinds of parallels in biology that I opened up with, we can see human autonomy and our culture (ideas and behaviors) as having evolved in ways that are strikingly similar to the biological jump that life made long ago, where single-celled organisms eventually joined forces with one another to become multi-cellular.  This biological jump is analogous to the jump we made during the early onset of civilization, where we employed an increasingly complex distribution of labor and occupational specialization, allowing us to survive many more environmental hurdles than ever before.  Once civilization began, the spread of culture became much more effective for transmitting ideas both laterally within a culture and longitudinally from generation to generation, with this process heavily enhanced by our having adopted various forms of written language, allowing us to store and transmit information in much more robust ways, similar to genetic information storage and transfer via DNA, RNA, and proteins.

Although the single-celled bacterium or amoeba (for example) may be thought of as having more “autonomy” than a cell that is forcefully interconnected within a multi-cellular organism, we can see how the range of capacities available to single cells were far more limited before making the symbiotic jump, just as humans living before the onset of civilization had more “freedom” (at least of a certain type) and yet the number of possible life trajectories and experiences was minuscule when compared to a human living in a post-cultural world.  But once multi-cellular organisms began to form a nervous system and eventually a brain, the entire collection of cells making up an organism became ultimately subservient to a centralized form of executive power — just as humans have become subservient to the executive authority of the state or government (along with various social pressures of conformity).

And just as the fates of each cell in a multi-cellular organism became predetermined and predictable by its particular set of available resources and the specific information it received from neighboring cells, similarly our own lives are becoming increasingly predetermined and predictable by the socioeconomic resources made available to us and the information we’re given which constitutes our mass-culture.  We are slowly morphing from individual brains into something akin to individual neurons within a global brain of mass-consciousness and mass-culture, having our critical thinking skills and creative aspirations exchanged for rehearsed responses and docile expectations that maintain the status quo and which continually transfers our autonomy to an oligarchic power structure.

We might wonder if this shift has been inevitable, possibly being yet another example of a “fractal pattern” recapitulated in sociological form out of the very same freely floating rationales that biological evolution has been making use of for eons.  In any case, it’s critically important that we become aware of this change, so we can try and actively achieve and effectively maintain the liberties and level of individual autonomy that we so highly cherish.  We ought to be thinking about what kinds of ways we can remain cognizant of, and critical to, our culture and its products; how we can reconcile or transform technological rationality and progress with a future world comprised of truly liberated individuals; and how to transform our corporatocratic capitalist society into one that is based on a mixed economy with a social safety net that even the wealthiest citizens would be content with living under, so as to maximize the actual creative freedom people have once their basic existential needs have been met.

Will unchecked capitalism, social-media, mass-media, and the false needs and epistemological bubbles they’re forming lead to our undoing and destruction?  Or will we find a way to rise above this technologically-induced setback, and take advantage of the opportunities it has afforded us, to make the world and our technology truly compatible with our human psychology?  Whatever the future holds for us, it is undoubtedly going to depend on how many of us begin to critically think about how we can seriously restructure our educational system and how we disseminate information, how we can re-prioritize and better reflect on what our personal goals ought to be, and also how we ought to identify ourselves as free and unique individuals.

Advertisement

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part VI)

This is the last post I’m going to write for this particular post-series on Predictive Processing (PP).  Here’s the links to parts 1, 2, 3, 4, and 5.  I’ve already explored a bit on how a PP framework can account for folk psychological concepts like beliefs, desires, and emotions, how it accounts for action, language and ontology, knowledge, and also perception, imagination, and reasoning.  In this final post for this series, I’m going to explore consciousness itself and how some theories of consciousness fit very nicely within a PP framework.

Consciousness as Prediction (Predicting The Self)

Earlier in this post-series I explored how PP treats perception (whether online or an offline form like imagination) as simply predictions pertaining to incoming visual information with varying degrees of precision weighting assigned to the resulting prediction error.  In a sense then, consciousness just is prediction.  At the very least, it is a subset of the predictions, likely those that are higher up in the predictive hierarchy.  This is all going to depend on which aspect or level of consciousness we are trying to explain, and as philosophers and cognitive scientists well know, consciousness is difficult to pin down and define in any way.

By consciousness, we could mean any kind of awareness at all, or we could limit this term to only apply to a system that is aware of itself.  Either way we have to be careful here if we’re looking to distinguish between consciousness generally speaking (consciousness in any form, which may be unrecognizable to us) and the unique kind of consciousness that we as human beings experience.  If an ant is conscious, it doesn’t likely have any of the richness that we have in our experience nor is it likely to have self-awareness like we do (even though a dolphin, which has a large neocortex and prefrontal cortex, is far more likely to).  So we have to keep these different levels of consciousness in mind in order to properly assess their being explained by any cognitive framework.

Looking through a PP lens, we can see that what we come to know about the world and our own bodily states is a matter of predictive models pertaining to various inferred causal relations.  These inferred causal relations ultimately stem from bottom-up sensory input.  But when this information is abstracted at higher and higher levels, eventually one can (in principle) get to a point where those higher level models begin to predict the existence of a unified prediction engine.  In other words, a subset of the highest-level predictive models may eventually predict itself as a self.  We might describe this process as the emergence of some level of self-awareness, even if higher levels of self-awareness aren’t possible unless particular kinds of higher level models have been generated.

What kinds of predictions might be involved with this kind of emergence?  Well, we might expect that predictions pertaining to our own autobiographical history, which is largely composed of episodic memories of our past experience, would contribute to this process (e.g. “I remember when I went to that amusement park with my friend Mary, and we both vomited!”).  If we begin to infer what is common or continuous between those memories of past experiences (even if only 1 second in the past), we may discover that there is a form of psychological continuity or identity present.  And if this psychological continuity (this type of causal relation) is coincident with an incredibly stable set of predictions pertaining to (especially internal) bodily states, then an embodied subject or self can plausibly emerge from it.

This emergence of an embodied self is also likely fueled by predictions pertaining to other objects that we infer existing as subjects.  For instance, in order to develop a theory of mind about other people, that is, in order to predict how other people will behave, we can’t simply model their external behavior as we can for something like a rock falling down a hill.  This can work up to a point, but eventually it’s just not good enough as behaviors become more complex.  Animal behavior, most especially that of humans, is far more complex than that of inanimate objects and as such it is going to be far more effective to infer some internal hidden causes for that behavior.  Our predictions would work well if they involved some kind of internal intentionality and goal-directedness operating within any animate object, thereby transforming that object into a subject.  This should be no less true for how we model our own behavior.

If this object that we’ve now inferred to be a subject seems to behave in ways that we see ourselves as behaving (especially if it’s another human being), then we can begin to infer some kind of equivalence despite being separate subjects.  We can begin to infer that they too have beliefs, desires, and emotions, and thus that they have an internal perspective that we can’t directly access just as they aren’t able to access ours.  And we can also see ourselves from a different perspective based on how we see those other subjects from an external perspective.  Since I can’t easily see myself from an external perspective, when I look at others and infer that we are similar kinds of beings, then I can begin to see myself as having both an internal and external side.  I can begin to infer that others see me similar to the way that I see them, thus further adding to my concept of self and the boundaries that define that self.

Multiple meta-cognitive predictions can be inferred based on all of these interactions with others, and from the introspective interactions with our brain’s own models.  Once this happens, a cognitive agent like ourselves may begin to think about thinking and think about being a thinking being, and so on and so forth.  All of these cognitive moves would seem to provide varying degrees of self-hood or self-awareness.  And these can all be thought of as the brain’s best guesses that account for its own behavior.  Either way, it seems that the level of consciousness that is intrinsic to an agent’s experience is going to be dependent on what kinds of higher level models and meta-models are operating within the agent.

Consciousness as Integrated Information

One prominent theory of consciousness is the Integrated Information Theory of Consciousness, otherwise known as IIT.  This theory, initially formulated by Giulio Tononi back in 2004 and which has undergone development ever since, posits that consciousness is ultimately dependent on the degree of information integration that is inherent in the causal properties of some system.  Another way of saying this is that the causal system specified is unified such that every part of the system must be able to affect and be affected by the rest of the system.  If you were to physically isolate one part of a system from the rest of it (and if this part was the least significant to the rest of the system), then the resulting change in the cause-effect structure of the system would quantify the degree of integration.  A large change in the cause-effect structure based on this part’s isolation from the system (that is, by having introduced what is called a minimum partition to the system) would imply a high degree of information integration and vice versa.  And again, a high degree of integration implies a high degree of consciousness.

Notice how this information integration axiom in IIT posits that a cognitive system that is entirely feed-forward will not be conscious.  So if our brain processed incoming sensory information from the bottom up and there was no top-down generative model feeding downward through the system, then IIT would predict that our brain wouldn’t be able to produce consciousness.  PP on the other hand, posits a feedback system (as opposed to feed-forward) where the bottom-up sensory information that flows upward is met with a downward flow of top-down predictions trying to explain away that sensory information.  The brain’s predictions cause a change in the resulting prediction error, and this prediction error serves as feedback to modify the brain’s predictions.  Thus, a cognitive architecture like that suggested by PP is predicted to produce consciousness according to the most fundamental axiom of IIT.

Additionally, PP posits cross-modal sensory features and functionality where the brain integrates (especially lower level) predictions spanning various spatio-temporal scales from different sensory modalities, into a unified whole.  For example, if I am looking at and petting a black cat lying on my lap and hearing it purr, PP posits that my perceptual experience and contextual understanding of that experience are based on having integrated the visual, tactile, and auditory expectations that I’ve associated to constitute such an experience of a “black cat”.  It is going to be contingent on a conjunction of predictions that are occurring simultaneously in order to produce a unified experience rather than a barrage of millions or billions of separate causal relations (let alone those which stem from different sensory modalities) or having millions or billions of separate conscious experiences (which would seem to necessitate separate consciousnesses if they are happening at the same time).

Evolution of Consciousness

Since IIT identifies consciousness with integrated information, it can plausibly account for why it evolved in the first place.  The basic idea here is that a brain that is capable of integrating information is more likely to exploit and understand an environment that has a complex causal structure on multiple time scales than a brain that has informationally isolated modules.  This idea has been tested and confirmed to some degree by artificial life simulations (animats) where adaptation and integration are both simulated.  The organism in these simulations was a Braitenberg-like vehicle that had to move through a maze.  After 60,000 generations of simulated brains evolving through natural selection, it was found that there was a monotonic relationship between their ability to get through the maze and the amount of simulated information integration in their brains.

This increase in adaptation was the result of an effective increase in the number of concepts that the organism could make use of given the limited number of elements and connections possible in its cognitive architecture.  In other words, given a limited number of connections in a causal system (such as a group of neurons), you can pack more functions per element if the level of integration with respect to those connections is high, thus giving an evolutionary advantage to those with higher integration.  Therefore, when all else is equal in terms of neural economy and resources, higher integration gives an organism the ability to take advantage of more regularities in their environment.

From a PP perspective, this makes perfect sense because the complex causal structure of the environment is described as being modeled at many different levels of abstraction and at many different spatio-temporal scales.  All of these modeled causal relations are also described as having a hierarchical structure with models contained within models, and with many associations existing between various models.  These associations between models can be accounted for by a cognitive architecture that re-uses certain sets of neurons in multiple models, so the association is effectively instantiated by some literal degree of neuronal overlap.  And of course, these associations between multiply-leveled predictions allows the brain to exploit (and create!) as many regularities in the environment as possible.  In short, both PP and IIT make a lot of practical sense from an evolutionary perspective.

That’s All Folks!

And this concludes my post-series on the Predictive Processing (PP) framework and how I see it as being applicable to a far more broad account of mentality and brain function, than it is generally assumed to be.  If there’s any takeaways from this post-series, I hope you can at least appreciate the parsimony and explanatory scope of predictive processing and viewing the brain as a creative and highly capable prediction engine.

Predictive Processing: Unlocking the Mysteries of Mind & Body (Part II)

In the first post of this series I introduced some of the basic concepts involved in the Predictive Processing (PP) theory of perception and action.  I briefly tied together the notions of belief, desire, emotion, and action from within a PP lens.  In this post, I’d like to discuss the relationship between language and ontology through the same framework.  I’ll also start talking about PP in an evolutionary context as well, though I’ll have more to say about that in future posts in this series.

Active (Bayesian) Inference as a Source for Ontology

One of the main themes within PP is the idea of active (Bayesian) inference whereby we physically interact with the world, sampling it and modifying it in order to reduce our level of uncertainty in our predictions about the causes of the brain’s inputs.  Within an evolutionary context, we can see why this form of embodied cognition is an ideal schema for an information processing system to employ in order to maximize chances of survival in our highly interactive world.

In order to reduce the amount of sensory information that has to be processed at any given time, it is far more economical for the brain to only worry about the prediction error that flows upward through the neural system, rather than processing all incoming sensory data from scratch.  If the brain is employing a set of predictions that can “explain away” most of the incoming sensory data, then the downward flow of predictions can encounter an upward flow of sensory information (effectively cancelling each other out) and the only thing that remains to propagate upward through the system and do any “cognitive work” (i.e. the only thing that needs to be processed) on the predictive models flowing downward is the remaining prediction error (prediction error = predictions of sensory input minus the actual sensory input).  This is similar to data compression strategies for video files (for example) that only worry about the information that changes over time (pixels that change brightness/color) and then simply compress the information that remains constant (pixels that do not change from frame-to-frame).

The ultimate goal for this strategy within an evolutionary context is to allow the organism to understand its environment in the most salient ways for the pragmatic purposes of accomplishing goals relating to survival.  But once humans began to develop culture and evolve culturally, the predictive strategy gained a new kind of evolutionary breathing space, being able to predict increasingly complex causal relations and developing technology along the way.  All of these inferred causal relations appear to me to be the very source of our ontology, as each hierarchically structured prediction and its ability to become associated with others provides an ideal platform for differentiating between any number of spatio-temporal conceptions and their categorical or logical organization.

An active Bayesian inference system is also ideal to explain our intellectual thirst, human curiosity, and interest in novel experiences (to some degree), because we learn more about the world (and ourselves) by interacting with it in new ways.  In doing so, we are provided with a constant means of fueling and altering our ontology.

Language & Ontology

Language is an important component as well and it fits well within a PP framework as it serves to further link perception and action together in a very important way, allowing us to make new kinds of predictions about the world that wouldn’t have been possible without it.   A tool like language makes a lot of sense from an evolutionary perspective as well since better predictions about the world result in a higher chance of survival.

When we use language by speaking or writing it, we are performing an action which is instantiated by the desire to do so (see previous post about “desire” within a PP framework).  When we interpret language by listening to it or by reading, we are performing a perceptual task which is again simply another set of predictions (in this case, pertaining to the specific causes leading to our sensory inputs).  If we were simply sending and receiving non-lingual nonsense, then the same basic predictive principles underlying perception and action would still apply, but something new emerges when we send and receive actual language (which contains information).  With language, we begin to associate certain sounds and visual information with some kind of meaning or meaningful information.  Once we can do this, we can effectively share our thoughts with one another, or at least many aspects of our thoughts with one another.  This provides for an enormous evolutionary advantage as now we can communicate almost anything we want to one another, store it in external forms of memory (books, computers, etc.), and further analyze or manipulate the information for various purposes (accounting, inventory, science, mathematics, etc.).

By being able to predict certain causal outcomes through the use of language, we are effectively using the lower level predictions associated with perceiving and emitting language to satisfy higher level predictions related to more complex goals including those that extend far into the future.  Since the information that is sent and received amounts to testing or modifying our predictions of the world, we are effectively using language to share and modulate one brain’s set of predictions with that of another brain.  One important aspect of this process is that this information is inherently probabilistic which is why language often trips people up with ambiguities, nuances, multiple meanings behind words and other attributes of language that often lead to misunderstanding.  Wittgenstein is one of the more prominent philosophers who caught onto this property of language and its consequence on philosophical problems and how we see the world structured.  I think a lot of the problems Wittgenstein elaborated on with respect to language can be better accounted for by looking at language as dealing with probabilistic ontological/causal relations that serve some pragmatic purpose, with the meaning of any word or phrase as being best described by its use rather than some clear-cut definition.

This probabilistic attribute of language in terms of the meanings of words having fuzzy boundaries also tracks very well with the ontology that a brain currently has access to.  Our ontology, or what kinds of things we think exist in the world, are often categorized in various ways with some of the more concrete entities given names such as: “animals”, “plants”, “rocks”, “cats”, “cups”, “cars”, “cities”, etc.  But if I morph a wooden chair (say, by chipping away at parts of it with a chisel), eventually it will no longer be recognizable as a chair, and it may begin to look more like a table than a chair or like nothing other than an oddly shaped chunk of wood.  During this process, it may be difficult to point to the exact moment that it stopped being a chair and instead became a table or something else, and this would make sense if what we know to be a chair or table or what-have-you is nothing more than a probabilistic high-level prediction about certain causal relations.  If my brain perceives an object that produces too high of a prediction error based on the predictive model of what a “chair” is, then it will try another model (such as the predictive model pertaining to a “table”), potentially leading to models that are less and less specific until it is satisfied with recognizing the object as merely a “chunk of wood”.

From a PP lens, we can consider lower level predictions pertaining to more basic causes of sensory input (bright/dark regions, lines, colors, curves, edges, etc.) to form some basic ontological building blocks and when they are assembled into higher level predictions, the amount of integrated information increases.  This information integration process leads to condensed probabilities about increasingly complex causal relations, and this ends up reducing the dimensionality of the cause-effect space of the predicted phenomenon (where a set of separate cause-effect repertoires are combined into a smaller number of them).

You can see the advantage here by considering what the brain might do if it’s looking at a black cat sitting on a brown chair.  What if the brain were to look at this scene as merely a set of pixels on the retina that change over time, where there’s no expectations of any subset of pixels to change in ways that differ from any other subset?  This wouldn’t be very useful in predicting how the visual scene will change over time.  What if instead, the brain differentiates one subset of pixels (that correspond to what we call a cat) from all the rest of the pixels, and it does this in part by predicting proximity relations between neighboring pixels in the subset (so if some black pixels move from the right to the left visual field, then some number of neighboring black pixels are predicted to move with it)?

This latter method treats the subset of black-colored pixels as a separate object (as opposed to treating the entire visual scene as a single object), and doing this kind of differentiation in more and more complex ways leads to a well-defined object or concept, or a large number of them.  Associating sounds like “meow” with this subset of black-colored pixels, is just one example of yet another set of properties or predictions that further defines this perceived object as distinct from the rest of the perceptual scene.  Associating this object with a visual or auditory label such as “cat” finally links this ontological object with language.  As long as we agree on what is generally meant by the word “cat” (which we determine through its use), then we can share and modify the predictive models associated with such an object or concept, as we can do with any other successful instance of linguistic communication.

Language, Context, and Linguistic Relativism

However, it should be noted that as we get to more complex causal relations (more complex concepts/objects), we can no longer give these concepts a simple one word label and expect to communicate information about them nearly as easily as we could for the concept of a “cat”.  Think about concepts like “love” or “patriotism” or “transcendence” and realize how there’s many different ways that we use those terms and how they can mean all sorts of different things and so our meaning behind those words will be heavily conveyed to others by the context that they are used in.  And context in a PP framework could be described as simply the (expected) conjunction of multiple predictive models (multiple sets of causal relations) such as the conjunction of the predictive models pertaining to the concepts of food, pizza, and a desirable taste, and the word “love” which would imply a particular use of the word “love” as used in the phrase “I love pizza”.  This use of the word “love” is different than one which involves the conjunction of predictive models pertaining to the concepts of intimacy, sex, infatuation, and care, implied in a phrase like “I love my wife”.  In any case, conjunctions of predictive models can get complicated and this carries over to our stretching our language to its very limits.

Since we are immersed in language and since it is integral in our day-to-day lives, we also end up being conditioned to think linguistically in a number of ways.  For example, we often think with an interior monologue (e.g. “I am hungry and I want pizza for lunch”) even when we don’t plan on communicating this information to anyone else, so it’s not as if we’re simply rehearsing what we need to say before we say it.  I tend to think however that this linguistic thinking (thinking in our native language) is more or less a result of the fact that the causal relations that we think about have become so strongly associated with certain linguistic labels and propositions, that we sort of automatically think of the causal relations alongside the labels that we “hear” in our head.  This seems to be true even if the causal relations could be thought of without any linguistic labels, in principle at least.  We’ve simply learned to associate them so strongly to one another that in most cases separating the two is just not possible.

On the flip side, this tendency of language to associate itself with our thoughts, also puts certain barriers or restrictions on our thoughts.  If we are always preparing to share our thoughts through language, then we’re going to become somewhat entrained to think in ways that can be most easily expressed in a linguistic form.  So although language may simply be along for the ride with many non-linguistic aspects of thought, our tendency to use it may also structure our thinking and reasoning in large ways.  This would account for why people raised in different cultures with different languages see the world in different ways based on the structure of their language.  While linguistic determinism seems to have been ruled out (the strong version of the Sapir-Whorf hypothesis), there is still strong evidence to support linguistic relativism (the weak version of the Sapir-Whorf hypothesis), whereby one’s language effects their ontology and view of how the world is structured.

If language is so heavily used day-to-day then this phenomenon makes sense as viewed through a PP lens since we’re going to end up putting a high weight on the predictions that link ontology with language since these predictions have been demonstrated to us to be useful most of the time.  Minimal prediction error means that our Bayesian evidence is further supported and the higher the weight carried by these predictions, the more these predictions will restrict our overall thinking, including how our ontology is structured.

Moving on…

I think that these are but a few of the interesting relationships between language and ontology and how a PP framework helps to put it all together nicely, and I just haven’t seen this kind of explanatory power and parsimony in any other kind of conceptual framework about how the brain functions.  This bodes well for the framework and it’s becoming less and less surprising to see it being further supported over time with studies in neuroscience, cognition, psychology, and also those pertaining to pathologies of the brain, perceptual illusions, etc.  In the next post in this series, I’m going to talk about knowledge and how it can be seen through the lens of PP.

The WikiLeaks Conundrum

I’ve been thinking a lot about WikiLeaks over the last year, especially given the relevant consequences that have ensued with respect to the 2016 presidential election.  In particular, I’ve been thinking about the trade-offs that underlie any type of platform that centers around publishing secret or classified information, news leaks, and the like.  I’m torn over the general concept in terms of whether these kinds of platforms provide a net good for society and so I decided to write a blog post about it to outline my concerns through a brief analysis.

Make no mistake that I appreciate the fact that there are people in the world that work hard and are often taking huge risks to their own safety in order to deliver any number of secrets to the general public, whether governmental, political, or corporate.  And this is by no means exclusive to Wikileaks, but also applies to similar organizations and even individual whistle-blowers like Edward Snowden.  In many cases, the information that is leaked to the public is vitally important to inform us about some magnate’s personal corruption, various forms of systemic corruption, or even outright violations of our constitutional rights (such as the NSA violating our right to privacy as outlined in the fourth amendment).

While the public tends to highly value the increased transparency that these kinds of leaks offer, they also open us up to a number of vulnerabilities.  One prominent example that illustrates some of these vulnerabilities is the influence on the 2016 presidential election, resulting from the Clinton email leaks and the leaks pertaining to the DNC.  One might ask how exactly could those leaks have been a bad thing for the public?  After all it just increased transparency and gave the public information that most of us felt we had a right to know.  Unfortunately, it’s much more complicated than that as it can be difficult to know where to draw the line in terms of what should or should not be public knowledge.

To illustrate this point, imagine that you are a foreign or domestic entity that is highly capable of hacking.  Now imagine that you stand to gain an immense amount of land, money, or power if a particular political candidate in a foreign or domestic election is elected, because you know about their current reach of power and their behavioral tendencies, their public or private ties to other magnates, and you know the kinds of policies that they are likely to enact based on their public pronouncements in the media and their advertised campaign platform.  Now if you have the ability to hack into private information from every pertinent candidate and/or political party involved in that election, then you likely have the ability to not only know secrets about the candidate that can benefit you from their winning (including their perspective of you as a foreign or domestic entity, and/or damning things about them that you can use as leverage to bribe them later on after being elected), but you also likely know about damning things that could cripple the opposing candidate’s chances at being elected.

This point illustrates the following conundrum:  while WikiLeaks can deliver important information to the public, it can also be used as a platform for malicious entities to influence our elections, to jeopardize our national or international security, or to cause any number of problems based on “selective” sharing.  That is to say, they may have plenty of information that would be damning to both opposing political parties, but they may only choose to deliver half the story because of an underlying agenda to influence the election outcome.  This creates an obvious problem, not least because the public doesn’t consider the amount of hacked or leaked information that they didn’t get.  Instead they think they’ve just become better informed concerning a political candidate or some policy issue, when in fact their judgment has now been compromised because they’ve just received a hyper-biased leak and one that was given to them intentionally to mislead them, even though the contents of the leak may in fact be true.  But when people aren’t able to put the new information in the proper context or perspective, then new information can actually make them less informed.  That is to say, the new information can become an epistemological liability, because it unknowingly distorts the facts, leading people to behave in ways that they otherwise would not have if they only had a few more pertinent details.

So now we have to ask ourselves, what can we do about this?  Should we just scrap WikiLeaks?  I don’t think that’s necessary, nor do I think it’s feasible to do even if we wanted to since it would likely just be replaced by any number of other entities that would accomplish the same ends (or it would become delocalized and go back to a bunch of disconnected sources).  Should we assume all leaked information has been leaked to serve some malicious agenda?

Well, a good dose of healthy skepticism could be a part of the solution.  We don’t want to be irrationally skeptical of any and all leaks, but it would make sense to have more scrutiny when it’s apparent that the leak could serve a malicious purpose.  This means that we need to be deeply concerned about this unless or until we reach a point in time where hacking is so common that the number of leaks reaches a threshold where it’s no longer pragmatically possible to selectively share them to accomplish these kinds of half-truth driven political agendas.  Until that point is reached, if it’s ever reached, given the arms race between encryption and hacking, we will have to question every seemingly important leak and work hard to make the public at large understand these concerns and to take them seriously.  It’s too easy for the majority to be distracted by the proverbial carrot dangling in front of them, such that they fail to realize that it may be some form of politically motivated bait.  In the mean time, we need to open up the conversation surrounding this issue, and look into possible solutions to help mitigate our concerns.  Perhaps we’ll start seeing organizations that can better vet the sources of these leaks, or that can better analyze their immediate effects on the global economy, elections, etc., before deciding whether or not they should release the information to the public.  This won’t be an easy task.

This brings me to my last point which is to say that I don’t think people have a fundamental right to know every piece of information that’s out there.  If someone found a way to make a nuclear bomb using household ingredients, should that be public information?  Don’t people understand that many pieces of information are kept private or classified because that’s the only way some organizations can function?  Including organizations that strive to maintain or increase national and international security?  Do people want all information to be public even if it comes at the expense of creating humanitarian crises, or the further consolidation of power by select plutocrats?  There’s often debate over the trade-offs between giving up our personal privacy to increase our safety.  Now the time has come to ask whether our giving up some forms of privacy or secrecy on larger scales (whether we like it or not) is actually detracting from our safety or putting our democracy in jeopardy.

The illusion of Persistent Identity & the Role of Information in Identity

After reading and commenting on a post at “A Philosopher’s Take” by James DiGiovanna titled Responsibility, Identity, and Artificial Beings: Persons, Supra-persons and Para-persons, I decided to expand on the topic of personal identity.

Personal Identity Concepts & Criteria

I think when most people talk about personal identity, they are referring to how they see themselves and how they see others in terms of personality and some assortment of (usually prominent) cognitive and behavioral traits.  Basically, they see it as what makes a person unique and in some way distinguishable from another person.  And even this rudimentary concept can be broken down into at least two parts, namely, how we see ourselves (self-ascribed identity) and how others see us (which we could call the inferred identity of someone else), since they are likely going to differ.  While most people tend to think of identity in these ways, when philosophers talk about personal identity, they are usually referring to the unique numerical identity of a person.  Roughly speaking, this amounts to basically whatever conditions or properties that are both necessary and sufficient such that a person at one point in time and a person at another point in time can be considered the same person — with a temporal continuity between those points in time.

Usually the criterion put forward for this personal identity is supposed to be some form of spatiotemporal and/or psychological continuity.  I certainly wouldn’t be the first person to point out that the question of which criterion is correct has already framed the debate with the assumption that a personal (numerical) identity exists in the first place and even if it did exist, it also assumes that the criterion is something that would be determinable in some way.  While it is not unfounded to believe that some properties exist that we could ascribe to all persons (simply because of what we find in common with all persons we’ve interacted with thus far), I think it is far too presumptuous to believe that there is a numerical identity underlying our basic conceptions of personal identity and a determinable criterion for it.  At best, I think if one finds any kind of numerical identity for persons that persist over time, it is not going to be compatible with our intuitions nor is it going to be applicable in any pragmatic way.

As I mention pragmatism, I am sympathetic to Parfit’s views in the sense that regardless of what one finds the criteria for numerical personal identity to be (if it exists), the only thing that really matters to us is psychological continuity anyway.  So despite the fact that Locke’s view — that psychological continuity (via memory) was the criterion for personal identity — was in fact shown to be based on circular and illogical arguments (per Butler, Reid and others), nevertheless I give applause to his basic idea.  Locke seemed to be on the right track, in that psychological continuity (in some sense involving memory and consciousness) is really the essence of what we care about when defining persons, even if it can’t be used as a valid criterion in the way he proposed.

(Non) Persistence & Pragmatic Use of a Personal Identity Concept

I think that the search for, and long debates over, what the best criterion for personal identity is, has illustrated that what people have been trying to label as personal identity should probably be relabeled as some sort of pragmatic pseudo-identity. The pragmatic considerations behind the common and intuitive conceptions of personal identity have no doubt steered the debate pertaining to any possible criteria for helping to define it, and so we can still value those considerations even if a numerical personal identity doesn’t really exist (that is, even if it is nothing more than a pseudo-identity) and even if a diachronic numerical personal identity does exist but isn’t useful in any way.

If the object/subject that we refer to as “I” or “me” is constantly changing with every passing moment of time both physically and psychologically, then I tend to think that the self (that many people ascribe as the “agent” of our personal identity) is an illusion of some sort.  I tend to side more with Hume on this point (or at least James Giles’ fair interpretation of Hume) in that my views seem to be some version of a no-self or eliminativist theory of personal identity.  As Hume pointed out, even though we intuitively ascribe a self and thereby some kind of personal identity, there is no logical reason supported by our subjective experience to think it is anything but a figment of the imagination.  This illusion results from our perceptions flowing from one to the next, with a barrage of changes taking place with this “self” over time that we simply don’t notice taking place — at least not without critical reflection on our past experiences of this ever-changing “self”.  The psychological continuity that Locke described seems to be the main driving force behind this illusory self since there is an overlap in the memories of the succession of persons.

I think one could say that if there is any numerical identity that is associated with the term “I” or “me”, it only exists for a short moment of time in one specific spatio-temporal slice, and then as the next perceivable moment elapses, what used to be “I” will become someone else, even if the new person that comes into being is still referred to as “I” or “me” by a person that possesses roughly the same configuration of matter in its body and brain as the previous person.  Since the neighboring identities have an overlap in accessible memory including autobiographical memories, memories of past experiences generally, and the memories pertaining to the evolving desires that motivate behavior, we shouldn’t expect this succession of persons to be noticed or perceived by the illusory self because each identity has access to a set of memories that is sufficiently similar to the set of memories accessible to the previous or successive identity.  And this sufficient degree of similarity in those identities’ memories allow for a seemingly persistent autobiographical “self” with goals.

As for the pragmatic reasons for considering all of these “I”s and “me”s to be the same person and some singular identity over time, we can see that there is a causal dependency between each member of this “chain of spatio-temporal identities” that I think exists, and so treating that chain of interconnected identities as one being is extremely intuitive and also incredibly useful for accomplishing goals (which is likely the reason why evolution would favor brains that can intuit this concept of a persistent “self” and the near uni-directional behavior that results from it).  There is a continuity of memory and behaviors (even though both change over time, both in terms of the number of memories and their accuracy) and this continuity allows for a process of conditioning to modify behavior in ways that actively rely on those chains of memories of past experiences.  We behave as if we are a single person moving through time and space (and as if we are surrounded by other temporally extended single person’s behaving in similar ways) and this provides a means of assigning ethical and causal responsibility to something or more specifically to some agent.  Quite simply, by having those different identities referenced under one label and physically attached to or instantiated by something localized, that allows for that pragmatic pseudo-identity to persist over time in order for various goals (whether personal or interpersonal/societal) to be accomplished.

“The Persons Problem” and a “Speciation” Analogy

I came up with an analogy that I thought was very fitting to this concept.  One could analogize this succession of identities that get clumped into one bulk pragmatic-pseudo-identity with the evolutionary concept of speciation.  For example, a sequence of identities somehow constitute an intuitively persistent personal identity, just as a sequence of biological generations somehow constitute a particular species due to the high degree of similarity between them all.  The apparent difficulty lies in the fact that, at some point after enough identities have succeeded one another, even the intuitive conception of a personal identity changes markedly to the point of being unrecognizable from its ancestral predecessor, just as enough biological generations transpiring eventually leads to what we call a new species.  It’s difficult to define exactly when that speciation event happens (hence the species problem), and we have a similar problem with personal identity I think.  Where does it begin and end?  If personal identity changes over the course of a lifetime, when does one person become another?  I could think of “me” as the same “me” that existed one year ago, but if I go far enough back in time, say to when I was five years old, it is clear that “I” am a completely different person now when compared to that five year old (different beliefs, goals, worldview, ontology, etc.).  There seems to have been an identity “speciation” event of some sort even though it is hard to define exactly when that was.

Biologists have tried to solve their species problem by coming up with various criteria to help for taxonomical purposes at the very least, but what they’ve wound up with at this point is several different criteria for defining a species that are each effective for different purposes (e.g. biological-species concept, morpho-species concept, phylogenetic-species concept, etc.), and without any single “correct” answer since they are all situationally more or less useful.  Similarly, some philosophers have had a persons problem that they’ve been trying to solve and I gather that it is insoluble for similar “fuzzy boundary” reasons (indeterminate properties, situationally dependent properties, etc.).

The Role of Information in a Personal Identity Concept

Anyway, rather than attempt to solve the numerical personal identity problem, I think that philosophers need to focus more on the importance of the concept of information and how it can be used to try and arrive at a more objective and pragmatic description of the personal identity of some cognitive agent (even if it is not used as a criterion for numerical identity, since information can be copied and the copies can be distinguished from one another numerically).  I think this is especially true once we take some of the concerns that James DiGiovanna brought up concerning the integration of future AI into our society.

If all of the beliefs, behaviors, and causal driving forces in a cognitive agent can be represented in terms of information, then I think we can implement more universal conditioning principles within our ethical and societal framework since they will be based more on the information content of the person’s identity without putting as much importance on numerical identity nor as much importance on our intuitions of persisting people (since they will be challenged by several kinds of foreseeable future AI scenarios).

To illustrate this point, I’ll address one of James DiGiovanna’s conundrums.  James asks us:

To give some quick examples: suppose an AI commits a crime, and then, judging its actions wrong, immediately reforms itself so that it will never commit a crime again. Further, it makes restitution. Would it make sense to punish the AI? What if it had completely rewritten its memory and personality, so that, while there was still a physical continuity, it had no psychological content in common with the prior being? Or suppose an AI commits a crime, and then destroys itself. If a duplicate of its programming was started elsewhere, would it be guilty of the crime? What if twelve duplicates were made? Should they each be punished?

In the first case, if the information constituting the new identity of the AI after reprogramming is such that it no longer needs any kind of conditioning, then it would be senseless to punish the AI — other than to appease humans that may be angry that they couldn’t themselves avoid punishment in this way, due to having a much slower and less effective means of reprogramming themselves.  I would say that the reprogrammed AI is guilty of the crime, but only if its reprogrammed memory still included information pertaining to having performed those past criminal behaviors.  However, if those “criminal memories” are now gone via the reprogramming then I’d say that the AI is not guilty of the crime because the information constituting its identity doesn’t match that of the criminal AI.  It would have no recollection of having committed the crime and so “it” would not have committed the crime since that “it” was lost in the reprogramming process due to the dramatic change in information that took place.

In the latter scenario, if the information constituting the identity of the destroyed AI was re-instantiated elsewhere, then I would say that it is in fact guilty of the crime — though it would not be numerically guilty of the crime but rather qualitatively guilty of the crime (to differentiate between the numerical and qualitative personal identity concepts that are embedded in the concept of guilt).  If twelve duplicates of this information were instantiated into new AI hardware, then likewise all twelve of those cognitive agents would be qualitatively guilty of the crime.  What actions should be taken based on qualitative guilt?  I think it means that the AI should be punished or more specifically that the judicial system should perform the reconditioning required to modify their behavior as if it had committed the crime (especially if the AI believes/remembers that it has committed the crime), for the better of society.  If this can be accomplished through reprogramming, then that would be the most rational thing to do without any need for traditional forms of punishment.

We can analogize this with another thought experiment with human beings.  If we imagine a human that has had its memories changed so that it believes it is Charles Manson, has all of Charles Manson’s memories and intentions, then that person should be treated as if they are Charles Manson and thus incarcerated/punished accordingly to rehabilitate them or protect the other members of society.  This is assuming of course that we had reliable access to that kind of mind-reading knowledge.  If we did, the information constituting the identity of that person would be what is most important — not what the actual previous actions of the person were — because the “previous person” was someone else, due to that gross change in information.

Conscious Realism & The Interface Theory of Perception

A few months ago I was reading an interesting article in The Atlantic about Donald Hoffman’s Interface Theory of Perception.  As a person highly interested in consciousness studies, cognitive science, and the mind-body problem, I found the basic concepts of his theory quite fascinating.  What was most interesting to me was the counter-intuitive connection between evolution and perception that Hoffman has proposed.  Now it is certainly reasonable and intuitive to assume that evolutionary natural selection would favor perceptions that are closer to “the truth” or closer to the objective reality that exists independent of our minds, simply because of the idea that perceptions that are more accurate will be more likely to lead to survival than perceptions that are not accurate.  As an example, if I were to perceive lions as inert objects like trees, I would be more likely to be naturally selected against and eaten by a lion when compared to one who perceives lions as a mobile predator that could kill them.

While this is intuitive and reasonable to some degree, what Hoffman actually shows, using evolutionary game theory, is that with respect to organisms with comparable complexity, those with perceptions that are closer to reality are never going to be selected for nearly as much as those with perceptions that are tuned to fitness instead.  More so, truth in this case will be driven to extinction when it is up against perceptual models that are tuned to fitness.  That is to say, evolution will select for organisms that perceive the world in a way that is less accurate (in terms of the underlying reality) as long as the perception is tuned for survival benefits.  The bottom line is that given some specific level of complexity, it is more costly to process more information (costing more time and resources), and so if a “heuristic” method for perception can evolve instead, one that “hides” all the complex information underlying reality and instead provides us with a species-specific guide to adaptive behavior, that will always be the preferred choice.

To see this point more clearly, let’s consider an example.  Let’s imagine there’s an animal that regularly eats some kind of insect, such as a beetle, but it needs to eat a particular sized beetle or else it has a relatively high probability of eating the wrong kind of beetle (and we can assume that the “wrong” kind of beetle would be deadly to eat).  Now let’s imagine two possible types of evolved perception: it could have really accurate perceptions about the various sizes of beetles that it encounters so it can distinguish many different sizes from one another (and then choose the proper size range to eat), or it could evolve less accurate perceptions such that all beetles that are either too small or too large appear as indistinguishable from one another (maybe all the wrong-sized beetles whether too large or too small look like indistinguishable red-colored blobs) and perhaps all the beetles that are in the ideal size range for eating appear as green-colored blobs (that are again, indistinguishable from one another).  So the only discrimination in this latter case of perception is between red and green colored blobs.

Both types of perception would solve the problem of which beetles to eat or not eat, but the latter type (even if much less accurate) would bestow a fitness advantage over the former type, by allowing the animal to process much less information about the environment by not focusing on relatively useless information (like specific beetle size).  In this case, with beetle size as the only variable under consideration for survival, evolution would select for the organism that knows less total information about beetle size, as long as it knows what is most important about distinguishing the edible beetles from the poisonous beetles.  Now we can imagine that in some cases, the fitness function could align with the true structure of reality, but this is not what we ever expect to see generically in the world.  At best we may see some kind of overlap between the two but if there doesn’t have to be any then truth will go extinct.

Perception is Analogous to a Desktop Computer Interface

Hoffman analogizes this concept of a “perception interface” with the desktop interface of a personal computer.  When we see icons of folders on the desktop and drag one of those icons to the trash bin, we shouldn’t take that interface literally, because there isn’t literally a folder being moved to a literal trash bin but rather it is simply an interface that hides most if not all of what is really going on in the background — all those various diodes, resistors and transistors that are manipulated in order to modify stored information that is represented in binary code.

The desktop interface ultimately provides us with an easy and intuitive way of accomplishing these various information processing tasks because trying to do so in the most “truthful” way — by literally manually manipulating every diode, resistor, and transistor to accomplish the same task — would be far more cumbersome and less effective than using the interface.  Therefore the interface, by hiding this truth from us, allows us to “navigate” through that computational world with more fitness.  In this case, having more fitness simply means being able to accomplish information processing goals more easily, with less resources, etc.

Hoffman goes on to say that even though we shouldn’t take the desktop interface literally, obviously we should still take it seriously, because moving that folder to the trash bin can have direct implications on our lives, by potentially destroying months worth of valuable work on a manuscript that is contained in that folder.  Likewise we should take our perceptions seriously, even if we don’t take them literally.  We know that stepping in front of a moving train will likely end our conscious experience even if it is for causal reasons that we have no epistemic access to via our perception, given the species-specific “desktop interface” that evolution has endowed us with.

Relevance to the Mind-body Problem

The crucial point with this analogy is the fact that if our knowledge was confined to the desktop interface of the computer, we’d never be able to ascertain the underlying reality of the “computer”, because all that information that we don’t need to know about that underlying reality is hidden from us.  The same would apply to our perception, where it would be epistemically isolated from the underlying objective reality that exists.  I want to add to this point that even though it appears that we have found the underlying guts of our consciousness, i.e., the findings in neuroscience, it would be mistaken to think that this approach will conclusively answer the mind-body problem because the interface that we’ve used to discover our brains’ underlying neurobiology is still the “desktop” interface.

So while we may think we’ve found the underlying guts of “the computer”, this is far from certain, given the possibility of and support for this theory.  This may end up being the reason why many philosophers claim there is a “hard problem” of consciousness and one that can’t be solved.  It could be that we simply are stuck in the desktop interface and there’s no way to find out about the underlying reality that gives rise to that interface.  All we can do is maximize our knowledge of the interface itself and that would be our epistemic boundary.

Predictions of the Theory

Now if this was just a fancy idea put forward by Hoffman, that would be interesting in its own right, but the fact that it is supported by evolutionary game theory and genetic algorithm simulations shows that the theory is more than plausible.  Even better, the theory is actually a scientific theory (and not just a hypothesis), because it has made falsifiable predictions as well.  It predicts that “each species has its own interface (with some similarities between phylogenetically related species), almost surely no interface performs reconstructions (read the second link for more details on this), each interface is tailored to guide adaptive behavior in the relevant niche, much of the competition between and within species exploits strengths and limitations of interfaces, and such competition can lead to arms races between interfaces that critically influence their adaptive evolution.”  The theory predicts that interfaces are essential to understanding evolution and the competition between organisms, whereas the reconstruction theory makes such understanding impossible.  Thus, evidence of interfaces should be widespread throughout nature.

In his paper, he mentions the Jewel beetle as a case in point.  This beetle has a perceptual category, desirable females, which works well in its niche, and it uses it to choose larger females because they are the best mates.  According to the reconstructionist thesis, the male’s perception of desirable females should incorporate a statistical estimate of the true sizes of the most fertile females, but it doesn’t do this.  Instead, it has a category based on “bigger is better” and although this bestows a high fitness behavior for the male beetle in its evolutionary niche, if it comes into contact with a “stubbie” beer bottle, it falls into an infinite loop by being drawn to this supernormal stimuli since it is smooth, brown, and extremely large.  We can see that the “bigger is better” perceptual category relies on less information about the true nature of reality and instead chooses an “informational shortcut”.  The evidence of supernormal stimuli which have been found with many species further supports the theory and is evidence against the reconstructionist claim that perceptual categories estimate the statistical structure of the world.

More on Conscious Realism (Consciousness is all there is?)

This last link provided here shows the mathematical formalism of Hoffman’s conscious realist theory as proved by Chetan Prakash.  It contains a thorough explanation of the conscious realist theory (which goes above and beyond the interface theory of perception) and it also provides answers to common objections put forward by other scientists and philosophers on this theory.