A voice without a mouth no more: The neurobiology of language and consciousness

Most research on the neurobiology of language ignores consciousness and vice versa. Here, language, with an emphasis on inner speech, is hypothesised to generate and sustain self-awareness, i.e., higher-order consciousness. Converging evidence supporting this hypothesis is reviewed. To account for these findings, a ‘HOLISTIC ’ model of neurobiology of language, inner speech, and consciousness is proposed. It involves a ‘core ’ set of inner speech production regions that initiate the experience of feeling and hearing words. These take on affective qualities, deriving from activation of associated sensory, motor, and emotional representations, involving a largely unconscious dynamic ‘periphery ’ , distributed throughout the whole brain. Responding to those words forms the basis for sustained network activity, involving ‘default mode ’ activation and prefrontal and thalamic/ brainstem selection of contextually relevant responses. Evidence for the model is reviewed, supporting neuro-imaging meta-analyses conducted, and comparisons with other theories of consciousness made. The HOLISTIC model constitutes a more parsimonious and complete account of the ‘neural correlates of consciousness ’ that has implications for a mechanistic account of mental health and wellbeing. A sentence uttered


Introduction
The average adult brain 'stores' more than 40,000 words (Brysbaert et al., 2016) and spends about 11 h a day engaged in tasks reliant on those words (like emailing, instant messaging, watching television, etc.) (Chui et al., 2012).Furthermore, we spend a large portion of our waking day talking to ourselves, i.e., using outer self-talk and inner speech (Alderson-Day andFernyhough, 2015, 2014;Hurlburt et al., 2013).Thus, our brains are here estimated to minimally process some 150,000 word tokens a day and more than 3,500,000,000 words over an average human lifespan.
What are the consequences of processing 3.5 billion words for human cognition?Given this ubiquity, one might imagine words and language are a fundamental mechanism underlying and inseparable from most cognitive functioning.Yet, language is not considered central to the study or understanding of cognition and is even oddly marginalised (Lupyan, 2016).Most non-linguistic studies of cognition treat language merely as a vehicle for stimulus presentation and/or verbal report.Large numbers of language experiments focus on language competence (i.e., the knowledge that people have of phonemes, syllables, morphology, syntax, etc.) and not real-world language performance or use (Skipper, 2015).Yet, substantial empirical evidence now demonstrates that language use shapes and even determines visual and emotional perception, learning and memory, reasoning and social cognition and vice versa, even when words are not explicitly being used (Barrett et al., 2007;Holtgraves and Kashima, 2008;Lindquist, 2017;Lupyan, 2016Lupyan, , 2012;;Lupyan et al., 2020;Ünal and Papafragou, 2016).
This puzzling lack of focus on language in cognitive science despite its pervasiveness extends to that most astonishing and seemingly mysterious cognitive process, consciousness.Most of the empirical research on the behavioural and 'neural correlates of consciousness' does not discuss language or does so only peripherally as just another cognitive process (Bisenius et al., 2015;Boly et al., 2017;Koch et al., 2016;Odegaard et al., 2017;Rees et al., 2002).Conversely, most language research does not concern itself with consciousness.Building on the work of many scholars (cited throughout), this article challenges these dogmas with the argument that language has penetrated the brain to such an extent that it is a fundamental mechanism for generating and maintaining consciousness.
To refine this claim, some definitions are needed.First, 'language' as used herein pertains mostly to the heard and vocally (externally and internally) produced variety, though it is believed the proposed model can be extended to signed languages (Alderson-Day and Fernyhough, 2015;Atkinson, 2006;Atkinson et al., 2007).Second, theory and empirical evidence suggest that consciousness should not be defined as being dichotomous (i.e., on or off) or even two dimensionally (e.g., arousal/wakefulness vs awareness/content) but, rather, as being multifactorial or consisting of a multidimensional state space (Bayne et al., 2016;Birch et al., 2020;Fazekas andOvergaard, 2017, 2016).For simplicity, consciousness is conceptualised here as occupying two gross regions of this space referred to as 'primary' and 'higher-order' consciousness (partly after) (Edelman et al., 2011).
Primary consciousness involves wakeful, attentive, and aware processing of the environment and is conceptually similar to other characterisation, including 'awareness', 'consistent' and 'sensorimotor' awareness, 'core' and 'minimal' consciousness, and the 'ecological self' (Morin, 2006).In contrast, higher-order consciousness involves levels of self-awareness (becoming 'the object of one's own attention') and meta-self-awareness (being aware of being self-aware) (Morin, 2006).It is conceptually similar to other characterisations like 'extended', 'recursive', 'reflective', and 'self' consciousness and the 'extended' and 'narrative' self, many of which highlight or even necessitate the role of language (Morin, 2006).
Thus, the specific argument advanced in this article is that the neurobiology of language, with an emphasis on inner speech, produces and sustains higher-order consciousness and modifies primary conscious experience (though this is not equivalent to saying one cannot be selfaware without language or that all thinking is in words) (Hurlburt and Akhter, 2008).To support this argument, the 'special' relationship among language, inner speech, and consciousness is first reviewed from a comparative cognition perspective (2.1).Next, the article reviews anecdotal, behavioural, and neurobiological evidence suggesting that language is a mechanism for producing consciousness (2.2 and 2.3).After describing relevant aspects of inner speech in more detail (3.1), a mechanistic account of how inner speech might lead to higher-order consciousness is proposed (3.2).This is followed by a formalisation of how these mechanisms operate in the 'HOLISTIC' model of the neurobiology of language, inner speech, and consciousness (4.1) Supporting evidence and neuroimaging meta-analyses are provided (4.2) and application of the model to mental health, wellbeing, and psychotherapy is discussed (4.3).Before concluding, the relationships between the HOLISTIC model and other theories of consciousness are discussed (5).

Comparative overview
Most people have an intuition that there is some 'special' relationship between language and consciousness.Where does this intuition derive from?In this section, various perspectives on this link are described from a comparative cognition perspective (see also) (Bermúdez, 2007;Carruthers, 2002;Clark, 1998;Dennett, 1997;Frankish, 2018Frankish, , 2002;;Jackendoff, 1987;James, 1890;Jaynes, 2000;Rosenthal, 1990;Wiley, 2014).First, the belief that language is somehow important for consciousness comes from our phenomenological experience of the world.In Consciousness Explained, the philosopher Daniel C Dennett puts it this way: We often do discover what we think (and hence what we mean) by reflecting on what we find ourselves saying.The fact that we said it gives it a certain personal persuasiveness or at least a presumption of authenticity.(Dennett, 1993) 2 That is, it seems that we do not become aware of what we think until it is expressed in words.This is consistent with a second reason that emerges from a sort of folk comparative neuroscience.Namely, our pets seem reasonably conscious but not particularly self-aware.One straightforward difference between pets and humans is that humans have brains that use language.Indeed, consciousness is considered to be widespread in the animal kingdom, including cephalopods, birds, and mammals (Birch et al., 2020).While these animals (including our pets) exhibit primary consciousness, humans have higher-order consciousness that might be the result of a lifetime's exposure to more than three billion words.The neuroscientists Gerald M Edelman, Joseph A Gally, and Bernard J Baars elaborate: By reference to linguistic tokens, humans can divorce themselves from the 'remembered present' of primary consciousness.What emerges as a result of the combination of primary and higher-order consciousness is a narrative capability encompassing past experience and future plans, as well as the ability to be conscious of being conscious.(Edelman et al., 2011) Thus, it seems we somehow emerge from primary consciousness into higher-order consciousness by being able to use language.Indeed, animals other than humans do not display very sophisticated feats of selfawareness.For example, the available evidence is mixed as to whether primates other than apes and humans show mirror self-recognition (Anderson and Gallup, 2015) or if apes can explicitly represent the beliefs of others, an ability predicated on self-awareness (Bettle and Rosati, 2020;Horschler et al., 2020;Kano et al., 2020).In contrast, human children start representing others' beliefs implicitly around two and can explicitly do so at about four years of age, achievements often linked to language development (Milligan et al., 2007;Nilsson and de López, 2016).
By extension, if we simply teach non-human primates to use language they might display evidence of more complex forms of selfawareness.Several apes have been taught hundreds of signs that they appear to use adeptly (e.g.) (Gardner and Gardner, 1969).However, there is little evidence that these signs or other gestures are used to spontaneously refer to or discuss themselves or others.Rather, some have argued that these are mostly used to make requests and solicit rewards (Byrne et al., 2017;Cartmill et al., 2011;Terrace, 2005;Terrace et al., 1979).Thus, though having hundreds of signs is impressive, perhaps it is simply not enough.Indeed, human foetuses begin learning language in utero (Moon et al., 2013) and, by adulthood, have vocabularies in the range of 40-70,000 words (Brysbaert et al., 2016;Segbers and Schroeder, 2017) that are used to report on oneself or others in about 65% of all language content (Dunbar et al., 1997;Dunbar, 2004).
What explains these disparities between human and non-human primates?They might exist because humans evolved to be more socially cooperative compared to the more fundamentally competitive great ape phenotype (Bettle and Rosati, 2020;Tomasello et al., 2012).This might have led to pressures that selected for a brain architecture that could support language because language permits more sophisticated cooperation, e.g., through more sophisticated reasoning about others' minds (Bettle and Rosati, 2020).This leads to another reason for the intuition that there is a special relationship between language and higher-order consciousness.Namely, the social nature of language that allows us to reason about others' minds, when turned in on the self, allows us to have thoughts about the self.The psychologist Alain Morin has long insisted on a perspective like this (Morin, 2011(Morin, , 2009(Morin, , 2006(Morin, , 2005(Morin, , 2001(Morin, , 1993;;Morin and Everett, 1990;Morin and Michaud, 2007): 2 Dennett has been quoted as suggesting language is necessary for consciousness.See: https://www.edge.org/response-detail/11902

J.I. Skipper
The social mechanism initiating the taking of others' perspectives, and resulting in an objective vision of oneself, can be reproduced by self-talk; also, self-talk allows a reproduction for oneself of the appraisals we get from others.And finally, self-talk creates a redundancy of self-information within the self, and with it a distance (essential to self-awareness) between self-information and the individual (the self).(Morin, 1993) Furthermore, much of this self-talk, initially meant to serve the social/cooperative needs of humans, cannot be readily made conscious without language.As Morin states: though many public self-aspects (like physical features, gestures, postural and motor characteristics, mannerisms, etc.) and some private ones (e.g., bodily states) need not be verbalized in order to be 'seen,' it must be acknowledged that most private self-aspects like beliefs, attitudes, personality traits, or personal virtues, can hardly be brought to consciousness without self-verbalizations.(Morin and Everett, 1990) That is, language allows us to talk about categories, groups of information key to human psychological function like concepts, emotion, intelligence, memory, and personality that are not likely natural kinds (Barrett, 2006;Machery, 2005;Michaelian, 2011).Though other primates might reason with categories, language clearly enhances and expands on these capacities dramatically (Cacchione et al., 2016;Gelman and Roberts, 2017).
To summarise, there is a fundamental relationship between language, inner speech, and higher-order consciousness that roughly seems to be associated with the separation of words (as felt and heard) from our other 'thoughts' (as in primary consciousness) that allow us to describe the world and ourselves in a manner that both creates and extends the self in time.Before expanding on this description, evidence for this 'special' relationship is reviewed, deriving from various anecdotal reports, behavioural studies, and neurobiological evidence, from a diverse array of converging methods.

Anecdotal evidence
The anecdotal accounts all roughly describe episodes of not having speech as being in states of primary consciousness.The subsequent learning or recovery of language is described as the return of higherorder consciousness (though see) (Mitchell, 2009).This is clearly described by Helen Keller who was deaf and blind and did not learn a language until middle childhood.In 'Before the Soul Dawn' (Chapter XI) of her book 'The World I Live In', she writes: Before my teacher came to me, I did not know that I am.I lived in a world that was no world.I cannot hope to describe adequately that unconscious, yet conscious time of nothingness.I did not know that I knew aught or that I lived or acted or desired.Since I had no power of thought, I did not compare one mental state with another.When I learned the meaning of 'I' and 'me' and found that I was something, I began to think.Then consciousness first existed for me.("The project Gutenberg eBook of the world I live in, by Helen Keller", 2009) There are similar anecdotal descriptions of the phenomenological experience of recovering from aphasia, an impaired ability or inability to use language following brain damage (Morin, 2009;Moss, 1972;Ojemann, 1986).The author Lauren Marks (Marks, 2017) states that her 'sense of awareness lurched forward in stages' 3 during recovery from aphasia due to a ruptured aneurysm.Before language returned, she writes: I had a nothing mind, a flotsam mind.I was incredibly focused on the present, with very little awareness or interest in my past or future.
My entire environment felt interconnected, like cells in a large, breathing organism.I felt less like myself and more like everything around me. 4   Regarding inner speech, Marks later says: I think it's very, very difficult to access a sense of self (personality and/or preferences) when you don't have access to your own inner voice.Ego feels partially linked to the linguistic skill of expressing that ego. 5   The clinical psychologist, Claude S Moss, provides a similar account regarding the loss of inner speech following a stroke in his 1972 book 'Recovery with Aphasia': I had also lost the ability to engage in self-talk.In other words, I did not have the ability to think about the future -to worry, to anticipate or perceive it -at least not with words.Thus for the first four or five weeks after hospitalization I simply existed.(As quoted in) (Morin, 2001).
Also with reference to inner speech, the neuroscientist Jill B Taylor (see) (Taylor, 2009), describes her experience of recovery from a stroke in a manner similar to Marks and Moss: You wake up in the morning and the first thing your brain says is, 'Oh man, the sun is shining,' Well, imagine you don't hear that little voice that says, 'Man, the sun is shining.'You just experience the sun and the shining. 6  There are some 10-20 accounts of 'aphasia from the inside' that typically have a similar flavour (e.g., as portrayed by Samuel Beckett who provides the title of this article) (Ardila and Rubio-Bruno, 2018;Lecercle and Riley, 2004;Salisbury, 2008).In addition to these, there are numerous descriptions of children who were raised by non-human animals 7 or in confinement and did not develop language.These individuals are regularly described as not initially being 'conscious' and lacking self-awareness, e.g., as putatively demonstrated by not having mirror self-recognition (Curtiss, 2014;McCrone, 2003;Newton, 2011Newton, , 1996)).

Experimental evidence
These various anecdotal accounts suggest that language and inner speech produce higher-order consciousness.However, they are all variously post hoc reconstructions, relying on memories that are possibly fallible.Some individuals may have also had additional comorbid neurological problems other than simply losing or not having language.The anecdotes might also be subject to an over-representation bias given the stunning nature of the stories being told.In the case of feral children, source material inevitably reflects historical political and sociocultural biases (as potentially reflected in titles like the 'Savage Girl of Champagne').Thus, to further evaluate the relationship between language, inner speech, and higher-order consciousness, the extant experimental behavioural and neurobiological literature are reviewed.

Behavioural
A number of behavioural studies suggest that words facilitate the conscious experience of non-linguistic stimuli, perhaps through categorically organising relevant sensory information.In particular, hearing or reading colour, motion, and object words promotes the detection of 3 https://www.intandem.co.uk/the-quiet-before-the-word/ visual colour, motion, and objects (Boutonnet and Lupyan, 2015;Forder et al., 2017;Forder and Lupyan, 2019;Lupyan et al., 2007;Meteyard et al., 2007;Noorman et al., 2018).Learned verbal labels similarly aid tactile perception (Miller et al., 2018).Furthermore, without any explicit verbal cues, colour contrasts are perceptually facilitated for those contrasts verbally marked but not those unmarked in one's native language (Maier and Abdel Rahman, 2018).These studies imply that words gate what we consciously perceive, suggesting a deeper link between language to consciousness than it typically acknowledged.This is supported by studies making use of various 'trademark' paradigms used in consciousness research (Havlík et al., 2019).In the flash suppression paradigm, a visual image presented to one eye is suppressed by an image 'flashed' to the other.In some studies, participants are asked to indicate whether they see the image (e. g., of a zebra) after no cue or a matching (e.g., 'zebra') or nonmatching word (e.g., 'kangaroo').Results suggest that words can help otherwise unconscious colour, face, and object visual information emerge into consciousness (Forder et al., 2016;Fugate et al., 2019;Lupyan and Ward, 2013;Ostarek and Huettig, 2017;Pinto et al., 2015).Similarly, using a related binocular rivalry paradigm, emotional words can assist otherwise unconscious emotional faces emerge into consciousness (Fugate et al., 2019).
Inner speech has also been directly linked to consciousness.There are number of studies demonstrating moderate positive correlations between questionnaires assessing inner speech (e.g., 'If I am not feeling well, I often talk to myself about my state') and scales assessing selfawareness ('I generally pay attention to my inner feelings'; for a review see) (Morin, 2018).In three experiments, Mikaël Bastion at al. (2017), tested the hypothesis that inner speech facilitates the awareness of mind-wandering, a process that typically goes unnoticed.In the first experiment, participants showed decreased conscious awareness of mind-wandering when inner speech was reduced using articulatory suppression.Additionally, self-caught instances of mind-wandering were reported to be more verbal than probe-caught instances.In a second laboratory experiment, participants showed increased conscious awareness of mind-wandering when inner speech was facilitated.Lastly, a more natural real-world experience sampling experiment showed that the vividness of inner speech but not auditory and visual imagery predicted awareness of mind-wandering.

Neurobiological
The existing behavioural evidence thus suggests that language and 'inner speech should be conceived as an active tool for consciousness' (Bastian et al., 2017).Neurobiological evidence supporting this view is reviewed next, including studies of overt damage (e.g., from lesions; 2.3.1.1),intact or less obviously damaged brains (i.e., epilepsy and depression; 2.3.1.2),recovery from disorders of consciousness (2.3.1.3),and 'non-verbal thought ' (2.3.1.4).Across these, converging evidence is sought for the position that the hemisphere and brain regions putatively more important for language functioning are more associated with consciousness, while acknowledging that hemispheric dichotomies and the notion of fixed 'language regions' is overly simplistic (Jung-Beeman, 2005;Skipper, 2015).
2.3.2.1.Damaged.Independent of language, studies of cortical damage suggest that consciousness is typically associated with the left hemisphere (Albert et al., 1976;Salazar et al., 1986;Schwartz, 1967;Serafetinides et al., 1965;Snider et al., 2020) (though see) (Cucchiara et al., 2003).The left-hemisphere plays a more significant role in at least some aspects of language than the right hemisphere in about 88% of people and this is not tied to handedness (Mazoyer et al., 2014).This suggests that the greater left hemisphere locus of consciousness is indeed tied to language in some manner.This is confirmed by specific studies of aphasia.About 20-25% of patients (and possibly more) with left hemisphere damage and aphasia also have anosognosia or lack of awareness of their own language disability.This is particularly true of aphasias that affect comprehension more than production, as measured both directly (e.g., with the 'Visual-Analogue Test Assessing Anosognosia for Language Impairment', using minimal verbal report from patients and caregivers' evaluations) and indirectly (e.g., by quantifying patients' self-correction of naming errors) (Arantzeta et al., 2018;Cocchini et al., 2010;Dean et al., 2017;Kertesz and Benson, 1970;Lebrun, 1987;Marshall et al., 1998).Poor self-awareness of naming abilities in aphasia is more associated with inferior frontal gyrus lesions and possibly the entire lateral motor system whereas preserved awareness is associated with superior temporal lesions (van der Stelt et al., 2021).These are key 'language regions' in neurotypical individuals.
There have also been some studies of inner speech in patients with left hemisphere damage and aphasia.These show something like the converse of the above anecdotal accounts.That is, individuals with preserved inner speech, possibly without preserved production, exist in a relatively typical state of higher-order consciousness with awareness of their deficits (Fama et al., 2019a;Fama and Turkeltaub, 2020;Feinberg et al., 1986;Geva et al., 2011;Hayward et al., 2016;Sierpowska et al., 2020).However, patients who cannot reliably report their own inner speech seem to have anosognosia in that they detect and correct their naming errors less frequently than controls (Fama et al., 2019b).
Studies of white matter damage also suggest that consciousness is more associated with the left hemisphere.The corpus callosum exists only in placental mammals and is the largest axonal tract in the human brain, connecting the two hemispheres (Suárez et al., 2014).'Split brain' patients have had a commissurotomy, callosotomy, or somewhere in between to help with intractable epilepsy, resulting in relative hemispheric isolation (de Haan et al., 2020).Such patients have been shown to fabricate interpretive narratives for unconsciously processed information appearing in their right hemispheres, often based on information consciously available for verbal report in their left hemispheres.For example, a woman presented the word 'laugh' (or a photo of a naked man) to the left eye / right hemisphere is not conscious of seeing the word (or picture) but starts laughing.When asked why by the experimenter in front of her, she responds: 'What a way to make a living, doing this kind of testing every day!' (Cooney and Gazzaniga, 2003).In summarising split-brain studies of this form, Gazzaniga et al conclude: One theory of consciousness deriving from split brain patients is that a large part of our sense of conscious reality, we believe, comes from the verbal system attributing cause to exhibited behaviour.(Gazzaniga et al., 1977) Originally, results supporting statements like this were interpreted to suggest that split brain patients simply had a conscious left hemisphere verbal 'interpreter' and an unconscious right hemisphere (Morin and Everett, 1990).This was later depicted less categorically, with the suggestion that commissurotomy unevenly divides consciousness, with the right hemisphere having some level of consciousness but one that is more like primary consciousness (Morin, 2001) (though see) (Pinto et al., 2017).

Intact. Studies of individuals with epilepsy and depression
allow us to test whether language is tied to consciousness in individuals with intact brains or at least less obvious damage.Indeed, the left hemisphere and particularly the left temporal lobe is more associated with the loss of consciousness and underreporting of seizures while preserved speech is more associated with the right hemisphere (Blair, 2012;Detyniecki and Blumenfeld, 2014).Electroconvulsive therapy, used to treat intractable depression, supports these results.Specifically, several studies used it unilaterally as a test for cerebral 'dominance' and reported that left hemisphere shocks result in greater impairments on naming tests and a tendency toward a slower return to consciousness compared to the right hemisphere (Kriss et al., 1978;Pratt et al., 1971).
Related results are found with unilateral intracarotid anaesthetic, a component of the workup for surgery for intractable epilepsy.This produces hemianesthesia during which language functions of the unaffected hemisphere are tested.As with cortical damage and electroconvulsive therapy, patients show a greater lack of awareness (i.e., anosognosia) of their language comprehension compared to production deficits (75% vs 10% of patients) after left hemisphere anaesthetic (Banks et al., 2010).Finally, results are not only associated with simplistic left/right hemisphere dichotomies.Left sided injection of anaesthetic is more likely to result in unconsciousness in individuals whose tested language functioning is purported to rely more on the left hemisphere whereas individuals given the same tests who rely more on the right hemisphere were more likely to lose consciousness after right sided injection.Individuals with more balanced bilateral language functioning by these measures lost consciousness from injection to both hemispheres (Serafetinides et al., 1965).

Recovery.
If language plays a role in generating consciousness, language processing generally and activation of specific brain foci associated with language processing should predict disorders and recovery of consciousness.Indeed, electroencephalography based speechtracking accurately predicts a diagnosis of minimal consciousness over vegetative states (or unresponsive wakefulness syndrome) and a subsequent return of consciousness and does so increasingly better as one ascends the 'linguistic hierarchy' (from rest to words to sentences) (Gui et al., 2020).Grey and white matter preservation in the left hemisphere in and around 'language regions', activity in 'language regions', and connectivity between these regions distinguish individuals in a vegetative state from a minimally conscious state with and without command following (though otherwise 'devoid of clinical verbal or nonverbal expression') (Bruno et al., 2012;Demertzi et al., 2014;Guldenmund et al., 2016).Using resting-state scans with no language input, the 'auditory network' does the best job in discriminating minimally conscious patients from those in a vegetative state (Demertzi et al., 2015).More specifically, responses in patients' auditory cortices to their own name spoken by a familiar voice predicts recovery from a vegetative state to a minimally conscious state or emergence from a minimally conscious state (Wang et al., 2015).

Nonverbal.
Finally, if disorders of language are associated with a diminishment of consciousness, they should also be correlated with 'nonverbal' abilities to the extent these are consciously mediated.Though controversial, such a relationship has been observed in aphasia for cognitive (Fonseca et al., 2019;Gonzalez et al., 2020;Schumacher et al., 2019;Wall et al., 2017;Yao et al., 2020)(though see) (Fedorenko and Varley, 2016;Varley, 2014;Woolgar et al., 2018) and meta-cognitive performance (Baldo et al., 2005;Hermer-Vazquez et al., 1999;Langland-Hassan et al., 2017).For example, people with aphasia are impaired in skills like drawing, block arrangement, and visual pattern completion (Gonzalez et al., 2020).People with aphasia with inner speech deficits, as determined by a silent rhyming task, can categorise as well as controls which visual objects go together, but are unreliable in their metacognitive judgements about their categorisations (Langland-Hassan et al., 2017).These findings are supported by studies of people with non-or minimally verbal autism who are also impaired on nonverbal tasks, which the authors in one review feel 'fundamentally questions the idea of the language-independence of thought in humans' (Hinzen et al., 2020).

Caveats
There are a number of caveats with regard to the available empirical evidence.These include some potential circularity in logic in that the assessment of consciousness might be complicated by language impairments because participants and patients cannot report on their states (e.g., see) (Majerus et al., 2009;Schnakers et al., 2015).Another possible issue is that damage to specific hemispheres or particular language loci might have been so extensive that some as-of-yet unknown non-linguistic progenitor of higher-order consciousness was damaged, e. g., explaining the effects on nonverbal processes.However, these caveats can likely be ruled out on the basis of parsimony and converging evidence from the totality of studies reviewed that do not all suffer from these issues.
To the extent that this is true, one can ask if language and inner speech cause consciousness?Specific damage to 'language regions' resulting in the loss of higher-order consciousness might be considered causal in the field of cognitive neuroscience, particularly in the context of converging evidence (Vaidya et al., 2019).This leads to the question of whether language is a necessary and/or sufficient cause.However, simplistic conditions of necessity and sufficiency are inadequate to handle the complexities of consciousness if it is truly multidimensional (as the evidence suggests).Causality in a multidimensional framework is likely closer to a contributory cause, satisfying an 'INUS'8 condition where many factors act as partial causes contributing to one effect (Mackie, 2003;Nachev et al., 2018).Language is causal of consciousness from this perspective, even in the face of counter arguments, e.g., that there might be individuals without an 'inner monologue' (because they still have outer language and likely other contributing causes).

Inner speech
Thus, anecdotal, behavioural, and neurobiological evidence all suggest that language and inner speech produce at least higher-order consciousness, with evidence that primary consciousness is also affected.Given its importance, inner speech is now described in more detail, focusing on those aspects important for the model of the neurobiology of language and consciousness to be developed (for more comprehensive reviews, see) (Alderson-Day and Fernyhough, 2015;Fernyhough, 2016;Langland-Hassan, 2020;Perrone-Bertolotti et al., 2014).After this description, a mechanistic account of how inner speech mediates and supports higher-order consciousness is suggested.

Description
Inner speech can be 'defined as the subjective experience of language in the absence of overt and audible articulation' (Alderson- Day and Fernyhough, 2015).It goes by many names, including covert [speech / talk], [articulatory / auditory verbal / speech / voice] imagery, inner [dialogue / monologue / speech / talk / voice], mental verbalization, self [statements / talk], [silent / subvocal] speech, and verbal thinking.Perhaps this assortment of names reflects the variety of ways people experience inner speech, its frequency, and its usefulness.

Varieties
How do we experience inner speech?While it is often portrayed as being similar to overt speech production, it has a number of unique phenomenological properties (Jones and Fernyhough, 2007a).Based in part on the developmental theory of Lev S Vygotsky (Fernyhough, 2008(Fernyhough, , 1996)), the 'Varieties of Inner Speech Questionnaire' (VISQ) (McCarthy-Jones and Fernyhough, 2011) and its revised version (VISQ-R) (Alderson-Day et al., 2018) assess the frequency of these properties.In the VISQ, they are the qualities of 'dialogicality', 'condensation', 'other people', and 'evaluative/motivational' (with the latter being replaced by 'evaluative/critical' and 'positive/regulatory' factors in the VISQ-R).
To discuss two factors most relevant here, participants in the VISQ study reported some experience of 'dialogicality' in 77% of their inner speech (McCarthy-Jones and Fernyhough, 2011).This factor reflects the extent to which inner speech is the internalisation of the inherently social process of having dialogues or conversations.Participants in the VISQ also report some experience of 'condensation' in 36% of their inner speech (McCarthy-Jones and Fernyhough, 2011).Condensed inner speech can be thought of as being near the right end of a continuum, with 'expanded' or more sentence-like inner speech on the left, becoming increasingly more fragmentary and abbreviated on the right, perhaps losing many or most of the 'accoutrements of external language' (Fernyhough, 2004;Mart'ınez-Manrique and Vicente, 2010).VISQ-R results were similar with 71% and 43% of participants reporting having at least sometimes experienced 'dialogicality ' and 'condensed' inner speech, respectively (Alderson-Day et al., 2018).
Descriptive experience sampling (DES) demonstrates a broader variety of phenomenological experiences than questionnaire data.This method involves 'beeping' participants at random times, they write down their inner experience with some notes on it, and later discuss these with an investigator in an in depth interview (Hurlburt and Akhter, 2006).DES results suggest that inner speech is most generally in complete sentences (that are sometimes dialogic).However, it ranges from sentences to a few words (with condensed speech being relatively infrequent) to the infrequently experienced partially worded (inner speech with 'holes') and unworded (though not unsymbolised) inner speech.DES also suggests the distinction between the infrequently experienced 'inner hearing' (hearing voices that are not there) and 'inner speaking', the majority of inner speech (Hurlburt et al., 2013;Hurlburt and Heavey, 2018).
Though inner speech might be most common in sentences, the variety of forms it can take might map onto differences in how we consciously perceive the world.Indeed, dialogic inner speech is more associated with awareness than condensed speech (Verhaeghen and Mirabito, 2021).Similarly, dalogic but not condensed inner speech is correlated with 'absorption', associated with expanded self and self-awareness (Pekala et al., 1985;Rosen et al., 2021Rosen et al., , 2017))(for evidence pertaining to other 'modes' of inner speech, see) (Shahidi et al., 2021).

Frequency
If some or all of these varieties of inner speech are important for producing higher-order consciousness, inner speech might be expected to occur at high frequencies.DES is perhaps the gold standard for answering this question as it reduces problems of introspection and retrospective reporting (Hurlburt and Heavey, 2001;Nisbett and Wilson, 1977;Tourangeau, 2000).Research using DES suggests that inner speech occurs from 0% to 75% of the time, with an average frequency across participants of 26% of all thoughts (Heavey and Hurlburt, 2008;Mihelic, 2010).There is a large discrepancy between this value and self-reported inner speech frequencies as measured by the Nevada Inner Experience Questionnaire (NIEQ).The NIEQ frequencies ranges from 38% to 74% with an average of 71% (Heavey et al., 2018).The Self-Talk Scale results in similar percentages, with average self-reported frequencies ranging from 54% to 67% in various studies (Brinthaupt et al., 2015(Brinthaupt et al., , 2009;;Brinthaupt and Kang, 2014;Heavey et al., 2018).
Why do people believe they talk to themselves about three times more than they actually do?There are a number of reasons to think that 26% might be an underestimation.Participants might be reporting on a single dominant form of thought to the exclusion of other co-occurring forms (Mart'ınez-Manrique and Vicente, 2010) or they might be experiencing inner speech but not know they are because they lack the appropriate label, which would likely be the case with 'condensation' (Alderson-Day and Fernyhough, 2014).However, there is another answer to this question that is pertinent to consciousness.As reviewed, inner speech predicts conscious experience (2.3.1).Conscious experience is by definition more salient than unconscious information and salient information is more likely to be remembered (e.g.,) (Taylor and Fiske, 1978).Similarly, 72% of VISQ-R respondents reported that inner speech is 'positive' sometimes or more (Alderson-Day et al., 2018) and emotional information is also more likely to be remembered (Kensinger, 2009).Thus, inner speech might be reported to be more frequent than it occurs for the reasons it is under discussion here, i.e., because it generates higher-order consciousness.

Uses
Spending 1/4 of and maybe more of our lives using inner speech is still a lot.What does all this internal language do for us?Studies suggest that inner speech serves a diverse set of facilitatory functions in human behaviour.It begins life as 'private speech', emerging around 2-3 years of age (about the same time children start representing others' beliefs) and becomes increasingly more internalised in middle childhood.Private speech is involved in self-regulation of emotions, behaviour, cognition, planning, creativity, and theory of mind (perhaps in part because it maintains some of its 'dialogic' nature) (Alderson-Day and Fernyhough, 2015;Winsler, 2009).In adulthood, inner speech has variously been demonstrated to be a rehearsal tool for working memory and involved in cognitive flexibility and planning, 'reasoning about others, spatial orientation, categorization, cognitive control, and reading' and, more generally, in 'verbal self-guidance' through motivation and control in performance-related domains (Alderson- Day and Fernyhough, 2015).
Overall, these descriptions support the idea that inner speech has a number of functions beyond keeping ourselves company.This is consistent with the more general idea that language is not simply a medium of communication but, rather, serves a large number of cognitive functions (Carruthers, 2002;Clark, 1998;Jackendoff, 1996).By Ray Jackendoff's account, language helps us 'think' because it allows thoughts to be communicated, it makes thinking available, and gives percepts' affective quality a form that can be manipulated (Jackendoff, 1996).Building on a long tradition started by Vygotsky in the 1930 s, Andy Clark suggests that language gives us the power to perform novel computations, i.e., 'memory augmentation', 'environmental simplification', 'coordination and the reduction of on-line deliberation', 'taming path-dependent learning', 'attention and resource allocation', and 'data manipulation and representation' (Clark, 1998).The 'moral' is: The role of public language and text in human cognition is not limited to the preservation and communication of ideas.Instead, these external resources make available concepts, strategies and learning trajectories which are simply not available to individual, un-augmented brains.Much of the true power of language lies in its underappreciated capacity to re-shape the computational spaces which confront intelligent agents.(pg.10) (Clark, 1998)

Mechanisms
Though the evidence reviewed suggests that the neurobiology of language and inner speech generates higher-order consciousness (2.2 and 2.3), nothing yet reviewed comprehensively explains how (Churchland, 1983).Synthesising the views discussed in the comparative overview (2.1) and prior Section (3.1), a tentative taxonomy of mechanisms is suggested in this section and then flushed out in terms of the neurobiology in the next (4).The proposed mechanisms are grouped into 'mechanical', 'functional', and 'linguistic' types, each composed of several subtypes that explain the emergence of higher-order consciousness.This taxonomy is not intended to be categorical as the types and subtypes are not easily separable or independent.And though they are presented roughly in order of descending significance, one or more of these mechanisms might be operational and more important at any specific moment (see 2.3.3 for a causal framework for these mechanisms).

Mechanical
These are the primary processes that generate higher-order consciousness.The self-awareness that originates from their operation is not as dependent on the specific words composing inner speech or their meanings but, rather, is a direct consequence of the inner production of those words.

Agency and ownership.
Self-awareness comes from a sense of agency and ownership of inner speech, loquor, ergo sum.That is, inner speech gives the phenomenological experience of both producing, feeling, and listening to speech as with an interlocutor known well to the producer/perceiver.'I' am aware that 'I' produced this because 'I' generated it and/or 'I' recognize 'my' voice and 'I' recognize 'my' own experiences and beliefs in the content of the speech.'I' am also aware that 'I' produced this because these qualities could not have come from another interlocutor nor an alien broadcast.
3.2.1.2.Broadcast distance.Self-awareness comes from the sense of distance created by inner speech seemingly being broadcast in the head (thus, it is 'overheard'), drawing awareness to unconscious self-relevant perceptual, interoceptive, somatosensory, and cognitive information.That is, inner speech is somewhat like the announcer at the end of a moving walkway, drawing attention to what is right in front of you that you might otherwise not be aware of.'I' just heard 'should I really be buying this?' coming from 'my' head area in 'my' voice and nobody else was speaking and nobody answered so 'I' must be alerting 'myself' to the possibility that this bottle of Margaux is too expensive or that there might be better alternatives.

Functional
These describe the process of turning words into more words (e.g., 'internal dialogues'), actions, and emotions that serve to guide adaptive behaviours.As such, they function to expand or change self-awareness through extension in time and learning.

Internal dialogue.
Self-awareness comes from inner speech being the internalisation of what was originally an externalised social dialogue in childhood (Fernyhough, 2016(Fernyhough, , 2009)), increasing self-attention through extension in time as the 'conversation' continues.Additionally, self-awareness is a product of the fact that internal dialogues are typically used for functions like self-criticism, self-enhancement, self-guidance, self-improvement, self-management, and self-reinforcement (among other things) (Oleś et al., 2020;Puchalska--Wasyl et al., 2008).To give another metaphor, inner speech is somewhat like having your attention drawn to your driving when a deer crosses the road in the distance.What was mostly automatic driving behaviour just moments before becomes a set of plans to slow down, turn on your high beams, and be more attentive.'I' hear myself say 'that was close', and respond, 'yeah, I should get a coffee at the next filling station'.Yet, 'I' respond 'but I won't be able to sleep'.

Affect and motivation.
Self-awareness comes from inner speech having an intrinsic emotional or affective tone that is self-relevant and has motivational value, leading the self into action and awareness of those actions.That is, the words comprising inner speech are experienced as having an affective quality from their past associations and these motivate one to contextually appropriate responses and actions.That made 'me' feel bad so 'I' think 'I'll' stop saying it to 'myself' or 'I' have felt this before and when 'I' do, 'I' have responded in this way and so 'I' shall again.

Linguistic
Though not the primary progenitors or expanders, this set of processes further amplify and extend self-awareness.They are more reliant on the words used and their underlying meanings than are the mechanical or functional mechanisms.

Narrative self.
Self-awareness comes from inner speech forming bits of text that are the direct result of or can be easily fit into a remembered narrative that extends that self into the past and future.
That is, inner speech can be related to a remembered self at any time, given the self-relevance of that information.It is also creative, generating new ideas about the self through those past associations, leading to new conceptual blends (Fauconnier and Turner, 2008;Thagard and Stewart, 2011).'I' find myself saying 'boy am I being naughty' while trying a psychedelic drug but then 'well I was always a mischievous child' or 'this exam is hard' but keep calm as 'well, I do want to become a neuroscientist someday'.Perhaps 'I' will become a psychedelic neuroscientist.
3.2.3.2.Pronominal self.Self-awareness comes from the pronouns used or implied during inner speech.That is, though obvious, it cannot be overstated how much experience we have using first-('I', 'me', 'we', and 'us') or second-person ('you') pronouns to refer to self-relevant information (and, counterfactually not some 'he', 'she', 'they', or 'it').Indeed, 'I' and 'you' are the second and third most common spoken words in the British National Corpus. 9Even when these pronouns are not used, they are implied.'I' hear myself say 'ouch, that hurt' and know it refers to 'me' as it really means 'that hurt me' and references the pain in my body that 'I' am experiencing, not my partner who looks relatively unperturbed.
3.2.3.3.Categorical self.Self-awareness comes from the inherent property of words to group features of experience into categories that are abstractions.Though this is associated with the pronominal self, it also includes aspects of the self that are arguably minimally or not able to be represented without the words and narratives comprising inner speech.That is, some of our words represent categorisations that do not exist as natural kinds in nature and include most of those words we use to describe self-relevant features 'like beliefs, attitudes, personality traits, or personal virtues' (Morin and Everett, 1990).'I think I love him' collects those features that I categorise as love from 'my' western perspective and experience in the world as a man attracted to other men.

HOLISTIC model
These mechanisms are situated in a neurobiological model called the 'higher order language and inner speech to "I" consciousness' or HOLISTIC model.In addition to forming a decent acronym, HOLISTIC reflects that the model is conceptually related to 'higher-order' theories of consciousness (discussed in 5.1), is intended to explain higher-order, i.e., 'I' or self-awareness, and that it involves the whole brain.First, a descriptive overview of the neurobiological implementation of the mechanical, functional, and linguistic mechanisms in the HOLISTIC model is presented (4.1) before discussing specific neurobiological features and evidence in greater detail (4.2).

Mechanisms
A caricature of the model is provided in Fig. 1 to help illustrate this description and the underlying brain anatomy.The sequence of activity patterns represented by '1.' through '6.' are derived from six large-scale neuroimaging meta-analyses (https://neuroquery.org/) (Dockès et al., 2020).These are 'speech production' ('1.'), 'language comprehension' ('2.'), 'default mode' ('3.'), 'semantic knowledge' ('4.'), 'cognitive control' ('5.'), and 'corticothalamic and thalamocortical' ('6.').They are intended to illustrate the possible underlying anatomy and activity patterns of the HOLISTIC model and should not be taken too literally.For example the 'semantic knowledge' meta-analysis is only a standin for the distributed pattern of activation associated with 'situation models' that would involve a dynamic pattern.Nor should the model be interpreted to have only six processing steps that are stage-like.Rather, there are many as-of-yet identified steps that are likely largely overlapping, dynamic, and not involving fixed regions.
To illustrate the model, imagine you are in a village in Cumbria, England.You have gone out to buy some Wellingtons and you walk past a tearoom.Such multisensory scenes continuously and unconsciously activate contextually associated words and word sequences (Carr et al., 1982;Chabal and Marian, 2015;Dell'Acqua and Grainger, 1999;Sperber et al., 1979;Zwitserlood et al., 2018), deriving from both the external context having culturally agreed upon labels (e.g., 'shop', 'tea', 'scone', 'eat some cake', etc.) and internal context associated with the memory of more idiosyncratic individual experiences (e.g., the memory of lines from your favourite movie involving a tearoom in an English village as in the movie 'Withnail and I') (Brandimonte et al., 1992;Xie et al., 2021).
The neurobiological representations of these words are implemented in distributed cortical/subcortical networks involving the whole brain.These include a network of core 'motor regions' to (inwardly) produce those word sequences, somatosensory and interoceptive regions representing felt sensory, and auditory cortices representing heard acoustic/ phonological properties of producing those words.These regions are connected to a widely distributed set of more dynamic peripheral networks including sensory, motor, and emotional regions encoding associated semantic and conceptual referents of words (e.g., motor regions involved in eating cake) (Pulvermüller, 2013;Pulvermüller and Fadiga, 2010).Activation of one or more of these regions might reinstate entire core-periphery networks and these can 'cooperate' and 'compete'.One network might activate an overlapping region of another network, increasing the activity of that network (cooperation).Network activity might increase with converging contextual information while others lose activity (competition; see) (Skipper, 2015).

Mechanical
4.1.1.1.Agency and ownership.Outside the tearoom, you 'overhear' yourself inwardly saying 'We want to get in there don't we.Eat some cake.'Thus, external and internal context was strong enough to have increased the activation for this sequence of words to a high enough threshold that cortical/subcortical regions in the speech production system are engaged (Fig. 1, '1.').This system shares with inner speech a mechanism for predicting the expected sensory consequences of producing speech through feedback to somatosensory, interoceptive, and auditory cortices (Fig. 1, '1.', black and white asterisks, respectively).These 'efference copies' can be subtracted from or compared with incoming sensory information, generating an 'error signal'.In contrast to overt production, there is no 'cancellation' of inner speech from reafferent sensory information.Thus, activation of speech production regions results in the sensation of willing speech and feedback activation to sensory cortices in the feeling/hearing of your own voice.
4.1.1.2.Broadcast distance.These 'overheard' words appear in the head seemingly unbidden as we do not have direct conscious access to the processes occurring in the brain that led to the generation of inner Fig. 1.Caricature of the HOLISTIC model.Labelled sequence '1.' through '6.' and arrows roughly represent the progression of activity patterns associated with the model.In reality the model is not as stage-like, there are not just six steps, and the activity patterns are not as fixed.Activity patterns derive from six neuroimaging meta-analyses (see 4.2.6 for details).Each of these was thresholded at Z = 1.96 with a cluster size of 100 voxels for illustrative purposes, where red corresponds to positive and blue negative (decreasing or inhibited) activity.The black asterisk approximately indicates the location of primary motor and somatosensory cortex and the white asterisks primary auditory cortices.
speech.Thus, these are next comprehended and this process yields informative clues as to what your brain is doing (for supporting evidence, see) (Aucouturier et al., 2016;Lind et al., 2014) (though see) (Lind et al., 2015;Meekings et al., 2015) (Fig. 1, '2.').In the example, it has interpreted the multisensory scene that is unfolding in front of you as containing a teashop with acceptable looking cakes (among other things).It seems to be suggesting some possible actions you might want to perform with this information (like going in and eating).It also might be suggesting some links to more internal processes like a positive affective evaluation of the teashop and that you are peckish.Though these may well be confabulations, they nonetheless more generally correspond to a 'higher-order' mechanism that gives you some conscious access to your own underlying brain processes.

Internal dialogue.
In response to 'eat some cake', you hear yourself saying: 'Cake and fine wine.We want the finest wines available to humanity, we want them here and we want them now.'The production of this 'dialogue' creates a continuous reverberation among motor, somatosensory, interoceptive, and auditory cortices over time, resulting in extended self-awareness of your own voice broadcasting information.Contrast this with the momentary and fleeting awareness of primary consciousness, with dialogue resulting in 'mental presence' lasting seconds or longer (Dorato and Wittmann, 2020;Wittmann, 2011).
The extension of inner speech to a 'conversation' allows you to, e.g., 'discuss' plans that might be suggested by your speech to take action, make decisions, and so on.This ability occurs in part because the memories associated with your inner speech are associated with learned and contextually appropriate or at least established responses to those memories.In particular, inner speech is connected to any number of competing unconscious 'situation models' that are activated by the 'default mode network', a distributed set of regions involved in the unconscious processing of autobiographical memory/self-knowledge (Fig. 1, '3.').Situation models are mental representations of the actions, characters, events, objects, and places that are associated with speech but not necessarily contained within it (Zwaan and Radvansky, 1998).Thus, like words and word sequences, situation models unconsciously reactivate a distributed set of peripheral cortical/subcortical networks associated with the sensory-motor and affective properties of those models (indeed, the former are likely derived from the later; Fig. 1,  '4.').

Affect and motivation.
How are the most relevant situation models (activated by 'default mode' regions, Fig. 1, '3.' and '4.') selected?The connectivity of words and situation models to associated emotional properties and corresponding brain regions gives inner speech a felt affective flavour.This might motivate internal honing or selection of situation models or aspects of those situation models by prefrontal cortex regions (Fig. 1, '5.'), perhaps through overall levels of activation or inhibition (Radvansky, 1999;Radvansky et al., 2005b).Finally, honing might lead to the selection of a subsequent 'reply' or inner speech response or an action through corticothalamic/thalamocortical interactions (Fig. 1, '6.').For example, after 'eat some cake', one activated situation model involves you entering the tearoom, sitting down, receiving a menu, and ordering cake.Aspects of that model might be selected because of the positive affective qualities attached to your memory of a prior teashop sharing similar characteristics.

Linguistic
However, the stronger situation model in this case involves you watching your favourite movie because your 'eat some cake' and 'finest wines' inner speech derive from that movie.Thus, the positive flavour of this model motivates you to locate a '53 Margaux' and return home to watch 'Withnail and I', inwardly proclaiming: 'I love this.It's been too long'.These fragments reflect a personal narrative in which your identity coincides with the time and morals, sense of humour, and literary pretensions on display in said movie.This is compounded by the many prior occasions you watched the film with a beloved friend.Thus, merely quoting the earlier lines imbues a sense of self-awareness extended across space and time (Narrative Self).Simply using the word 'I' increases self-awareness because it is habitually used to refer to selfrelevant information and, supporting this, there are no other mouths moving or even people nearby (Pronominal Self).Furthermore, a word like 'love' is a relatively abstract category that you would only be aware of if you were using language encoded through your own social and emotional experiences (Categorical Self).All these processes involve self-relevant inner speech that has an effect on self-awareness by activating autobiographical memory via the 'default mode network' and increasing the probability of self-relevant situation models being selected, resulting in more self-relevant inner speech and so on (Fig. 1, '1.'-'6.',arrows).

Details
To summarise, inner speech occurs in a set of core regions overlapping with speech production, implementing predictive efference copy mechanisms that generate a felt and heard phenomenology.The 'predicted' words are connected to a whole-brain distribution of peripheral regions encoding their meaning, associated memories, and situations models.These peripheral networks form the basis for contextually appropriate responses in the form of inner dialogue or actions that are activated by the 'default mode network' and honed and selected by prefrontal regions.The felt/heard nature of inner speech, its recurrent activation, and the affective tone generated from memories and thalamocortical/corticothalamic interactions collectively constitute higher-order consciousness.
In what follows, further discussion and empirical evidence is reviewed for important components of the HOLISTIC model.These include sections on the role prediction in inner speech (4.2.1), the felt/ heard phenomenology it produces through interaction with the body (4.2.2), the whole-brain distribution of language in the brain it activates, particularly in the form of situation models (4.2.3), the role of 'default mode network' in activating (4.2.4) and prefrontal cortex in selecting (4.2.5) aspects of these situation models, and the neurobiological 'loci' of various forms of consciousness that emerge (4.2.6).Finally, aspects of the HOLISTIC model discussed in this section are further illustrated with supporting neuroimaging meta-analyses (4.2.7).

Prediction
The neurobiology of speech production involves a large, distributed set of regions that overlaps with those involved in speech perception (for a review see) (Skipper et al., 2017).Core regions include those more involved in (1) movement through more direct innervation of musculature (the supplementary motor cortex or SMA in the medial superior frontal gyrus and primary motor cortex in the anterior central sulcus); (2) interoception (the anterior insula, e.g., for the voluntary control of breathing); (3) somatosensation (primary somatosensory cortex in the posterior central sulcus and postcentral gyrus); and (4) 'secondary' or 'higher-level' movement and somatosensation (e.g., the pars opercularis of the inferior frontal gyrus, a bit of 'Broca's area', premotor cortex in the precentral gyrus, pre-SMA, and other parietal cortices).Subcortical structures like the thalamus, basal ganglia, and cerebellum also play important roles in speech production.
Why does speech production involve intero-and exteroceptive somatosensory and auditory regions that also participate in speech perception?Speech production is a sensory-motor process that needs feedback control (Guenther et al., 2006;Hickok, 2012;Houde and Chang, 2015).Learning to speak requires a mechanism to monitor what we produce in order to compare to a model and make adjustments.Once learned, there needs to be a method to monitor what we say so that we can adjust our vocalisations to correct for speaking volume, rate, and errors.Vocal learning and online adjustment to articulatory perturbations are neatly explained by predictive models.In such models, 'motor' regions send 'efference copies' of the expected sensory consequences of a motor programme to interoceptive, somatosensory, and auditory cortices, which can be compared to incoming sensory information.If the input is not as expected, 'error signals' are generated that allow movements to be adjusted.
Inner speech seems to mostly rely on the same brain regions as overt speech (for reviews, see Alderson-Day and Fernyhough, 2015;Grandchamp et al., 2019;Hubbard, 2010).Differences involve the relative engagement of 'lower-level' primary motor, somatosensory, auditory, and nearby cortices, with these typically being less engaged during covert speech (though this does not mean they are inactive) (Christoffels et al., 2007;Hurlburt et al., 2016;Martin et al., 2014;Pei et al., 2011;Shuster and Lemieux, 2005).Relative engagement is supported by behavioural studies of inner speech and inner speech errors that collectively suggest 'lower-level' phonological processes that would putatively engage these 'lower-level' regions, are variously engaged in different contexts (Corley et al., 2011;Filik and Barber, 2011;Oppenheim andDell, 2010, 2008).
As in production, there is evidence that predictive models and efference copy operate between motor, somatosensory, and auditory regions during inner speech, which would provide the sensory experience of one's own voice (Aziz-Zadeh et al., 2005;Ford and Mathalon, 2004;Jack et al., 2019;Price et al., 2011;Scott, 2013;Scott et al., 2013;Shergill et al., 2002;Tian, 2010;Tian and Poeppel, 2013;Whitford et al., 2017) (for critiques and discussions, see) (Gregory, 2022;Jones and Fernyhough, 2007b).This is supported by evidence that inner speech is typically experienced in what is recognisably one's own voice and accent (Alderson-Day and Fernyhough, 2015;Filik and Barber, 2011).Feeling and hearing one's own voice might occur because the feedback signals are not cancelled in inner speech due to the lack of any actual speech being produced (Swiney and Sousa, 2014;Tian and Poeppel, 2012).
That we in some sense feel and hear inner speech through efference copy-like mechanisms is further supported by research on auditory verbal hallucinations.These are typically experienced as heard and are often associated with changes in motor, somatosensory, and auditory regions (Allen et al., 2008;Diederen et al., 2011;Linden et al., 2010;Rapin et al., 2013;Renaud Jardri et al., 2011;Shergill et al., 2003).Aberrations of predictive models and efference copy between these regions has long been associated with auditory hallucinations as experienced in psychosis (Allen et al., 2007;Blakemore et al., 2000;Ford and Mathalon, 2005;Frith, 1992).

Body
Do we really feel and hear inner speech?During speech production, there are a host of internal body parts actually moved.These include hair cells in the cochlea, various muscles, ligaments, and bones, vibrating vocal cords and nasal cavity, and the lungs.These cause physical changes in sensory receptors, activating interoceptive, proprioceptive, somatosensory, and auditory cortices.The suggestion here is that these same brain-body loops are engaged to some degree in inner speech.Thus, it literally feels like something to use both outer and inner speech because words are integrated with predicted and monitored information from the body, which form a self reference frame.Because words can be processed unconsciously, this leads to the testable hypothesis that consciousness of inner speech will depend on the extent that the body is felt, a position adapted from research in perceptual domain for which there is decent evidence (Azzalini et al., 2019;Seth and Tsakiris, 2018;Tallon-Baudry et al., 2018).
Supporting evidence that inner speech engages the body and does so variably comes from electromyography and breathing pattern studies.Covertly reciting speech produces more activity in facial muscles than rest or visualising.Furthermore, the activity is muscle specific, with bilabials (e.g., the 'p' in 'lip') producing more lip activity than words involving the tongue (e.g., the 't' in 'tongue') which produce more tongue activity (Chapell, 1994;Livesay et al., 1996;McGuigan and Dollins, 1989;McGuigan and Winstead, 1974).More naturalistic inner speech (e.g., ruminating for a few minutes) also shows increases in electromyography over rest but only sometimes compared to other tasks (Moffatt et al., 2020;Nalborczyk, 2022;Nalborczyk et al., 2021Nalborczyk et al., , 2017)).Similarly, inner speech shows changes in the physical patterns of respiration that look more speech-like than rest, with variations depending on the form of the inner speech (Chapell, 1994;Conrad and Schönle, 1979).
Variability in the patterns of electromyography and breathing with the form of inner speech supports the hypothesis that consciousness will depend on the extent that the body is engaged.This follows from the data reviewed earlier that inner speech is more associated with selfawareness generally (2.3.1) and dialogic is more associated with selfawareness than condensed inner speech (3.1.1).This is further supported by neuroimaging data showing that when people inwardly generate speech they produce more motor, interoceptive, and somatosensory cortex activity than verbal mind wandering, which contains more condensed speech (Grandchamp et al., 2019) and is not typically associated with awareness (Smallwood and Schooler, 2015).On the other end of the spectrum, 'mind blankness', characterised by a lack of awareness (and inner speech), results in deactivation and, therefore, inactive or inhibited speech production regions (Kawagoe et al., 2019).

Whole-brain
The HOLISTIC model maintains that language deeply penetrates every part of the human brain.The 'language regions' posited by traditional models focus on 'Broca's' and 'Wernicke's' areas and, more recently, the small number of other regions (Geschwind, 1970;Hickok and Poeppel, 2007;Rauschecker and Scott, 2009).Here, those are considered only 'cores', i.e., regions of dense connectivity coordinating more loosely connected regions outside of those cores that form a more dynamic and reconfigurable periphery (Skipper, 2015).That is, language has a core-periphery network architecture in the brain (Bassett et al., 2013;Betzel et al., 2019;Borgatti and Everett, 2000;Chai et al., 2016;Csermely et al., 2013;Fedorenko and Thompson-Schill, 2014;Li et al., 2020;Shen et al., 2022).
Supporting this view, the available data suggests that, when words are read or heard, the sensory-motor and emotional associations of those words concomitantly activate 'modality-specific' brain regions associated with processing those features.Thus, an action word like 'walk' activates dorsal motor regions of the brain whose axons innervate the musculature of the legs (Hauk et al., 2004).Similarly, written words like 'doorbell' activate auditory cortex (Kiefer et al., 2008), 'blue' visual colour regions (Martin et al., 1995), 'limburger' olfactory cortex (González et al., 2006), and 'sad' the amygdala (Citron, 2012).Though these are the results of laboratory paradigms, e.g., involving single word reading, words engage the entire brain in a similar manner during natural language comprehension (de Heer et al., 2017;Huth et al., 2016).Furthermore, these sensory-motor and emotional activation patterns are not simply a post perceptual process of imagery, following 'true' language processing.Rather, these effects occur as early as 50-150 ms after word onset, while the words are still being processed and before imagery is theoretically possible (Citron, 2012;García et al., 2019;Kiefer et al., 2008;MacGregor et al., 2012;Shtyrov et al., 2014).
The HOLISTIC model suggests that the sensory, motor, and emotional activity patterns elicited by words in these studies also occur during inner speech, i.e., the core inner speech areas are connected to the same dynamic peripheries.These patterns form the foundation for the mental simulation of the events and situations implicitly associated with sentences and larger narratives.Such 'situation models' are arguably required for making inferences that lead to successful language use (Zwaan and Radvansky, 1998).For example, if you were asked to 'place a pencil in the beaker' (compared to on the table), the fact that you are likely to pick a nearby pencil up by one end cannot be explained by the words alone.The situation model that led you to grasp the pencil in this manner was needed for you to act appropriately (Stanfield and Zwaan, 2001).Like individual words, sentences and narratives activate a whole-brain distribution of multifunctional regions (Chow et al., 2014;Desai et al., 2009;Ferstl et al., 2005;Ferstl and von Cramon, 2007;Speer et al., 2009).However, as exemplified by the pencil example, those responses are at least in part the result of the situation model and not the words themselves.For example, indirect requests like 'it is hot here' in a room (as opposed to a sandy desert) activate the motor system despite there being no reference to action (van Ackeren et al., 2012).Similarly, implied emotion in sentences activates regions important for emotional processing (Lai et al., 2015).

Activation
The HOLISTIC model proposes that inner speech activates a set of cores connected to whole-brain distributions of sensory-motor and emotional regions that form the basis for or are themselves situation models.Situation models are considered to incorporate autobiographical memory (Magliano et al., 2007;Radvansky et al., 2005a).Why are we not conscious of these memories or models?To explain, inspiration is taken from Wilder Penfield's (1958) model of consciousness, based on electrical brain stimulation of the temporal lobe in awake patients undergoing surgery for epilepsy (Penfield, 1958).He found that patients often had surprisingly detailed sensory, motor, and emotional experiences that they did not normally have access to but that could be reliably reactivated.For example, a patient is described who heard voices and saw circus wagons used to haul animals when stimulated and restimulated in a specific site.In Penfield's own words: One must conclude that there is, hidden away in the brain, a record of the stream of consciousness.It seems to hold the detail of that stream as laid down during each [person]'s waking conscious hours.Contained in this record are all those things of which the individual was once aware… This is not a memory, as we usually use the word, although it may have some relation to it.No [person] can recall by voluntary effort such a wealth of detail.(Penfield, 1958) Penfield suggests that detailed memories are stored to guide our behaviour through comparison.If we were always conscious of these, our senses would be overwhelmed and we would not be able to act in the world.Similarly, in the HOLISTIC model situation models need to be activated and one or some aspect of that model selected to serve as a guide during inner speech but we cannot be aware of these models or we would be paralysed.
This description corresponds well to current thinking about the role of the 'default mode network' in both internally and externally driven behaviour.This network was initially described as being a single 'tasknegative' network supporting (Andrews-Hanna, 2012;Buckner et al., 2008): internal mentation that is largely detached from the external world.Within this possibility, the default network plays a role in constructing dynamic mental simulations based on personal past experiences such as used during remembering, thinking about the future, and generally when imagining alternative perspectives and scenarios to the present.(Buckner et al., 2008) The 'default mode network' is now believed to consist of multiple networks with specialised regions that 'echo' multiple cognitive states (Braga et al., 2013;Braga and Buckner, 2017;Buckner and DiNicola, 2019).Even as activity decreases in the 'default mode network' during demanding tasks, functional connectivity often increases with task associated networks (Krieger-Redwood et al., 2016;Palhano-Fontes et al., 2015).Results tend to support a proposal by which this involvement in tasks largely occurs when they benefit from memory guidance (Crittenden et al., 2015;Konishi et al., 2015;Murphy et al., 2019Murphy et al., , 2018;;Sormaz et al., 2018;Spreng et al., 2014;Vatansever et al., 2017).Some have even suggested that those memories take the form of contextually relevant situation models (Chen et al., 2017;Keidel et al., 2018;Ranganath and Ritchey, 2012;Smith et al., 2021Smith et al., , 2018)).

Selection
How are components of situation models or competing situation models that are activated by the 'default mode network' selected so that they can guide further inner speech or actions?The 'creative cognition' literature addresses this question (Beaty et al., 2019;Jung et al., 2013;Kenett et al., 2018;Kleinmintz et al., 2019).This work collectively suggests that the 'default mode' and 'executive/cognitive control' networks collaborate, with the former generating ideas in the form of memories and the latter selecting them.Key regions in the 'default mode' include the angular gyrus, medial prefrontal cortex, and posterior cingulate cortex (see Fig. 1, '3.').The key regions in the 'executive/cognitive control network' include anterior parietal and ventroand dorsolateral prefrontal cortices (see Fig. 1, '5.').Among other support for this proposal, a temporal connectivity analysis for novel metaphor production (a language task) shows that the 'default' and 'executive/cognitive control' networks are initially decoupled and then become coupled at later stages in processing (Beaty et al., 2017).Of the ventrolateral and dorsolateral prefrontal cortices, the former includes the inferior frontal gyrus and shows stronger functional connectivity with the default network during creativity tasks involving verbal responses (Beaty et al., 2021).The inferior frontal gyrus has long been associated with verbal selection among competing alternatives (Hagoort, 2013;Lau et al., 2008;Moss et al., 2005;Skipper et al., 2007a;Thompson-Schill et al., 1997;Wang et al., 2021).

Loci
The HOLISTIC model extends across the entire lateral and medial surface of the neocortex.This begs the question as to where in the brain, if anywhere, consciousness 'arises' in this model.To answer this question, the distinction between primary and higher-order consciousness is revisited, with the proposal that higher-order consciousness arises as a collaboration between subcortical structures more associated with primary consciousness and neocortical networks more associated with higher-order consciousness in lieu of their role in language and inner speech.
Primary consciousness in animals implies an origin of consciousness in evolutionarily older brain regions (Birch et al., 2020).Indeed, primary consciousness appears to derive from the upper brainstem/midbrain in mammals (henceforth, 'upper brainstem').This conclusion is based on a panoply of evidence, including that upper brainstem lesions and stimulation cause loss of consciousness and, conversely, decorticated animals and human children born without a neocortex display conscious behaviours (a form of double dissociation) (Aleman and Merker, 2014;Merker, 2007;Moruzzi and Magoun, 1949;Panksepp, 1998;Parvizi and Damasio, 2001;Shewmon et al., 2007;Solms, 2018).These appear as goal-directed and emotional, even excessively so.Consistent with this, brainstem structures are the source of powerful neuromodulators of mood like serotonin, dopamine, noradrenaline, and acetylcholine (Solms, 2018).

J.I. Skipper
Based on this data, it has been suggested that primary consciousness is inherently affective and not fundamentally neocortical (Damasio, 2012;Solms, 2018).For example, by one proposal, the adaptive function of consciousness is said to be homeostasis, whereby conscious affective states or feelings generated by the brainstem serve as a sort of alarm mechanism to guide behaviour to reduce predictive uncertainty, which typically registers as negative affect in novel contexts (Solms, 2021(Solms, , 2018;;Solms and Friston, 2018).That is, consciousness gives experience an affective 'flavour', a diverse range of qualia that guides behaviour and learning from experience (Cleeremans and Tallon-Baudry, 2021;Solms, 2018).This is consistent with empirical evidence suggesting that consciousness is not likely directly involved in information processing or controlling behaviour but that its function is associated with subsequent flexible responding (Earl, 2014;Halligan and Oakley, 2021;Oakley and Halligan, 2017).From this perspective, the neocortex is a plastic memory store whose size scales with the flexibility and elaborateness of an organism's predictions for maintaining homeostasis.
How does this subcortical basis of primary consciousness relate to inner speech?To account for data like this, the HOLISTIC model places upper brainstem structures as the penultimate selection mechanism that comes after the massively parallel neocortical process of situation model honing, allowing for serial/sequenced behaviour (Fig. 1, '6.').The upper brainstem selects the response, whether verbal or action, that maximises the homeostatic utility of the space of possible responses.This, in turn, has a felt quality that itself can be used to guide subsequent behaviour.By this model, language and inner speech are a particularly elaborate form of maintaining homeostasis.Furthermore, the bidirectional nature of corticothalamic and thalamocortical connectivity suggests that language and inner speech might substantially influence how primary consciousness is experienced.
Supporting this is the known subcortical involvement in language (Kotz and Schwartze, 2010;Nadeau and Crosson, 1997;Skipper and Lametti, 2021;Vos et al., 2021;Whelan et al., 2005).The upper brainstem has direct input into the thalamus and, from there, corticothalamic and thalamocortical and basal ganglia circuits are involved in language processing (among other things).This has been proposed to include engagement, information transfer, sharpening, and lexical selection functions (Crosson, 2013) and temporal prediction (Kotz and Schwartze, 2010).Indeed, lesions to the thalamus regularly result in different kinds of aphasia, including problems with communication, fluency, naming, and various semantic deficits (Crosson, 2019;Nishio et al., 2014;Rangus et al., 2021;Vos et al., 2021;Whelan et al., 2005).
What about other cortical regions that have been proposed to give rise to consciousness?As reviewed above, decorticated animals suggest that none are completely necessary.Consistent with this, there is not very strong evidence from stimulation or damage for either a front or back, i.e., prefrontal or posterior visual cortical origin of conscious experience (Boly et al., 2017;Odegaard et al., 2017;Raccah et al., 2021).Electrical stimulation that causes reports of conscious experience variously comes from the entire brain and, if anywhere, comes most frequently from 'somatomotor' regions that look suspiciously language-like (Fox et al., 2020;Koch, 2020).Whereas no single cortical lesion predicts loss of consciousness, connectivity with the brainstem and thalamus does (Afrasiabi et al., 2021;Snider et al., 2020).
A similar conclusion can be reached from the more specific literature on self-consciousness (for an intense review, see) (Frewen et al., 2020).One prominent view puts the locus of self-awareness is the anterior insular cortex.However, lesion studies suggest that people maintain self-awareness even after bilateral damage to the entire insula (Damasio et al., 2013;Feinstein, 2013;Philippi et al., 2012).Similarly, self-awareness persists after damage to other putative self loci, like the anterior cingulate and medial prefrontal cortices (Philippi et al., 2012).Furthermore, people with severe memory impairments from brain trauma retain a continuous sense of self (Medved and Brockmeier, 2008;Philippi et al., 2012;Rathbone et al., 2009;Tippett et al., 2018).Indeed, sufficient preservation of semantic memory and life narratives have been suggested to be enough to maintain the belief in a persistent self in dementia (Strikwerda-Brown et al., 2019;Tippett et al., 2018).What is common across these cases is the relative preservation of the upper brainstem/thalamus and some ability to use language.
The 'default mode network' activation of situation models with their attendant sensory, motor, and emotional activity and the brainstem affective component of consciousness together might provide an account of so-called 'fringe' consciousness (Mangan, 1993).The latter was originally proposed by William James and refers to vague or dimly felt contextual information that is used to help guide behaviour, corresponding to a feeling of knowing, tip-of-the-tongue, or familiarity.Indeed, research suggests that affective responses do derive from the fringe and this is used to guide judgements (Reber et al., 2002;Topolinski and Strack, 2009).'Fringe' consciousness is the HOLISTIC model is suggested to come from the joint affective activations associated with situation models and upper brainstem functioning (see also Epstein, 2004Epstein, , 2000)).'Unsymbolised thought' might be the result of aborted inner speech, producing a 'fringe' through these same affective activations but without any inner speech becoming conscious (Vicente and Jorba, 2019).

Meta-analyses
Several simple and informal neuroimaging meta-analyses were performed to further illustrate and provide support for the neuroanatomical basis of the HOLISTIC model (see Fig. 1 for a review).These use 'Neu-roQuery' to predict the spatial distribution of activity from 418,772 activation locations associated with term frequencies in the full texts of 13,459 neuroimaging articles (https://neuroquery.org/) (Dockès et al., 2020).Fig. 2 shows activity associated with articles in which the term 'language' appears at a high frequency (this is the same meta-analysis as in Fig. 1, '2.').This activity is projected onto a cortical surface based representation of the MNI aligned Colin27 brain (Holmes et al., 1998) using SUMA (Saad et al., 2004).Results include known language cores in the superior temporal plane and inferior frontal, precentral, and central sulcus regions bilaterally.
Next, additional meta-analyses were conducted to confirm the implication of the HOLISTIC model that language and self processing share resources.Overlaid on the 'language' meta-analysis in Fig. 2 are Fig. 2. Neuroimaging meta-analyses of 'language' (hot colours), 'autobiographical memories' (white outline), 'self knowledge' (grey outline), and 'unconscious' (black outline).Results were thresholded at 97% with a cluster size of 100 voxels.
two additional NeuroQuery meta-analyses associated with activity for self-relevant information processing, including the related constructs of 'autobiographical memories' (Fig. 2, white outline), and 'self-knowledge' (Fig. 2, grey outline).The resulting activity patterns overlap with 33% of the 'language' activity, including overlap in the superior temporal plane and inferior frontal gyrus (see also) (Morin and Hamper, 2012).This suggests language and self-relevant processing share significant resources.To confirm this, an overlap map of the 'language', 'autobiographical memories', and 'self-knowledge' meta-analyses was created (not shown).The resulting spatial pattern of activity was then correlated with thousands of other meta-analyses.The overlap map was most correlated with the patterns of activity from language related meta-analysis, e.g., 'sentences', 'language', 'linguistic', and 'language comprehension' meta-analyses (Yarkoni et al., 2011).Note that this result was not preordained, the overlap could have been associated with some other possible mechanism mediating both language and self processing (e.g., 'control processing').
The patterns of overlap between the three meta-analyses described thus far also confirms another implication of the HOLISTIC model, namely that the activation of situation models by 'default mode' regions is a somewhat separable processing stage.That is, despite overlapping a great deal, one of the primary differences between the language activation pattern and the self-relevant meta-analyses patterns is the extent to which they show overlapping activation in medial regions typically associated with the 'default mode network'.These regions include the posterior cingulate cortex, precuneus, medial prefrontal cortex, and hippocampus.Indeed, the self-relevant meta-analyses activation patterns overlap with 53% of a 'default network' meta-analysis activation Fig. 3. Predictive neuroimaging meta-analyses of the 'thalamus + speech-production' (left column) and 'thalamus + language' (right column).Hot and cool colours suggest positive or negative relationships, respectively.Black and white outlines encompass activity for the 'speech production' and 'language' meta-analysis for comparison.Results were thresholded at 97% with a cluster size of 100 voxels.See Fig. 2 for the full 'language' meta-analysis.
(not shown), suggesting that they also share significant resources.
A final meta-analysis was conducted to confirm the implication of the HOLISTIC model that, whereas speech production and inner speech is said to generally be a conscious process, the activity of the periphery associated with situation models and their activation by 'default mode' regions are associated with unconscious processing.Overlaid onto Fig. 2 is the activity pattern associated with the term 'unconscious' (black outline).An examination of the articles that contributed to this metaanalysis indicates that activations mostly derive from studies involving subliminal presentation of stimuli or priming, with many using linguistic stimuli.Except for the supplementary and presupplementary motor areas, the results do not overlap much with activation from the 'language' meta-analysis except in inferior frontal and precentral regions.On the other hand, they do significantly overlap with the activation patterns from the meta-analyses of 'autobiographical memories', 'selfknowledge', and the 'default network', particularly in medial cortices.
To summarise, the results of these meta-analyses are consistent with the HOLISTIC model.They indicate an inseparable link between language and self-processing and the relationship between the selfprocessing and the 'default mode network'.Furthermore, results are consistent with the proposal that language/inner speech cores are more associated with conscious processing, particularly in regions associated with speech perception and language comprehension (i.e., the superior and middle temporal gyri), while medial and other 'default mode' regions are more associated with unconscious processing.
Another set of NeuroQuery meta-analyses was conducted to test the HOLISTIC model proposal for an association between the upper brainstem and thalamus in the penultimate selection of speech production and inner speech responses (associated with consciousness; Fig. 1, '1.' and '6.') and not in comprehension/'default mode' situation model selection (which remains unconscious; Fig. 1, '2.'-'5.').The 'thalamus and speech production' (Fig. 3, left column, hot colours) and 'speech production' (Fig. 3, black outline) meta-analyses show activation patterns that fully overlap.Also active and overlapping, though not visible in Fig. 3, are the brainstem, aspects of the basal ganglia, and cerebellum.In contrast, the relationship between the activation patterns for the 'thalamus and language' (Fig. 3, right column, hot and cool colours) and 'language' (Fig. 3, white outline) meta-analyses are quite different.These patterns positively overlap with each other and with the activity from the 'speech production' meta-analysis (Fig. 3, black outline) in the central sulcus, insula, and supplementary motor area (Fig. 3, right column, hot colours).Again, overlapping and active but not visible are the brainstem, aspects of the basal ganglia, and cerebellum.Strikingly, activation in the thalamus anti-correlates with a set of regions that strongly overlaps the activation pattern from the 'language' metaanalysis (Fig. 3, right column, cool colours and white outline).This negative relationship also includes regions implicated in autobiographical memories and self-relevant processing (compare with Fig. 2, white and grey outlines).
To summarise, this second set of meta-analytic results are consistent with the HOLISTIC model in that speech production and, by extension, inner speech production are associated with activation of the brainstem/ thalamus, key subcortical structures that give rise to both primary and higher-order consciousness (4.2.6).In contrast, the more distributed regions of the brain involved in language, semantic processing, and selfprocessing (overlapping the 'default mode network') are inactivated when the thalamus is active.These patterns are consistent with the HOLISTIC model suggestion that we are only conscious of our inner voice and not associated semantic/situation model processing.

Mental health
The HOLISTIC model potentially has a number of applications beyond the cognitive sciences, extending to the clinical health sciences.Straightforwardly, selected responses (Fig. 1, '5' or '6.') might become entrenched through learning and use, making them more automatic and less accessible to conscious awareness.Entrenched responses are likely to be adaptive but may not always be positive or appropriate (or might become less so over time), potentially leading to decrements in wellbeing and mental health.Psychotherapy might work by targeting these entrenched connections.
Given the current lack of mechanistic understanding, the HOLISTIC model might serve as a foundation for studying how therapy results in change (Carey et al., 2020;Cuijpers et al., 2019;Kazdin, 2007;Lemmens et al., 2016).Current proposals do not view language or inner speech as a primary mechanism of the therapeutic process itself.Yet, it is hard to understand proposed behavioural or neurobiological mechanisms of effect without invoking words (e.g., 'rumination' involves high rates of inner speech) (Goldwin et al., 2013;Goldwin and Behar, 2012;McLaughlin et al., 2007;Moffatt et al., 2020).

Other theories
How does the HOLISTIC model compare to other theories of consciousness?This is discussed in the context of six prominent theories or classes of theories (for reviews, see Brown et al., 2019;Northoff and Lamme, 2020;Sattin et al., 2021;Seth and Bayne, 2022;Signorelli et al., 2021).As none considers the role for language and inner speech in consciousness, they are reviewed with regard to how the HOLISTIC model might be related and add explanatory value.This section also brings together a number of themes running throughout in the context of those theories.To preview, they are the (1) higher-order theories (HOTs, where language is a 're-representation'); (2) global workspace theories (GWTs, where core language related brain regions are a global workspace); (3) reentrant processing theories (RPTs, where, e.g., inner dialogue, results in recurrence and stabilisation); (4) integrated information theory (IIT, where, e.g., condensed is less conscious than dialogic inner speech); (5) temporo-spatial theory of consciousness (TTC, for which the HOLISTIC model might be a type); and (6) predictive models (for which the HOLISTIC model is a type).Before concluding, the nature of subjective experience is revisited in the context of these models.

HOTs
Higher-order theories address the issue that most of what the brain does seems not to be accessible to consciousness.This begs the question of how some unconscious processes become or are made available to consciousness.Higher-order theories propose that consciousness occurs when unconscious first-order representations (i.e., those of sensory states) are 're-represented' or 'redescribed', perhaps in a different format, by a higher-order representation (Brown et al., 2019;Lau and Rosenthal, 2011).However, it is not clear what higher-order representations are and, as such, they often seem to be defined anatomically (e. g., as 'something the prefrontal cortex does') (Brown et al., 2019).One solution to this issue is to make the higher-order representations linguistic (Morin, 2005).In the HOLISTIC model, core language and inner speech regions 're-represent' peripherally connected networks corresponding to, e.g., sensorimotor and emotional associations of those words or situation models (4.2.3).

GWTs
The HOLISTIC model is also related to global workspace (Baars, 2002;Edelman et al., 2011) and global neuronal workspace theories (Dehaene and Naccache, 2001).In these, as with HOTs, sensory representations are not sufficient for consciousness.Unlike HOTs, there is no need to 'redescribe' these first order representations.Rather, conscious thoughts occur with the ignition and stabilisation of a 'workspace' interconnecting and coordinating specialised (e.g., visual) brain regions.This makes such information globally available to the brain so that it can be acted on.Like HOTs, it is relatively unclear what constitutes a 'global workspace' or what happens in one and so it is also often defined anatomically (with an emphasis on prefrontal-parietal cortex).The HOLISTIC model solution is to identify the global workspace with language and inner speech cores.These interconnect a whole periphery and this information becomes available for use to be acted on through dialogue and/or other mechanisms (3.2 and 4.1), stabilising core activity.

RPTs
Like HOTs and GWTs, recurrent processing theories posit that consciousness does not arise from feedforward processing, e.g., from early visual cortices (Lamme, 2010(Lamme, , 2006)).Rather, some form of reentrant, reverberative, or recurrent processing between brain regions is required.The HOLISTIC model is grossly in accord with RPTs in that inner speech creates recurrence in various manners, e.g., through predictive mechanisms in language and inner speech networks (4.2.1; see 5.6).

IIT
In IIT, consciousness is associated with integrated information above zero, quantified using an information theoretic value (Phi or Φ) (Tononi et al., 2016).As Φ is zero in feedforward systems and greater than zero in recurrent systems, IIT aligns with HOTs, GWTs, and RPTs in that consciousness likely requires feedback processing (and has the repercussion that thermostats are conscious).Phi might be used to test arguments made regarding aphasia (2.2 and 2.3.2) and the varieties of inner speech (3.1.1 and 4.2.2).Someone with more damage (lower Φ) should be less conscious than someone with less extensive damage to the distributed cores supporting speech production and inner speech (higher Φ).Similarly, someone using condensed inner speech (lower Φ) should be less conscious than someone having a dialogue in their head (higher Φ).

TTC
Consciousness in the HOLISTIC model emerges as a complex and dynamic spatial arrangement that unfolds over multiple and extended temporal scales (4.2).As such it is well aligned with the temporo-spatial theory of consciousness that posits a more comparable neurobiological architecture.That is, the TTC posits a 'common currency' of nested temporal-spatial neurobiological brain states that align with levels, content, phenomenal aspects, and cognitive features of consciousness (Northoff et al., 2020;Northoff and Huang, 2017;Northoff and Zilio, 2022).The TTC does not take a traditional function-based view.Rather, it maintains a global brain dynamics view in which those functions are embedded (Northoff et al., 2020).In contrast, the HOLISTIC model takes a function-based global view, in which language, inner speech, and consciousness unfold at different temporal (e.g., condensed inner speech vs internal dialogue) and spatial scales (e.g., inner speech cores vs dynamic whole-brain peripheries).

Prediction
The HOLISTIC model is an example of a predictive model.This originates not in attempting a grand theory of everything but, rather, the need to explain how humans understand language.That is, language is ambiguous at all levels of analyses from phonemes to sentences to discourse.A solution for this ambiguity problem is that the brain uses context at all temporal/spatial scales to predict the acoustic patterns arriving in auditory cortices (Skipper, 2015(Skipper, , 2014;;Skipper et al., 2021Skipper et al., , 2017Skipper et al., , 2007bSkipper et al., , 2006Skipper et al., , 2005;;Skipper and Lametti, 2021).A consequence is that the HOLISTIC model aligns with a large number of theories that suggest consciousness is closely related to feedback predictions and feedforward prediction errors (for a review, see Seth and Bayne, 2022).More generally, predictive models align with the HOTs, GWTs, RPTs, and IIT in requiring feedback for conscious experience.

Experience
Cutting across the HOLISTIC model and other theories of consciousness is (presumably) a desire to understand where and how subjective experience arises.The 'easier' problem might be the 'where' question.In the HOLISTIC model consciousness is the result of a complex set of cortico-cortical, cortical-subcortical, and bodily interactions (4).Roughly, the upper brainstem is more associated with primary consciousness and the neocortex more associated with higher-order consciousness (4.2.2).This is in contrast with the HOTs (front), GWTs (front), RPTs (back is sufficient), and IIT (back 'hot zone') that are preoccupied with 'the' locus of consciousness being in the front or back of the neocortex (e.g., Boly et al., 2017).This suggests that only the HOLISTIC and TTC models can account for the reviewed (e.g., brainstem) data because they allow the loci of consciousness to vary more dramatically than other models.
In addressing 'how' consciousness arises, a distinction is often made between 'phenomenal' (or 'what it is like') and 'access' consciousness (in which content is available for report) (Nagel, 1974).The HOLISTIC model concerns itself with both, again, distinguishing primary (more phenomenal) and higher-order (more access) consciousness.In contrast, RPTs and IIT are more concerned with phenomenal whereas HOTs and GWT with access consciousness (and TTC both).However, this is a gross characterisation as subjective experience in the HOLISTIC model is multidetermined, resulting from seven mechanisms (3.2 and 4.1).For example, inner speech cores have a somatic feeling, producing a first-person perspective (4.2.2) imbued with a rich affective 'fringe' generated by peripheral and brain stem networks (4.2.6).As others argue, it is hard to separate phenomenal from access consciousness in this model as they are likely highly correlated if not inseparable (Baars, 1995;Cohen and Dennett, 2011;Overgaard, 2018).
More generally, it can be asked whether the HOLISTIC model or any of the other theories cross the 'explanatory gap' between neurobiological processes described from the 3rd person and the 1st person experience of how things feels (Chalmers, 1995;Levine, 1983).It is farly unclear how 'redescription' (HOTs), 'global workspaces' (GWTs), 'recursion' (RTPs), 'integration' (IIT), or 'temporal-spatial patterns' (TTC) result in subjective experience or qualia.Perhaps this is because the underlying theories are mostly ill equipped to explain any real-world behaviour in that the model system is typically only visual consciousness, studied with 'trademark' paradigms (Havlík et al., 2019) that do not resemble anything the brain actually does.The model of consciousness that emerges will necessarily be impoverished because it cannot account for specific behaviours in context.The HOLISTIC model does better because it allows the factors giving rise to subjective experience to be specified for a specific ecological behaviour that requires embedding in the entire brain (with all of its putative component processes, e.g,.vision, memory, emotion, etc.).That said, the HOLISTIC model still likely needs some help getting across the gap, e.g., by adopting the 'phenomenal concept strategy' (Carruthers and Veillet, 2007;Masciari and Carruthers, 2020) or 'dual-aspect monism' (Panksepp, 2005;Solms, 2018;Solms and Friston, 2018).

Conclusions
Converging evidence presented in this review suggests that, if we could precisely excise language and inner speech from the brain without further damage, we would be left without most if not all higher-order and some aspects of primary consciousness (2).The HOLISTIC model of the neurobiology of language, inner speech, and consciousness was provided to explain why this is, along with supporting data (3 and 4; Figs.1-3).It is the first model of the neurobiology of language to consider consciousness and vice versa and suggests new possibilities for understanding mental health and wellbeing and mechanisms of psychotherapy (4).The level of mechanistic specificity addresses a number of problems associated with current theories of the 'neural correlates of consciousness' (5).
Thus, this review begs for a reconsideration and integration of the neurobiology of language and inner speech in not only consciousness but also in the cognitive and clinical health sciences more generally.Language in these disciplines is typically relegated to the status of something we use for making stimuli or the thing we have people respond to our questions with.Yet, this review has highlighted a number of ways that language and inner speech play complex and deterministic roles in many if not most of our social, cognitive, and emotional processes we hold dear.The framework presented here sees the human brain as that of homo narrans, one largely formed by the exposure to massive amounts of language from in utero to in coffin.The result of this inculcation is extensively language mediated functioning, including generating conscious experience.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.