Some youthful observations about language

As a youth, I  studied lots of languages [1]. As I matured I was struck at once by the delightful multitude of syntactic devices that languages invent for the expression of meaning. At the same time, I was struck by several things:

1.      How uncanny were the similarities between what people in these various languages chose to talk about?

They all talked about people, and space and time, and needs, plans and possessions, and change, completion, speed, stories and thought [2]. Out of all the possible realities that one might invent (given a supple imagination), how odd, I thought that all cultures seemed to have converged on so much common ground.[3] If we had been born in a two-dimensional universe instead of a three-dimensional one, how different might our language be? I wondered. How very odd that humans all have so very much in common.

2.      A second thing that impressed me was that three of the languages I studied had two-part negation:

·         French -- ne verb pas

·         Navajo -- doo verb phrase da (no = dooda)

·         Quechua -- mana verb phrase çu (no = manaçu)

That gave me cause to think a bit, and made quite ready to accept Herb Simon's argument against Chomsky when it came about a few years later. (I later discovered that Mongolian did the same thing.)

3.      The third thing was how cleverly some languages work to conserve vocabulary.

While I was quite familiar  [4] with Charles Ogden's Basic English [5] , and had some serious complaints about it [6], it had helped me develop an eye for semantic parsimony. An example:


·         yaça-  (verb stem)

·         yaçay -- to know (inflected with regular inflection through person and number)

·         yaçaçi -- to teach; literally, to cause to know. (çi is the transitive particle that can be added to many intransitive stems)

·         yaçaçiq -- a teacher; literally one who causes to know. (q is a nominalizer restricted, I think, to humans)

I found hundreds of these examples in the languages I was studying. Navajo was particularly resistant to foreign loan words, and Quechua, while borrowing some vocabulary and syntax from Spanish, had remained relatively unaffected.

Toward a youthful theory of semantics

The confluence of thoughts about how parsimonious actual languages could be, how parsimonious Basic English had tried to be, and the astonishing cross-cultural similarity in what humans actually chose to talk about (out of all the possible realities that the 1960's had instructed us to be aware of) together with a burgeoning interest in the semantics of mathematical systems gave cause for me to re-consider Ogden's fundamental question:

What is the minimum number of semantic primitives necessary to express human thought?

I then undertook a several year quest to develop and study just such a language. My answers to two of the simplest questions are:

1. Does this reductionist semantics really work?

·         Well, not exactly. However,…

Some small number of semantic primitives plus an appropriate syntax (e.g. one which is anaphoric or graph theoretic), I think, suffices to encode most of the non-molecular cognitive reality of humans (and several hypothetical categories of sentient species ).  The molecular world populated by halibut, coca-cola, guitars and rhinos is likely to require an open and extensible format, but plain old human thought as expressed in philosophy, teleology and mechanism is likely not to require much more, until, perhaps, we, as a species, mutate.

2. For a given language, can a semantic reduction be done methodically? That is,  given a monolingual dictionary for an arbitrary language may we rewrite  the definitions of words using a small “defining lexicon” of primitive, and systematically minimize th esize of the defining lexicon ?

·         No.

My article  The extraction of a minimum set of semantic primitives from a monolingual dictionary is NP-complete (soon to be reprinted by MIT Press) concludes that it won't be easy in the general case. (though that doesn't mean a clever human couldn't optimize a given language with a good set of heuristics and working knowledge of the language).

Recent discovery of a similar approach

Another thing to say on this topic is that at about the same time (35 years ago) that I put this topic on hold as a research project in Boulder, Anna Wierzbicka was writing about a very similar project called Natural Semantic Metalanguage (NSM) in either Poland or Australia. I never learned about her project (which seems to have gathered some real steam since then) until a couple of weeks ago (September 2008), and have not yet really begun to digest it. On a superficial level:

·         some of the linguistic and inferential methodologies used by the approach seem similar to a paper I wrote in grad school discussing the implications of "verbs of influence" to converge on underlying meaning.

·         (good news!) many of the primitives that NSM advances are much the same as those I arrived at

I do not see in NSM the syntactic devices (the graph grammar) necessary to drive a more general theory of language from that theory of semantics contained within NSM. My goal was a bit different perhaps: not just to come up with an optimal set of primitives, but also to allow the creation of a language from which those primitives could be combined so as to express complex ideas. Maybe it is there, I just don't see it in the Wikipedia summary -- Duh!

Ultimately, it doesn't really matter, as what I write about here is not NSM, but rather, what I do know about. The purpose is not to re-invent a wheel but to describe one such wheel, as I attempt to use that wheel as a more general part of a machine that involves SVG as an expressive medium. See remarks on language as a two-dimensional exercise (under development as "Language" in Words, Meaning and Language)

Details, as time allows (the primitives and their justifications):

It should be noted that the theory (or is it merely a speculation?) put forth here underwent at least two distinct generations of formulation prior to arriving in the form shown here. The first was the development of a detailed cosmology in which the simplest of universes was imagined, and from which the evolution or ancestry of other primitives was derived, in classical cosmological narrative[7]. The second involved the creation of a writing system using an “alphabet of semantic primitives [8]” together with a syntax involving parentheses and subscripts. The primitives are clustered into topics, wherein the discussion of their inclusion and subtleties associated with them may be had, though other organizations (which reflect others of my historic presentations may prove preferable).

·         act (generic verb marker) -- to do or, in some cases, to be, as in to exist.


act agent -- agent acts (an agent is able to act)

act manner -- the way something is done

·         sense and think


·         thing and essence (generic noun markers) -- In English the morpheme -ness comes close to a generic essence. An essence with molecular presence would be a thing.

·         able/possible -- typically used as a verb ("can"), this form is used to imply that some act can be accomplished, that some fact is possible, or that some agent is able to do something.


(act can) agent -- agent can act (an agent is able to act)

can adj -- able

·         universal and existential (all/some)

·         poset (brings comparatives for ancestry, modal logic, ethics and spatial process)

·         value (manner, quality, magnitude, number) -- these are four aspects of how we assign values to events. The manner of an event is a non-quantitative description

·         this/that/yonder/unknown (as in Navajo -- enables deixis for person, time, evaluative and space modalities)


this act -- to substitute for, to equal

  this that yonder what
place here right there over there where
time now then about then when
person I/me you he/she who
quality thus that way the way how

·         need

·         adjectival mark (in context includes adverbial) -- -like (converts nouns or verbs to adjectival or adverbial modifiers)


person-like -- human

think-like -- thoughtful

magnitude-like -- big

quantity-like -- numerous

this-like -- similar


·         gender

·         negation (incl. voidance & reflection) In most languages there is a mechanism to differentiate between opposition (as in "the uncola" which means to invert along the "cola" axis) and the simple absence of a property (as in "asexual" where the value has merely been voided). Mathematically we may think of the distinction as between multiplying by zero vs. multiplying by negative one.

·         time (including past present and poset/hypothetical) There is a past (-ed-ness), there is a present (this-time-ness), there are a variety of possible futures

this time quantity < alpha time quantity

this time quantity < beta time quantity

for which alpha time quantity ~? beta time quantity

However, much of human discussion of futures seems to adhere to a series of relativizations (like viewports) in which transitivity of the < relationship on time quantities appears to apply.

·         space (including those which are metric but non-dimensional, but certainly including directional vectors in nonmetric spaces flavored by vector bundles)

·         iteration/extrapolation/completion (for continuative aspects of verbs)

·         the conjunctions of Standard First Order Logic. {and, or, if/then, if and only if, negation, etc.} It so turns out that all of the binary truth operators (mapping a pair of Boolean values to either true or false) can be defined from one: the neither nor operator, so in a strictly reductionist way, neighte nor

·         preventative and causative

·         change -- to become not-this-like or simply to become.

·         person -- this may be replaced by think thing -- a sentient essence.

·         etc.

[1] I had coursework in French, Russian, Navajo, and Quechua (including quick glances at half a dozen others) Whence this interest arose

[2] This is of course contrary to the Sapir-Whorf hypothesis. I will confess I don't follow Sapir here. Hopi does have words for yesterday and for years ago it does articulate tense in its verb system -- at least so a check of my own informants and of a Hopi-English dictionary conveyed. To claim that a concept of time is missing is rather an extrapolation beyond the data, I fear. (See also .)

[3] Perhaps the cultures whose realities were so imaginative that they could not be conveyed to neighboring cultures, were simply eradicated through the intolerance of those neighbors. Humans have not always displayed exemplary levels of tolerance and understanding.

[4] Having read about it in my early readings on linguistics and having written poems that used its vocabulary and a random number generator in one of my college poetry classes, much to the delight of Professor Rodefer, but much to the consternation and disgust of some students in the class.

[5] A list of 850 words from which, purportedly, all meanings in English and other languages could be written. See

[6] A good number of the words in the list seemed to me to be quite unnecessary. More critically, a good number of the "primitives" in that lexicon were used to stand for several of their polysemes. The ambiguity of primitives, was to my mathematical sensibilities, offensive.

[7] The cosmology (to be documented in more detail here at a future time) begins with Eternity (a moment forever). From Eternity emerges Thought (or consiousness); Thought creates the first difference (Change): the difference between itself (This) and other (That). On reflection upon Difference, Time is born, and with each new reflection it makes, Thought gives birth to new ones of the semantic primitives. Space, for example, is born as a metaphor for the posets traversed by Thought in its own reflections upon itself, and the history of its own activity (Action) in that spatial metaphor. As the Space of Thought’s activities becomes more familiar and cross-connected over Time, that Space grows toward a more continuous and less discrete space such as a Euclidean or axial space, not because Space itself is intrinsically Euclidean nor even axial, but because that metaphor comes to describe certain classes of familiar activity. The very early activities of Thought as it emerges from Eternity are also laden with emotive connotation, whence also is born a system of ethics, needs and goals.

[8] The term “alphabet” may be misleading. Linguistics often divides writing systems into alphabets, syllabaries, and ideographies (or symbologies)  (as exemplified, respectively, by Latin, Cherokee and Chinese). Here alphabets and syllabaries are considered to be based on the phonology of language, while ideographies are based on the semantics. The semantic writing system I developed (called Finger Thinking) was clearly ideographic, but since I wished to be able to write (and type) it easily, I used the Roman alphabet for the majority of its character set. In computing theory, the term “alphabet” is used, more broadly, to refer to any character set.

