Language
- Butterfield, Jeremy.
Damp Squid.
Oxford: Oxford University Press, 2008.
ISBN 978-0-19-923906-1.
-
Dictionaries attempt to capture how language (or at least the
words of which it is composed) is used, or in some cases
should be used according to the compiler of the
dictionary, and in rare examples, such as the
monumental
Oxford English Dictionary (OED),
to trace the origin and history of the use of words over
time. But dictionaries are no better than the source material
upon which they are based, and even the OED, with its millions
of quotations contributed by thousands of volunteer readers, can only
sample a small fraction of the written language. Further,
there is much more to language than the definitions of
words: syntax, grammar, regional dialects and usage,
changes due to the style of writing (formal, informal,
scholarly, etc.), associations of words with one another,
differences between spoken and written language, and
evolution of all of these matters and more over time.
Before the advent of computers and, more recently, the
access to large volumes of machine-readable text afforded
by the Internet, research into these aspects of linguistics
was difficult, extraordinarily tedious, and its accuracy
suspect due to the small sample sizes necessarily used in
studies.
Computer linguistics sets out to study how a language is actually
used by collecting a large quantity of text (called
a corpus), tagged with identifying information useful
for the intended studies, and permitting measurement of the statistics
of the content of the text. The first computerised corpus was
created in 1961, containing the then-staggering number of one million
words. (Note that since a corpus contains extracts of text, the
word count refers to the total number of words, not the number of
unique words—as we'll see shortly, a small number of words
accounts for a large fraction of the text.) The preeminent
research corpus today is the
Oxford
English Corpus which, in 2006, surpassed two billion words
and is presently growing at the rate of 350 million words a
year—ain't the Web grand, or what?
This book, which is a pure delight, compelling page turner,
and must-have for all fanatic “wordies”, is a light-hearted
look at the state of the English language today: not what it
should be, but what it is. Traditionalists
and fussy prescriptivists (among whom I count myself) will be
dismayed at the battles already lost: “miniscule”
and “straight-laced” already outnumber “minuscule”
and “strait-laced”, and many other barbarisms and
clueless coinages are coming on strong. Less depressing and more
fascinating are the empirical research on word frequency
(Zipf's Law
is much in evidence here, although it is never cited by name)—the ten
most frequent words make up 25% of the corpus, and the top one
hundred account for fully half of the text—word origins,
mutation of words and terms, association of words with one
another, idiomatic phrases, and the way context dictates the
choice of words which most English speakers would find almost
impossible to distinguish by definition alone. This amateur astronomer
finds it heartening to discover that the most common noun modified
by the adjective “naked” is “eye” (1398
times in the corpus; “body” is second at 1144 occurrences).
If you've ever been baffled by the origin of the idiom “It's
raining cats and dogs” in English, just imagine how puzzled
the Welsh must be by “Bwrw hen
wragedd a ffyn” (“It's raining old women
and sticks”).
The title? It's an example of an “eggcorn”
(p. 58–59): a common word or phrase which mutates
into a similar sounding one as speakers who can't puzzle out its original,
now obscure, meaning try to make sense of it. Now that the
safetyland culture has made most people unfamiliar with explosives,
“damp squib” becomes “damp squid” (although,
if you're a squid, it's not being damp that's a
problem). Other eggcorns marching their way through the language
are “baited breath”, “preying mantis”,
and “slight of hand”.
January 2009
- Houston, Keith.
Shady Characters.
New York: W. W. Norton, 2013.
ISBN 978-0-393-06442-1.
-
¶ The earliest written
languages seem mostly to have been mnemonic tools for recording
and reciting spoken text. As such, they had little need for
punctuation and many managed to get along withoutevenspacesbetweenwords.
If you read it out loud, it's pretty easy to sound out (although
words written without spaces can be used to create deliciously
ambiguous text).
As the written language evolved to encompass scholarly and
sacred texts, commentaries upon other texts, fiction,
drama, and law, the structural complexity of the text grew
apace, and it became increasingly difficult to express this in
words alone. Punctuation was born.
In the third century B.C.
Aristophanes of Byzantium
(not to be confused with
the other fellow),
librarian at Alexandria,
invented a system of dots to denote logical breaks in
Greek texts of classical rhetoric, which were placed
after units called the komma, kolon,
and periodos. In a different graphical form, they are
with us still.
Until the introduction of movable type printing in Europe in the
15th century, books were hand-copied by scribes, each of whom was
free, within the constraints of their institutions, to innovate
in the presentation of the texts they copied. In the interest of
conserving rare and expensive writing materials such as papyrus and
parchment, abbreviations came into common use. The humble ampersand
(the derivation of whose English name is delightfully presented
here) dates to the shorthand invented by Cicero's personal
secretary/slave Tiro, who invented a mark to quickly write
“et” as his master
spoke.
Other punctuation marks co-evolved with textual criticism: quotation
marks allowed writers to distinguish text from other sources
included within their works, and asterisks, daggers, and other
symbols were introduced to denote commentary upon text. Once
bound books
(codices) printed
with wide margins became common, readers would annotate them as
they read, often ☛ pointing out key
passages. Even a symbol as with-it as the now-ubiquitous “@”
(which I recall around 1997 being called “the Internet logo”)
is documented as having been used in 1536 as an abbreviation for
amphorae of wine. And the ever-more-trending symbol prefixing #hashtags?
Isaac Newton used it in the 17th century, and the story of how it came
to be called an “octothorpe” is worthy of modern myth.
This is much more than a history of obscure punctuation. It traces how
we communicate in writing over the millennia, and how technologies such
as movable type printing, mechanical type composition, typewriting, phototypesetting,
and computer text composition have both enriched and impoverished our written
language. Impoverished? Indeed—I compose this on a computer able
to display in excess of 64,000 characters from the written languages used
by most people since the dawn of civilisation. And yet, thanks to the poisonous
legacy of the typewriter, only a few people seem to be aware of the distinction,
known to everybody setting type in the 19th century, among the em-dash—used
to set off a phrase; the en-dash, denoting “to” in constructions
like “1914–1918”; the hyphen, separating compound words such as
“anarcho-libertarian” or words split at the end of a line; the minus
sign, as in −4.221; and the figure dash, with the same width as numbers
in a font where all numbers have the same width, which permits setting tables
of numbers separated by dashes in even columns. People who appreciate typography
and use
TeX
are acutely aware of this and grind their teeth when reading documents
produced by demotic software tools such as Microsoft Word or reading postings
on the Web which, although they could be so much better, would have made
Mencken storm the Linotype floor of the Sunpapers had any of his writing been
so poorly set.
Pilcrows, octothorpes, interrobangs, manicules, and the centuries-long quest
for a typographical mark for irony (Like, we really need that¡)—this
is a pure typographical delight: enjoy!
In the Kindle edition end of chapter notes
are bidirectionally linked (albeit with inconsistent
and duplicate reference marks), but end notes are not linked to their
references in the text—you must manually flip to the notes
and find the number. The end notes contain many references to Web
URLs, but these are not active links, just text: to follow them you
must copy and paste them into a browser address bar. The index is just
a list of terms, not linked to references in the text. There is no
way to distinguish examples of typographic symbols which are set in
red type from chapter note reference links set in an identical red
font.
October 2013
- Wolfe, Tom.
The Kingdom of Speech.
New York: Little, Brown, 2016.
ISBN 978-0-316-40462-4.
-
In this short (192) page book, Tom Wolfe returns
to his roots in the “new journalism”, of which he was
a pioneer in the 1960s. Here the topic is the theory of evolution;
the challenge posed to it by human speech (because no obvious
precursor to speech occurs in other animals); attempts, from
Darwin to Noam Chomsky to explain this apparent discrepancy and
preserve the status of evolution as a “theory of everything”;
and the evidence collected by linguist and anthropologist
Daniel Everett
among the
Pirahã
people of the Amazon basin in Brazil, which appears to falsify
Chomsky's lifetime of work on the origin of human language and
the universality of its structure. A second theme is contrasting
theorists and intellectuals such as Darwin and Chomsky
with “flycatchers” such as
Alfred Russel
Wallace, Darwin's rival for priority in publishing the theory
of evolution, and Daniel Everett, who work in the field—often in
remote, unpleasant, and dangerous conditions—to collect
the data upon which the grand thinkers erect their castles of hypothesis.
Doubtless fearful of the reaction if he suggested the theory of
evolution applied to the origin of humans, in his 1859 book
On the Origin of Species, Darwin only tiptoed
close to the question two pages from the end, writing,
“In the distant future, I see open fields for far more
important researches. Psychology will be securely based on a new
foundation, that of the necessary acquirement of each mental
power and capacity of gradation. Light will be thrown on the
origin of man and his history.” He needn't have been
so cautious: he fooled nobody. The very first review, five
days before publication, asked, “If a monkey has become
a man—…?”, and the tempest was soon at
full force.
Darwin's critics, among them
Max Müller,
German-born professor of languages at Oxford, and Darwin's
rival Alfred Wallace, seized upon human
characteristics which had no obvious precursors in the animals
from which man was supposed to have descended: a hairless body,
the capacity for abstract thought, and, Müller's emphasis,
speech. As Müller said, “Language is our Rubicon, and
no brute will dare cross it.” How could Darwin's theory,
which claimed to describe evolution from existing characteristics
in ancestor species, explain completely novel properties which animals
lacked?
Darwin responded with his 1871 The Descent of Man, and Selection
in Relation to Sex, which explicitly argued that there were
precursors to these supposedly novel human characteristics among
animals, and that, for example, human speech was foreshadowed by the
mating songs of birds. Sexual selection was suggested as the
mechanism by which humans lost their hair, and the roots of a number
of human emotions and even religious devotion could be found in the
behaviour of dogs. Many found these arguments, presented without any
concrete evidence, unpersuasive. The question of the origin of
language had become so controversial and toxic that a year later, the
Philological Society of London announced it would no longer accept
papers on the subject.
With the rediscovery of
Gregor Mendel's
work on genetics and subsequent
research in the field, a mechanism which could explain Darwin's
evolution was in hand, and the theory became widely accepted,
with the few discrepancies set aside (as had the Philological
Society) as things we weren't yet ready to figure out.
In the years after World War II, the social sciences became
afflicted by a case of “physics envy”. The contribution
to the war effort by their colleagues in the hard sciences in
areas such as radar, atomic energy, and aeronautics had been
handsomely rewarded by prestige and funding, while the more
squishy sciences remained in a prewar languor along with the
departments of Latin, Medieval History, and Drama. Clearly, what
was needed was for these fields to adopt a theoretical approach
grounded in mathematics which had served so well for chemists,
physicists, engineers, and appeared to be working for the new
breed of economists.
It was into this environment that in the late 1950s a young
linguist named
Noam Chomsky
burst onto the scene. Over its century and a half of history,
much of the work of linguistics had been cataloguing and studying
the thousands of languages spoken by people around the world, much
as entomologists and botanists (or, in the pejorative term of
Darwin's age, flycatchers) travelled to distant lands to
discover the diversity of nature and try to make sense of
how it was all interrelated. In his 1957 book,
Syntactic Structures, Chomsky, then just twenty-eight
years old and working in the building at MIT where radar had been developed
during the war, said all of this tedious and messy field work
was unnecessary. Humans had evolved (note, “evolved”)
a “language organ”, an actual physical structure within
the brain—the “language acquisition device”—which
children used to learn and speak the language they heard from
their parents. All human languages shared a “universal
grammar”, on top of which all the details of specific
languages so carefully catalogued in the field were just fluff, like
the specific shape and colour of butterflies' wings. Chomsky
invented the “Martian linguist” which was to come to
feature in his lectures, who he claimed, arriving on Earth, would
quickly discover the unity underlying all human languages. No longer
need the linguist leave his air conditioned office. As
Wolfe writes in chapter 4, “Now, all
the new, Higher Things in a linguist's life were to be found
indoors, at a desk…looking at learned journals filled with
cramped type instead of at a bunch of hambone faces in a cloud
of gnats.”
Given the alternatives, most linguists opted for the office, and for
the prestige that a theory-based approach to their field conferred,
and by the 1960s, Chomsky's views had taken over linguistics, with
only a few dissenters, at whom Chomsky hurled thunderbolts from his
perch on academic Olympus. He transmuted into a general-purpose
intellectual, pronouncing on politics, economics, philosophy,
history, and whatever occupied his fancy, all with the confidence
and certainty he brought to linguistics. Those who dissented
he denounced as “frauds”, “liars”, or
“charlatans”, including B. F. Skinner, Alan Dershowitz,
Jacques Lacan, Elie Wiesel, Christopher Hitchens, and Jacques
Derrida. (Well, maybe I agree when it comes to Derrida and Lacan.)
In 2002, with two colleagues, he published a new theory
claiming that recursion—embedding one thought within
another—was a universal property of human language and
component of the universal grammar hard-wired into the brain.
Since 1977, Daniel Everett had been living with and studying the
Pirahã in Brazil, originally as a missionary and later as
an academic linguist trained and working in the Chomsky tradition.
He was the first person to successfully learn the Pirahã
language, and documented it in publications. In 2005 he published
a paper in which he concluded that the language, one of the
simplest ever described, contained no recursion whatsoever.
It also contained neither a past nor future tense, description of
relations beyond parents and siblings, gender, numbers, and many
additional aspects of other languages. But the absence of recursion
falsified Chomsky's theory, which pronounced it a fundamental
part of all human languages. Here was a field worker, a
flycatcher, braving not only gnats but anacondas, caimans,
and just about every tropical disease in the catalogue, knocking the
foundation from beneath the great man's fairy castle of theory.
Naturally, Chomsky and his acolytes responded with their
customary vituperation, (this time, the adjective of choice for
Everett was “charlatan”). Just as they were preparing
the academic paper which would drive a stake through this nonsense,
Everett published
Don't Sleep, There Are Snakes,
a combined account of his thirty years with the Pirahã
and an analysis of their language. The book became a popular hit and
won numerous awards. In 2012, Everett followed up with
Language: The Cultural Tool,
which rejects Chomsky's view of language as an innate and universal
human property in favour of the view that it is one among a
multitude of artifacts created by human societies as a tool, and
necessarily reflects the characteristics of those societies.
Chomsky now refuses to discuss Everett's work.
In the conclusion, Wolfe comes down on the side of Everett, and
argues that the solution to the mystery of how speech evolved is
that it didn't evolve at all. Speech is simply a tool which humans
used their big brains to invent to help them accomplish their
goals, just as they invented bows and arrows, canoes, and
microprocessors. It doesn't make any more sense to ask how
evolution produced speech than it does to suggest it produced
any of those other artifacts not made by animals. He further
suggests that the invention of speech proceeded from initial
use of sounds as mnemonics for objects and concepts, then
progressed to more complex grammatical structure, but I found
little evidence in his argument to back the supposition, nor is
this a necessary part of viewing speech as an invented artifact.
Chomsky's grand theory, like most theories made up without grounding
in empirical evidence, is failing both by being falsified on its
fundamentals by the work of Everett and others, and also by the
failure, despite half a century of progress in neurophysiology,
to identify the “language organ” upon which it is
based.
It's somewhat amusing to see soft science academics rush to Chomsky's
defence, when he's arguing that language is biologically determined
as opposed to being, as Everett contends, a social construct whose
details depend upon the cultural context which created it. A hunter-gatherer
society such as the Pirahã living in an environment where food is
abundant and little changes over time scales from days to
generations, doesn't need a language as complicated as those living in
an agricultural society with division of labour, and it shouldn't be a
surprise to find their language is more rudimentary. Chomsky assumed
that all human languages were universal (able to express any concept),
in the sense
David Deutsch
defined universality in
The Beginning of Infinity, but why should
every people have a universal language when some cultures get along
just fine without universal number systems or alphabets? Doesn't
it make a lot more sense to conclude that people settle on a language,
like any other tools, which gets the job done? Wolfe then argues that
the capacity of speech is the defining characteristic of human
beings, and enables all of the other human capabilities and
accomplishments which animals lack. I'd consider this not proved. Why
isn't the definitive human characteristic the ability to make tools,
and language simply one among a multitude of tools humans have invented?
This book strikes me as one or two interesting blog posts
struggling to escape from a snarknado of Wolfe's 1960s style
verbal fireworks, including Bango!, riiippp,
OOOF!, and “a regular crotch crusher!”.
At age 85, he's still got it, but I wonder whether he, or his
editor, questioned whether this style of journalism is as
effective when discussing evolutionary biology and linguistics
as in mocking sixties radicals, hippies, or pretentious
artists and architects. There is some odd typography, as well.
Grave accents are used in words like “learnèd”,
presumably to indicate it's to be pronounced as two syllables,
but then occasionally we get an acute accent instead—what's
that supposed to mean? Chapter endnotes are given as
superscript letters while source citations are superscript numbers,
neither of which are easy to select on a touch-screen
Kindle edition. There is no index.
January 2017