Zellig Sabbetai Harris (October 23, 1909 – May 22, 1992) was an American linguist. Originally a student of Semitic languages, he is best known for his work in structural linguistics and discourse analysis. He also contributed to the investigation in sub-language grammar, operator grammar, and a theory of linguistic information. Harris viewed his research not just as an academic exercise but as work with social application. Indeed, many applications particularly in the field of computers can trace their origins to his work. Harris regarded language as an essentially social activity, the basis of communication among people. His work on grammar and sub-languages echoed this belief, as his theories viewed the grammatical form and semantic content as essentially connected, both conveying meaning in a social context. His studies of sub-languages and their development within specialized areas of work, such as medicine, are valuable in revealing how we can maintain harmonious communication among diverse groups within the larger society. As linguists understand the role of sub-languages, human society can maintain its coherence as a whole while encouraging a diversity of specializations, as people achieve their full potential as individuals maximizing their specific abilities while remaining well-connected to the larger society.



Zellig Sabbetai Harris was born on October 23, 1909, in Balta, Russia, (today’s Ukraine). His middle name, “Sabbatai,” together with his brother’s first name, “Tzvee,” indicates his parents were the followers of Sabbatai Zevi or Tsvee (1626-1676), a Jewish rabbi who claimed to be the Messiah.

Harris came with his family to Philadelphia, Pennsylvania, in 1913, when he was four years old. A student in the Oriental Studies department, he received his bachelor's degree in 1930, master's degree in 1932, and doctoral degree in 1934, all from the University of Pennsylvania. He spent his whole professional life at that institution.

Harris began teaching in 1931, and went on to found the linguistics department there in 1946, the first such department in the country. He started his career in Semitic languages, and spent some time in studying Phoenician and Ugaritic. He published his Development of the Canaanite Dialects in 1939, which was a study of the early history of the Canaanite branch of West Semitic, to which the Phoenician dialects, with Hebrew, Moabite, and others belong.

In the early 1940s, Harris turned his focus to the study of general linguistics, for which he eventually became famous. In 1951, he published his Structural Linguistics, which became the standard textbook for more than a decade. He also engaged with the new field of computational linguistics, which just emerged with the advancement of first computers (Penn participated in the development of the first computer, ENIAC).

In 1966, he was named the Benjamin Franklin Professor of linguistics at the University of Pennsylvania.

Harris spent many summers working on a kibbutz in Israel. His wife, Bruria Kaufman, was a professor at the Weizmann Institute in Jerusalem, and also worked as an assistant to Albert Einstein at Princeton. Harris actively engaged in advocating for the independence of Israel, and was known as a zealous Zionist. He was active in the Avukah, the student Zionist organization of that time, which flourished on the Penn campus during Harris’ time there.

Harris retired in 1979, and died at his home in New York City, on May 22, 1992.


It is widely believed that Harris carried the linguistic ideas of Leonard Bloomfield to their extreme development: The investigation of discovery procedures for phonemes and morphemes, based on the distributional properties of these units.

Harris' Methods in Structural Linguistics (1951) is the definitive formulation of descriptive structural work as developed up to 1946. This book made him famous, but was (and still is) frequently misinterpreted as a synthesis of a "neo-Bloomfieldian school" of structuralism. His discovery procedures are methods for verifying that results are validly derived from the data, freeing linguistic analysis from Positivist-inspired restrictions, such as the fear that to be scientific one must progress stepwise from phonetics, to phonemics, to morphology, and so on, without "mixing levels."

Beginning with the recognition that speaker judgments of phonemic contrast are the fundamental data of linguistics (not derived from distributional analysis of phonetic notations), his signal contributions in this regard during this period include discontinuous morphemes, componential analysis of morphology and long components in phonology, a substitution-grammar of phrase expansions that is related to immediate-constituent analysis, and above all a detailed specification of validation criteria for linguistic analysis. The book includes the first formulation of generative grammar.

Natural language, which demonstrably contains its own metalanguage, cannot be based in a metalanguage external to it, and any dependence on a priori metalinguistic notions obscures an understanding of the true character of language. Deriving from this insight, his aim was to constitute linguistics as a product of mathematical analysis of the data of language, an endeavor which he explicitly contrasted with attempts to treat language structure as a projection of language-like systems of mathematics or logic.

Linguistic transformation

As early as 1939, Harris began teaching his students about linguistic transformations and the regularizing of texts in discourse analysis. This aspect of his extensive work in diverse languages such as Kota, Hidatsa, and Cherokee, and of course Modern Hebrew, as well as English, did not begin to see publication until his "Culture and Style" and "Discourse Analysis" papers in 1952. Then in a series of papers beginning with "Co-occurrence and Transformations in Linguistic Structure" (1957) he put formal syntax on an entirely new, generative basis.

Harris recognized, as Sapir and Bloomfield also had stated, that semantics is included in grammar, not separate from it; form and information being two sides of the same coin. Grammar, as so far developed, could not yet consist of individual word combinations, but only of word classes. A sequence, or ntuple of word classes (plus invariant morphemes, termed "constants") specifies a subset of sentences that are formally alike. He investigated mappings from one such subset to another in the set of sentences. In linear algebra, a transformation is a mapping that preserves linear combinations, and that is the term that Harris introduced into linguistics.

Harris' work on the set of transformations, factoring them into elementary sentence-differences as transitions in a derivational sequence, led to a partition of the set of sentences into two sub-languages: An informationally complete sub-language with neither ambiguity nor paraphrase, versus the set of its more conventional and usable paraphrases (Harris 1969). Morphemes in the latter may be present in reduced form, even reduced to zero; their fully explicit forms are recoverable by undoing deformations and reductions of phonemic shape that he termed "extended morphophonemics." Thence, in parallel with the generalization of linear algebra to operator theory, came Operator Grammar. Here at last is a grammar of the entry of individual words into the construction of a sentence. When the entry of an operator word on its argument word or words brings about the string conditions that a reduction requires, it may be carried out; most reductions are optional. Operator Grammar resembles predicate calculus, and has affinities with Categorical Grammar, but these are findings after the fact which did not guide its development or the research that led to it.

Since Harris was Noam Chomsky's teacher, beginning as an undergraduate in 1946, some linguists have questioned whether Chomsky's transformational grammar is as revolutionary as it has been usually considered. However, the two scholars developed their concepts of transformation on different premises. Chomsky early on adapted Post-production systems as formalism for generating language-like symbol systems, and used this for presentation of immediate-constituent analysis. From this he developed phrase structure grammar and then extended it for presentation of Harris' transformations, restated as operations mapping one phrase-structure tree to another. This led later to his redefinition of transformations as operations mapping an abstract "deep structure" into a "surface structure."

Sublingual analysis

In his work on sub-language analysis, Harris showed how the sub-language for a restricted domain can have a pre-existent external metalanguage, expressed in sentences in the language but outside of the sub-language, something that is not available to language as a whole. In the language as a whole, restrictions on operator-argument combinability can only be specified in terms of relative acceptability, and it is difficult to rule out any satisfier of an attested sentence-form as nonsense, but in technical domains, especially in sub-languages of science, metalanguage definitions of terms and relations restrict word combinability, and the correlation of form with meaning becomes quite sharp. It is perhaps of interest that the test and exemplification of this in The Form of Information in Science (1989) vindicates in some degree the Sapir-Whorf hypothesis. It also expresses Harris' lifelong interest in the further evolution or refinement of language in the context of problems of social amelioration and in possible future developments of language beyond its present capacities.

Later career

Harris' linguistic work culminated in the companion books A Grammar of English on Mathematical Principles (1982) and A Theory of Language and Information (1991). Mathematical information theory concerns only quantity of information; here for the first time was a theory of information content. In the latter work, also, Harris ventured to propose at last what might be the "truth of the matter" in the nature of language, what is required to learn it, its origin, and its possible future development. His discoveries vindicated Sapir's recognition, long disregarded, that language is predominantly a social artifact.

Harris applied discourse analysis to the languages of science. For example, he and his coworkers studied the sub-language of immunology. They argued that a change had occurred within a few years in the structure of the medical language as found in numerous immunological publications. They claimed that this change reflected the advancement of knowledge gained in this period. In 1989, he published a 590 page book on that topic.


Harris' enduring stature derives from the remarkable unity of purpose which characterizes his work. His rigor and originality, as well as the richness of his scientific understanding, allowed him to take linguistics to ever new stages of generality, often ahead of his time. He was always interested in the social usefulness of his work, and applications of it abound, ranging from medical informatics, to translation systems, to speech recognition, to the automatic generation of text from data as heard, for example, on automated weather radio broadcasts. Numerous computer applications, like Medical Language Processor or the Proteus Project, can trace its roots in the Harris’ work.

Many workers have continued to extend the lines of research that he opened. Other students of Harris, besides Noam Chomsky, include Joseph Applegate, Ernest Bender, William Evan, and Maurice Gross.


  • Harris, Zellig S. 1936. A Grammar of the Phoenician Language. Doctoral dissertation. Eisenbrauns. ISBN 0940490080
  • Harris, Zellig S. 1939. Development of the Canaanite Dialects: An Investigation in Linguistic History. Periodicals Service Co. ISBN 0527026905
  • Harris, Zellig S. 1951. Methods in Structural Linguistics. Chicago: University of Chicago Press.
  • Harris, Zellig S. 1962. String Analysis of Sentence Structure. Mouton.
  • Harris, Zellig S. 1968. Mathematical Structures of Language. Krieger Pub Co. ISBN 0882759582
  • Harris, Zellig S. 1969. The Two Systems of Grammar: Report and Paraphrase. University of Pennsylvania.
  • Harris, Zellig S. 1970. Papers in Structural and Transformational Linguistics. Dordrecht: Reidel.
  • Harris, Zellig S. 1976. Notes du Cours de Syntax. Paris: Éditions du Seuil.
  • Harris, Zellig S. [1981] 2001. Papers on Syntax. Springer. ISBN 9027712662
  • Harris, Zellig S. 1982. A Grammar of English on Mathematical Principles. John Wiley & Sons Inc. ISBN 0471029580
  • Harris, Zellig S. 1988. Language and Information. Columbia University Press. ISBN 0231066627
  • Harris, Zellig S. [1989] 2001. The Form of Information in Science: Analysis of an immunology sublanguage. Springer. ISBN 9027725160
  • Harris, Zellig S. 1991. A Theory of Language and Information: A Mathematical Approach. Oxford University Press. ISBN 0198242247
  • Harris, Zellig S. 1997. The Transformation of Capitalist Society. Rowman & Littlefield Publishers. ISBN 0847684121


  • Koerner, E. F. Konrad. 1993. "Zellig Sabbettai Harris: A Comprehensive Bibliography of his Writings 1932-1991" in Historiographia Linguistica XX. 509-522.
  • Murray, Stephen O. 1994. Theory Groups and the Study of Language in North America. Philadelphia: John Benjamins.
  • Nevin, Bruce E. 1993. "A Minimalist Program for Linguistics: The Work of Zellig Harris on Meaning and Information" in Historiographia Linguistica XX, 2/3, 355-398.
  • Nevin, Bruce E. 2002. The Legacy of Zellig Harris: Language and Information into the 21st Century (Volume 1). John Benjamins Publishing Co. ISBN 1588112462
  • Watt, W.C. 2005. Zellig Sabbatai Harris: A Biographical Memoir. The National Academy Press. Retrieved on March 5, 2007.

