Classification of Romance languages


The internal classification of the Romance languages is a complex and sometimes controversial topic which may not have one single answer. Several classifications have been proposed, based on different criteria.

Variation among languages

In spite of their common origin, the descendants of Vulgar Latin have many differences. These occur at all levels, including the sound systems, the orthography, the nominal, verbal, and adjectival inflections, the auxiliary verbs and the semantics of verbal tenses, the function words, the rules for subordinate clauses, and, especially, in their vocabularies. While most of those differences are clearly due to independent development after the breakup of the Roman Empire, one must also consider the influence of prior languages in territories of Latin Europe that fell under Roman rule, and possible heterogeneity in Vulgar Latin itself.
Romanian, together with other related languages, like Aromanian, has a number of grammatical features which are unique within Romance, but are shared with other non-Romance languages of the Balkans, such as Albanian, Bulgarian, Greek, Macedonian, Serbo-Croatian and Turkish. These include, for example, the structure of the vestigial case system, the placement of articles as suffixes of the nouns, and several more. This phenomenon, called the Balkan language area, may be due to contacts between those languages in post-Roman times.

Formation of plurals

Some Romance languages form plurals by adding , while others form the plural by changing the final vowel from some masculine nouns.
Some Romance languages use a version of Latin plus, others a version of magis.
Although the Classical Latin word for "nothing" is ', the common word for "nothing" became ' in Italian, ' in Sardinian, ' in Spanish, Portuguese, and Galician, ' in French, ' in Catalan, ' and ' in Aragonese, ' in Occitan, ' in Romanian, ' in Romansh, ' in Venetian and Piedmontese, ' and ' in Lombard, and ' and ' in Friulian. Some argue that most roots derive from different parts of a Latin phrase nullam rem natam, an emphatic idiom for "nothing". Meanwhile, Italian and Venetian niente and gnente would seem to be more logically derived from Latin ne entem, ne inde or, more likely, ne entem, which also explains the French cognate word néant. The Piedmontese negative adverb nen cames also directly from ne entem, while gnente is borrowed from Italian.

The number 16

Romanian constructs the names of the numbers 11–19 by a regular Slavic-influenced pattern that could be translated as "one-over-ten", "two-over-ten", etc. All the other Romance languages use a pattern like "one-ten", "two-ten", etc. for 11–15, and the pattern "ten-and-seven, "ten-and-eight", "ten-and-nine" for 17–19. For 16, however, they split into two groups: some use "six-ten", some use "ten-and-six":
Classical Latin uses the "one-ten" pattern for 11–17, but then switches to "two-off-twenty" and "one-off-twenty". For the sake of comparison, note that many of the Germanic languages use two special words derived from "one left over" and "two left over" for 11 and 12, then the pattern "three-ten", "four-ten",..., "nine-ten" for 13–19.

To have and to hold

The verbs derived from Latin habēre "to have", tenēre "to hold", and esse "to be" are used differently in the various Romance languages, to express possession, to construct perfect tenses, and to make existential statements. If we use T for tenēre, H for habēre, and E for esse, we have the following distribution:
For example:
LanguagePossessive
predicate
PerfectExistentialPattern
EnglishI haveI have doneThere isHHE
Italian ho ho fattoc'èHHE
Friulian o ai o ai fata 'nd è, al èHHE
Venetian go go fatoghe xe, ghi n'éHHE
Lombard a gh-u a u faial gh'è, a gh'èHHE
Piedmontese i l'hai i l'hai fàita-i éHHE
Romanian am am făcuteste / eHHE
Neapolitan tengo aggio fattoce staTH–
Sardinian apo
apu
apo fattu
apu fattu
bi at / bi est
nc'at / nc'est
HHH
Romansh hai hai fatgigl haHHH
Frenchj'aij'ai faitil y aHHH
Catalan tinc he fethi haTHH
Aragonese tiengo
he
he feitobi haTHH
Spanish tengo he hechohayTHH
Galician teño
haiT–H
Portuguese tenho tenho feitoTTH
Portuguese tenho tenho feito
tem
TTH

Ancient Galician-Portuguese used to employ the auxiliary H for permanent states, such as Eu hei um nome "I have a name", and T for non-permanent states Eu tenho um livro "I have a book", but this construction is no longer used in modern Galician and Portuguese. Informal Brazilian Portuguese uses the T verb even in the existential sense, e.g. Tem água no copo "There is water in the glass".
Languages that have not grammaticalised *tenēre have kept it with its original sense "hold", e.g. Italian tieni il libro, French tu tiens le livre, Romanian ține cartea, Friulian Tu tu tegnis il libri "You're holding the book". The meaning of "hold" is also retained to some extent in Spanish and Catalan.
Romansh uses, besides igl ha, the form i dat, calqued from German es gibt.

To have or to be

Some languages use their equivalent of 'have' as an auxiliary verb to form the compound forms of all verbs; others use 'be' for some verbs and 'have' for others.
In the latter type, the verbs which use 'be' as an auxiliary are unaccusative verbs, that is, intransitive verbs that often show motion not directly initiated by the subject or changes of state, such as 'fall', 'come', 'become'. All other verbs use 'have'. For example, in French, J'ai vu or Italian ho visto 'I have seen' vs. Je suis tombé, sono caduto 'I have fallen'. Note, however, the difference between French and Italian in the choice of auxiliary for the verb 'be' itself: Fr. J'ai été 'I have been' with 'have', but Italian sono stato with 'be'. In Southern Italian languages the principles governing auxiliaries can be quite complex, including even differences in persons of the subject. A similar distinction exists in the Germanic languages, which share a language area; German and the Scandinavian languages use 'have' and 'be', while modern English now uses 'have' only.
"Be" is also used for reflexive forms of the verbs, as in French j'ai lavé 'I washed ', but je me suis lavé 'I washed myself', Italian ho lavato 'I washed ' vs. mi sono lavato 'I washed myself'.
Tuscan uses si forms identical to the 3rd person reflexive in a usage interpreted as 'we' subject, triggering 'be' as auxiliary in compound constructions, with the subject pronoun noi 'we' optional. If the verb employed is one that otherwise selects 'have' as auxiliary, the past participle is unmarked: si è lavorato = abbiamo lavorato 'we worked'. If the verb is one that otherwise selects 'be', the past participle is marked plural: si è arrivati = siamo arrivati 'we arrived'.

Classification

Difficulties of classification

The comparative method used by linguists to build family language trees is based on the assumption that the member languages evolved from a single proto-language by a sequence of binary splits, separated by many centuries. With that hypothesis, and the glottochronological assumption that the degree of linguistic change is roughly proportional to elapsed time, the sequence of splits can be deduced by measuring the differences between the members.
However, the history of Romance languages, as we know it, makes the first assumption rather problematic. While the Roman Empire lasted, its educational policies and the natural mobility of its soldiers and administrative officials probably ensured some degree of linguistic homogeneity throughout its territory. Even if there were differences between the Vulgar Latin spoken in different regions, it is doubtful whether there were any sharp boundaries between the various dialects. On the other hand, after the Empire's collapse, the population of Latin speakers was separated—almost instantaneously, by the standards of historical linguistics—into a large number of politically independent states and feudal domains whose populations were largely bound to the land. These units then interacted, merged and split in various ways over the next fifteen centuries, possibly influenced by languages external to the family.
To sum it up, the history of Latin and Romance-speaking peoples can hardly be described by a binary branching pattern; therefore, one may argue that any attempt to fit the Romance languages into a tree structure is inherently flawed. In this regard, the genealogical structure of languages forms a typical linkage.
On the other hand, the tree structure may be meaningfully applied to any subfamilies of Romance whose members did diverge from a common ancestor by binary splits. That may be the case, for example, of the dialects of Spanish and Portuguese spoken in different countries, or the regional variants of spoken standard Italian.

The standard proposal

Nevertheless, by applying the comparative method, some linguists have concluded that the earliest split in the Romance family tree was between Sardinian and the remaining group, called Continental Romance. Among the many peculiar Sardinian distinguishing features are its articles and retention of the "hard" sounds of "c" and "g" before "e" and "i". This view is challenged in part by the existence of definite articles continuing forms in some varieties of Catalan, best known as typical of Balearic dialects.
According to this view, the next split was between Romanian in the east, and the other languages in the west. One of the characteristic features of Romanian is its retention of three of Latin's seven noun cases. The third major split was more evenly divided, between the Italian branch, which comprises many languages spoken in the Italian Peninsula, and the Gallo-Iberian branch.

Another proposal

However, this is not the only view. Another common classification begins by splitting the Romance languages into two main branches, East and West. The East group includes Romanian, the languages of Corsica and Sardinia, and all languages of Italy south of a line through the cities of Rimini and La Spezia. Languages in this group are said to be more conservative, i.e. they retained more features of the original Latin.
The latter then split into a Gallo-Romance group, which became the Oïl languages, Gallo-Italian, Occitan, Franco-Provençal and Romansh, and an Iberian Romance group which became Spanish and Portuguese.

The wave hypothesis

Linguists like Jean-Pierre Chambon claim that the various regional languages did not evolve in isolation from their neighbours; on the contrary, they see many changes propagating from the more central regions towards the periphery. These authors see the Romance family as a linkage rather than a tree-like family, and insist that the Wave model is better suited than the Tree model for representing the history of Romance.

Degree of separation from Latin

In a study by linguist Mario Pei, the degrees of phonological modification of vowels of the Romance languages with respect to the ancestral Latin were found to be as follows