The use of corpora has expanded linguistic research possibilities, revolutionized theoretical concepts of language by applying empirical methods, and changed the face of applied linguistics. Although corpora are a tremendous asset to lexicography, translation studies, cultural studies and even to forensics, an awareness of their benefits has not fully arrived in the field of language teaching, yet. While insights from corpora linguistics have led to the development of a new and improved generation of dictionaries, most teaching materials for teaching English as a foreign language (TEFL) have been largely unaffected in this respect, and are regrettably still based on old conventions and on the intuition of course book designers.
This unfortunate fact is the starting point for the term paper at hand, which investi-gates the high frequency nouns day, money and way in two corpora in order to compare their authentic use by native speakers to their illustration in German teaching materials. Its focus is set on phraseology and frequency to find out if these nouns are adequately represented, or if an amendment of teaching materials is necessary.
The spoken part of the British National Corpus (BNC spoken) serves as the basis of this analysis, due to the fact that German language teaching policy favours a communicative approach aiming at the development of communicative competence and fluency in spoken English. The BNC findings are juxtaposed to the results of an analysis of the German English as a Foreign Language Textbook Corpus (GEFL TC), a corpus comprising two school book series. Additionally, the nouns´ introduction and presentation in the German English G 2000 textbook series are explored. Finally, as one approach to investigate the teaching materials for advanced learners, the respective entries are checked in three dictionaries aiming at this target group, namely the Longman Dictionary of Contemporary English, the Macmillan Dictionary for Advanced Learners and the Oxford Advanced Learners´ Dictionary of Current English.
The paper will first provide the theoretical background for the analysis, and will explain its concept and methods. Subsequently, it will focus on the analysis´ results and will propose improvements to textbook design. Last but not least, it presents Data Driven Learning (DDL) as corpus-based complement of course books and devises six exercises, based on the corpora findings of the BNC spoken, to exemplify it.
Table of contents
1. Introduction
2. Main Part
2.1. Phraseology, Frequency and Typicality as Issues of Corpus Linguistics
2.1.1. Phraseology
2.1.2. Frequency and Typicality
2.2. Phraseology, Frequency and Typicality in Applied Linguistics and Language Teaching
2.3. Status Quo of Phraseology and Frequency in Contemporary Teaching Materials
2.4. Concept of Analysis at Hand
2.5. Day, Money and Way in the BNC Spoken and in the GEFL TC
2.5.1. Day
2.5.2. Money
2.5.3. Way
2.5.4. Summary of the Findings
2.6. Day, Money and Way in the English G 2000 Series
2.6.1. Day
2.6.2. Money
2.6.3. Way
2.6.4. Summary of the Findings
2.7. Day, Money and Way in Advanced Learners´ Dictionaries
2.8. Conclusion
2.9. DDL-Exercises
2.9.1. The Benefits of DDL-Exercises for TEFL
2.9.2. Direct and Indirect Approach
2.9.3. Exercises Concerning Day, Money, Way
2.9.3.1. Day
2.9.3.2. Money
2.9.3.3. Way
3. Conclusion
4. Bibliography
5. Appendix
5.1. Sample Concordances from the BNC Spoken Corpus
5.2. Analysis Results of the BNC Spoken Sample Concordances
5.3. GEFL TC Concordances
5.4. Analysis Results of the GEFLTC Concordances
5.5. Table: Comparison of Corpora Findings
5.6. DLL-Exercises
5.6.1. Exercise 1: A Day
5.6.2. Exercise 2: One Day
5.6.3. Exercise 3: Money Collocating with Adjectives and Quantifiers
5.6.4. Exercise 4: Money and Verbs
5.6.5. Exercise 5: Way - Deduce the Meaning of the Nonsense Word
5.6.6. Exercise 6: Way - Noun and Adverb / Adverbial
1. Introduction
The use of corpora has expanded linguistic research possibilities, revolutionized theoretical concepts of language by applying empirical methods, and changed the face of applied linguistics. Although corpora are a tremendous asset to lexicography, translation studies, cultural studies and even to forensics, an awareness of their benefits has not fully arrived in the field of language teaching, yet. While insights from corpora linguistics have led to the development of a new and improved generation of dictionaries, most teaching materials for teaching English as a foreign language (TEFL) have been largely unaffected in this respect, and are regrettably still based on old conventions and on the intuition of course book designers.
This unfortunate fact is the starting point for the term paper at hand, which investi-gates the high frequency nouns day, money and way in two corpora in order to compare their authentic use by native speakers to their illustration in German teaching materials. Its focus is set on phraseology and frequency to find out if these nouns are adequately represented, or if an amendment of teaching materials is necessary.
The spoken part of the British National Corpus (BNC spoken) serves as the basis of this analysis, due to the fact that German language teaching policy favours a communicative approach aiming at the development of communicative competence and fluency in spoken English. The BNC findings are juxtaposed to the results of an analysis of the German English as a Foreign Language Textbook Corpus (GEFL TC), a corpus comprising two school book series. Additionally, the nouns´ introduction and presentation in the German English G 2000 textbook series are explored. Finally, as one approach to investigate the teaching materials for advanced learners, the respective entries are checked in three dictionaries aiming at this target group, namely the Longman Dictionary of Contemporary English, the Macmillan Dictionary for Advanced Learners and the Oxford Advanced Learners´ Dictionary of Current English.
The paper will first provide the theoretical background for the analysis, and will explain its concept and methods. Subsequently, it will focus on the analysis´ results and will propose improvements to textbook design. Last but not least, it presents Data Driven Learning (DDL) as corpus-based complement of course books and devises six exercises, based on the corpora findings of the BNC spoken, to exemplify it.
2. Main Part
2.1. Phraseology, Frequency and Typicality as Issues of Corpus Linguistics
Computer-driven investigations of large text compilations known as corpora enable linguists to easily deal with large amounts of language material, and to apply reliable empirical methods to their research. Furthermore, they offer a chance to behold language from different angles which change its perception and lead to a better understanding and even to new notions and concepts of language. Three of these angles are phraseology, frequency and typicality.
2.1.1. Phraseology
One sort of corpus-access software, so-called concordancing programs, can be used to observe the phraseology of words by investigating their collocates (cf. Hunston, 2002: 9, 13). Phraseology can be defined as “the tendency of words to occur in a preferred sequence in naturally-occurring language data” (Groom, 2006: 25; Hunston, 2002: 138). It comprises all aspects of preferred sequencing as well as the occurrence of fixed phrases (Hunston, 2002: 138) Collocates are words or phrases that are frequently used with other words or phrases in a way that sounds correct to native speakers of a language (Koprowski, 2005: 332), while the term collocation denotes the “statistical tendency of words to co-occur” (Hunston, 2002: 12;),
Phraseology plays a “central role in the corpus linguistic study of language” (Mahlberg, 2006: 378) having a major impact on the development of language theories, because it is crucial in the determination of linguistic meaning (cf. Groom, 2006: 25). As Sinclair explains, the sense of words is connected to a particular usage, either syntactic patterns, a close association of words or a grouping of words in a set phrase. Therefore, the meaning of words cannot be detected in isolation, but it is determined by the respective environment (1995b: xvii). Meaning and phraseology are intertwined and distinguishing between meanings is distinguishing between patterns of usage (cf. Hunston, 2002: 47). This holds particularly true for polysemous words, because their diverse meanings “do not reside in the words themselves, but in the sequences in which they participate” (Groom, 2006: 26). Each sense will tend to be associated most frequently with a different set of phraseological patterns (Hunston, 2002:139).
Phraseology is also important to determine semantic prosody, a phenomenon that refers to words that are typically used in a particular environment, from which these words adapt connotations in addition to their usual meaning (Hunston, 2002: 141 et seq.). Due to the fact that semantic prosody is a “subtle element of attitudinal, often pragmatic meaning”, it is hard to grasp as it draws on a variety of factors in the context of the respective lexical item (Mahlberg, 2006: 387 with reference to Sinclair), which is part of phraseology.
The importance of phraseology for the semantic content of words also led to a revision of theoretical language concepts. Sinclair’s idiom principle approach regards phraseology as the core of language description and challenges current views about language by claiming that there is no distinction between phraseological patterns and meaning, as well as no distinction between lexis and grammar (cf. Hunston, 2002: 138 with reference to Sinclair). He furthermore argues that there are two principles that organize language, the idiom principle and the open choice principle. According to the idiom principle each word is used in a common phraseology and its meaning is connected to whole phrases rather than to individual part of them. Recipients understand phrases as a whole unit, rather than as a grammatical template with lexical items, and phraseologies are decoded and encoded as “single entities rather than a string of individual words” (cf. Hunston, 2002: 143 et seq.). The open choice principle implies that meaning is generated by the application of grammatical rules to single words (Hunston, 2002: 145). Language is said to be primarily interpreted in the light of the idiom principle, and recourse to the open-choice principle is only taken if this primary interpretation fails (cf. Hunston, 2002: 143).
In the style of Sinclair’s view of a unity of lexis and grammar there is an also an approach to a “pattern grammar”, which regards phraseology as the foundation of grammar and tries to create a grammar description on the basis of patterns of frequently occurring phrases (cf. Hunston, 2002: 104 et seq. with further reference; cf. Mahlberg, 2006: 378 et seq.). The term pattern is defined as “a phraseology frequently associated with (a sense of) a word, particularly in terms of prepositions, groups and clauses that follow the word” (Hunston & Francis, 2000: 3, cited according to Mahlberg, 2006: 378). Patterns distinguish different meanings of words, but also connect diverse words in the sense that given patterns often share aspects of meaning (Hunston, 2002: 105; Mahlberg, 2006: 379)
2.1.2. Frequency and Typicality
Corpus-access software can also analyse words in a corpus in terms of their frequency and arrange them in form of frequency lists (cf. Hunston 2002: 3; Sinclair, 1991b: 30 et seq.).
First of all, frequency is helpful, because it hints at the typical usage of words, whereas typicality denotes the most frequent meanings, or collocates or phraseology of words (Hunston, 2002: 42). The typicality data is valuable due to the fact that speakers often have intuitions about typicality, but these intuitions are not always in accordance with corpus-evidence of frequency. They rather concern prototypicality, the usage which is commonly felt to be typical. (cf. Hunston, 2002, 43 with reference to Barlow and Shortall)
Additionally, frequency is an important factor in combination with phraseology, because the most frequent words carry the main patterns in a language (Mahlberg, 2006: 389 with further reference) and have a “subtle range of meanings” ( Sinclair, 1995b: xviii). Phraseology is prevalent in very frequent words, because they are often used in fairly fixed phrases (Hunston, 2002: 102 with further reference).
Furthermore, if it is assumed that the meaning of words can be viewed as their use, frequency can also be seen as part of the meaning. (Mahlberg: 2006: 389).
2.2. Phraseology, Frequency and Typicality in Applied Linguistics and Language Teaching
Corpus insights concerning phraseology and frequency have had an immense influence on applied linguistics and revolutionized the design of reference materials like dictionaries and grammar books.
Dictionaries nowadays tend to define phrases rather than single words, include definition sentences to illustrate phraseology, and sometimes even introduce collocational information into definitions (cf. Hunston, 2002: 102) They also include frequency information, either explicitly in the respective entries or implicitly as an criterion for the organization of entries (cf. Hunston, 2002: 97). Examples of such corpus based dictionaries are the Collins Cobuild English Language Dictionary (Sinclair, 1995a) and the PONS Großwörterbuch Englisch – Deutsch, Deutsch – Englisch (Cop & Agbaria et al., 2002).
Some grammar books include lexical information as one major part of the grammatical description, explicitly state lists of grammatical patterns sorted according to words of different word classes, and use grammar codings based on these patterns. Additionally, they are also based on frequency information (cf. Hunston, 2002: 104 et seq.). Representatives of such grammar books are the Collins Cobuild English Grammar (Sinclair et al. 1991a) and the Cobuild Grammar Pattern series (cf. Hunston, 104 with further reference).
Aside from their implications and effects on the design of reference material, phraseology, frequency and typicality are important issues in the realm of actual language teaching, too.
The study of phraseology is of great pedagogic value for the teaching of languages (cf. Koprowski, 2005: 322, Hunston, 2002: 197), because phrases can pose problems for language learners as well as support their understanding of the target language. This particularly holds true for the learning and teaching of English, as phrases are frequently used in this language (Hunston, 2002: 138 with further reference).
First of all, if Sinclair’s idiom principle depicts cognitive reality, learners need to be confronted with phraseology, rather than with a construction kit of single lexical items and rules for their combination.
Secondly, negotiation of ambiguous meanings is a common element of classroom discourse and of interactions between native and non-native speakers (Groom, 2006: 27). Phrases or sequences are problematic in this respect, because they are often connected with associations that can differ strikingly for native speakers and for learners of English as a foreign language (Groom, 2006: 26). On the other hand, the exposure to certain sequences can also lead to sets of expectations as to what they typically mean, which are then consciously or unconsciously applied to new instances of this phraseology (Groom, 2006: 26 et seq.). Therefore, the study of phraseology helps learners to understand the meaning of phrases, raises their consciousness of the target language, and enables them to conform to native-speaker phraseological norms and expectations, which supports their communica-tive skills (cf. Römer, 2005: 282; Groom, 2006: 27; Hunston, 2002: 197; Mukherjee, 2004: 246).
Moreover, vocabulary teaching has to take account of semantic prosody, especially in teaching hermeneutics to advanced learners. In this respect, a phraseological approach is the only effective way of accomplishing it (cf. Hunston, 2002: 142).
Summing up, TEFL methods therefore have to include phrases and should replace the single word as a teaching unit by phraseology wherever possible (cf. Hunston, 2002: 139; Lamy & Mortensen, 2006: 323 with reference to Willis). On the other hand, one has to be cautious only to include useful lexical phrases and to exclude superficial or extremely rare ones (Koprowski, 2005: 324, 331), but this is a question of frequency and typically and will be discussed later on.
Aside from its significance for communicative competence and vocabulary teaching, the study of phraseology also holds advantages for the teaching of grammar, and can enhance the traditional methods. Learners often have problems with abstract grammatical meta-language and rules, which may lead to unnecessary failure (Hadley, 2002: 3.2). By following the pattern grammar approach, surface distinctions can be used to focus on the association between meaning and form, and to characterize grammatical patterns without regress to abstract terms and labels, which are often connected to delimitation problems and to controversial interpretations (cf. Mahlberg, 2006: 379, 381). Thereby, grammar teaching would become more illustrative by shifting the focus from “a grammar of “empty constructions”“ to “a grammar of lexical items” (Römer, 2005: 286) Another argument that supports the application of the grammar pattern approach in TEFL is that fact that English is a lexical language, in which many grammatical concepts can be regarded as lexical patterns (cf. Hunston, 2002: 190 with reference to Willis).
Thus, many proponents of a pattern grammar argue for its introduction in language teaching to communicate grammatical insights and to help learners utilize the target language (“Pedagogical Grammar” Hadley, 2002: 3.0; “Corpus-Driven Communicative Didactic Lexical Grammar” Römer, 2005: 285; “Lexical Syllabus” by Willis cf. Hunston, 2002: 190 and Lamy & Mortensen, 3.2.2.).
Frequency and typicality do not rank behind phraseology in terms of their significance for language learning and teaching. It is demanded that frequency and typicality should constitute the centre of a syllabus, which teaches the most frequent words and items first (cf. Hunston, 2002: 194, 189 with further reference to Willis; Koprowski 2005: 324; Römer, 2004: 161; Römer, 2005: 281, 287).
One reason given is the fact that the most common units in a language are the ones, which learners will likely encounter outside the classroom and in contact with native speakers (cf. Koprowski, 2005: 324). Secondly, most frequent words have a variety of usages, so that learners easily acquire a flexibility of language by learning them (cf. Hunston, 2002: 189). Therefore, fluency in using frequent words is of greater value than knowing a lot of words which are hardly ever used (cf. Sinclair, 1995b: xviii). Finally, the main uses of the most frequent words also cover the main grammatical patterns and should be used to explain grammar in class (cf. Hunston, 2002: 189)
In addition, if language teaching should facilitate learners´ communicative competence, it has to communicate knowledge about what is expected or typical in a language. Again, the most typical and central language features should be taught first, before exposing learners to less common and rather marginal features (Römer, 2005: 281). In order to give learners the opportunity to sound more natural or native-like, every item should be presented in its typical context, which encompasses collocation, phraseology and patterns (cf. Römer, 2005: 282 with further reference).
Summing up, if lexical items are arranged, ordered and presented in accordance with frequency, it is assured that the most important aspects of the target language are learnt first (cf. Mindt & Grabowski, 1995: 6), and that learners receive an optimum yield out of their learning efforts (cf. Koprowski, 2005: 323). Due to this, typicality in frequency terms should also be a guideline for choosing examples for teaching (cf. Hunston, 2002: 44). Nevertheless, in order to present an adequately detailed picture of the target language, certain infrequent but important aspects should be considered and included in a syllabus, since infrequent items sometimes have a high cultural value and may carry a lot of information which learners need to know in order to understand them semantically and culturally (cf. Hunston, 2002: 194; Römer, 2005: 286).
2.3. Status Quo of Phraseology and Frequency in Contemporary Teaching Materials
Despite their didactic significance and in contrary to the paid attention in reference materials, phraseology, frequency and typicality play only a subordinate role in contemporary teaching materials and in EFL course books. Concerning this matter it has to be kept in mind that in many teaching contexts there is no distinction between course books and syllabus, making the books the main source for instruction and nearly the only language input for learners (Koprowski, 2005: 323 with further reference). This also applies to German TEFL which is very textbook-based, especially in respect of the training of beginners and intermediate learners (Römer, 2004: 151, 163). The dependence on written course materials reaches so far that the term “Abiturspeak” denotes the fact that advanced learners, who leave school with the German equivalent of British A-levels, are not sufficiently able to use natural spoken English, but rely on a spoken version of the written language (Mukherjee, 2004: 247).
In general, teaching materials have been largely unaffected by the insights of corpus linguistics (Römer, 2005: 277; Römer, 2006: 128; Mukherjee, 2004: 242 et seq.) Research even attested discrepancies between course books and corpus evidence. For German EFL materials, Mindt stated that “English in German EFL textbooks is at variance with language used by native speakers” (1997: 42), and Römer documented “considerable inadequacies in pedagogical description”, which refer to a simplified illustration of lexical and grammatical items in textbooks, basing on significant aberrations of collocation and context patterns from natural spoken English. (cf. 2004: 275, 282). Moreover, the notion of a pattern grammar has not reached EFL course books, yet. The dichotomy of lexis and grammar is still maintained (Römer, 2004, 286).
Even if British mainstream course books nowadays routinely offer a mix of collocations, compounds, binominals, idioms, as well as fixed and semi-fixed phrases, the selection is done without reference to corpus data (especially frequency), and many of the presented items are of limited pedagogic value (cf. Koprowski, 2005: 322).
Unfortunately, frequency and typicality are still no criteria for the presentation of English in course books. The selection processes are often highly subjective and depending on the discretion and intuition of the designers (cf. Sinclair, 1991a: 30; Koprowski, 2005: 322). Instead of structuring a course around a set of typical and therefore useful lexical items, school book developers begin with a theme, topic or structure and then “intuit” items related to these basic concepts (Koprowski, 2005: 330) Thus, teaching materials do not tend to present the typical usage of English, but rather a prototypical one (cf. Hunston, 2002: 44 with reference to Barlow and Shortall; Römer, 2005:, 279).
Another disregard of corpus data is the fact that learners are largely confronted with invented sentences that have never occurred in real speech situations (Römer, 2004: 153; 2005: 279; Lamy & Mortensen, 2006: 3.2.1.) Thus, learners are exposed to an artificial, simplified language, which does not mirror the authentic use of English and possibly hinders learners´ fluency in speech production (Römer, 2004: 153 et. seq.; 2005: 278; 2006: 125 et seq. with further evidence).
2.4. Concept of Analysis at Hand
The didactic significance of phraseology, frequency and typicality in language teaching and the backwardness of German EFL school books in this regard are the starting points for this paper’s investigation. Supporting the recommendation to integrate corpus data into the design of teaching materials, it is based on approaches by Römer and Mahlberg.
The foundation of the examination at hand is a course book analysis by Römer, who developed a pedagogic corpus and examined whether German school books contained the same collocational patterns as natural spoken English (cf. Römer, 2004). She postulates that teaching materials should reflect corpus evidence, teach common language patterns, and adjust the proportions in which items co-occur in textbooks to those in spoken English (Römer, 2004: 153, 161; 2005: 276, 283). In addition, progression and sequencing should be in line with frequency information (Römer 2005: 288) Moreover, pedagogic language should be adjusted to actual language use and textbooks shall present authentic samples of genuine English as examples (Römer, 2005: 276 et seq., 281, 290).
Mahlberg suggests a completion of the comprehensive overview provided by the pattern grammar approach by information on the main patterns of the most frequent members of a word class, and particularly focuses on high frequency nouns with special regard to collocating verbs (cf. Mahlberg, 2006: 381, 389).
This paper combines these two approaches and examines the phraseology of the three nouns day, money and way, which belong to the most frequent nouns in English and tries to figure out, whether they are adequately represented in teaching materials. In order to set a wide scope for the investigation, the paper looks for any kind of “multi-word lexical items”, which are defined as “vocabulary consisting of a sequence of two or more words, which semantically or syntactically forms a meaningful or inseparable unit”. These multi-word lexical items encompass collocations, compounds, phrasal verbs, binominals and fixed and semi-fixed phrases (Koprowski, 2005: 322 with further reference). Taking on Mahlbergs idea, special attention is paid to collocating verbs.
Due to the central importance of communicative competence and fluency in German TEFL (cf. Römer, 2005: 280; for German grammar schools: Kerncurriculum, 11; EPA English 5), the phraseology is primarily examined in the spoken part of the British National Corpus (BNC spoken). The findings are then juxtaposed to the results of an analysis of Römer´s German English as a Foreign Language Textbook Corpus (GEFL TC), a pedagogic corpus based on two course book series widely used in German secondary schools, namely Green Line New and English G 2000 (cf. Römer, 2004: 155 et seq. for further details). The comparison will reveal, whether the described discrepancies and inadequacies also refer to the illustration of the high frequency nouns.
The investigation is based on concordances created by the aid of the Oxford Wordsmith Language Tool program, version 4.0. They were sorted to the left and to the right, in order to highlight phrases and patterns, and finally printed in the KWIC format (cf. Sinclair, 1991b: 32 et seq. for a description of it). The concordances can be found in the appendix, along with detailed lists of their phraseological content (cf. appendix 5.2.-5.5.).
In these lists, the results were sorted by the word class of the nouns´ respective collocates. Frequency and alphabetical order determine the sequence of the entries within each word class column, and sub-entries illustrate special patterns. Verb entries cover ordinary verbs as well as phrasal verbs, and present a lemma (cf. Hunston, 2002: 17) including all inflected word-forms of a particular verb. An overlapping of findings from the analysis of the right-sorted and left-sorted concordances is sometimes inevitable, since patterns are sometimes part of both contexts of the nouns. A summary of all important collocates and patterns can be found in a table in the appendix (cf. appendix 5.6), which contrasts the results from the BNC spoken and from the GEFL TC.
To keep an investigation of high frequency nouns within the realms of possibility, the queries were limited to two random sets of 100 lines from the BNC spoken for each word. On the other hand, concordances from the GEFL TC contained all occurrences of the nouns, since there were only 141 occurrences of day, 116 of money and 95 of way. This method has the drawback that the findings are not directly comparable, because samples are compared with total occurrences. Moreover, a normalization of the findings is confronted with the problem of a great difference in size between the compared corpora. Alternatively, as a concession to feasibility, percentages are calculated in order to create a rough common denominator. They are rounded according to the third digit after the decimal point.
Subsequent to the corpus analysis, the paper turns to the representation of the nouns in the English G 2000 series in more detail and investigates, whether the introduction of the respective phraseologies and meanings mirrors the BNC sample findings.
Finally, as one approach to investigate the teaching materials for advanced learners, the entries of the high frequency nouns are checked in three dictionaries. Dictionaries build the common ground for advanced learners, since they do not use course books anymore, but concentrate on a variety of texts written by native speakers, or use other authentic language sources. The paper takes a look at the Longman Dictionary of Contemporary English, the Macmillan Dictionary for Advanced Learners and the Oxford Advanced Learners´ Dictionary of Current English
On basis of the outcome of the investigations the paper will suggest possible improvements, if necessary. In conclusion, it will address the benefits of Data Driven Learning and devise corpus based exercises for the high frequency nouns, taking into account the analysis´ results.
2.5. Day, Money and Way in the BNC Spoken and in the GEFL TC
In general, the following comments refer to a comparison of the concordances from the BNC spoken and the GEFL TC. A summary and juxtaposition of all important collocates and patterns can be found in a table in the appendix (5.6.), complete lists of all particular collocates follow the respective concordances.
The table was initially created on basis of the GEFL TC findings, since there was only one concordance per noun, and collocations were ordered in accordance with their frequency. Then the findings of the BNC spoken samples were juxtaposed and novel collocations and patterns were integrated, while the distinction between the two relative samples was maintained. Finally, the data was mutually complemented.
Percentages in the table represent the collocates´ frequency in each of the examined concordances, treating the BNC samples separately. BNC spoken percentages in the following comments are given as an average of both samples in order to relativise the individual findings.
2.5.1. Day
One striking observation about the noun day is that it frequently occurs as part of compounds. Both BNC spoken concordances show a compound quantity of 18%, but differ in the rate of compounds that refer to holidays or special days like Christmas Day, New Years´ Day holiday, Boxing Day, Nottinghamshire Day or Medieval´s Day. On average, 3% of the lines contain such a compound. The GEFL TC features less compounds involving day (9.22%), but simultaneously contains more compounds referring to holidays (4.26%), like Christmas Day, St. Patrick Day, Guy Fawkes´ Day or Independence Day. Therefore, the tendency of day to be part of compounds is under-, while its character as part of holidays is overrepresented in relation to the BNC spoken. Day off can be found in both BNC samples among the “ordinary” compounds, averaging at 2%, but without any equivalent in the GEFL TC. The BNC spoken furthermore offers a variety of nouns sporadically collocating with day, like day centre (1%), day and night (1%), day to day basis / issue (0.5% each), or day by day (0.5%), which do not exist in the GEFL TC .
With respect to prepositions as collocates of day, several discrepancies between both corpora can be attested, as well as some rough accordances. On + day is the most frequent pattern in the GEFL TC (9.22%) and it also encompasses the more complex patterns on + possessive pronoun + first day (4.26%) and on + possessive pronoun + last day (1.42%). These combinations cannot be found in the BNC spoken, which only contains the pattern on + day in 1% of the cases, always as the phrase on the day. Day in is also very frequent in the GEFL TC (5.68%), especially in the pattern day in + ´place / location´ (4.97%). The collocation also occurs in the BNC spoken, but is infrequent (1%) and exclusively exists as the pattern day in + ´place / location´. Another overrepresented collocation in the GEFL TC is day on (3.55%), occurring most of the times in the pattern day on + ´place / location´ (2.84%), for example day on the beach. The BNC spoken features day on in only 1% of the cases and solely as the pattern day on + ´weekday´, like day on Sunday. In so far, the GEFL TC frequently uses a pattern which has no match in the BNC spoken. The same applies to day for, which appears in 2.84% of all concordance lines in the GEFL TC, but in none of the BNC spoken. Additional overrepresented patterns in the school book corpus are day at (2.84% GEFL TC vs. 0.5% BNC spoken) and day before (1.42% GEFL TC vs. 0.5% BNC spoken), which also comprises the longer pattern the day before yesterday which exclusively appears in the GEFL TC (0.71%).
Underrepresented collocations include of the day (2.13% GEFL TC vs. 8.5% BNC spoken) and at the end of the day (0.71% GEFL TC vs. 4% BNC spoken), while time of the day only exists in the BNC spoken (1%). Day of is a little more frequent in the BNC spoken (2.5%) than in the GEFL TC (2.13%), but includes the pattern day of + gerund (1%), which has no counterpart in the pedagogic corpus. A close match between both corpora is the day after with 1.5% of all occurrences in the BNC spoken and 1.42% in the GEFLTC, though the phrase the day after tomorrow only appears in the GEFL TC (0.71%).
The BNC spoken comprises prepositional collocations, which cannot be attested in the GEFLTC, for instance: in+ day (2%) with the phrase in the day (1.5%), during the day (2.5%) with its variations during the course of the day and during the heat of the day (0.5% each), and for a day (1%). On the other hand, the GEFL TC contains from day to day (1.42%) which has no equivalence in the British National Corpus samples.
A further mismatch between the corpora is the collocation of day with the conjunction when, which is rare in the BNC spoken (1.5%), but quite frequent in the GEFL TC (5.68%). In 3.55% of the GEFL TC lines, the collocate is part of an inserted relative clause, a pattern that does not occur in the BNC spoken sample lines.
Comparing the adjectives collocating with day, it becomes obvious that there are more discrepancies than accordances, as well. One comparable collocational pattern is one day with 7.5% in the BNC spoken and 7.1% in the GEFL TC. Nevertheless, a closer look at the meanings of the phrase reveals that the agreement only refers to ´a single day´ with 0.5% in the BNC spoken and 0.71% in the GEFL TC. The meaning ´at an undetermined time in the past´ constitutes only 2.5% of the BNC samples, while 6.38% of the GEFL TC concordance lines feature this meaning. The BNC spoken additionally exhibits the meaning ´at an undetermined time in the future´ (3%) which is missing in the GEFL TC. Another pattern appearing in comparable percentages is this day, occurring in 2% of the BNC spoken and in 2.13% of the GEFL TC. However, it is often part of lager patterns like this one day (0.5% BNC spoken), this very day (0.5% BNC spoken), on this day (1.42% GEFL TC) or to this day (0.71% GEFL TC), which only occur in one of the two corpora.
A very strong disproportion can be seen in the usage of the phrase next day. Only 2% of the BNC spoken concordance lines contain this phrase, all as the phrase the next day. In return, 11.35% of the GEFL TC lines show next day, 9.38 % as the phrase the next day. That day is likewise overrepresented in the textbook corpus (11.35% GEFL TC vs. 2% BNC spoken), which also applies to its ´on a particular day´ meaning (2.84% GEFLTC vs. 2% BNC spoken). It even encompasses the longer phrase later that day (2.13% GEFL TC) which has no match in the BNC spoken. Furthermore, nice day occurs rarely in the BNC spoken (1%) but very frequently in the textbook corpus (5.67% GEFL TC). Besides, the collocation always appears as part of the phrase HAVE a nice day (5.67%). First day also appears disproportionately high in the GEFL TC (4.26% vs. 1.5% BNC spoken) and features with on + possessive pronoun + first day (3.55%) and in the first day (0.71%) other patterns than the BNC, which exhibits the first day (1.5%). Finally, every day is also overrepresented in GEFL TC (6.38% vs. 5% BNC spoken).
On the other hand, all day appears disproportionately low in the GEFL TC (4.96% vs. 8% BNC spoken) and lacks the pattern all day long (0.5% BNC spoken) and all night and all day (0.5% BNC spoken).
Several adjectival collocations only occur in one of the corpora. The BNC spoken contains the very frequent phrase the other day (8%), which mostly has the meaning ´a few days ago´ (7.5%), and the collocation full day (1%). On the other hand, the GEFL TC presents last day (3.55%), a great day (2.13%), a good day (2.13%), big day (1.42%), perfect day (1.42%) and windy day (1.42%).
As far as day in combination with articles is concerned, a day is highly underrepresented in the GEFL TC (2.13% vs. 12% BNC spoken). A closer look at the different meanings of a day, namely ´per day´ (1.42% GEFL TC vs. 5.5% BNC spoken) and ´one single day´ (0.71% GEFL TC vs. 4.5% BNC spoken), shows the same tendency. The BNC additionally features the pattern times a day (1%), which does not exist in the course book corpus.
Verb collocates of day in the BNC spoken contain the relatively rare patterns HAVE + day (1.5%) and HAVE GOT + day (1%), which both collocate with day off (1%). Contrary to this, the GEFL TC only exhibits the frequent HAVE + day (7.8%), which is always integrated in salutations like HAVE a nice day (5.67%), HAVE a great day (1.42%) and HAVE an enjoyable day (0.71%). The textbook corpus also frequently contains the pattern BE the day (3.55%), like in Is it the day? and …today would be the day…, which does not appear in the BNC. Finally, SPEND + day is nearly equally distributed in both corpora (1% BNC spoken, 0.71% GEFL TC).
Summing up, the collocations of day in both corpora exhibit striking discrepancies and some seldom, marginal accordances.
2.5.2. Money
Noun collocations of money in the BNC spoken contain Government money (1%) and value for money (1%), which have no counterpart in the GEFL TC. Reciprocally, the schoolbook corpus features pocket money (3.45%) and lunch money (0.86%) without a counterpart in the BNC spoken samples. This circumstance can be regarded as marginal, though, since the findings in the BNC samples are rare and the prevalence of pocket money is accounted by the pedagogic character of the corpus.
Prepositions collocating with money in the BNC spoken are to, on, for, from, about, in and into. The pattern money to + verb (infinitive) is frequent (4.5%). It also occurs in the GEFL TC, although underrepresented (3.45%). Money on only occurs in the spoken part of the British National corpus (2.5%) always in combination with verb collocates like SPEND money on (1%), COST money on (0.5%) HAVE money on (0.5%), and PUT money on (0.5%). With 18.1%, Money for is drastically overrepresented in the GEFL TC in comparison to a 5% occurrence in the BNC spoken. Further overrepresentations can be attested for money from (6.03% GEFL TC vs. 2% BNC spoken) and for about money (0.5% BNC spoken vs. 1.72% GEFLTC). In return, money in is underrepresented in the GEFL TC (1.72 % vs.5% BNC spoken). Money into again appears only in the BNC spoken (1.5%).
Money frequently collocates with adjectives and quantifier phrases. Comparing the patterns in both corpora, it is obvious that discrepancies prevail, and that the GEFL TC lacks a lot of pattern occurring in the BNC spoken. A lot of money constitutes only 6% of the BNC samples, but 8.62% of the GEFL TC lines, hence it is overrepresented. The BNC also shows examples of variation like an awful lot of money (0.5%) and hell of a lot of money (0.5%), which are infrequent, but non-existing in the other corpus. Reciprocally, the school book corpus exhibits lots of money (1.72%), which has no match in the BNC spoken. Overrepresentations in the GEFL TC also apply to some money (7.76% vs. 5% BNC spoken), no money (5.17% vs. 1.5 % BNC spoken) and enough money (4.31% vs. 0.5% BNC spoken). Simultaneously, underrepresentations are exemplified by any money (2.59% vs. 4.5% BNC spoken), amount of money (0.86% vs. 2.5% BNC spoken), that money (1.72% vs. 4.5% BNC spoken) and this money (0.86 vs. 2% BNC spoken).
A rough accordance can be seen in much money (3% BNC spoken vs. 3.45 % GEFL). Even if all money seems to roughly correspond (1% BNC spoken vs. 1.72% GEFL TC), the actually occurring pattern in the textbook corpus is all the money (1.72%).
Again, the BNC samples contain several collocations which cannot be found in the GEFL TC, for instance more money (3.5%), a bit of money (1%), little money (1%), and the right money (1%). On the other hand, the GEFL TC features extra money in 4.31% and English money in 2.59% of the cases without a counterpart in the BNC spoken
The adverbial collocation money back is more frequent in the GEFL TC (2.59%) than in the BNC spoken (0.5%). In the BNC samples, it only occurs in the pattern GIVE money BACK (0.5%). The GEFL TC exhibits this pattern as well, but more frequently (1.72%). In addition, it features HAVE money BACK (0.86%) which cannot be spotted in the BNC spoken.
When collocations of money with personal pronouns are concerned, my money is more common in the GEFL TC (3.45%) than in the samples of the spoken part of the BNC (3.45% vs.1.5%). In contrast to it, the distribution of your money is nearly balanced (3.45% GEFL TC vs. 3.5% BNC spoken) The BNC concordances additionally contain her money (1.5%) and their money (1%).
A great variety of verbs collocate with money in the BNC spoken samples, but again, the differences to the GEFL TC are more numerous than the similarities. The only equivalent patterns are GET money (8% BNC vs. 7.76% GEFL TC) and PUT money (1% BNC spoken vs.0.86% GEFL TC).
HAVE money and HAVE GOT money appear in both corpora, but with a strong overrepresentation in the GEFL TC (10.34% vs. 7 % BNC spoken and 12.07% vs. 6% BNC spoken). Only the BNC features the phrase HAVE money on it (0.5%), while the phrase HAVE money with + personal pronoun (0.86%) exclusively appears in the GEFL TC. GIVE money also occurs more often in the GEFL TC (10.34%) than in the BNC spoken (6%), as well as the aforementioned pattern GIVE BACK money (1.72% vs. 1% BNC spoken). Overrepresentations also refer to MAKE money (5.16% GEFL TC vs. 1.5% BNC spoken), NEED money (4.31% GEFL TC vs. 1% BNC spoken) and COLLECT money (3.54% GEFL TC vs. 0.5% BNC spoken). PAY money is contrariwise underrepresented in the GEFL TC (0.86% vs. 2.5% BNC spoken), money to PAY occurs quite equally in both corpora (0.86% GEFL TC vs.0.5% BNC), and the rare patterns PAY BACK money (0.86% GEFL TC) and PAY with money (0.86% GEFL TC) cannot be attested in the BNC concordances at all. The same applies to the phrase PUT TOGETHER money (0.86% GEFL TC).
Several patterns are only featured by the BNC spoken samples, for example TAKE money (2.5%), it takes money (0.5%), INVEST money (2%), WANT money (2%), TRANSFER money (1.5%), USE money (1.5%), BE there money (1%), FIND money (1%), KEEP money (1%), RAISE money (1%) and PUT money on the side (0.5%) Money in combination with GO contains patterns like where the money GO (1%), money GO out (0.5%) and KEEP the money GO (0.5%). The collocation of money with the definite article the exhibits nearly equal distributions (20% BNC vs. 22.41% GEFL TC)
Summarizing the results of money, the attested patterns diverge immensely.
2.5.3. Way
By virtue of its character as a polysemous word, way is by far the most complex word of the three high frequency nouns. In order to avoid going beyond the limits of this term paper, the description of its collocational patterns will be restricted to the most important aspects and to the most drastic discrepancies. For a complete juxtaposition attention is invited to the table in the appendix (5.6.)
The tendency of way to collocate with nouns in names is existent in both corpora, although it is more frequent in the GEFL TC (2.11%) than in the BNC spoken (0.5%), which contains Stoke Worstead Road Way as the only instance.
When prepositions collocating with way are concerned, only discrepancies between the corpora can be attested. On + way is far more frequent in the GEFL TC (27.37%) than in the BNC spoken (6.5%), as well as the patterns on the way (18.95% GEFL TC vs. 3.5% BNC spoken) and on+ possessive pronoun + way (9.47% GEFL TC vs. 3% BNC spoken). The same applies to way to (17.89% GEFL TC vs. 6.5% BNC spoken) and the appendant patterns way to + noun / noun phrase (12.63% GEFL TC vs. 2% BNC spoken) and way to + verb (infinitive) (5.36% GEFL TC vs. 3.5% BNC spoken). Other overrepresentations in the GEFL TC include by the way in the sense of ´adding something to what you are saying´ (3.16% GEFL TC vs.2% BNC spoken), way up (2.11% GEFL TC vs. 0.5% BNC spoken), and out of the way (2.11% GEFL TC vs. 1.5% BNC spoken. Furthermore, the verbs collocating with out of the way differ. The BNC features GET + out of the way (1%) and TAKE + out of the way (0.5%), while the GEFL TC exhibits JUMP out of the way (2.11%).
Underrepresentations of prepositional patterns in the GELF TC include way of (2.11% GEFL TC vs. 12.5% BNC spoken), especially in the pattern way of + verb (gerund) (2.11% GEFL TC vs. 20.5% BNC spoken). Also in + way occurs less frequent (2.11% GEFL TC vs. 14.5% BNC spoken) and only contains the patterns in our own way (1.05%) and in this way (1.05%), which have no equivalent in the BNC spoken. In return, this corpus shows the collocations in + way (9%), which include several examples of the larger pattern in a + adjective / adjectival phrase + way, as well as in the way (3.5%) and in some way (1%). None of these collocations can be found in the GEFL TC. Another instance of an underrepresented collocation is way through which occurs in only 1.05% of all GEFL TC concordance lines. In the BNC samples, it is more frequent (2.5%) and always appears in the pattern the way through. Again, this pattern cannot be found in the GEFL TC.
About the way (1.5%) and under way (1%), collocating with the verbs BE (0.5%) and GET (0.5%), and way of + personal pronoun (0.5%) are other patterns, which the GEFL TC is lacking.
Collocations of way with adjectives show some similarities in both corpora, but in the majority of cases, differences prevail again. Rough accordances exist in all the way (3.5%BNC spoken vs. 3.16% GEFL TC) and which way (1% BNC spoken vs. 1.05% GEFL TC). This way seems to be another instance (6% BNC spoken vs. 5.26% GEFL TC), but a closer look at the appendant patterns reveals that this only holds true for GO this way (1.5% BNC spoken vs. 1.05% GEFL TC). Aside from GO this way, the BNC spoken also features other verbs that collocate with this way in the pattern verb + this way (2.5%), but the GEFL TC does not. The BNC concordances also exhibit this way with the infrequent meanings ´particular path/route´ (0.5%) and ´by using this method´ (0.5%). On the contrary, the GEFL TC contains this way! as an interjection in the sense of ´in this direction´ (2.11%), and features this way with the meaning ´area´ (1.05%), exemplified by the sentence Remember, how last time you were round this way it hadn’t rained […].
One underrepresented collocation in the GEFL TC is that way (5.26% GEFL TC vs. 8.5% BNC spoken), especially what the ´path / route´ meaning (2.11% GEFL TC vs. 4% BNC spoken) and the patterns verb + that way (2.11% GEFL TC vs. 4% BNC spoken) are concerned. Others are other way (1.05% GEFL TC vs. 4& BNC spoken) and no way (1.05% GEFL TC vs. 4.5% BNC spoken), especially in reference to the meaning ´under no circumstance´ (1.05% GEFL TC vs. 3% BNC spoken).
Overrepresentations in the textbook corpus are a long way (4.24 % vs. 2% BNC spoken) and wrong way (4.24 % vs. 1.5% BNC spoken), which contains the pattern the wrong way (3.16% vs. 0.5% BNC spoken). On the other hand, the wrong way round is equally distributed in both corpora (1.05 GEFL TC vs. 1% BNC). Only way is also more frequent in the GEFL TC (2.11% vs. 1% BNC spoken), especially as the phrase the only way (2.11% vs. 0.5%), as well as some way (2.11% vs. 1.5% BNC spoken). Patterns which only occur in the BNC spoken are any way (3.5%) and easy way (1%).
Discrepancies also prevail in respect of adverbs collocating with way. Way home belongs to the patterns which are overrepresented in the GEFL TC (7.73% vs. 1.5% BNC spoken), even if the extended phrase on the way home is equally distributed in both corpora (1.05% GEFL TC vs. 1% BNC spoken). Moreover, in 6.23% of its lines the course book corpus features the phrase on the way home from, which cannot be located in the BNC spoken at all. Way back is another instance of overrepresentation, occurring only in 2% of the BNC spoken samples, but in 6.32% of the GEFL TC lines. On the way back is also more numerous in the GEFL TC (3.16% vs. 1% BNC spoken), which it additionally presents the patterns on+ possessive pronoun + way back (2.11%) and the way back (1.05%) without a match in the BNC samples. In return, this corpus shows the phrase all the way back (1%) which is a stranger to the textbook corpus.
Underrepresented collocations in the GEFL TC comprise way round (1.05% vs. 4% BNC spoken) and way around (1.05% vs. 1.5% BNC spoken). The other way round (1.5% BNC spoken) does not appear in the GEFL TC, along with the right way round, all way round, all the way around and the other way around (0.5% BNC spoken each), solely the wrong way round appears in both corpora in comparable percentages (1% BNC spoken vs. 1.05% GEFL TC). The pattern FIND + possessive pronoun + way around is again over-represented in the GEFL TC (1.05% vs. 0.5% BNC spoken).
Additionally, both corpora contain adverbial collocations without a counterpart, as well. The BNC spoken features way out (1.5%), whereas the GEFL TC exhibits way here in form of the pattern on + possessive pronoun + way here (2.11%).
Furthermore, the pattern way + personal pronoun + verb, like […] that’s the way I see it[…] exists in each corpus, although to a different extent (7% BNC spoken vs. 5.23% GEFL TC) and with different personal pronouns (see appendix 5.6. for details).
With respect of verbs collocating with way, disproportions of patterns can be attested once more. BE on+ possessive pronoun + way (8.42% vs. 3% BNC spoken), FIND + way (6.32% vs. 1.5% BNC spoken), FIND + possessive pronoun + way (3.12% vs. 0.5% BNC spoken) and FIND a way (2.11% vs. 1% BNC spoken) belong to the overrepresented collocations in the GEFL TC. The schoolbook corpus also contains FIND some way (1.05%) which is missing in the BNC concordances. Other cases of overrepresentation are GO + way (5.26% GEFL TC vs. 2% BNC spoken), way to + verb (infinitive) (5.26% GEFL TC vs. 3.5% BNC spoken), and MAKE + possessive pronoun + way (2.11% GEFL TC vs. 0.5% BNC spoken). On the other hand, BE a way of + gerund is less frequent in the textbook corpus (1.05% vs. 3.5% BNC spoken).
Patterns exclusively existing in one corpus are GET out of the way (1.5% BNC spoken), BE on the way (0.5% BNC spoken), BE under way (0.5% BNC spoken), GET under way (0.5% BNC spoken), TAKE the easy way out (0.5% BNC spoken), GET out of my way (3.16% GEFL TC) and TELL the way (2.11% GEFL TC).
Last but not least, the co-occurrence of way with the definite article the is another case of overrepresentation in the GEFL TC (41.04% vs. 32.5% BNC spoken). Just regarding the stand-alone phrase the way and sorting out cases where it collocates with additional elements, two different meanings can be identified in the BNC spoken, namely ´manner´ (6%) and ´path/route´ (2%). These meanings are more common in the GEFL TC (7.37% and 7.37%), which additionally features the meaning ´method´ (1.05%).
Summing up, also the collocational patterns of way immensely diverge in both corpora.
2.5.4. Summary of the Findings
The analysis has shown that both corpora immensely differ in the existence and distribution of collocational patterns with regard to all of the high frequency nouns. Even if the GEFL TC contains a lot of patterns which occur in the BNC concordances, certain collocations are over- or underrepresented. Furthermore, it features patterns which are nonexistent in the BNC spoken samples, while lacking others, which are quite frequent in them. Moreover, the BNC spoken principally presents a greater variety of collocational patterns than the textbook corpus.
If the BNC spoken samples are representative for the whole BNC spoken (what is assumed for the time being), it can be attested that all three high frequency nouns are not adequately represented in the GEFL TC. Thus, the appendant schoolbooks are at variance with authentic spoken English.
In order to adjust these textbooks to corpus evidence, overrepresentations have to be reduced, in order to make room for a stronger emphasis of underrepresented patterns and for “zero-representations”, videlicet frequent collocations non-occurring in the textbooks.
2.6. Day, Money and Way in the English G 2000 Series
The following section explores the representation of the high frequency nouns in the German schoolbook series English G 2000. Since it belongs to the books included in the GEFL TC, the previous analysis made clear that the representation of the nouns basically differs from their use in authentic English. Thus, the focus of investigation is now set on the introduction of the lexical items. Taking on Römer´s postulate to adjust sequencing and progression to frequency information (Römer 2005: 288), it is examined if the introduction of the respective phraseologies and meanings mirrors the BNC sample findings. Complementary, in order to reliably refer to frequency information, the entries of the Collins Cobuild English Dictionary were consulted as a second resource for comparisons. All entries in this corpus based dictionary are organized in accordance with frequency and practical utility (cf. Sinclair, 1995a, xix), principles which can also be regarded as apt for organizing course books.
The investigated series comprises six volumes (D1-D6) designed for pupils from beginner to intermediate level. Each book consists of units with obligatory and optional content, exercises and a vocabulary part. In addition, it features an alphabetical dictionary section which contains the vocabulary of the current volume, as well as the one of the preceding books.
2.6.1. Day
The basic meaning of day as ´a period of 24 hours´ is introduced in the third unit of the first volume (Schwarz, 1997: 53, 156). This accords with the Collins Cobuild dictionary, which describes this meaning in the first entry of the noun (Sinclair, 1995a: 358), and agrees with the significance of day as a high frequency noun. The phrase the next day is also introduced in the first volume of the series, and translated with a ´on the following day´ meaning (Schwarz, 1997: 157). This phrase is not mentioned in the Collins Cobuild dictionary, but it is quite frequent in the BNC spoken samples (4%), thus its early introduction seems reasonable. Unfortunately, the phrase does not occur in the unit itself, it only appears in a remark in the vocabulary section and therefore in isolation without context (Schwarz, 1997: 157).
The third volume (D3) introduces the phrasal verb to save the day (Schwarz, 1998b: 150) which is neither included in the Cobuild entry (which only presents win /lose the day, cf. Sinclair, 1995a: 359), nor does it occur in the BNC spoken samples. Thus, it seems to be an infrequent collocational pattern, whose relatively early presentation contrasts to the introduction of a day as ´per day´ in the subsequent volume (Schwarz, 1999: 138). Although this collocation is quite frequent according to the BNC spoken samples (5%BNC spoken), it is introduced after to the infrequent phrasal verb.
The dictionary entry of the volume D4 also contains the collocation these days with a ´nowadays´ meaning (Schwarz, 1999: 158), which does not appear in the BNC spoken samples and cannot be found in the Cobuild entries, only featuring one of these days (Sinclair, 1995a: 35) The fourth volume also presents the compound day trip (Schwarz, 1999: 158) which does not appear in the BNC spoken samples, but has a seperate entry in the Collins Cobuild dictionary (Sinclair, 1995a: 359), which makes it worth introducing.
The penultimate volume of the series contains the day before (Schwarz, 2000: 150), which is infrequent in the BNC spoken samples (0.5%) and cannot be attested in the Collins Cobuild entries. Nevertheless, its presentation at such an advanced stage of the course does not seem problematic. D5 also exhibits G´day (Schwarz, 2000: 152) as part of a unit about Australia. Even if this pattern is neither found in the BNC spoken samples nor in the Collins Cobuild dictionary, this does not pose a problem. On the one hand, both references are based on British English and do not contain Australian peculiarities, and on the other hand, G´day is a common Australian salutation whose introduction is apt for a down under unit.
Finally, the last volume (D6) exhibits day care centre (Schwarz, 2002: 126). Related collocations include day centre in the BNC spoken samples (1%) and day care as a separate entry in the Collins Cobuild dictionary (Sinclair, 1995a: 359). Despite the lack of a literal counterpart in those sources, the introduction of day centre can be regarded in accordance with frequency, since the compound occurs at the end of an introductory course.
Summing up the previous observations, except for save the day, these days and a day, the existing vocabulary and the sequencing of its introduction can be said to agree with frequency information. Due to its commonness, a day should be introduced at an earlier stage of the course.
2.6.2. Money
According to its status as a high frequency noun, also money is introduced in the first volume of the school book series as ´means of payment´ (Schwarz 1997: 68, 176).
The third volume (D3) features to spend money (on) (Schwarz, 1998b: 139), a collocation which is frequently found in the BNC spoken samples (5.5% SPEND money, 1% SPEND money on). Due to this, the choice of this collocation accords with frequency, but it could have been introduced earlier. In addition, to spend money on only occurs in the vocabulary section of the book, the respective unit itself only presents to spend time on (Schwarz, 1998b: 110) D3 also exhibits pocket money, which does not exist in the BNC spoken samples, but can be spotted in the Cobuild dictionary entry (Sinclair, 1955a: 933). Thus, an introduction of the compound in the middle of the course seems unproblematic.
The final volume (D6) presents to get one’s money worth as ´to get good value for the spent money´ (Schwarz, 2002: 44,136). This pattern does not occur in the BNC spoken samples, but it is included in the Collins Cobuild dictionary entry (Sinclair, 1995a: 933), what makes it a reasonable item for the course. But again, it could have been introduced earlier.
In conclusion, the overall choice of the collocational patterns of money is in line with frequency information, but the sequencing could be improved, especially with regard to the fact that an introduction of new collocations is totally neglected in three of the volumes (D2, D4, and D5).
2.6.3. Way
Contrary to expectations that the series would introduce way with the concrete ´path/route´ meaning, volume D1 features the wrong way round with an abstract ´the incorrect one of two possibilities´ meaning (Schwarz, 1997: 92, 176). Although abstract meanings of way are more frequent than concrete ones (cf. Sinclair, 1995a: 1646 et seq.), the presented pattern ranks in the midfield of way ’s overall meanings (entry 19.3. of a total of 36.14, cf. Sinclair 1995a: 1647 et seq.). In addition, the BNC spoken samples also provide evidence for the limited frequency of the wrong way round (0.5%). Thus, the approach to begin with an abstract meaning agrees with frequency information, but the selected item could have been a more common one, for example the ´means / method´ meaning which is the first entry in the Collins Cobuild dictionary (Sinclair, 1995a: 1646).
The concrete ´path /route´ meaning of way is introduced in the second volume along with the collocation (to) tell the way (to) (Schwarz, 1998a: 27,156). Due to the commonness of this meaning (entry 18. of the Collins Cobuild dictionary, cf. Sinclair, 1005a: 1646) the introduction is reasonable. Although the pattern (to) tell the way (to) cannot be found neither in the BNC spoken samples, nor as an entry in the Cobuild dictionary, the presentation of this phrase can be regarded as justified by the thematic proximity to the ´path /route´ meaning, and by its pragmatic utility for negotiating one’s way. The dictionary entry of D2 exhibits further phraseology in this respect, presenting the whole question Can you tell me the way to…? It also contains on my way (to), a pattern that rather seldom occurs in the BNC spoken samples (1%) and only ranks at entry 32.1. in the Cobuild dictionary (Sinclair, 1955a: 1648). The dictionary part furthermore exhibits the wrong way, translated as ´in the incorrect direction´ (Schwarz, 1998a: 188). This collocation can be attested in the BNC spoken samples (1.5%), and ´direction´ is a frequent meaning which can be found in entry 13. of the corpus-based dictionary (Sinclair, 1995a: 1646). Summing up, volume D2 contains frequent collocational patterns as well as infrequent ones, whose occurrence can be accepted.
Despite this accordance to frequency data, the D2 dictionary entry of way, orders the meanings conversely to it. It presents the concrete ´path /route´ meaning prior to the ´direction´ meaning (Schwarz, 1998a: 188).
The third volume (D3) presents this way as part of the pattern verb + this way (Schwarz, 1998b: 10, 136). This pattern frequently occurs in the BNC spoken samples (4.5%) and its ´means/method´ meaning is the most frequent one (first entry of way, cf. Sinclair, 1995a: 1646,). The vocabulary section also juxtaposes it to this way with a ´direction meaning´, which is less frequent (0.5% BNC spoken; 13th entry in the Cobuild dictionary, cf. Sinclair, 1995a: 1646). In addition, D3 exhibits to go a long way (Schwarz, 1998b: 50, 149), which does not exist in the BNC spoken samples, but is mentioned in entry 22.4 of the Collins Cobuild dictionary (cf. Sinclair, 1995a: 1647), what suggests a frequent pattern. With different ways of reading your text and in different ways, two further collocational variations of the ´means / method´ meaning are presented in D3 (Schwarz, 1998b: 70, 157), which both do not appear in the BNC spoken samples. Nonetheless, both belong to the most frequent meaning of the noun, and the pattern different way of + gerund is featured in the first example sentence of the Cobuild entry (Sinclair, 1995a: 1646). Due to this, the introduction of these patterns is justified.
Again, the dictionary part of the volume neglects frequency data, as it mentions the ´means / method´ meaning as the third semantic possibility of way (Schwarz, 1998b: 184). Thus, its dictionary entry turns frequency information upside down, beginning with the least frequent ´path / route´ meaning, continuing with ´direction´ as a meaning of average frequency, and concluding with the most frequent one, ´means / method´.
Volume D4 introduces the set phrase by the way as ´introduction to add something to what one is saying´ (Schwarz, 1999: 24, 135). This collocation can be attested in the BNC spoken samples (1.5%) and it is listed in entry 25. of the corpus-based dictionary (Sinclair, 1995a: 1647). It is not a very frequent collocation, but since it contains a new facet of meaning, its introduction at an intermediate stage of the course seems adequate. Additionally, its rank as the fourth possible meaning of way in the dictionary section of D4 (Schwarz, 1999: 176 et seq.) conforms to its frequency.
Further featured collocations in the fourth volume of the textbooks are to work your way back (Schwarz, 1999: 63, 145) and the way you say it (ibidem, 79, 151). The first one does not occur in the BNC samples, but work your way is listed as a quite frequent expression in the Cobuild dictionary, which is used to “suggest an idea of movement, progress or force as well as the action described by the verb” (Sinclair, 1995a: 1646, entry 11.) The second collocation has no direct match in the BNC spoken samples either, but it is part of the attested common pattern way + personal pronoun + verb (7% BNC spoken). In addition, is another realization of the ´means/method´ meaning (cf. Sinclair, 1995a: 1646). Ergo, the introduction of both collocations is reasonable in terms of frequency, even if the pattern way + personal pronoun + verb could be introduced explicitly and earlier.
The penultimate volume of the textbook series presents in your own way within the dictionary section (Schwarz, 2000: 186). The BNC spoken samples do not contain this collocation, but the second entry in the Collins Cobuild dictionary refers to your way of doing something as “the manner or method of doing it which you use or think is suitable or correct” (Sinclair, 1995a: 1646). Therefore, even if the frequency of the actual collocation preposition + possessive pronoun + own way remains doubtful, an introduction of a pattern including the very common collocation your way with the abovementioned meaning is reasonable. With regard to the prominence given to your way of doing something in the corpus based Collins Cobuild dictionary (cf. second entry; Sinclair, 1995a: 1646) it should be introduced at an earlier stage of the course.
The last volume D6 finally features no way! as a new lexical item (Schwarz, 2002: 35, 133’). No way, as an “empathic way of saying no”, is listed in entry 31.2. of the Cobuild dictionary (Sinclair, 1995a: 1648), and it also quite frequently appears in the BNC spoken samples (4.5%). Due to this, its introduction principally agrees with frequency data and its late mentioning in the course corresponds to the rather subordinated importance ascribed to this phrase by the Cobuild dictionary. Surprisingly, the dictionary section of D6 does not incorporate this phrase into the entry of way (Schwarz, 2002: 168).
In conclusion, the series covers a range of frequent and infrequent phraseologies of way. Although it has to be acclaimed that the textbooks begin to introduce way with an abstract meaning instead of the concrete ´path /route´ meaning, a different introductory sequence would agree better with frequency data. It should start with ´method´, than feature ´direction´ and ´path / route´, and finally present by the way. Furthermore, the entries in the dictionary part should be adjusted to this order and no way should be added as the last entry. On the other hand, an early introduction of the concrete meaning of way is also justifiable as a concession to pedagogic and didactic considerations, which favour concrete meanings over abstract ones, due to arguments by cognitive psychology (also cf. Hunston, 2002: 44 for the pedagogical necessity of the psychologically prototypical).
2.6.4. Summary of the Findings
In respect of introduction and progression in accordance with frequency data, the three high frequency nouns strikingly differ in the English G 2000 textbooks.
While day essentially agrees with frequency information, money only accords to it what the overall choice of the lexical items is concerned. Its sequencing is worthy of improvement, since three volumes not dealing with new money collocations leave enough space for the introduction of further combinations and patterns. Possible candidates are verb collocations like have money to burn (cf. Sinclair, 1995a: 933) or TRANSFER money (1.5% BNC spoken), or adjective and quantifier collocations (cf. BNC spoken results). Common compounds like hush money and blood money (cf. Sinclair, 1995a: 933) could also be considered, but the suitability of their concepts may be questioned in a pedagogic context, as well as their utility for a beginners course in English. What way is concerned, a total conversion of the sequencing is strongly recommended, like already stated.
In conclusion, Römer´s call for an adjustment of sequencing and progression to frequency data can be confirmed for the analysed high frequency nouns in the English G 2000 textbook series.
2.7. Day, Money and Way in Advanced Learners´ Dictionaries
After investigating textbooks for beginners and intermediate learners, this paper takes a closer look at dictionaries for advanced learners. Since advanced students seldom use standardized course materials created by designers of their own mother-tongue, but work on a variety of texts and on other sources produced by native speakers of English, phraseology and authenticity of language play an even more important role in advanced language teaching. Thus, dictionaries for advanced learners, being one of the main tools in higher language learning, face great expectations in this respect.
Three different dictionaries are investigated concerning their description of the three high frequency nouns, namely the Longman Dictionary of Contemporary English (Summers, 2003), the Macmillan English Dictionary for Advanced Learners (Rundell & Fox, 2002) and the Oxford Advanced Learners´ Dictionary of Current English (Hornby, 2003). Avoiding getting lost in detail, the following remarks will only summarize the results.
First of all, the described corpus-revolution of reference materials (cf. Hunston, 2002: 102) also affected all three advanced learner dictionaries. The Longman DCE draws on the Longman Corpus Network Database, which comprises 300 million words from various text types and spoken English (Bullon, 2003: x; Quirk, 2003: ix), the Macmillan EDAL refers to the World English Corpus with 200 million words from texts and speech (cf. Rundell & FOX, 2002: vi, Hoey, 2002: ix), and the OALD is based on the British National Corpus (cf. Oxford University Press, 2006).
Secondly, each dictionary is devoted to collocation and phraseology. A large percentage of entries in the Longman DCE contains phrases highlighted in bold, and coloured collocation boxes show collocations illustrated by examples directly from or based on the corpus (cf. Bullon, 2003:x). A “language notes” section even explicitly explains multi-lexical items like phrasal verbs, idioms and collocations (Summers, 2003: 974, 976 et seq., 986). The enclosed CD-ROM additionally features a corpus mode, which can display examples as concordance lines, and a phrase bank, which allows looking up collocations (cf. Bullon, 2003: x). The Macmillan EDAL sets up a core vocabulary of English consisting of 7500 words, whose entries are provided with extra information on collocational behaviour and with a wide range of example sentences (cf. Rundell, 2002: x). The entries of the OALD also contain common collocations with example sentences, and highlight important ones in bold print (cf. Hornby, 2003: study pages B3). The “study pages” section of this dictionary even gives an introduction to the linguistic terms collocation, phrasal verb and idiom (ibidem, B3, B10, B12). Furthermore, the dictionary applies a pattern grammar approach by differentiating between different verb types and labelling them with codes, for example verbs followed by that-clauses, verbs followed by wh-clauses, verbs in combination with an infinitive phrase, verbs with gerund, and verbs plus direct speech. Each entry of a verb shows the different ways in which it can be used, and features the respective pattern codes along with example sentences (cf. Hornby, 2003: B8).
Moreover, all three dictionaries were designed on the basis of frequency data. The Longman DCE indicates the top 3,000 most frequent words with a red entry, and orders all entries in accordance with frequency information as far as possible, in order give learners an indication of which meanings should be learned first (cf. Bullon, 2003: x) The dictionary generally distinguishes between spoken and written frequency, and some entries even show this distinction in frequency graphs (ibidem). The words belonging to the core vocabulary of English in the Macmillan EDAL are chosen on the basis of their frequency and importance, and are marked by a “star rating” indicating their frequency (cf. Rundell, 2002: x). Words with three stars are part of the 2,500 most common and basic words, words with two stars are very common, and those with one star are yet fairly common lexical items (cf. Rundell & Fox, 2002: xii). The ODAL does not explicitly state the ordering of its entries according to frequency information, but since it only presents common collocations, and due to the fact that its entries of the three examined high frequency nouns resemble those of the other dictionaries, a general regress to frequency data can be assumed.
Finally, all three advanced learners´ dictionaries exemplify meanings, collocation and phrases with instances of authentic language (cf. Bullon, 2003: x; Rundell, 2002: x; Hornby, 2003: B3, B10, B12). The CD-ROM enclosed with the Longman DCE furthermore excels the printed paper version in content. It holds additional 80,000 examples, 150,000 extra collocates, and over one million unedited sentences from the Longman Corpus Network Database, which were automatically selected from the corpus (cf. Bullon, 2003: x).
Thus, measured by Römer´s desiderata, the three dictionaries are ideal language learning materials in terms of integration of corpus evidence, presentation of common language patterns, and using authentic language samples.
2.8. Conclusion
The investigation of the three high frequency nouns painted a heterogeneous picture of their representation in TEFL materials.
The corpus analysis of the GEFL TC proved immense discrepancies in the existence and distribution of collocational patterns and phraseology, and showed that the nouns are principally not adequately represented in the Green Line New and English G 2000 textbook series. A closer look at the English G 2000 course books revealed that the introduction and presentation of the nouns need to be adjusted to frequency data to a greater extent, whereas the degree of deviance from an ideal sequencing is dependent on the respective lexical item. Solely the dictionaries for advanced learners depict all major phraseology of the high frequency nouns, and realize all demands for integrating corpus-linguistic insights into the design of teaching materials. In this respect they are outstanding.
Notwithstanding the progress and achievements in the realm of reference books for learners, the design of other teaching materials, especially of textbooks for basic introductory courses, still have to be adjusted in terms of phraseology and frequency data.
2.9. DDL-Exercises
2.9.1. The Benefits of DDL-Exercises for TEFL
The investigation of the three high frequency nouns yielded the need of a restructuring of the examined course books. Nevertheless, a syllabus consists of a variety of further lexical items, and can follow manifold didactic and pedagogic aims. Thus, a multitude of aspects have to be taken into consideration in the context of the design progress, and an adequate proportion has to be found. Due to this, until there is enough empirical corpus data to radically revise course books, one relies on careful and well considered enhancements (cf. Römer, 2005: 275 et seq., 282).
But what are the practical consequences of this statement for every day language teaching? Since teachers do not have the time to scientifically revise the school books they work with, they can only compensate the drawbacks of this learner input by complementary corpus-.based materials and teaching methods.
One of such methods is “Data Driven Learning” (DDL), developed by Tim Johns. In this approach, students act as “language detectives”, they work as linguistic researchers on authentic corpus examples in form of concordance lines to discover facts about the target language (cf. Hunston, 2002: 170; Römer, 2006: 124). Another similar concept is the “Corpus Aided Discovery Learning” by Silvia Bernadini, which shifts away from learning by research towards learning by discovery, and uses corpora as a basis of more autonomous learning activities, which enable students to make “serendipitous” findings in the target language (cf. Hunston, 2002: 171; Römer, 2006: 124)
Regardless of how the methods are labelled, or what particular guidelines are given, the central idea to use corpora in language teaching has several advantages. Subsequently, the term DDL is used as a generic term for all corpus-based teaching approaches without forgetting about the existence of differences in detail.
First of all, DDL is a detachment of unnatural, simplified textbook English and a way to facilitate communicative competence (cf. Hadley, 2002: 4.1.,7.0; Römer, 2005: 280). On the one hand, corpora provide learners with authentic examples of genuine English, which can improve their fluency (Römer, 2005: 278), as well as help them to achieve a greater naturalness in the target language (Römer, 2004: 154 with further reference). In addition, they are a good preparation for contact with native speakers, since they confront students with the “messiness” of authentic utterances instead of the kind of well-formed sentences textbooks consist of, but which rarely occur in real life communication (Römer, 2005: 280; also cf. Lamy & Mortensen, 2006: 3.2.1.).
Secondly, use of corpora can also meet the demands of inductive grammar teaching (cf. Zigésar & Zigésar, 1998: 196). By working on concordance lines, learners can infer typical usages of patterns and formulate grammatical rules by themselves (cf. Römer, 2005: 290). In addition, DDL can facilitate grammatical consciousness raising and enables students to get a feeling for the target language (Hadley, 2002: 4.0). Instead of teaching a particular feature, DDL can be used to draws learners´ attention to it by giving them corpus evidence, and asking them to hypothesize and draw their conclusions (cf. Hunston, 2002: 184 et seq.). Thereby, DDL also teaches methods of analysis and discovery which are beneficial for learners´ further independent studies (cf. Lamy & Mortensen, 2006: 3.2.2.).
Finally, data-driven activities are also advantageous in vocabulary teaching, especially for advanced learners. Corpora provide ample examples to illustrate different meanings of words which depend on the context, and also reveal all sorts of information about the keyword which are seldom addressed in standard textbooks, like common collocations, idiomatic and metaphorical use, style and register (Tribble & Jones, 1997: 40, 47). Thus, DDL is helpful to increase learners´ skill to deduce meaning from context (cf. Hunston: 2002: 170; Tribble & Jones, 1997: 40). What is more, DDL activities are an excellent way to make learners aware of collocations and phraseology (cf. Hunston, 2002: 186 with reference to Willis & Willis).
Aside from its didactic benefits, DDL also offers general pedagogical advantages. Depending on the degree of guidance and control by the teacher, data driven exercises can increase intrinsic motivation due to the contact with authentic language and its self-depen-dent exploration by learners (cf. Hunston: 2002: 170; Tribble & Jones, 1997: 38, 40). By encouraging students to reflect and critically question the authority of dictionaries and grammars, learners´ autonomy is increased (Lamy & Mortensen, 2006: 3.1.2.). Since they are encouraged to trust their own sense, DDL can boost their self-confidence. Learners´ confidence in using the target language is also strengthened by the insights they gain into its real structures, and by attaining a greater naturalness due to the exposure to large amounts of authentic language (cf. Römer, 2004: 154, Römer: 2005: 276)
2.9.2. Direct and Indirect Approach
DDL exercises can be integrated into any lesson, if they are tied to a word or phrase which appears in a reading or listening text, or in another classroom activity (Hunston, 2002: 179). Basically, this can be accomplished in two different ways. A “direct approach” or “raw corpus approach” lets students work on corpora in order to conduct individually designed investigations. They can make observations of their own, test statements made in standard reference books, or come up with own hypotheses and check them (cf. Hunston, 2002: 171 et sec.; Römer, 2006: 124). Albeit this method contains a maximum of motivation for learners, it often flounders on access to computers and concordancing programs, as well as on sufficient time for individual consultations in ordinary teaching situations (Hadley, 2002: 4.2; Hunston, 2002: 171). Thus, the second possibility, an “indirect approach”, is more feasible within the framework of a public school system. Since mainstream TEFL is dominated by state-run institutions, the following exercises refer to this method in order to make a contribution to the popularization of DDL.
The “indirect approach” confronts learners with raw or filtered concordances in a printed form, which highlight a particular language phenomenon (cf. Römer, 2006: 124; Tribble & Jones, 1997: 38; Hunston, 2002: 170 et seq.). Teachers can prefabricate materials and worksheets, which may be photocopied and distributed to learners (Hunston: 2002: 171; Tribble & Jones, 1997: 37). These materials consist of selected concordance lines showing the respective language phenomenon and of carefully worded questions and tasks, which guide learners towards noticing relevant information (Hunston, 2002: 176; Tribble & Jones, 1997: 38).
[...]
-
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X.