Navigation – Plan du site

AccueilNuméros79Articles de rechercheCertifying lecturers’ English lan...

Articles de recherche

Certifying lecturers’ English language skills for teaching in English-medium instruction programs in higher education

La certification des compétences langagières des enseignants pour dispenser des cours en anglais dans l’enseignement supérieur
Slobodanka Dimova
p. 29-47


L’enseignement des spécialités en anglais est de plus en plus souvent mis en place dans les universités traditionnellement non anglophones dans le cadre des politiques d’internationalisation des universités. L’augmentation du nombre de ces programmes soulève des inquiétudes quant à la qualité de l’enseignement si bien que de nombreuses universités ont commencé à mettre en place des certifications en anglais pour les professeurs non anglophones qui dispensent des enseignements en anglais. Les politiques de certification des enseignants varient selon les établissements d’enseignement supérieur. Certains établissements certifient les enseignants par le biais de formations spécifiques, par le biais de prestataires externes ou en interne, laquelle formation conduit à l’obtention d’un certificat. Les tests oraux et écrits conçus localement se fondent souvent sur l’échelle du CECR, car elle permet une reconnaissance internationale des résultats des tests. En s’appuyant sur la recherche et l’expérience d’un test institutionnel pour la compétence orale en anglais du personnel universitaire, cet article examine les avantages et les inconvénients des différents modèles de certification pour les enseignants et fournit des recommandations pour concevoir des pratiques de certification équitables et durables.

Haut de page

Texte intégral

1. Introduction

1English medium instruction (EMI) has been increasingly implemented in higher education (HE) as part of higher education institutions’ (HEI’s) strategies for internationalization. Although the purpose of these internationalization strategies varies across HEIs, their primary focus tends to rest on the recruitment of international students and lecturers and access to international research published in English. Given the growth of EMI, university policy-makers, administrators, and management have started debating the type of relevant quality assurance measures that must be implemented to ensure the branding, accreditation, and educational quality of their EMI courses and programs. The ongoing discussions regarding quality assurance measures, which are evident in conferences and symposia organized by large associations for international education, such as the European Association for International Education and China Education Association for International Exchange, frequently include references to lecturers’ ability to teach in EMI by scrutinizing primarily non-native English-speaking lecturers’ English proficiency and pedagogical skills.

2As part of these quality assurance endeavours, many HEI’s have adopted certification methods for EMI lecturers. These certification methods vary greatly across universities, mostly depending on the expertise and the resources at their disposal. The methods also vary due to the local needs and the stakeholders’ understanding of what EMI entails. Despite the increasing implementation of language and pedagogy requirements for teaching in EMI and the establishment of certification procedures, the field lacks rigorous and consistent research regarding the constructs of EMI teaching competence and language proficiency for EMI.

3To facilitate decision-making among HEI’s regarding the scope and the type of EMI lecturer certification, this article first discusses the research that has identified the need for certification and the different elements of the EMI framework. Then, the various EMI certification methods currently used across HEI’s, certification through courses, international tests of English, and locally developed tests, are presented. The current conceptualizations of the language construct for EMI and the minimum requirements are also included. Finally, recommendations about how to approach developing language assessment tools for EMI lecturer certification are outlined.

2. The need for EMI lecturer certification

4University managements’ concerns about the quality of education in EMI have resulted from research reports that highlight the challenges that non-native English-speaking lecturers experience in the EMI classroom. Early research reports on EMI from European university contexts suggest that teaching in English requires additional effort and is time-consuming (Airey 2011; Crawford Camiciottoli 2004; Dafouz & Núñez 2009; Morell 2007; Tange 2010; Vinke 1995; Vinke et al. 1998). Findings indicate that some EMI lecturers lack competence for dealing with linguistically diverse student populations, adapt to new pedagogical approaches, and address the need for intercultural communicative competence (ICC) (Airey 2011; Klaassen 2008; Tange 2010; Vinke 1995; Westbrook & Henriksen 2011).

5In terms of language, some studies reveal that EMI lecturers tend to cover less material in EMI lectures because the speech rate in their L2 is slower than when speaking in L1s (Thøgersen & Airey 2011; Vinke 1995; Arkin & Osam 2015). The lecturers’ speech rate may be influenced by their insufficient overall language proficiency, restricted vocabulary range, and inadequate style. Research has identified divergent levels of English proficiency among the European countries, where southern Europeans tend to have lower academic English proficiency due to various socio-educational and linguistic factors (Campagna & Pulcini 2014; Dafouz & Camacho-Miñano 2016). Research that focuses on vocabulary range suggests that EMI lecturers experience communication difficulties because of their restricted general and academic vocabulary range, rather than their domain-specific terminology (Tange 2010; Dimova & Kling 2018). Regarding style, EMI lecturers tend to use dry and formal language, similar to that in written communication and lack nuanced expression and humor (Thøgersen & Airey 2011; Tange 2010; Wilkinson 2005).

6More recently, similar research regarding lecturers’ challenges with EMI has expanded to include higher education in Asia-Pacific and South America. Once again, EMI lecturers’ linguistic and teaching practices are questioned because they affect students’ content learning (Fenton-Smith et al. 2017; Martinez 2016). Lecturers are found to experience difficulties with providing explanations and answering questions (Vu & Burns 2014).

7The research results that point to the challenges experienced by EMI lecturers mainly come from survey and interview-based studies in which native-speaker ideology is ubiquitous (Inbar-Lourie & Donitsa-Schmidt 2020; Pilkinton-Pihko 2013). Although many research reports end with recommendations for EMI lecturer certification and/or training, the implied certification and training norms tend to be aligned with similar programs at British, North American, and Australian universities. However, EMI does not represent a monolithic phenomenon because it is established in various contexts and governed by different policies. In order to understand what EMI lecturer certification should entail, it is important to understand the EMI framework.

3. EMI framework

8In order to discuss lecturer certification for EMI, an understanding of what EMI represents is essential. EMI is implemented and interpreted in various ways based on local contextual variables and policies. According to the findings from the Transnational Alignment of English Competences of Academic Staff (TAEC) (TAEC Literature Database Report 2020), EMI can be based on the framework represented in Figure 1.

Figure 1. TAEC EMI framework

Figure 1. TAEC EMI framework

9According to this framework, the process and the type of EMI implementation depends on the institutional, national, and international policies related to language and instruction, which are embedded in a particular context. For example, in the Nordic region, a non-Anglophone context, a number of policy initiatives were taken, in which parallel language use was promoted to balance the use of English and the national language (Gregersen 2014, 2018). These initiatives resulted in the Nordic Declaration on Language Policy (Nordic Council of Ministers 2007, 2015, 2018). This policy was adopted at Danish universities, where disciplinary courses and programs are available in both Danish and English, and both languages can be used during instruction (Dimova & Kling 2015). Although EMI allows for recruitment of international lecturers and students, which should be taken into consideration during curriculum design, EMI is not implemented to foster pedagogical changes at the university or to improve students’ English proficiency levels. Danish universities have established courses in university pedagogy “universitetspædagogikum”, the purpose of which is pedagogical training of all lecturers, regardless of which language they use for teaching (Dimova & Kling 2018). Given that a large number of lecturers have high proficiency in English, instead of requiring language training for all lecturers, the university designed an oral English proficiency test, the Test of Oral English Proficiency (TOEPAS), to screen EMI lecturers, offer formative feedback to all, and recommend further training to those who need it (Kling & Stæhr 2012).

10In some other non-Anglophone contexts, EMI policies require improvement of both students’ English language proficiency and lecturers’ pedagogical approaches (Macaro 2018). Therefore, EMI certification may need to focus on ensuring lecturers’ English language proficiency and their ability to integrate language and content in disciplinary courses.

11EMI at international branches of British or North American universities are governed by normative language policies and educational cultures from the originating countries despite their non-Anglophone geographical location (Dimova & Kling 2020; Eslami et al. 2020). In such cases, EMI lecturer certification may need to reflect different expectations than those at national universities in countries where English is used as a foreign language.

12This variation in the establishment and the role of EMI in the local context should be taken into consideration when planning lecturer certification. In other words, what skills EMI lecturers need vastly depends on what EMI requires, which is different across contexts. Pedagogical and linguistic practices formed in one context cannot be directly mapped onto another due to the local constraints. For example, while classroom interaction is useful for content learning, it may not be easily viable in large courses in auditoria with over 200 students.

4. Certification methods : courses, commercial tests, and local tests

13Based on survey data, O’Dowd (2018) found that a large percentage of HEI’s from different countries (Spain, Austria, Italy, Sweden, the Netherlands, Germany, and France) had English language certification requirements for EMI lecturers. Currently, three EMI lecturer certification methods are applied: 1) certification through EMI-related course completion (rather than assessment), 2) certification through existing tests, and 3) certification through locally developed tests.

14Many HEI’s offer training courses for EMI lecturers, most of which focus on general communicational skills, pedagogy, and language proficiency (O’Dowd 2018). Most of these training courses and programs are established and managed locally, while some courses are commercial and offered by external institutions. Although these courses are primarily established to support EMI lecturers, some are also used for lecturer certification. In other words, lecturers obtain certification through course completion because these training courses typically do not include formal assessment. Certification courses for EMI lecturer certification are especially promoted by commercial institutions that have identified EMI as a new market for their services that used to target primarily international students in North America, Britain, or Australia.

15A number of HEI’s require scores from international commercial tests of English as evidence of lecturers’ language ability to teach in EMI. For example, some HEI’s in the Netherlands required that lecturers take some of the international tests to be certified to teach in EMI in the early 2000s (Klaassen 2001; Klaassen & Bos 2010). Similarly, the Conference of Rectors of Universities (CRUE) included a list of commercial tests that can be used as part of the certification requirement in the linguistic policy for the internationalization of the Spanish university system (Halbach & Lázaro 2015; Fortanet-Gómez 2020). A survey of 50 universities in Spain suggested that 38 universities use these commercial tests as recommended in the language policy developed by CRUE (Bazo et al. 2017; Halbach & Lázaro 2015). Most of the required commercial tests were originally designed to measure general English proficiency (e.g., Aptis General, IELTS General) or academic English proficiency (e.g., IELTS Academic, TOEFL). The academic English proficiency tests were primarily designed and validated for uses at HEI’s in North America, Britain, or Australia, especially in relation to international student admission (Bridgeman, Cho & DiPietro 2016; Coleman, Starfield & Hagan 2003; Ginther & Yan 2018; Wait & Gressel 2009).

16Some HEI’s initiated development of local assessment methods that focus on oral English proficiency or on all language skills, including grammar. These local assessment methods are seldom discussed in research journals or conferences. Consequently, the range of assessment methods and their constructs remain publicly unavailable.

17Only a few assessments used for EMI lecturer certification are mentioned in the literature. The University of the Basque Country developed the Test of Performance for Teaching at University Level through the Medium of English (TOPTULTE), which includes a grammar section, and performance-based written and oral sections (Ball & Lindsay 2013). The consortium of four Belgian universities (KU Leuven, University of Antwerp, Ghent University, and Vrije University Brussels) developed the Interuniversity Test of Academic English (ITACE), which consists of a computer-based part that includes several sections (grammar and vocabulary, reading, and listening) and a performance-based one that includes a writing and an oral section (Verguts & De Moor 2019).

18The Copenhagen Business School designed an assessment tool for EMI lecture certification, PLATE – Project in Language Assessment for Teaching in English, which was based on unannounced classroom observation (Kling & Hjulmand 2008). The rating scale was based on the Common European Framework of Reference (CEFR) and it included criteria related to overall oral production, overall spoken interaction, sociolinguistic appropriateness, and overall presentation skills (op. cit.: 194). Similarly, the University of Freiburg developed an EMI lecturer certification procedure based on classroom observation, where lecturers’ self-evaluations and evaluations from raters and students are used for certification decisions (Gundermann & Dubow 2018). The assessment measures linguistic and communicative competences. Therefore, the rating criteria are divided into five linguistic and five communicative criteria, each of which is rated on a 1 to 4 scale (with 1 being the highest). The score of each rating criterion is computed by averaging the rater score and the mean of students’ scores. The final scores for linguistic competence and communicative competence are computed as averages of the scores for the five criteria each of these competences comprises.

19On the other hand, the Test of Oral English Proficiency for Academic Staff (TOEPAS), which has been used at the University of Copenhagen, Roskilde University, and University of Nantes, is based on simulated lectures. During the test session, three EMI lecturers from the same discipline take turns giving a lecture and taking the role of students and asking questions (Kling & Stæhr 2011; Kling & Dimova 2015). In the first version of TOEPAS, scores were assigned based on a hybrid, 1 to 5 scale that implemented criteria related to fluency, pronunciation, vocabulary, grammar, and interaction (Kling & Stæhr 2011; Dimova 2017a; Dimova 2020b). The TOEPAS scale was subsequently revised, and now scores are assigned based on a hybrid 10 to 60 scale (in increments of 10) that includes criteria related to audience awareness, fluency, intelligibility, organization and coherence, vocabulary, and grammar (Dimova 2020a).

20Assumptions can be made that the variation in the assessment tools used for EMI lecturer certification exists because of the diverging conceptualizations of the construct which underlie these assessment methods. The following section addresses the debates over construct definitions for assessment tools used for EMI lecturer certification.

4.1. Construct conceptualizations

21Several issues have permeated the debates regarding the construct underlying the assessments used for EMI lecturer certification. The most prevalent issue has been whether and to what degree pedagogy should be part of the construct, followed by discussions regarding the role of native speaker ideology in linguistic and pedagogical norm selection (Davies 2011; Dimova 2017b; Kirkpatrick 2006). In order to tackle these issues, understanding of the target language use domain (TLU) (Bachman & Palmer 1996) is needed. The TLU represents the situation in which certain language abilities are needed to perform a task. Therefore, TLU can also be viewed as the local context in which language is used (Dimova et al. 2020). The role of the language and the context, and perhaps their interaction, can be included in the definition of the construct represented in the assessment (Bachman 2007). Knoch and Macqueen (2019) propose four dimensions of construct definition: stated, theoretical, operationalized, and perceived. The stated construct represents the claim about the assessment focus that is applied in communication with different stakeholders involved in the assessment procedure. The purpose of this dimension is to convince the stakeholders about the relevance of the test, and it appeals to the assessment’s “face validity” (op. cit.: 43). While the theoretical construct definition draws on current theories about language competence and language use, the operationalized construct is defined through the type of tasks included in the assessment. The perceived construct reflects stakeholders’ understanding of the assessment results and their uses for decision-making. The emphasis tends to lie on assessments’ face validity, both in terms of the stated constructs, i.e., the stated relevance, and the perceived construct, i.e., stakeholder interpretations of test results and uses. Given the lack of theoretical models representing language use in the EMI domain, the design of assessment instruments becomes a challenge, and a certain level of incongruence among the different construct dimensions may be expected. For instance, a number of EMI lecturer certification instruments draw on the CEFR as a theoretical base for the instrument design, which appeals to the relevance, transparency, and therefore, face validity, of the assessment, regardless of whether the operationalization of the CEFR in the EMI domain is viable (Dimova 2017a).

22Needs analyses, which include the analysis of the domains of language use, the needs of stakeholders, the available resources, and the policies, provide essential information for the design of the operationalized constructs (Bachman & Palmer 1996; Dimova et al. 2020; Douglas 2020; Knoch & Macqueen 2019). Currently, needs analyses are primarily in the form of surveys, where EMI lecturers are asked about their perceptions of the needs and attitudes towards implementation of certification programs (Macaro et al. 2019). Needs analyses can help select an existing assessment (e.g., IELTS, TOEFL, TOEIC) because they make it possible to find the assessment method that most closely aligns with the local language needs. More importantly, needs analyses are at the core of local test development, especially if these tests are not associated with a language support program.

23Despite the existence of locally developed assessment methods, reports on the needs analysis that guided the test design are rarely publicly available. The needs analysis protocol is available only for TOEPAS, which is published in the first Technical Manual developed by the University of Copenhagen (Kling & Stæhr 2012; Dimova et al. 2020). In the manual, the test developers describe the needs analysis, which included interviews with lecturers, deans, and study board leaders, as well as classroom observations, and analyses of existing assessment methods and theoretical frameworks, in order to justify the selection of the certification method. Based on the observations and the interviews with lecturers, the test developers described the language domain and the teaching tasks in which lecturers are involved, while the interviews with management provided information about requirements, policies, and uses of certification results. Based on the needs analyses, the four dimensions of the construct underlying the first version of TOEPAS were outlined. In terms of the stated construct, TOEPAS “is a test of spoken production and interaction in English. More specifically, it assesses test takers’ ability to lecture and interact with students in an academic context” (op. cit.: 12). While the theoretical construct is based on Bachman & Palmer’s (1996) model of language ability, the operationalized construct is defined thus:

The test tasks are designed to elicit whether the test taker can handle a range of communicative tasks which are central to university teaching at graduate level, namely present highly complex content material ; explain domain-specific terms and concepts ; clarify, paraphrase and restate concepts and main points ; present and explain an assignment ; ask, understand and respond to student questions ; deal with unclear questions and misunderstandings and negotiate meaning when necessary. (op. cit. : 12-13)

24The TOEPAS construct description focuses on language abilities rather than teaching abilities. The argument supporting this construct definition is that assessment of pedagogy and topical knowledge may be superfluous if lecturers’ previous training, teaching experience, and disciplinary expertise are taken into consideration (Dimova & Kling 2018). Moreover, pedagogical approaches vary across disciplinary and educational traditions, which means that identifying the criteria for assessment of teaching ability that are applicable across disciplines represents a challenging task for test developers. Therefore, Dimova & Kling (2018) argue that a weak language for specific purposes (LSP) model, in which the tasks elicit the pedagogical function of language use, is more relevant than a strong LSP model, in which classroom behavior (i.e., interactivity level, task variation, activities, instructional activities) is also assessed.

25Another argument against assessing EMI lecturers’ teaching abilities as part of their certification is that such a practice may cause inequality among lecturers because the teaching abilities of those who teach in their L1, e.g., native speakers of English, are rarely formally scrutinized. Instead of assessment, certification through participation in courses on pedagogy seems to be a more appropriate approach, regardless of what medium of instruction is used. Such courses are particularly important if the university establishes content and language integrated teaching approaches, or if the university is in the process of internationalizing the curricula.

26The lack of models of language abilities for EMI to inform the theoretical constructs has led to the development of assessment methods that draw from existing communicative and language ability models. Although the tasks in locally developed tests may reflect the EMI context, task types tend to be similar to those found in international academic English tests (Lindsey & Ball 2010; Verguts & De Moor 2019), and assessment criteria tend to reflect native speaker norms (Kling & Stæhr 2012). For example, in the first version of TOEPAS, one of the criteria was based on a reference to “an educated native speaker”.

27Developers of assessment methods for certifying EMI lecturers have rightly acknowledged EMI as an English as a lingua franca (ELF) setting in their construct statements, but the operationalized construct continues to focus on grammatical and phonological accuracy, and, thus, reflect native speaker norms (Gundermann & Dubow 2018). Given the fluidity of norms and standards in the communicative practices in an ELF setting such as EMI, capturing the characteristics of the language domain that is represented in the assessment becomes a difficult feat, which explains the continued use of the well-established native speaker norms (Dimova 2017b, 2020a).

28Despite the absence of a comprehensive theoretical model of language ability for EMI, research on classroom communication in EMI settings has been growing, and the language characteristics that contribute to effective communication have been identified. Suviniitty (2012) found that students’ comprehension of lecturers depended on the interactional features of their speech rather than their perceived proficiency levels. Björkman (2010, 2011) argued that language proficiency was not as important as pragmatic ability for successful communication in the EMI classroom setting, which represents an ELF setting. This means that high proficiency does not lead to effective communication if the lecturer lacks pragmatic strategies relevant for the multi-lingual, multi-cultural setting represented in EMI. For example, Björkman (2011) highlighted several pragmatic strategies, repetition, explicitness, cohesion, coherence, and topic organization, as essential for successful classroom communication. Moreover, Inbar-Lorie and Donitsa-Schmidt (2020) found that students’ preference for EMI lecturers was not grounded within the notion of native speakerism. Instead, ideal EMI lecturers are those who are proficient in English, experts in their fields, familiar with the local language and culture, capable of using appropriate pedagogy, and able to promote internationalization.

29In a study analyzing TOEPAS rater behavior, formative feedback, and EMI lecturer classroom practices, Dimova (2020a) found a tension between lecturers’ reported practices and the norms promoted in the TOEPAS scale and the written feedback reports. While TOEPAS promoted the native-speaker norm, lecturers’ practices aligned with those in Suviniitty’s (2012) and Björkman’s findings (2011). Based on the results, removal of the native speaker norm references and the focus on grammatical and phonological accuracy from the TOEPAS scale descriptors was recommended. The new version of the test, TOPEAS 2.0, includes descriptors that reflect previous findings about effective communication and the EMI lecturers’ reports on their classroom practices. In other words, the descriptors emphasize lecturers’ abilities to use pragmatic strategies, including summaries, emphasis, reiteration, coherence, and organization. Further research is needed in identifying the transferrable characteristics of language use that are applicable across different universities.

4.2. Characteristics of the different EMI certification models

30When making decisions about certification methods for EMI lecturers, an important consideration is the technical characteristics of existing assessments and the desired characteristics of the assessment method that the HEI plans to develop. Although the technical characteristics of the large-scale international tests are available due to the numerous studies investigating their reliability and validity for different uses, technical reports on the locally developed assessment methods are rarely publicly available. HEI’s publish only information about the assessment content (type of tasks), protocol (e.g., registering, length, results), and, perhaps, scale. The lack of information about the technical characteristics of the certification assessments makes it difficult to compare the existing assessment instruments and obscures their development process.

31Several assessment methods are found among locally developed assessments: assessments based on several language skills, assessments based on classroom observation, and assessments based on simulated lectures. Assessments based on several language skills tend to have similar structure to some of the large-scale international tests by including sections for each skill (reading, writing, listening, speaking), where the receptive skills, grammar, and vocabulary are measured through discrete items (e.g., multiple-choice, fill-in-the-blank, and matching), and writing and speaking are performance-based (e.g., TOPTULTE, ITACE). If the development of these types of assessment methods follows the principles of good practice and rigorous analysis of test items, then such test methods exhibit high reliability and practicality. Scoring reliability is achieved through consistent application of the assessment criteria, so given the discrete nature of a number of items, and the controlled performance-based items, consistent scoring is easier to achieve. However, these assessments score lower on authenticity because the language they elicit depends on the lecturers’ interaction with the given task, i.e., their interpretation of the task and the expected response (Bachman & Palmer 2010). These assessments also score lower on face validity because the relationship between the task and the TLU domain is not straightforward.

32For instance, ITACE is reported to have followed rigorous test development standards established by the Association of Language Testers in Europe (ALTE). Classical test theory (Lord & Novick 1968) was applied for item and test analysis, which has been evaluated and validated by a panel of independent expert members. Therefore, stakeholders could expect high test administration and score consistency. In terms of authenticity, however, although the sentences, texts, and listening inputs used in the test items were taken from authentic contexts, the test items may not be perceived to elicit authentic language uses represented in the language domain, i.e., the EMI classroom. The stakeholders may not perceive the writing and speaking tasks, and especially the discrete items, as being relevant tasks for EMI lecturers, which may affect the assessment method’s test face validity.

33Assessments based on actual classroom observations, on the other hand, fall on the other end of the continuum in terms of authenticity and face validity. Assessing the language that EMI lecturers actually use in their classroom lies high in the authenticity and face validity range. Even though lecturers tend not to behave exactly the same as they normally would when teaching because of their raised awareness of being assessed (Dimova & Kling 2018), they still communicate within the actual context. Such assessment procedures may seem very relevant for the stakeholders and, therefore, have high face validity. On the other hand, maintaining consistent administration and scoring procedures represent a challenge. The contextual variables in which the assessment takes place vary greatly from one classroom to another. Some lecturers teach in large auditoria with over 200 students, some teach seminars with up to 10 students, and some teach in labs, clinics, or workshops. These contextual settings require different teaching behavior, language uses, and levels of interaction. Moreover, lecturers’ interaction also depends on how well they know their students, i.e., if the lecturer is meeting the students for the first time, they may not be as comfortable in the classroom as towards the end of the semester, when the lecturer may know the students and their reactions very well. This enormous variation in the assessment administration settings poses difficulties for consistent use of the rating criteria because raters may not be able to interpret the variation in lecturers’ performances, i.e., decide whether their performance is influenced by their language abilities or by the contextual variables. Moreover, Kling and Hjulmand (2010) emphasized the difficulty to plan and administer the assessment due to various logistic challenges.

34For example, the EMI certification procedure at the University of Freiburg emphasizes the assessment’s face validity by using “naturalistic assessment conditions” and claiming that “the elimination of interference factors with the aim to assure (more) objectivity, is detrimental to assessing teaching quality since it is the unforeseeable interaction between learners and teachers which makes teaching a challenge” (Dubow & Gundermann 2017: 117-118). However, unexpected variation is detrimental to scoring reliability. If the scores or the results are influenced by the context, then the validity of the results may be threatened because they fail to represent score variance based on ability levels but based on other variables, which represents construct irrelevant variance (Messick 1996). Information regarding how the various levels of challenge that lecturers experience in their classrooms are accounted for as part of the rating procedure, and how “teaching quality” is operationalized in the rating criteria or the operationalized construct is essential. Making decisions about whether EMI lecturers should be certified based on classroom observation as an assessment method may be problematic. However, classroom observation is an excellent model for lecturer training, rather than assessment, where trainers observe EMI lecturers multiple times and discuss the teaching afterwards.

35Assessment methods based on simulated lectures represent a middle ground between the highly controlled tasks and the example with real-life classroom observations. A simulated administration protocol allows for elicitation of lecturing language while controlling for external influences that cause irrelevant variance in the speaking performance. Although face validity and authenticity may not be as high as in classroom observations, they are still high when compared to controlled language production. Similarly to classroom observations, the maintenance of consistent scoring, i.e., rater reliability, may present a challenge because the academic disciplines and the topics of simulated lectures vary: variations in the discipline-specific language and teaching approaches cannot be avoided. What is different from classroom observations, though, is that the variation in the simulated lectures is more predictable and can be more easily accounted for in the rating procedures and the rating criteria.

36For example, TOEPAS, which is based on simulated lectures where the teachers give lectures in their fields, introduces disciplinary and pedagogical variation because lecturers have different scientific and disciplinary backgrounds. Despite this variation, the TOEPAS raters were trained to use the rating criteria consistently (Kling & Dimova 2015), and their rating did not display bias against a particular department or a field of study (Dimova & Kling 2018). Figure 2 presents the classification of the current methods based on reliability, practicality, authenticity, and face validity.

Figure 2. Current assessment methods

Figure 2. Current assessment methods

37Alongside research on EMI lecturer certification methods, discussions regarding minimum English language proficiency requirements have been introduced. The following section presents current HEI practices regarding the implementation of minimum English language proficiency requirements for EMI certification.

4.3. Minimum proficiency requirements

38In order to promote international recognition and transparency of EMI lecturer certification, HEI’s have either developed assessment methods based on the CEFR levels and descriptors (see Kling & Hjulmand 2010; Dubow & Gunermann 2017) or established requirements based on the CEFR. Whether the standardization procedures have been applied to align the local assessment scores to the CEFR, as recommended by the European Council (Verhelst et al. 2009), remains unclear.

39Level C1 on the CEFR has been widely accepted as the minimum English proficiency needed for effective teaching in the EMI classroom although this requirement has not been supported by empirical research. The reasons for establishing C1 as the minimum requirement for EMI lecturer certification are twofold: it is believed that lecturers should be one level above the students for whom the minimum proficiency level is set at B2 (Klaassen & Bos 2010), and academic language descriptors are introduced at C1 level on the CEFR. C1 as the minimum proficiency level for EMI lecturers is a requirement proposed by CRUE in the language policy for higher education in Spain (Bazo et al. 2016), the Flemish government (Verguts & De Moor 2019), and the 3TU Federation, which comprises the three technical Universities in the Netherlands: Delft University of Technology, University of Twente and Eindhoven University of Technology, and Maastricht University (Klaassen & Bos 2010).

40However, the requirements across universities vary, and a number of universities establish lower English proficiency levels for EMI lecturer certification. A large number of HEI’s have established requirements below C1 and a very small fraction at C2 level (Halbach & Lázaro 2015; O’Dowd 2018). Given the lack of empirical evidence to support the decisions for minimum English language proficiency, the lower proficiency requirements are assumed to be grounded in HEI’s’ pragmatic reasoning. If the requirements are at C1 level, many lecturers may not qualify to teach in EMI.

41At the University of Copenhagen, the minimum oral English proficiency requirement was empirically established based on the establishment of a cut score on the TOEPAS scale (Kling & Dimova 2015). Expert judges were involved in setting up the cut scores using the borderline group method (Zieky & Perie 2006). However, due to stakeholders’ unfamiliarity with the TOEPAS scale beyond the University of Copenhagen, the minimum language requirements for EMI lecturer certification lacked international transparency (Dimova 2017a). Therefore, a three-day standard-setting event took place, where the TOEPAS scale was aligned with the CEFR following the stages (familiarization, specification, standardization, and validation) as proposed by the manual Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (CEFR) (Dimova 2018; Figueras et al. 2005). When the TOEPAS scale was aligned with the CEFR, the empirically established TOEPAS cut score fell at the B2+, rather than the C1 level. EMI lecturers’ subject expertise, extensive disciplinary (domain-specific) vocabulary, and teaching experience compensated for their linguistic competence, so B2+ was sufficient for them to teach effectively.

5. Conclusion

42EMI lecturer certification has increasingly been implemented across different HEI’s as part of their quality assurance measures, international transparency, and accreditation. The certification approaches and methods vary across HEI’s based on the availability of local resources and expertise and based on the diverging conceptualizations of the underlying constructs of the assessment tools used for EMI lecturer certification. Although disagreements about best practices and critiques of existing certification models are found in the literature, these disagreements and criticism are grounded in scholars’ pragmatic reasoning, as well as their intuitions and perceptions, rather than rigorous empirical research.

43Based on the basic principles of language test design and the limited research on EMI lecturer certification, several recommendations emerge. Given the lack of theoretical models of language competence in the EMI domain, needs analyses in the local domain in which the certification is used are essential. Assessment developers, or those in charge of making decisions about the certification method, should first identify the needs of the stakeholders, the available resources, and the institutional, local, and national language and instructional policies. More importantly, they need to describe the TLU domain so that they are able to select and adopt appropriate commercial assessments or assessments developed by peer institutions, or to develop their own assessment methods that reflect their local needs. Although the primary goal of the certification is probably quality assurance and/or accreditation, considerations about the impact of the certification on lecturers and their practices are essential (Dimova 2020a, 2017a). In other words, depending on how certifications are implemented, they may create either exclusion, inequality, and loss of status among lecturers (Dimova 2017a) on the one hand, or support and guidance, on the other (Dimova 2020c). Certification procedures that include extensive feedback and training support to those who need it are more likely to be welcomed by the lecturers than those procedures that exclude them from participation in different departmental activities.

44It has been recognized in the field that assessment methods for EMI lecturer certification should differ from those for student university enrolment, but what the difference is remains elusive. Therefore, the argument to include pedagogy as part of the assessment’s construct has been present in EMI-related discussions. Although the inclusion of pedagogy may contribute to the face validity of the assessment, careful operationalization of the construct of pedagogy is crucial because pedagogy varies across classroom settings, disciplines, and educational traditions. Moreover, assessing only non-native English-speaking lecturers for their pedagogical skills creates inequality and establishes native speaker superiority.

45Despite the appearance of authenticity and the appeal to face validity of the assessment, which may facilitate the implementation of the certification method, the reliability and the validity of the assessment must be prioritized regardless of what assessment method is selected. Clear assessment criteria, relevant items or tasks, and continuous rater training are important in order to maintain consistency in score assignment. The involvement of naïve raters (e.g., students) in the rating process appeals to the face validity of the assessment, but it introduces varied interpretation of the assessment criteria based on various factors (e.g., subject difficulty, familiarity with the lecturer, grades for the subject, own proficiency levels) (Barnwell 1989, Kang et al. 2019).

46Finally, adopting uses of commercial test scores for EMI lecturer certification may seem an easier solution for HEI’s because it does not require internal resources and investments. However, HEI’s should be aware of the possible limitations of these tests in measuring lecturers’ English proficiency for the EMI domain and consider these limitations when interpreting test results and setting the requirements. For example, if the lecturer’s overall score on the TOEFL is lower because of her writing score, the implications of this for the EMI classroom should be considered.

47If HEI’s decide to design and implement a local assessment tool for EMI lecturer certification, they should consider the available resources for design, maintenance, development, and sustainability. Continuous analyses of the adequacy of the scale descriptors, rater training and performance, items, assessment uses, and assessment impacts are necessary in order to ensure the quality of the assessment instrument. Therefore, an assessment team that focuses on the assessment’s maintenance and ongoing development is needed. For instance, in the past decade, TOEPAS has undergone several revisions of the scale, the rater training systems, and the underlying infrastructure used for scoring, data storage, and analysis (Dimova et al. 2020). If the HEI lacks a long-term plan for assessment development and fails to establish a team dedicated to this purpose, then the sustainability of the assessment will be jeopardized.

Kling, Joyce & Lars Stenius Stæhr. 2012. The Development of the Test of Oral English Proficiency for Academic Staff (TOEPAS). <​soeg/​result/​?pure=en/​publications/​the-development-of-the-test-of-oral-english-proficiency-for-academic-staff-toepas(64a2c359-ef54-4688-a5e8-9a1d1604e79f)/​export.html>.

Wilkinson, Robert. 2005. “The impact of language on teaching content: Views from the content teacher”. Paper presented at the Bi and Multilingual Universities–Challenges and Future Prospects Conference. Retrieved from <​article/​view/​2021-n34-emi-lecturer-trainers-reflections-on-the-implementation-of-emi-lecturer-training-course>

Haut de page


Airey, John. 2011. “Talking about teaching in English: Swedish university lecturers’ experiences of changing teaching language”. Ibérica 22, 3554.

Arkın, Erkan & Necdet Osam. 2015. “English-medium higher education. A case study in a Turkish university context”. In Dimova, Slobodanka., Kristina. Hultgren & Christian Jensen (Eds.), English-medium Instruction in European Higher Education. Berlin: De Gruyter Mouton, 177199.

Bachman, Lyle. 2007. “What is the construct? The dialectic of abilities and contexts in defining constructs in language assessment”. Language testing reconsidered, 4171.

Bachman, Lyle F. & Adrian S. Palmer. 1996. Language Testing in Practice: Designing and Developing Useful Language Tests (Vol. 1). Oxford: Oxford University Press.

Bachman, Lyle F. & Adrian S. Palmer. 2010. Language Assessment in Practice: Developing Language Assessments and Justifying Their Use in The Real World. Oxford: Oxford University Press.

Ball, Phil & Diana Lindsay. 2013. “Language demands and support for English-medium instruction in tertiary education. Learning from a specific context”. In Doiz, A., J. M. Sierra & D. Lasagabaster (Eds.), English-medium Instruction at Universities: Global Challenges. Bristol, UK: Multilingual Matters, 4464.

Barnwell, David. 1989. “Naive native speakers and judgements of oral proficiency in Spanish”. Language Testing 6, 152163.

Bazo, Plácido, Dolores González, Aurora Centellas, Emma Dafouz, Alberto Fernández & Victor Pavón. 2017. Linguistic Policy for the Internationalisation of the Spanish University System: A Framework Document. Madrid: CRUE.

Björkman, Beyza. 2010. “So you think you can ELF: English as a lingua franca as the medium of instruction”. Hermes–Journal of Language and Communication Studies 45, 7796.

Björkman, Beyza. 2011. “Pragmatic strategies in English as an academic lingua franca: Ways of achieving communicative effectiveness?”. Journal of Pragmatics 43, 950964.

Bridgeman, Brent, Yeonsuk Cho & Stephen DiPietro. 2016. “Predicting grades from an English language assessment: The importance of peeling the onion”. Language Testing 33/3, 307318.

Campagna, Sandra & Virginia Pulcini. 2014. “English as a medium of instruction in Italian universities: Linguistic policies, pedagogical implications”. Textus 27, 173190.

Coleman, David, Sue Starfield & Anne Hagan. 2003. The Attitudes of IELTS Stakeholders: Student and Staff Perceptions of IELTS in Australian, UK and Chinese Tertiary Institutions. International English Language Testing System (IELTS) Research Reports 2003 (Volume 5), 160.

Crawford Camiciottoli, Belinda. 2004. “Interactive discourse structuring in L2 guest lectures: Some insights from a comparative corpus-based study”. English for Specific Purposes 3, 3954.

Dafouz, Emma & María-del-Mar Camacho-Miñano. 2016. “Exploring the impact of English-medium instruction on university student academic achievement: The case of accounting”. English for Specific Purposes 44, 5767.

Dafouz, Emma & Begoña Núñez. 2009. “CLIL in higher education: Devising a new learning landscape”. In Dafouz, E. & M. Guerrini (Eds.), CLIL across Educational Levels: Experiences from Primary, Secondary and Tertiary Contexts. Madrid: Richmond Publishing, 101112.

Davies, Alan. 2011. “Does language testing need the native speaker?”. Language Assessment Quarterly 8, 291308.

Dimova, Slobodanka. 2017a. “Life after oral English certification: The consequences of the Test of Oral English Proficiency for Academic Staff for EMI lecturers”. English for Specific Purposes 46, 45–58.

Dimova, Slobodanka. 2017b. “Pronunciation assessment in the context of World Englishes”. In Kang, O. and A. Ginther (Eds.), Assessment in Second Language Pronunciation. Oxon: Routledge, 49–66.

Dimova, Slobodanka. 2018. Linking the TOEPAS with the CEFR: Technical report. TAEC Erasmus+ project (20172020).

Dimova, Slobodanka. 2020a. “Language assessment of EMI content teachers: What norms.” In Kuteeva, M., K. Kaufhold & N. Hynninen (Eds.), Language Perceptions and Practices in Multilingual Universities. London: Palgrave Macmillan, 351–378.

Dimova, Slobodanka. 2020b. “Mundtlighed i et testperspektiv [Speech from a testing perspective]”. Sprogforum 70, 3946.

Dimova, Slobodanka. 2020c. “The role of feedback in the design of a testing model for social justice”. Journal of Contemporary Philology 3, 21–34. <>.

Dimova, Slobodanka & Joyce Kling. 2015. “Lecturers’ English proficiency and university language polices for quality assurance”. In Wilkinson, R. & M. L. Walsh (Eds.), Integrating Content and Language in Higher Education: From Theory to Practice. Selected papers from the 2013 ICLHE Conference. Frankfurt: Peter Lang, 5065.

Dimova, Slobodanka & Joyce Kling. 2018. “Assessing English‐medium instruction lecturer language proficiency across disciplines”. TESOL Quarterly 52, 634–656.

Dimova, Slobodanka & Joyce Kling. 2020. “Current considerations on integrating content and language in multilingual universities”. In Dimova, S. & J. Kling (Eds.), Integrating Content and Language in Multilingual Universities. Cham: Springer, 112.

Dimova, Slobodanka, Xun Yan & April Ginther. 2020. Local Language Testing: Design, Implementation, and Development. Oxon: Routledge.

Douglas, Dan. 2020. Assessing Languages for Specific Purposes. Cambridge: Cambridge University Press.

Dubow, Gregg & Susanne Gundermann. 2017. “Certifying the linguistic and communicative competencies of teachers in English-medium instruction programmes”. Language Learning in Higher Education 7/2, 475–487.

Eslami, Zohreh R., Keith M. Graham, and Hassan Bashir. 2020. "English Medium Instruction in Higher Education in Qatar: A multi-dimensional analysis using the ROAD-MAPPING framework." In Dimova, S. & J. Kling (Eds.), Integrating Content and Language in Multilingual Universities. Springer, Cham, 115–129.

Fenton-Smith, Ben, Pamela Humphreys & Ian Walkinshaw. 2017. English Medium Instruction in Higher Education in Asia-Pacific. New York: Springer International Publishing.

Figueras, Neus, Brian North, Sauli Takala, Norman Verhelst & Piet Van Avermaet. 2005. “Relating examinations to the common European framework: A manual”. Language Testing 22, 261–279.

Fortanet-Gómez, Inmaculada. 2020. "The dimensions of EMI in the international classroom: Training teachers for the future university." In Fortanet-Gómez, I. (Ed.), Teacher Training for English-Medium Instruction in Higher Education. Hershey: IGI Global, 1–20.

Ginther, April & Xun Yan. 2018. “Interpreting the relationships between TOEFL iBT scores and GPA: Language proficiency, policy, and profiles”. Language Testing 35/2, 271–295.

Gregersen, Frans (Ed.). 2014. Hvor Parallelt : Om Parallellspråkighet på Nordens Universitet. Copenhagen: Nordic Council of Ministers.

Gregersen, Frans. 2018. More Parallel, Please!: Best Practice of Parallel Language Use at Nordic Universities: 11 Recommendations. Copenhagen: Nordic Council of Ministers.

Gundermann, Susanne & Gregg Dubow. 2018. “Ensuring quality in EMI: Developing an assessment procedure at the University of Freiburg”. Bulletin VALS-ASLA 107, 113–125.

Halbach, Anna & Alberto Lázaro. 2015. The Accreditation of English Language Level in Spanish Universities. La Acreditación del nivel de lengua Inglesa en las Universidades Españolas. Actualización.

Inbar-Lourie, Ofra & Smadar Donitsa-Schmidt. 2020. “EMI Lecturers in international universities: Is a native/non-native English-speaking background relevant?”. International Journal of Bilingual Education and Bilingualism 23/3, 301–313.

Kang, Okim, Don Rubin & Alyssa Kermad. 2019. “The effect of training and rater differences on oral proficiency assessment”. Language Testing 36, 481–504.

Kirkpatrick, Andy. 2006. “Which model of English: Native-like, nativized or lingua franca?” In Rubdy, R. & M. Saraceni (Eds.), English in the World: Global Rules, Global Roles. London: Continuum, 71–83.

Klaassen, Renate. 2001. The International University Curriculum: Challenges in English-Medium Engineering Education (Doctoral dissertation). TU Delft: Delft University of Technology.

Klaassen, Renate. 2008. “Preparing lecturers for English-medium instruction”. In Wilkinson, R. & V. Zegars (Eds.), Realizing Content and Language Integration in Higher Education. Maastricht: Universitaire Pers Maastricht, 32–42.

Klaassen, Renate & Madeleine Bos. 2010. “English language screening for scientific staff at Delft University of Technology”. Journal of Language and Communication Studies 45, 61–75.

Kling, Joyce & Slobodanka Dimova. 2015. “The Test of Oral English for Academic Staff (TOEPAS): Validation of standards and scoring procedures”. In Knapp, A. & K. Aguado (Eds.), Fremdsprachen in Studium und Lehre–Chancen und Herausforderungen für den Wissenserwerb. Frankfurt/Main: Peter Lang, 247–268.

Kling, Joyce & Lise-Lotte Hjulmand. 2008. “PLATE–Project in language assessment for teaching in English”. In Wilkinson, R. & V. Zegars (Eds.), Realizing Content and Language Integration in Higher Education. Maastricht: Universitaire Pers Maastricht, 191–200.

Kling, Joyce & Lars Stenius Stæhr. 2011. “Assessment and assistance: Developing university lecturers’ skills through certification feedback”. In Cancino, R., L. Dam & K. Jæger (Eds.), Policies, Principles, Practices: New Directions in Foreign Language Education in the Era of Educational Globalization. Newcastle upon Tyne, UK: Cambridge Scholars Press, 213–245.

Knoch, Ute & Susy Macqueen. 2019. Assessing English for Professional Purposes. Oxon: Routledge.

Lord, Frederic M. & Melvin R. Novick. 1968. Statistical Theories of Mental Test Scores. Reading MA: Addison-Welsley Publishing Company.

Macaro, Ernesto. 2018. English Medium Instruction. Oxford: Oxford University Press.

Macaro, Ernesto, Antonio Jiménez-Muñoz & David Lasagabaster. 2019. “The importance of certification of English medium instruction teachers in higher education in Spain”. Porta Linguarum 32, 103–118.

Martinez, Ron. 2016. “English as a Medium of Instruction (EMI) in Brazilian higher education: challenges and opportunities”. In Finardi, K. R. (Eds), English in Brazil: Views, Policies and Programs. Londrina: SciELO-EDUEL, 191–228

Messick, Samuel. 1996. “Validity and washback in language testing”. Language Testing 13/3, 241–256.

Morell, Teresa. 2007. “What enhances EFL students’ participation in lecture discourse? Student, lecturer and discourse perspectives”. Journal of English for Academic Purposes 6, 222–237.

O'Dowd, Robert. 2018. “The training and accreditation of teachers for English medium instruction: an overview of practice in European universities”. International Journal of Bilingual Education and Bilingualism 21/5, 553–563.

Pilkinton-Pihko, Diane. 2013. English-Medium Instruction: Seeking Assessment Criteria for Spoken Professional English (doctoral dissertation). Helsinki: University of Helsinki. <>.

Suviniitty, Jaana. 2012. Lectures in English as a Lingua Franca: Interactional Features (doctoral dissertation). Helsinki: University of Helsinki. <>.

Tange, Hanne. 2010. “Caught in the Tower of Babel: University lecturers’ experiences with internationalisation”. Language and Intercultural Communication 10, 137–149.

TAEC Literature Database Report (2020). The Transnational Alignment of English Language Competences for University Lecturers Literature Database. TAEC Erasmus+ project (2017-2020). <>.

Thøgersen, Jakob & John Airey. 2011. “Lecturing undergraduate science in Danish and in English: A comparison of speaking rate and rhetorical style”. English for Specific Purposes 30, 209–221.

Verguts, Catherine & Tom De Moor. 2019. “The Policy-Imposed C1-Level of English in Flemish Universities: A Blessing for Students, a Challenge for Lecturers”. Paper presented at the BAAHE Conference. Ghent, Belgium.

Verhelst, Norman, Piet Van Avermaet, Sauli Takala, Neus Figueras, & Brian North. 2009. Common European framework of reference for languages: Learning, teaching, assessment. Cambridge: Cambridge University Press.

Vinke, Adriana A. 1995. English as the Medium of Instruction in Dutch Engineering Education (doctoral dissertation). TU Delft: Delft University of Technology.

Vinke, Adriana A, Joke Snippe & Wim Jochems. 1998. “English-medium content courses in Non-English Higher Education: A study of lecturer experiences and teaching behaviours”. Teaching in Higher Education 3, 383–394.

Vu, Nha & Anne Burns. 2014. “English as a medium of instruction: Challenges for Vietnamese tertiary lecturers”. Journal of Asia TEFL 11/3, 1–31.

Wait, Isaac & Justin Gressel. 2009. “Relationship between TOEFL score and academic success for international engineering students”. Journal of Engineering Education 98/4, 389–398.

Westbrook, Pete & Birgit Henriksen. 2011. “Bridging the linguistic and affective gaps”. In Cancino, R, L. Dam & K. Jæger (Eds.), Policies, Principles, Practices: New Directions in Foreign Language Education in the Era of Educational Globalization. Newcastle upon Tyne, UK: Cambridge Scholars Press, 188–212.

Zieky, Michael & Marianne Perie. 2006. A Primer on Setting Cut Scores on Tests of Educational Achievement. Princeton : ETS.

Haut de page

Table des illustrations

Titre Figure 1. TAEC EMI framework
Fichier image/png, 55k
Titre Figure 2. Current assessment methods
Fichier image/png, 14k
Haut de page

Pour citer cet article

Référence papier

Slobodanka Dimova, « Certifying lecturers’ English language skills for teaching in English-medium instruction programs in higher education »ASp, 79 | 2021, 29-47.

Référence électronique

Slobodanka Dimova, « Certifying lecturers’ English language skills for teaching in English-medium instruction programs in higher education »ASp [En ligne], 79 | 2021, mis en ligne le 01 mars 2022, consulté le 23 juin 2024. URL : ; DOI :

Haut de page


Slobodanka Dimova

Slobodanka Dimova is associate professor at the University of Copenhagen. Her research interests include language testing and EMI. Her work appears in Language Testing, TESOL Quarterly, English for Specific Purposes, and English for Academic Purposes. She co-edited English-Medium Instruction in European Higher Education (De Gruyter Mouton 2015), Integrating Content and Language in Multilingual Higher Education (Springer 2020), and co-authored Local Language Testing (Routledge 2020).

Haut de page

Droits d’auteur


Le texte seul est utilisable sous licence CC BY-NC-ND 4.0. Les autres éléments (illustrations, fichiers annexes importés) sont « Tous droits réservés », sauf mention contraire.

Haut de page
Rechercher dans OpenEdition Search

Vous allez être redirigé vers OpenEdition Search