Navigation – Plan du site

AccueilNuméros82Articles de rechercheData-driven learning and professi...

Articles de recherche

Data-driven learning and professional culture: Using textometry to analyze organizations in applied foreign language programs

Apprentissage par corpus et culture organisationnelle : une approche textométrique en langues étrangères appliquées
Mary C. Lavissière
p. 99-117

Résumés

À l’ère des données textuelles massives, la filière des langues étrangères appliquées (LEA) occupe une place unique dans le paysage universitaire. La combinaison de cours d’humanités et de programmes de langues appliquées au monde des affaires permet aux étudiants de comprendre les processus institutionnels auquel le discours contribue. En outre, l'apparition de logiciels textométriques, tels qu’Iramuteq (Ratinaud 2014), fournit une méthode qui étend l'analyse du discours à de grandes quantités de discours organisationnels. Notre étude exploratoire indique que la textométrie peut être utilisée pour améliorer les analyses de la culture organisationnelle menées par les étudiants en LEA.

Haut de page

Texte intégral

1 Introduction

1The question of the academic identity of Applied Foreign Languages (Langues Étrangères Appliquées, LEA) students in the larger landscape of students who complete French degree programs targeting both business classes and foreign languages was raised by Narcy-Combes (2003: 119) in the following terms:

  • 1 This is our translation of the following extract from Narcy-Combes (2003: 119) “Ensuite, la nécessi (...)

Next, the necessity for LEA students to construct their identity between French technical and business schools. This space seems to offer them a specific professional outlet on the condition of raising their awareness about the fact that linguistic competence in the Chomskyan definition (knowledge of the rules of how language functions) is not necessarily synonymous with competence in communication and that the latter supposes sociocultural and intercultural competences1.

  • 2 We adopt Biber et al.’s (2007) definition of the term “discourse”. These researchers distinguish be (...)

2In this article, we propose to use data-driven learning, specifically textometry, to increase the competences identified in the quote above. We make the argument that teaching LEA students about how language creates organizational culture may enable LEA students to distinguish themselves from other students with comparable foreign language and business skills. More specifically, we argue that using textometry in the classroom can help ascertain the connection between language and organizational culture. Indeed, while students in other degree programs study languages and organizational theory, it is rare that they study how organizational culture is created by language. It is equally uncommon for linguistic concepts, such as discourse,2 to be explicitly taught to LEA students. This is despite the fact that LEA is a degree in foreign languages. For this reason, we argue that integrating textometry to raise awareness of LEA students about the link between discourse and organizational culture can give them a distinctive set of theoretical knowledge and applied skills.

  • 3 Henceforth “institutional logics”.

3Our article is organized as follows. First, we carry out a review of one of the major theories in organizational culture, namely institutional logics perspective3 (Thornton et al. 2012). We also present the linguistic and mathematical theory behind textometry, especially from the Geometric School of Data Analysis, whose findings directly contributed to Iramuteq (Ratinaud 2014). In addition, we review research in which a data-driven learning (DDL) approach is used to teach about culture. Then, we proceed with a study in which we used DDL to teach first year master’s program students about organizational culture through discourse. The study includes the creation of a corpus by the students and an analysis of it with Iramuteq. Finally, we discuss the contributions of our results to the relevant literature and we present our conclusions.

2 Literature review

4The following paragraphs provide an overview of research about organizational culture, textometry, and data-driven learning approaches that use discourse to teach about culture. First, we briefly review the frameworks currently used for analyzing culture in LEA classes. We also provide a short overview of the history and tenets of institutional logics as an approach to organizational culture. Second, we explain the theory behind textometry, specifically Iramuteq, and the algorithms it integrates. Third, we review the literature concerning data-driven learning (DDL) approaches that link discourse and culture.

2.1 Organizational culture and institutional logics

5First, the pivotal role of cultural knowledge in successful professional communication is well established. In addition to Hall’s (1959, 1966) seminal anthropological studies about how cultural factors impact human interaction, the impact of culture on organizations is now a preoccupation in management sciences and practice. In order to help managers lead international teams effectively, authors such as Hofstede et al. (2010), Trompenaars & Hampden-Turner (2004), and Meyer (2014) propose models of cultural factors linked to nationality. These models often make up the core content of courses on culture in LEA programs. However, they focus mainly on traits of national culture.

6For this reason, we adopt the theory of institutional logics to analyze organizational culture. This theory aims to construct models that represent organizations as part of a multi-level system constructed by and constructor of symbolic systems, such as language. In this framework, an organization is defined as an instrument for accomplishing a given objective (Scott 2008). However, Selznick (1957) argues that organizations are subject to institutionalization, a process of infusing their practices with meaning. Thornton et al. (2012:10) further emphasize the importance of symbols in creating meaning by making the distinction between the material and the symbolic: “By material aspects of institutions, we refer to structures and practices; by symbolic aspects, we refer to ideation and meaning, recognizing that the symbolic and the material are intertwined and constitutive of one another.” Institutional logics thus offer a theory for how language participates in the construction of organizations through discourse. Linguistic structures such as sets of terms, collocations, genres, narratives, and lexical worlds (Reinert 1983) create meanings that are specific to organizations. In this way, understanding the trends in discourse produced by a given organization allows researchers and students to perceive deeper processes at work within it and to adapt their own discourses accordingly.

2.2 Textometry as exemplified by Iramuteq

7Textometric software such as Iramuteq can facilitate the representation and the interpretation of trends in organizational discourse. Words and collocations that appear frequently in organizational discourse can be seen as both as practices of speaking and symbolic patterns for conveying meaning. Textometric software programs help researchers and students to identify and analyze these patterns. They include algorithms developed by the school of Geometric Data Analysis (Le Roux & Rouanet 2005), which was founded in the 1960s by Jean-Paul Benzécri (Beaudouin 2016). As a mathematician with a deep interest in linguistics, Benzécri worked to create statistical methods that 1) facilitated an inductive and text-based approach to language rather than the top-down and abstract modeling proposed by the theory of Generative Linguistics; 2) were motivated by a philosophy that “the model must fit the data, not vice versa” (Greenacre 1984:10); 3) helped researchers visualize multivariable data through graphs in order to aid their interpretation. Statistical methods from Geometric Data Analysis are suitable for studying organizational discourse where the interest is to observe and analyze trends that stem from the textual data produced by organizations or in organizational contexts.

  • 4 unités de contexte
  • 5 contexte-types

8While multiple software programs inspired by Geometric Data Analysis exist, Iramuteq integrates and adapts a characteristic clustering algorithm for large corpora. Iramuteq is an open-source software that functions on several operating systems and has an active team of developers who interact with its community of users through message boards and mailing lists. The algorithm was initially proposed by Reinert (1983, 1990) who theorized that corpora bear the mark of the utterers’ statements about objects. He claims these objects are represented by lexical lemmas, which he refers to as “forms”. Statements and sentences are called “units of context”. 4The presence of two forms in a unit of context reflects the way a single utterer constructs a cognitive representation of the world. On the other hand, when forms repeatedly appear in the same units of context at a corpus level rather than only in an individual text, these repetitions are called “typical contexts”. 5They are collective cognitive representations that are specific to a corpus and they show “a general attitude of a subject in regard to the world” or a lexical world (Reinert 1990: 28).

9Reinert operationalizes his theory by using a descending hierarchical clustering (DHC) algorithm. This algorithm constructs statistically homogenous classes of forms that appear in the same units of context. The details of the algorithm are described in his articles and in Schonhardt-Bailey et al. (2012). We summarize the method briefly here to show how Reinert’s linguistic theories are integrated into Iramuteq. First, the program creates a contingency table. The rows are filled with units of context or text segments (TS) as they are called in Iramuteq; the columns are filled with forms. The length of the TS can be adjusted by the researcher and may correspond to the average length of a sentence in the discourse studied. The presence (1) or absence (0) of the forms in each TS is indicated at the intersection of each row and column of the contingency table. The algorithm then recursively creates classes, starting with one class and dividing it into smaller classes. It creates the first division by calculating the first factor of the Correspondence Analysis (Benzécri 1973) of the contingency table according to Pearson’s chi-squared test and the orthogonal hyperplane that maximizes interclass inertia between the two potential classes. Finally, to optimize the division, individual TS can be exchanged around the hyperplane. The algorithm continues until the number of predetermined iterations is reached. Once the classes of TS have been created, they are then characterized by the forms that are overrepresented in their TS as compared to the rest of the corpus according to the chi-square metric. Iramuteq also allows classes to be represented according to their distribution around the first two factors of a Correspondence Analysis computed on the classification results (classes × words). The researcher then interprets the classes and the factors, often assigning them a name.

2.3 Data-driven learning

10When Iramuteq is integrated into a DDL approach (Aijmer 2009; Bennett 2010; Boulton 2012, 2016, 2017; Boulton & Wilhelm 2006; Johns 1986, 1988, 1991), it can be used to teach LEA students about how discourse constructs and is constructed by organizational culture. Few studies, however, have actually used DDL methods to teach about culture. The list of 489 empirical DDL studies used by Boulton & Vyatkina (2021) in their systematic literature review shows that only Kettemann (2011) uses a DDL method to teach about culture. He uses this approach in a study centered on teaching youth culture for a cultural studies program. The students used a teacher-composed corpus (emo) and Wordsmith Tools 4.0 (Scott 2004). They were presented with collocations of the pronoun “I” and the verb following it, a list of key words from the emo corpus and one from the Bergen Corpus of London Teenage Language (colt) (Stenström et al. 2002); the list of the 50 most frequent keywords in the emo corpus, and the concordances for alone, lonely, and on my own. The teaching method was assessed through questionnaires distributed to the ten students who took the course. The questionnaires included short answer questions about the tasks and the students’ appreciation of the method. While his study was exploratory, Kettemann (2011) highlights an increase in the awareness of the language-culture connection.

11Kettemann (2011), however, does not use DDL to explore the link between discourse and organizational culture, nor does he use a method that allows for a systematic exploration of larger discourse units in his corpus. Finally, he provides his students with a corpus rather than having them construct their own corpora. In contrast, Charles (2012, 2014) shows that when students create their own corpora, they learn autonomy and tend to use the corpora beyond the classroom. Creating a corpus also brings students to consider how different types of discourse influence the representation of a culture. To fill these three gaps, we carried out an exploratory study in which students created their own corpora of organizational discourse and used Iramuteq to facilitate their analysis of the link between discourse and organizational culture. We describe this study in the following section.

3 Using textometry to teach organizational culture

12This section describes an exploratory study in which a task involving textometry was used to teach about the link between discourse and organizational culture. Like Kettemann (2011), we combined a deductive (theoretical classes) and inductive (textometric study) approach. The task involved nine students in the first-year of an LEA master’s program. Their specialization was international trade. They were enrolled in a “sandwich” degree program (alternance), and they worked in diverse sectors for their internships. The students worked three days outside the university and attended university courses during the other two days.

13The task took place during the first semester of the 2019–2020 academic year. The objectives of the English course included learning about organizational culture. As the students were outplaced as interns in companies during their degree program, instructors encouraged them to analyze the organizational culture of the companies in which they were interns. Several class sessions were devoted to cultural theory. Students were introduced to theories focused on national culture (Hall 1959, 1966; Hofstede et al., 2010; Meyer 2014), to organizational theory (Scott 2008; Thornton et al., 2012), and to textometry. Then a task-based approach (Ellis 2003; Kazeroni 1995; Kuteeva 2013; Malicka et al. 2019; Paris 2021) was used to encourage students to observe and analyze organizational culture using discourse from the companies in which they were working.

14The task included five steps. First, the students created a corpus of discourse related to the corporation where they worked as interns. Authentic texts from any source of professional discourse were eligible for the corpus: emails, internal regulation, social media communication, website, annual report, etc. Second, the students were trained to use Iramuteq during a three-hour class session. The students learned to obtain and interpret the results of basic frequency statistics (Loubère & Ratinaud 2014), the DHC (Reinert 1983, 1990) and the representation of the DHC classes when these were projected around the first two factors of the Correspondence Analysis (Benzécri 1973). Third, the students carried out analyses on their corpus using the software. Fourth, they presented and discussed their results in class. Fifth, they wrote a report on their companies’ culture as part of their final evaluation. They were asked to justify their choice of texts included in their corpus, to include their interpretations of the DHC analysis as well as to name the classes produced by the DHC analysis. They were also asked to label the first two factors of the Correspondence Analysis around which the classes can be projected in Iramuteq.

15The nine reports presented by our students were then analyzed with a thematic method (Bardin 2013). We used two predetermined themes and an emergent one. The predetermined themes were “corpus building” and “organizational culture analyses”. Students were explicitly asked to include these themes in their final reports. The other theme, “limits of textometry according to students”, emerged from the reports. We present our results in the following section.

4 Results

16Like Kettemann (2011), we organize our results according to the three themes that we identified in the students’ final anonymized reports. The themes are 1) corpus building; 2) organizational culture analyses; 3) limits of textometry according to students. We also present graphs from the students’ reports. We chose this presentation because, as mentioned in our literature review, an important step in Geometric Data Analyses is the interpretation of the classes resulting from the DHC and the interpretation of the axes resulting from the Correspondence Analysis. Students’ names, gender pronouns, and companies have been altered or excluded for ethical purposes.

4.1 Corpus building

  • 6 While coding was not required, student 2 also coded the texts in her corpora for type of social net (...)
  • 7 Cleaning of the corpora was started in class and completed by the students autonomously. Certain ty (...)

17The students were asked to build a corpus of written discourse produced by their companies. They were given the liberty to choose which texts to include in their corpus, but were told that they would have to justify their choices in their final report. Most of the texts chosen were already in digital format. They included internal documents that were not confidential and external communication, mainly via the companies’ websites and social media communication. For the latter, students mainly used a manual collection method to create their corpus, though one student, student 2, created a short, computerized algorithm to collect all of the social media publications from her company’s social media communications, including Twitter, Facebook, and LinkedIn6. We summarize the characteristics of the corpora creation in table 1 below. 7

Table 1: Description of the corpora as provided by the students in their final reports

Table 1: Description of the corpora as provided by the students in their final reports

18As table 1 shows, the quality of the corpora and that of the corpus description varied according to the students. It should be noted that several students decided to collect a corpus of organizational communication in French. This is because some companies had little text available in English. The students, however, were required to translate key terms into English in their final report.

19Out of the nine students, six students showed that they had gained an awareness about how textual genres may affect the representation an organization creates of itself. This includes an awareness of the difference between targeting audiences that are inside or outside of the organization:

The corpus does not contain internal texts. It is only external communication, so we might detect the image we want to create in the partners and clients’ minds. (student 2)

I have chosen to only present internal documents produced by the firm itself in order to examine what kind of message (the company profile, the company culture or the institutionalization rate, etc.) the firm [company] is trying to express. On one hand, the internal and organizational announcement will permit to analyse [sic] what the firm is trying to show about its values, and its culture to its employee (internal). And on the other hand, the other documents help us analyse [sic] what the firm is trying to express toward its customer and the society (brand image). (student 6)

20They also showed awareness of the different purposes of discourse on social media:

LinkedIn is a multi-purpose communication platform for the [company]’s brand: It is addressing to [sic] the business internal and external network of the brand, as the dealership network or the different collaborators of the firm; It also enables to get [sic] in touch with potential prospects, as active professional people are part of the target customers’ community. (student 5)

21One student showed that she was aware of the dual nature of the technical and the institutional components described by Selznick (1957):

I deliberately chose not to incorporate technical elements such as products’ technical specifications sheets because I wanted this analysis to focus on [company]’s brand image, strategy and culture only. Generally, the content includes general information about the Company (History, About, governance, strategy, objectives, products and services….), news (From September 2019 to January 2020 and previous important news that still have repercussions on today’s strategy – for example, the transition from [company A] to [company B] – , quotes, mottos and catchphrases – especially on social medias [sic] such as Facebook, Twitter and LinkedIn – . (student 8)

22Another student showed that norms and guidelines could constitute discourse that contains company values:

The first document I chose that made the most sense for me to study and center the research on IRaMuTeQ around was the frame of reference that I am currently working on and that is at the center of my work at [company]. It was important to me to study this because it is what I mainly do at my job and because it is a framework that covers trade as well as customer service, the 2 services that I will be working for in the future once this project is finished. (student 9)

23Student 1 and student 4, however, did not fully grasp that the representation of the organization that emerges from one source of discourse is highly contingent. It depends on the type of discourse selected for the textometric analysis:

My Corpus come [sic] from the website of my company [company] All of this [sic] pages are giving basic informations [sic] and example [sic] of what are our products and processes. Each of this [sic] pages are composed by a description of the process, the different materials that can be machined thanks to those processes and there is often a video to illustrate the process. I’ve also choose [sic] to include the page with the presentation of the company. I’ve choose [sic] this corpus because I think it really defined the company identity. What is it’s [sic] objectives, what do it propose, how it realize it? The whole company identity is inside. (student 1)

I thought that because they [texts] all come from [the company’s] official website they will anyway be linked and that it would be way more interesting to take them randomly and to see the results. (student 4)

24In sum, after the task, the majority of the students seemed more aware of how different genres create different representations of an organization.

4.2 Organizational culture analyses

25The students had mixed success in directly linking their interpretation of the DHC analysis and the graph of the Correspondence Analysis with concepts from institutional logics theory. Three students were able to explicitly link these concepts to the results of their textometric analyses; one student linked national culture concepts to his results. We give examples from their reports below.

26Student 2 coded her corpus for the source of discourse (social media versus website articles) and was able to observe differences in how the organization represented itself according to the source of the discourse. To illustrate this, she used the graph (figure 1b) that represented the DHC classes (figure 1a) around the first two factors of the Correspondence Analysis:

This graphic [sic] show [sic] that the communication is very different between social networks and articles. We can notice that the articles are more based on the citizen’s needs. And it’s normal because interviews are about the persons who made the projects proposals and the other part about the projects realized. Social medias communication is bases [sic] on participatory budget while articles semantic field is more based on the platform and its features. Also, the social networks are more linked to the news to be trendier. It may be because we have to adapt our readers. Articles are on our website; they are made for all persons who discover our company and platform. While the followers on social medias are persons who already knew the company, so can present our platform or the new features.

Figures 1a and 1b: DHC classes and Correspondence Analysis graph of discourse sources (student 2)

Figures 1a and 1b: DHC classes and Correspondence Analysis graph of discourse sources (student 2)

27Similarly, as shown in figure 2, student 6 analyzed her DHC classes in light of organizational theory, highlighting the regulative and normative aspects of her company:

After the analysis of both graphs, it seems that from an organisational, institutional and cultural point of view important information about the company could be draw [sic]. Firstly, it appears that the internal document [sic] shared by [company] are mostly communicating on [company] “Organisation” (3rd class) with 28,2% of the total words composing the corpus and put the emphasis on the word plant. “Plant” […] is directly links [sic] with the regulative and normative industrial logic (process, regulation, etc.) […]. Concerning the word “commit” […] it directly refers to the regulative basis of order highlighted before which is [company]’s commitment to follow the current law and regulation. It is visible in the everyday life of the employee (regular updating of the process, charts, concept sheets, etc.). Moreover, we can notify [sic] the engineering vocabulary type used in the document, this vocabulary is used in order to expose the organisational legitimation of [company] activity by using a [sic] specific language […].

Figure 2: Classes calculated by the DHC in student 6’s report

Figure 2: Classes calculated by the DHC in student 6’s report

28As shown in figure 3, student 7 also emphasized concepts associated with institutional logics, such as “standards” and “methods”:

In conclusion, this corpus highlights the institutional aspect of [company x]. This is due to the importance of the standards, doctrines, and operating methods to be followed in the various processes. The State […] has control over […], which is why [company x] is subject to European regulations.

[Company x] is not just a large group […] all these regulations are very much ingrained in its strategy. This aspect restrains the group in its development, but it brings security and perenity.

Figure 3: Classes calculated by the DHC in student 7’s report

Figure 3: Classes calculated by the DHC in student 7’s report

29The other students were less successful in directly linking their analyses to their company’s organizational culture as defined by institutional logics. While eight of the nine students proposed relevant names for the classes resulting from the DHC, only around half of the students (five out of nine) were able to name the first two factors of the Correspondence Analysis and correctly interpret the graph of the classes projected around these factors. Some students misinterpreted the graphs. For example, when a class is far from the center of the graph of the Correspondence Analysis, the interpretation should be that the class is distinct from the other classes. Instead, student 8 interpreted this position as meaning that its company’s website did not emphasize its services:

[…] the purple class which represents services is a bit offset. It could mean that [company] does not focus on services. Actually, services are [sic] one of [company]’s priorities since last year. This paradox could be explained by the fact that marketing teams are busy promoting events and innovations rather than talking about services that are still under study.

Figure 4: Correspondence analysis associated with the DHC in student 8’s report

Figure 4: Correspondence analysis associated with the DHC in student 8’s report

30The distance of the class from the center of the graph in figure 4 may mean that reference to services is concentrated in certain specific passages of the company’s texts rather than in its overall discourse. However, this does not necessarily mean that services are not emphasized. Another misinterpretation was found in student 3’s report. He interpreted the tree diagram from left to right (see figure 5), rather than according to the vertical lines indicating the relationship between the classes:

The fourth topic is the [sic] security, this place [sic] for a such important topic for a technologic company. Security comes at [sic] fourth. That shows that the company should communicate more about how they secure their data center. This could improve the sales of [product] for the companies and to be more known by people that [sic] doesn’t [sic] know about the [product].

Figure 5: Classes calculated by the DHC in student 3’s report

Figure 5: Classes calculated by the DHC in student 3’s report

31Overall, the results linked to Organizational culture analyses show the potential that textometry has for allowing students to explore the link between discourse and organizational culture. They also show, however, the challenging nature of both the theoretical concepts behind theories about organizational culture and the methodological concepts behind textometry.

4.3 Limits of textometry according to the students

32While not required to do so, some students explicitly or implicitly evoked the limits of textometry as they saw them. This included limitations they recognized in the representation of the company produced by Iramuteq with regard to their personal experience in the company. Student 1 found that the representation produced by the software did not correspond to his personal representation of the company.

One point I think is regrettable is that the corpus does not highlight the services offered by the company such as maintenance, training, sending spare parts in an emergency. These activities are an integral part of society [sic] since when a machine is sold, its life expectancy is around 20 to 30 years and all maintenance is carried out by our technicians. A substantial part of the turnover is therefore generated by the maintenance of the 300 machines that [company] has around the world. This activity is unfortunately not mentioned in the various graphs

33While he cites the corpus as the issue, he does not specify whether the issue is the method of corpus collection or if there is a true problem with how the company represents itself on its website. Student 4 also evokes a surprising result:

To me this analysis shows the exact representation that [company] has about itself. However I find it quite surprising that there is [sic] no occurrences for « customer » which appears only six times in the whole corpus. It is surprising because the company has a clear strategy concerning customer […] The same observing [sic] needs to be done for the worldwide part as I would have believe [sic] European occurrences would be more present than Worldwide occurrences. Actually, the European activity is less mentioned though it is the biggest part of [company’s] activity. […] This shows the limit of this analysis: you can’t just took [sic] what’s on the graphs for granted. It is really necessary to at least know what is the corpus composed of to have enough perspectives [sic] for a perfect analysis.

  • 8 We hesitate to come to a strong conclusion about whether student 4’s sampling method was the reason (...)

34Student 4 seems more analytical than student 1 because he mentions the nature of the corpus. Neither of these students delves explicitly into the issue of sampling and statistics, but they take a critical position on the difference between the representation that a collection of discourse can create through statistical methods and their personal experience in the company. 8

35In contrast, student 2, who had experience with statistics, was candid in her critiques of the representations produced by Iramuteq, citing issues related to both the algorithms and corpus:

There is [sic] a lot of variable than can make this analysis questionable. We have to keep in mind that the content has been written by only one person. It means that it reveals the point of view and the way to communicate of only one author. Also, this kind of analysis is very sensitive to overfitting, we can make a lot of theories, but those theories cannot be confirmed or infirmed. […] The corpus we used is only made of external communication, so it can only reveal something about the way the company communicate [sic]. It would have been very interesting to have more internal documents to be able to make a comparison between different way [sic] of communication. The purpose, the care and the recipients are not the same. By seeing common points between both type of communication our analysis could be validated or rejected. Data base may be too small to have something meaningful, no information on iramuteq are [sic] given to see if the size of the corpus is big enough, Iramuteq doesn’t detect if the corpus is good or not.

36The fact that students take a critical view is positive. However, it should be highlighted that the size of the corpus must be considered when evaluating how well it represents the company. Iramuteq only describes patterns from a given corpus. It does not judge the adequacy of the corpus, nor does it give any measure of statistical validity.

37In sum, these results point to the need for more instruction about corpora and a more detailed course on Iramuteq’s parameters. Other methods may be used to explore corpora in more depth, but they also require time for training that was not available during the course described in the present article. However, our results do show that students have gained insights into the contingent nature of the representation that discourse may give of the organization that produced it. The students understood that this representation can change according to the type of discourse and its communicative purpose.

5 Discussion

38In this section, we discuss how we contribute to the current literature about DDL by using a DDL approach to teach organizational culture through textometry. We divide this discussion into three points.

39First, we used DDL to teach about the link between discourse and organizational culture. In contrast, Kettemann (2011) used this method to teach about discourse and culture in a cultural studies program. The culture studied in his article, EMO culture, was chosen for its motivational potential among students. In our study, students analyzed companies in which they were interns. Like Kettemann (2011), we observed a high level of motivation in our students during corpus construction as illustrated in the citations from student reports evoking “corpus building”. Most of our students made a serious effort to find various sources of organizational discourse to explore the culture of their organizations. In contrast to Kettemann (2011), the concepts we studied in the theoretical portion of the class were important to the students’ immediate professional achievements. In other words, the students’ choice of research object was more applied, and its professional stakes were immediately visible. It also meant that our students learned the specialized vocabulary and expressions used in their companies for each “lexical world” that existed in their corpora. The relevance of the specialized vocabulary was not a consideration in Kettemann’s (2011) study.

40Second, unlike Kettemann (2011), our students constructed their corpora. This approach was crucial to their understanding of the link between discourse and organizational culture. Similar to Charles (2012, 2014), we observed that this method allowed our students to tailor their corpus to their needs, including learning specialized vocabulary. We also observed that our students became more autonomous. They were more equipped to analyze the link between discourse and organizational culture thanks to their corpus building experience. This was observable in their comments in the “corpus building” results. It was also observable in the “limits” results, in which several students took a critical point of view on the link between the representation created from their corpus and their own experience in their organization. This is a specific contribution of our study in regard to Kettemann (2011): through corpus creation, our students learned that the representations that emerge from discourse are contingent. They change according to the communicative purpose of the discourse ­to whom the discourse is directed and with what communicative objective. We feel this knowledge was acquired by the majority of the class.

41Third, while Kettemann (2011) used keywords and concordances, in our DDL approach, we mainly applied statistical methods for exploring trends in larger discourse units. This approach was operationalized by Reinert’s algorithm (1983, 1990) in Iramuteq to create lexical worlds from each students’ corpus of organizational discourse. All of the students were able to interpret the classes resulting from the DHC in light of their knowledge of the corpus and the organization in which they were interns. However, despite their motivation, not all students understood the overall objective of the projecting the classes around the first two factors of the Correspondence Analysis, nor how to interpret the graph representing this projection. Despite these limits, the majority of the students gained knowledge about the importance of discourse in creating organizational culture and the crucial, though complex, ability to analyze discourse to understand organizations.

42In sum, by furthering the research of Kettemann (2011) and Charles (2012, 2014), our study shows that LEA students better understand the link between discourse and organizations after constructing their own corpora of organizational discourse and analyzing them with textometric methods. As seen from the final reports, the students gained in overall awareness of the multi-faceted nature of organization discourse and organizations themselves. While not all of the students explicitly applied concepts from institutional logics to the results of their textometric analyses, the majority made reference to the multitude of “lexical worlds” present in organizations and the diverse logics they represent. This is a key concept to the framework of institutional logics and is crucial to gaining “sociocultural and intercultural competences” (Narcy-Combes 2003:119) for using language in organizational contexts.

6 Conclusion

43Our research puts forth evidence that using textometry in a DDL approach to teach organizational culture can provide a strategic advantage to LEA students. It provides a path to the professional profile that Narcy-Combes (2003) identified. Not only do LEA students study both business subjects and languages, but they can use tools such as textometry to grasp how discourse is a product and a source of organizational culture. Compared to students exiting other degree programs which include foreign languages and business classes, the knowledge of the link between discourse and organizational culture is a competitive advantage for LEA students.

44As a final note, we acknowledge the limits of our study. This study was carried out with a small group of master’s students. It was, therefore, feasible to help students who had encountered technical problems with the software. This type of help can be time-consuming for the instructor and unfeasible with a large group of students. Additionally, as the students were also interns, they had access to their companies’ internal documents. In this respect, the conditions were ideal for being able to create a corpus of authentic organizational discourse. Finally, Iramuteq was only employed to analyze the lexical content in the corpora generated by our students through classification and correspondence analysis. In the future, it would be interesting to combine this dual approach with other methods from the Geometric Data Analysis school or other corpus linguistics traditions. For example, combining keyword and concordance analyses with methods that explore the larger structures in discourse could show students how specialized terms interact with larger trends in a corpus to construct institutionalized representations.

Haut de page

Bibliographie

Aijmer, Karin. 2009. Corpora and Language Teaching. Philadelphia: John Benjamins Publishing.

Bardin, Laurence. 2013. L’Analyse de Contenu. Paris: Presses Universitaires de France.

Beaudouin, Valérie. 2016. “Statistical analysis of textual data: Benzécri and the French school of data analysis”. Glottometrics 33, 56–72.

Bennett, Gena. 2010. Using Corpora in the Language Learning Classroom. Ann Arbor: University of Michigan Press.

Benzécri, Jean-Paul (dir.). 1973. L’Analyse des Données : Leçons sur l’Analyse Factorielle et la Reconnaissance des Formes et Travaux du Laboratoire de Statistique de l’Université de Paris VI. Paris: Dunod.

Biber, Douglas, Ulla Connor & Thomas Albin Upton. 2007. Discourse on the Move: Using Corpus Analysis to Describe Discourse Structure. Philadelphia: John Benjamins Publishing.

Boulton, Alex. 2012. “Corpus consultation for ESP: a review of empirical research”. In Alex Boulton, Shirley Carter-Thomas & Elizabeth Rowley-Jolivet (Eds), Corpus-Informed Research and Learning in ESP. Philadelphia: John Benjamins Publishing, 261–291.

Boulton, Alex. 2016. “Integrating corpus tools and techniques in ESP courses”. ASp 69, 113–37, 10.4000/asp4826.

Boulton, Alex. 2017. “Data-Driven Learning and Language Pedagogy”. In Steven L. Thorne & Stephen May (Eds.), Language, Education and Technology, 181–92. Encyclopedia of Language and Education. Cham: Springer International Publishing. DOI: 10.1007/978-3-319-02237-6_15.

Boulton, Alex, & Stephan Wilhelm. 2006. “Habeant Corpus – They should have the body. Tools learners have the right to use”. ASp 49–50, 155–70,  10.4000/asp.661.

Boulton, Alex, & Nina Vyatkina. 2021. “Thirty years of data-driven learning: Taking stock and charting new directions over time”. Language Learning & Technology 25/3, 66–89.

Charles, Maggie. 2012. “‘Proper vocabulary and juicy collocations’: EAP students evaluate do-it-yourself corpus-building”. English for Specific Purposes 31/2, 93–102, 10.1016/j.esp.2011.12.003.

Charles, Maggie. 2014. “Getting the corpus habit: EAP students’ long-term use of personal corpora”. English for Specific Purposes 35, 30–40, 10.1016/j.esp.2013.11.004.

Ellis, Rod. 2003. “Task-Based Language Learning and Teaching”. Oxford Applied Linguistics. Oxford: Oxford University Press.

Greenacre, Michael. 1984. Theory and Applications of Correspondence Analysis. Orlando: Academic Press.

Hall, Edward T. 1959. The Silent Language. New York: Doubleday.

Hall, Edward T. 1966. The Hidden Dimension. New York.: Doubleday.

Hofstede, Geert H, Gert Jan Hofstede & Michael Minkov. 2010. Cultures and Organizations: Software of the Mind. Maidenhead: McGraw-Hill.

Johns, Tim. 1986. “Micro-concord: A language learner’s research tool”. System 14/2, 151–62, 10.1016/0346-251 X(86)90004-7.

Johns, Tim. 1988. Whence and Whither Classroom Concordancing? Computer Applications in Language Learning. Berlin: Mouton De Gruyter.

Johns, Tim. 1991. “Should you be persuaded – Two samples of data-driven learning materials”. Birmingham University English Language Research Journal 4, 1–16.

Kazeroni, Abdel. 1995. “Task-based language teaching”. ASp 7–10, 113–32, 10.4000/asp.3750.

Kettemann, Bernhard. 2011. “Tracing the emo side of life: Using a corpus of an alternative youth culture discourse to teach cultural studies.” In A. Frankenberg-Garcia, Lynne Flowerdew & Guy Aston (Eds.) New Trends in Corpora and Language Learning. London; New York: Continuum, 44–61.

Kuteeva, Maria. 2013. “Graduate Learners’ approaches to genre-analysis tasks: Variations across and within four disciplines”. English for Specific Purposes 32/2, 84–96, 10.1016/j.esp.2012.11.004.

Le Roux, Brigitte & Henry Rouanet. 2005. Geometric Data Analysis: From Correspondence Analysis to Structured Data Analysis. Dordrecht: Springer Netherlands.

Loubère, Lucie & Pierre Ratinaud. 2014. Documentation IRaMuTeQ 0.6 Alpha 3. http://www.iramuteq.org/documentation/fichiers/documentation_19_02_2014.pdf. Retrieved 13 July 2022.

Malicka, Aleksandra, Roger Gilabert Guerrero & John M. Norris. 2019. “From needs analysis to task design: Insights from an English for specific purposes context”. Language Teaching Research 23/1, 78–106, 10.1177/1362168817714278.

Meyer, Erin. 2014. The Culture Map: Breaking through the Invisible Boundaries of Global Business. New York: Public Affairs.

Narcy-Combes, Marie-Françoise. 2003. “La communication interculturelle en anglais des affaires: transfert ou conflit d’interprétation? Analyse d’une pratique d’enseignement en LEA”. ASp 39–40, 119–29, 10.4000/asp.1341.

Paris, Justine. 2021. “Multiplicité des approches à visée concrète, personnalisée et autonomisante en anglais de spécialité: Exemple en licence professionnelle droit du patrimoine”. ASp 79, 95–112, 10.4000/asp.7174.

Ratinaud, Pierre. 2014. IRaMuTeQ: Interface de R Pour Les Analyses Multidimensionnelles de Textes et de Questionnaires (version Version 0.7 alpha 2). Windows, GNU/Linux, Mac OS X. http://www.iramuteq.org.

Reinert, Max. 1983. “Une Méthode de classification descendante hiérarchique : Application à l’analyse lexicale par contexte”. Les Cahiers de l’Analyse des Données 8/2, 187–98.

Reinert, Max. 1990. “Alceste, une méthodologie d’analyse des données textuelles et une application: Aurelia De Gerard de Nerval”. Bulletin of Sociological Methodology/ Bulletin de Méthodologie Sociologique 26/1, 24–54, 10.1177/075910639002600103.

Schonhardt-Bailey, Cheryl, Edward Yager & Saadi Lahlou. 2012. “Yes, Ronald Reagan’s rhetoric was unique – But statistically, how unique?”. Presidential Studies Quarterly 42/3, 482–513.

Scott, Mike. 2004. WordSmith Tools (version 4). Oxford: Oxford University Press.

Scott, W. Richard. 2008. Institutions and Organizations: Ideas and Interests. Los Angeles: Sage Publications.

Stenström, Anna-Brita, Gisle Andersen & Ingrid Kristine Hasund. 2002. Trends in Teenage Talk: Corpus Compilation, Analysis and Findings. Philadelphia: John Benjamins Publishing.

Selznick, Philip. 1957. Leadership in Administration. New York: Harper & Row.

Thornton, P. H., William Ocasio & Michael Lounsbury. 2012. The Institutional Logics Perspective: A New Approach to Culture, Structure, and Process. Oxford: Oxford University Press.

Trompenaars, Fons, & Charles Hampden-Turner. 2004. Managing People across Cultures. Chichester: Capstone.

Haut de page

Notes

1 This is our translation of the following extract from Narcy-Combes (2003: 119) “Ensuite, la nécessité pour les étudiants de LEA de construire leur identité, entre BTS et école de commerce. Ce créneau semble pouvoir leur offrir un débouché spécifique à condition de les sensibiliser aux faits que la compétence linguistique dans la définition chomskienne (connaissance des règles de fonctionnement du langage) n’est pas nécessairement synonyme de compétence de communication et que cette dernière suppose des compétences socioculturelles et interculturelles.”

2 We adopt Biber et al.’s (2007) definition of the term “discourse”. These researchers distinguish between three different meanings of the term 1) usage of words in context; 2) studies of units of language that are larger than a sentence (an independent clause with subject and predicate); 3) studies of larger units of language to understand social interactions. In our approach, the term discourse always combines the second and third meanings because, as Reinert (1990) hypothesizes, the way sentences are combined into larger discourse units is highly-indicative of the utterers’ subjective representation of the outside world.

3 Henceforth “institutional logics”.

4 unités de contexte

5 contexte-types

6 While coding was not required, student 2 also coded the texts in her corpora for type of social network (Twitter, Facebook, LinkedIn) and type of website article (Interview or other type). This coding is relatively simple to implement and it could be integrated into future projects such as the one described here.

7 Cleaning of the corpora was started in class and completed by the students autonomously. Certain types of corpora creation and cleaning can be time consuming. They are important parameters to consider when planning to integrate student created corpora for language teaching.

8 We hesitate to come to a strong conclusion about whether student 4’s sampling method was the reason for the low frequency of “customer” in their corpus or if there was indeed an issue with the company’s representation of itself as customer-oriented on its website. This student aimed for a random collection of articles from the company’s website, but a truly random collection should have been obtained using a random selection technique. This was not the case for student 4. Additionally, the student did not analyze all the possible synonyms for “customer” in their report.

Haut de page

Table des illustrations

Titre Table 1: Description of the corpora as provided by the students in their final reports
URL http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/docannexe/image/8142/img-1.png
Fichier image/png, 36k
Titre Figures 1a and 1b: DHC classes and Correspondence Analysis graph of discourse sources (student 2)
URL http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/docannexe/image/8142/img-2.png
Fichier image/png, 155k
Titre Figure 2: Classes calculated by the DHC in student 6’s report
URL http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/docannexe/image/8142/img-3.png
Fichier image/png, 164k
Titre Figure 3: Classes calculated by the DHC in student 7’s report
URL http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/docannexe/image/8142/img-4.png
Fichier image/png, 161k
Titre Figure 4: Correspondence analysis associated with the DHC in student 8’s report
URL http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/docannexe/image/8142/img-5.png
Fichier image/png, 574k
Titre Figure 5: Classes calculated by the DHC in student 3’s report
URL http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/docannexe/image/8142/img-6.png
Fichier image/png, 407k
Haut de page

Pour citer cet article

Référence papier

Mary C. Lavissière, « Data-driven learning and professional culture: Using textometry to analyze organizations in applied foreign language programs »ASp, 82 | 2022, 99-117.

Référence électronique

Mary C. Lavissière, « Data-driven learning and professional culture: Using textometry to analyze organizations in applied foreign language programs »ASp [En ligne], 82 | 2022, mis en ligne le 01 novembre 2023, consulté le 08 novembre 2024. URL : http://0-journals-openedition-org.catalogue.libraries.london.ac.uk/asp/8142 ; DOI : https://0-doi-org.catalogue.libraries.london.ac.uk/10.4000/asp.8142

Haut de page

Auteur

Mary C. Lavissière

Mary C. Lavissière is an Associate Professor of Applied Foreign Languages at the Nantes Université and a member of the Centre de recherche sur les identités, les nations et l’interculturalité (CRINI). Her research focuses on LSP, genre, morphosyntax, historical linguistics and linguistic methods applied to management sciences and language teaching. Her projects include the modernization of legal language, legal genres, textometry applied to management sciences and language teaching, and linguistic behavior of women in the maritime industry. marycatherine.lavissiere@univ-nantes.fr

Haut de page

Droits d’auteur

CC-BY-NC-ND-4.0

Le texte seul est utilisable sous licence CC BY-NC-ND 4.0. Les autres éléments (illustrations, fichiers annexes importés) sont « Tous droits réservés », sauf mention contraire.

Haut de page
Rechercher dans OpenEdition Search

Vous allez être redirigé vers OpenEdition Search