1The general matter this paper wishes to focus on is the methodological challenge that the current status of images poses to semiotics, as the impact of computational techniques and digital practices on visual studies becomes a necessity to address: we refer here to the notion according to which the appearance of the complex interaction between digital tools, the subjects who employ them, and the virtual environments they construct has resulted in what Ruggero Eugeni describes as the postmedial condition (Eugeni, 2015), a major paradigm shift in the relevance and functions of images for contemporary societies.
2The main argument to be made here is that in order to adequately elaborate from a semiotic perspective digital images as part of the network of practices in which they are involved, it is essential to adopt a methodology of analysis that is up to date with the perspective of a material turn (Dondero, 2020). By taking into account the plane of immanence of objects as a critical space to move away from the notion of textual closure distinctive of the generative tradition, the direction we want to follow is to adopt a dynamic, open and interpretative perspective that employs the notion of material substrate as an entry point to examine digital images in the domain of interactivity and visual practices, consequently managing to address a question we claim to be crucial here: “what do we do with images?”.
- 1 Dondero; Fontanille, 2012.
3When thinking about the uses of digital images, the practices of scientific research clearly are a field that has already proven to be very fertile for current developments in visual semiotics1, which have been less and less constrained to the analytic tools originally developed and tested in the narrow field of artistic images.
4Therefore, for the applied part of this paper, we decided on a presentation of the analysis of an image recognition application developed in the field of biodiversity research, in order to try and test what the role of semiotics might be in the delicate process that leads from raw visual data to knowledge production via computational technologies: when empirical evidence is produced through visual data, what tools do we have to assess its legitimacy? What can semiotics teach us about the ways in which we decide to trust huge data sets, and about how we make them usable?
5Beginning with the postmedial condition as presented by Eugeni (2015) allows us to approach the topic in a very simple and direct way: computer technologies have emerged with their ability to generate, reshape and manipulate textual objects that were previously bound to specific and circumscribed contexts of use, but are now being broken down into small molecules of information that make them constantly subjected to readaptation and repurposing; the result of this increasing social pervasiveness of digital practices has led to the blurring of a clear distinction between “what is medial and what is not” (Eugeni, 2015, p. 28), thereby lifting the status of uniqueness from the confined spaces of the fruition and use of visual texts and opening up a whole new dimension of application possibilities for images, encoded in the form of computer language. While this concept is rather straightforward from a general point of view, in the domain of semiotics further steps are necessary to clarify the type of methodological shift we are addressing here, which calls into question the issue of materiality. Although in fact the ephemeral nature of digital images leads us to think of them as immaterial, substrateless objects, we believe that it is precisely the complex and layered nature of their materiality that allows us to enrich our methodological perspective.
6Regarding this prejudice about the immateriality of digital images, the work of Johanna Drucker (2020) has identified a fundamental juncture that goes back to Nelson Goodman’s distinction between autographic and allographic systems. In Goodman’s theory (1968), the concept of notation identifies the codified scheme that, in allographic systems, allows the authenticity of an infinitely reproducible cultural object to be recognized, and is opposed to the concept of inscription, which refers to the material trace that in autographic systems identifies the uniqueness of the original object: in the case of music, for example, the execution of a specific composition is regarded as authentic based on the adherence to a codified pattern of characters (notation) that can be read within the musical score, while any reproduction of a pictorial work (inscription) is registered as a copy. From this premise, it is understandably tempting to consider computer language as an expression of a highly functional notational system, in that it allows a cultural object of any kind (be it textual, visual, audio, etc.) to be reproduced indefinitely by translating it into a binary scheme that breaks it down into tiny particles of information. This is where the characterization of digital textualities as immaterial comes from, as they can be reproduced in the form of the similar configurations in binary language: the digital image would thus seem like pure textuality, a kind of immaterial trace that conveys a given configuration of visual features without inscribing itself in a substrate, without the possibility of embodying itself in an object.
7Assuming this point of view curiously entails no consequences within a semiotic paradigm centered on the plane of immanence of textuality such as that of the dominant (generative, Greimasian) tradition of semiotics, since an object of analysis is examined within the boundaries of its internal dynamics, disembodied from its materiality which exposes it to the various external circumstances and variables that determine its uses. It seems quite clear, however, that it is not helpful to limit our perspective to this consideration on digital immateriality, since it is precisely the pervasive mobility of digital objects that is predominantly expressed within the plane of immanence of practices: to embrace this perspective, it becomes necessary to move away from static, enclosed and ontological approaches to cultural products, which tend to observe them for “what they are,” and instead embrace a dynamic, open and interpretative outlook that considers digital texts on the basis of “what they do.”
8Johanna Drucker’s groundbreaking suggestion in this regard is expressed in the idea that, when we deal with visual languages, there is no such thing as a truly allographic system, or rather, “a system may be allographic at a formal level, as a notation system, but never at an inscriptional level where an image is produced as a material trace” (Drucker, 2020, p. 43). To give a brief example applied to digital images, a professional photographer would prefer a RAW file containing a large amount of uncompressed data to work on when editing some photos taken at a wedding, since he needs a wide range of settings to process each image at a very high level of detail; at the same time, the bride and groom who receive the images from the photographer would receive compressed files of JPG or PNG type, whose properties have been selected to achieve a satisfactory visual output, at the price of all the encoded information that is not required to have good photographs to print or share on social networks. There is no doubt that a good portion of data is shared between the image on which the photographer works and the image that the bride and groom get, but the very fact that the same image goes through a number of different stages of compression makes it clear that the gap between the two encodings is not insignificant, because the kind of visual practice in which they fit is different. This example is intended to show how problematic the concept of notation belonging to allographic systems is, since it leads us to think that there can be, for each visual configuration, one and only one notational scheme that encapsulates the whole identity of the image, while each trace of this image is for all intents and purposes an inscription, if we are to understand it as the result of a different and original selection of parameters pertinent to a given prediction of use.
9The digital image, embodied in the substrate through which it is visualized, refers back to the distributed complexity of its visualization, to the hardware and software supports, to the networks and servers, to the encoding languages on which it depends, and in which it is understood as the result of a network of instances and operations directly involved in the act of visualization.
10Restoring the centrality of the enunciative significance of this complex act of visualization means to acknowledge that these reticular processes are directly implicated in the act of sense-making of digital images, and thus require consideration in the analysis.
11Clearly, these considerations would not be viable without a key landmark, that is Jacques Fontanille’s work on the generative trajectory of expression (2005; 2008): a tool that has allowed to reclaim the principle of immanence of the generative tradition in a contemporary theoretical framework, one that opens up the boundaries of semiotic pertinence beyond the dimension of textuality without losing the essential methodological values on which the semiotic discipline is rooted, and that has contributed to the reassessment of the critical role of enunciation in contemporary studies.
- 2 Dondero, 2020; Paolucci, 2020.
12Following the journey towards the emancipation of enunciation from the plane of immanence of texts, which goes through Fontanille (2008) to reach the most updated theoretical declinations2, we would like to reaffirm how enunciation is here the main issue in play, as it is precisely the issue that in the field of semiotics grants access to the heterogeneity of dimensions in which each semiotic system is involved: the digital challenge is for semiotics a matter of complexity, which quite clearly cannot be solved within the boundaries of textual coherence and homogeneity that have characterized, by strategically legitimate reasons, the founding values of semiotics’ major tradition. Historically, the Greimasian tradition of semiotics decided to deal with the concept of enunciation in such a way as to keep faith with the Hjemslevian principle of immanence, making sure that the dynamics of subjectivity in language would not allow for the analysis of meaning to get lost in extralinguistic and extratextual circumstances, in which semiotic methodology was right not to get involved: this determined that when it comes to enunciation only the marks it leaves in the text would be considered, and that it would remain as a “pure non-observable artifact” (Fontanille, 2008), always presupposed and absent; To raise the issue of enunciation today in order to search for a more complete assessment of its pertinence means, instead, to look at the meaning effects that subjectivity activates in discourse in a layered, rhizomatic way, through a series of instances distributed among different planes of existence, without thereby betraying the principle of immanence by seeking easy explanations in the concrete extratextual circumstances.
13If the focus on the textual plane of immanence allows to only account for the realized mode of existence, that is, the objectified products of enunciation that remain in the text as marks of an always presupposed but absent gesture, expanding the perspective to planes of immanence of objects and predicative scenes implies making relevant to the analysis the heterogeneous apparatus of potential, virtual and actualized instances not directly identifiable in the enunciation through an analysis of features within differential configurations: the goal is not to confine the analysis of visual texts to plastic and figurative elements, but to move the image into an interpretative space of instances -not concrete subjects- that redirect the viewpoints on textuality in relation to the various planes from which we decide to lead the observation in a combinatory way.
- 3 Fontanille, 2005, p. 3: "la structure matérielle du support, à la manière dont elle offre au destin (...)
14By revisiting Pratiques sémiotiques (2008), we intend to retrieve the notion by which every text is inscribed in a substrate, an object that is composed of two faces: one, facing the textual plane, identifies it as a formal substrate and is configured as the space that provides the virtual conditions for the inscription of forms; the other, facing the plane of objects, identifies it as a material substrate and functions as a gateway device to the planes of practices and habits as it provides, we could say, the conditions for the utilization of the enunciate embedded in a circumstance: the substrate acts as an interface between the text and the practice, as a material structure that “offers the sender a surface for inscription, and the recipient a surface for decipherment or action”.3
15Seeking to expand the analysis outside the boundaries of textuality, the material, embodied, and sensory dimension that shifts the focus from the inscription (the text) to the substrate that embodies it (the object), and consequently to the practices of inscription, constitutes the critical space that enables access to the various dimensions in which semiotic systems are involved, as it directly problematizes the signifying independence of the enunciate. The key implication of the generative trajectory of expression developed by Fontanille is that each level is inherently integrated and integrative in relation to the others: each visual feature is such as it is included in a text, which is inscribed in an object, which is used for a practice that is part of a network of habits. This mutual and inseparable integration makes it clear how the digital practices of the image reflect in an indivisible way the connections between the units of this chain of planes of expression.
16The contemporary status of the digital image in itself already seems to make a rigidly textualist approach problematic: the fact that our daily engagement with images is mediated by a complex dynamic of interfaces, devices, algorithms and networks of images in constant interaction with the environments in which we are situated confronts us with the material properties and uses we make of visual texts as much as with their formal features. It could be said that the question of how we perceive, interpret, and describe images must now, more than ever, be linked to the issue of what we do with images, through the inclusion of the substrates on which visualization occurs within analyses. We believe that it is possible to consider the case of the digital image as a critical junction, useful for demonstrating the heuristic value of a theory of enunciation that integrates the set of enunciating instances involved in the production of meaning outside the level of the enunciate.
17Again, the work of Johanna Drucker (2020) provides a valuable resource for linking visual studies with specific theoretical directives of contemporary semiotics. In particular, it is interesting to observe how her Visualization and Interpretation eventually lands on an evenemential perspective, in the same way in which Claudio Paolucci’s coeval Persona presents an impersonal and evenemential theory of enunciation, built around the notion of event. We think it is appropriate to argue here that what these two texts share from this perspective reveals precisely the kind of methodological shift that is required today to adjust the perspectives of semiotic analysis to the visual challenges of digital contemporaneity.
18Paolucci notably elaborates on the historical roots of subjectivity in language, pointing out the distinction between two parallel traditions (Paolucci, 2021, pp. 53-59): a dominant one, of Aristotelian origin, which puts the subject that determines the predicate in the form “S is P,” at the center of the proposition, and a lesser one, of Stoic origin, whereby instead it is the event expressed by the verb that is at the center of the proposition and determines the subject that takes part in the predication, meaning that “It is not the statement that “the tree is green” (S is P) but rather the “greening” of the tree as the spring event, which opens up positions for the subjects and the persons which come to occupy them” (Paolucci, 2021, p. 53). This second minor tradition was later retraced through Lucien Tesnière’s valential grammar and Charles Sanders Peirce’s logic of relatives in order to develop a relational model of subjectivity in semiotic discourse, which we believe to be crucial here: as subjects, our strategic relationship with the world is not a one-dimensional and ontologically linear process, but it is rather mediated through a layered network of instances, of relations that connect us each time with certain and different textual, material and praxical aspects of the things we interact with; we are caught up in a dynamic process of events that reshape our perspective on things in relation to the kind of action they are meant to accommodate, the specific kind of practice that leads them to produce specific meaning. Putting the relation above the entity in this way leads us to abandon the ontological distinctions and the “systemic and static” (Coquet, 1987) approach of the generative tradition, and instead lean toward a perspective whereby the image-object does not simply offer a two-dimensional configuration of plastic and figurative elements to an ideal and crystallized practice of observation. In a three-dimensional and evenemential perspective, a complex network of visual features, arranged on different planes of immanence, lends itself to a multitude of acts that potentially redefine the instance of subject that interacts with it, whose role can no longer be limited to just that of an observer.
19In a very different theoretical and disciplinary context, Drucker similarly argues how the performative and distributed materiality of digital textualities confronts us with the possibility to “shift from an entity-based to an event-based conception of media and demonstrate the radically constitutive, codependent relations of complexity we overlook when we mistake a web of contingencies for a static, fixed, object of intellectual thought” (2020, p. 76). Digital technologies, by producing immaterial objects in a seemingly paradoxical way, allow us to observe postmedial texts regardless of their “is-ness”, as products reshapeable in the form of various types of compression that expose different aspects of them, which assume their role according to the process that embodies them, the practice of use they accommodate, and their commensurability in relation to all the other textualities of the system in which they are integrated.
20From this premises, the field on which we would like to operate is that of scientific knowledge production through digital images, a process that we might describe as an act of visual enunciation that allows something to be said about a specific state of things, with respect to certain principles of validation, through the use of digitally processed visual data: it will be necessary to prove the way in which certain instances, with certain functions and responsibilities, have to converge in these enunciative acts, in order to balance each other in a delicate and complex process that would otherwise risk producing false evidence, and that definitely cannot be described by remaining within the structural dogmas of generative analysis.
21The case of scientific images has been addressed in an absolutely enlightening way by Maria Giulia Dondero and Fontanille in Des images à problèmes (2012), a groundbreaking text that was able to demonstrate the need to seek out a new direction for enunciation, whose classical categories borrowed from linguistics proved to be impractical in dealing with non-artistic texts. We would like to follow in the footsteps left by this essay here, aiming to test semiotic suggestions that can conjugate with themes already addressed, from the potentially problematic standpoint of enunciation in digital materiality.
- 4 The EHT Collaboration et. al 2019/1-4 First M87 Event Horizon Telescope Results. in The Astrophysic (...)
22To try and clarify what we mean, we can briefly take the case study of black holes as an example: the work of Dondero and Fontanille focuses, primarily, on the process of manifestation that leads from the visible to the visual, that is, on the modus operandi of sign production by which science produces its visualizations, establishing images in tensive structures that regulate between referential adherence and iconic rendition: the case of the visual rendition of a black hole presented by Luminet (1979) gives to semiotics some excellent insights from this respect, since it involves the need to account for the processes of visual synthesis that translate complex mathematical equations that are not visual in nature into images, and thus for the virtual instances responsible for these translation processes; however, if we take the more recent case of the image of M87, considered as the “first photograph” of a black hole4 and obtained digitally with the help of machine learning algorithms, we believe that the challenge it poses is of a different nature.
23What differs on a theoretical level between the unique challenges posed by these two black hole images is this: while Luminet’s image made it possible to create a visual text from abstract theoretical models, to make a prediction about the appearance of an invisible object from a fictitious viewpoint, the purpose of M87’s image was to produce empirical evidence about the appearance of the same type of object. Building on scarce measurement data, the visual processing of M87 was intended to verify through observation the adherence of the theoretical model obtained by Luminet, but how is it possible to observe an object that is by definition unobservable? The Event Horizon Telescope5 project basically delivered a synthetic, simulated image, but the complex articulation of a network of subject instances, algorithms, data selection, principles for the construction of training sets, and the articulation of the work of different research groups made it possible to ensure, with sufficient and measurable certainty, that the resulting image shared with a possible state of affairs in the universe certain characteristics that enable us to obtain certain scientific knowledge. The kind of enunciative configuration that guarantees the scientific validity of the two images is thus different: on the one hand we have a network of instances responsible for a process of visual translation on various planes of immanence, and on the other hand we have a system of responsibilities to be coordinated to be sure that we have produced knowledge about the world in an acceptable way.
24On the basis of the work already done on scientific images, we believe that the digital image, as indeed any type of image in the context of its specific characteristics determined by the nature of its materiality and its practical uses, speaks to us and allows us to do things through a configuration of instances that is unique, and can be fascinating to reconstruct.
25As already noted, the contemporary condition of images is directly intertwined with the digital devices and environments we commonly use, and raises fundamental questions about the nature of what can be defined as a postmedial visuality. In his book Capitale Algoritmico, Eugeni (2021, pp. 12-14) describes the specific nature of digital images in relation to the visual practices that have been established by painting and printing tradition in the perspective of a progressive integration between different economies, where by economy we mean a set of rules and procedures for the circulation of cultural objects: if, previously, an economy of the visual used to regulate the circulation of images in the form of objects such as paintings or photographs, with the development of devices such as the cinematograph and television this becomes intersected with an economy of the light, since images can now be materially conveyed, following a process of conversion, into a type of light information, which employs electromagnetic energy; the integration of this new substrate implies the establishment of different actors and institutions, which regulate what can and cannot be transmitted, as well as new uses that are allowed or restricted. However, the fundamental shift for the contemporaneity occurred with the integration of the layered domain of the visual into the information economy, which took place with the development of computer technologies, which introduced a multitude of new subjects responsible for the exchange, gathering, manipulation and use of large amounts of data, something we see growing constantly and posing more and more ethical and political questions. Ultimately, the image can now be translated into numerical values, and thus takes on a new material constitution. With the advent of digital tools, images take on new properties in terms of usability and transmissibility that impose new statuses in addition to those previously stabilized with non-computer substrates, but one feature that has particularly altered the scenario for contemporary visual practices concerns a new scale of commensurability among the infinite number of images that end up coexisting in the same digital environment; the existence of images in the form of data, that is, numerical values, allows them to be compared with each other, to be clustered according to relationships based on similarity, to build collections, and to train computational systems to recognize or produce other images that are coherent with the visual features attributed to a specific cluster of occurrences. The possibilities that are unlocked by the gathering and employment of these enormous datasets have resulted in the emergence of a whole paradigm catalog of information regimes, given the need to regulate and limit the acquisition and circulation of data in a cultural, political and epistemological scenario in which decision-making processes and ways of generating knowledge are heavily influenced by the constant flow of information to be gathered, classified and used. It is within this scenario that the case study chosen for this paper takes place: the social network INaturalist.6
26INaturalist is a project developed within the academic field of biodiversity research and is presented as a crowd-sourced species identification system, that is, a platform through which one can obtain identifications on animal and plant species through a recognition process that combines computer vision technologies and crowdsourcing practices: users can access the platform via web or mobile devices, upload an image of the species to be identified, enter details about the location and time of the observation, and obtain a list of possible identifications through an automatic recognition process. Subsequently, the automatic identification runs through a manual vetting process by the community, which returns to the platform a certified recognition data at an appropriate level of reliability (Research Grade).
27The impact of platforms such as INaturalist for scientific research can be easily observed by consulting the Global Biodiversity Information Facility,7 a network funded by government organizations and international research institutes, which is dedicated to collecting and providing accessible and traceable datasets from a wide variety of sources according to certain information technology standards, so that they can be included in a vast and stratified network of taxonomic and distribution data of animal and plant species. Through this platform, it is possible to retrace the academic studies and publications that have made use of a particular dataset, and it can be observed that the data provided by applications such as INaturalist, which are characterized by a remarkably high number of occurrences and a wide geographic distribution, have proved useful for various types of surveys, such as studies on the distribution of invasive animal species, records of new species, rediscoveries of species that remained unnoticed for decades, and habitat variations due to climate change.8
28By choosing to approach this case study from a semiotic point of view, we believe that the centrality of the plane of immanence of practice is evident, since the process of sense-making through the features of visual textuality does not involve the figurative elements of the single image produced by the user (the realized component of enunciation) but considers them almost exclusively to the extent in which they engage with a virtual audience of other images according to relations of similarity and difference, to produce a kind of differential knowledge and to do something with visuality through the combined action of a plurality of enunciating instances.
29From an open and dynamic perspective, the way applications such as INaturalist function and are used in science provides an excellent exemplification of the capabilities of an encyclopedic model as illustrated by Umberto Eco (1975; 1984), and the demonstration of this assumption may prove useful in illustrating the way in which an evenemential theory of enunciation, which describes the transitions between modes of existence that coexist in an enunciate, integrates with this specific type of visual practices to return an organic set of meaning outside the level of the text.
30Following Eco (1983;1984), we can conceptualize the encyclopedic semantic model as a system for processing the interconnected way in which various semiotic systems operate (Eco, 1983, p. 339), as opposed to an intensive, dictionary model of Aristotelian origin. Porphyry’s tree, an expression of this Aristotelian conception, takes the form of a tree structure that places distinct substances in a hierarchical construction by means of hyponymy/hyperonymy and difference relations, based on essential differences, which allow substances to be distinguished at a deep level and by their fundamental qualities, as opposed to specific differences, which instead make distinctions between essences by accidental and superficial properties. The dictionary model adheres to this Aristotelian conception of substances, since each essence is virtually described in its entirety by a category of belonging that sets it apart from that which is fundamentally opposed to it. In the field of taxonomy, in which INaturalist’s case for the compilation of different animal and plant species is situated, a similar model can be found in phylogenetic trees, that is, diagrams that allow animal and plant species to be distinguished on the basis of evolutionary filiation relationships, in which common ancestors are recognized and species are distinguished substantially, at least on the basis of genetic heritage differences.
31However, this strong model of knowledge is soon proven to be structurally unsustainable, since the differences between substances are unlikely to be at the very least elaborable in the form of belonging to distinct categories among the densest branches of Porphyry’s Tree: what allows us to distinguish between substances always corresponds, evidently, to the kind of standpoint we ask things to differ from, providing us with the framework of a complex system of differences that cannot be included in a single dictionary of substances. The need for an encyclopedic model, inherently based on accidental differences and thus an expression of weak knowledge, comes from the fact that dictionary models come into crisis trying to classify the irreducible complexity of substances: we are always dealing with oriented manifestations of things, interpretations (immediate objects) of things (dynamic objects) that remain as abstract constructs, which we can only elaborate from a given point of view.
32Encyclopedia, conceivable in this sense as “the whole of the already said,” “the recorded set of all interpretations” (Paolucci, 2021, p. 16), is an excellent framework to describe the kind of knowledge that computer vision technologies produce, as it is based on huge databases of superficial differences, such as photographs that users upload to INaturalist may be. Machine learning technologies provide access to a catalog of accidental evidence that ideally tends to account for the complexity of substances by inductive means: the functionality of the system depends on its ability to recognize all that can be said about that object as referring to the image to be described, from different and infinite points of view. Image recognition applications use visual datasets as collections of different viewpoints on things, sets of interpretations that, in their partiality, manage to function as heterogeneous fragments (immediate objects) of an unreachable signifying wholeness (the dynamic object), but about which something can be predicated: INaturalist does not possess any abstract and complete model of Fulica atra, a phenotype that would allow the curious observer to know what species the duck they spotted and photographed belongs to, but is rather a vast set of very similar visual experiences, that is, photographs that highlight the same partial taxonomic attributes of the same species, which allow recognition to be abduced with a relative and measurable degree of certainty.
- 9 Eco, 1983, p. 340: "sottomette le leggi della significazione alla determinazione continua dei conte (...)
33The kind of knowledge generated through these systems is not a closed and definitive knowledge, but rather tentative, accidental, “weak” in the Echian sense, since it “submits the laws of signification to the continuous determination of contexts and circumstances”9; in this sense, the quantitative models offered by computer technologies have proven to be sustainable and highly functional systems for processing the complexity of reality, filtered through visual texts encoded in computer language: if the visual complexity of things submits empirical knowledge to the organic and subjective limits and partiality of local viewpoints, contexts and circumstances, it is necessary to adopt a dynamic, distributed and performative approach, which allows all possible viewpoints to be taken into account at the same time, in order to reconnect the image with the virtualities that determine its meaningfulness, in an open and heterogeneous way.
34It is now a matter of questioning what the specific “weakness” of these systems depends on, which variables can potentially undermine the scientific validity of a dataset and which ones allow us to grant it. This kind of process, we believe, expresses the specific interpretative approach that semiotics can offer for scientific research; if the Echian tradition of interpretative semiotics is now advocated according to Paolucci through the question “How can we come to know the world through signs and languages?” (Paolucci, 2021, p. V), in the context of using image databases and computer vision tools for Science the question can be shifted in the following way: How can we obtain valid and functional evidence about the world through collections of images and processes that can extract regularities and differences from them?
35As the work of Sabina Leonelli (2016) has shown, the impact of big data has in many ways radically revolutionized the methods of scientific research: having large datasets at our disposal has provided an excellent opportunity to broaden the scope of what can be demonstrated empirically, but it has also brought with it epistemological problems regarding the nature of data. We are accustomed to attributing objective qualities to data, which speak for themselves and are able to adequately represent reality as factual evidence, given that even at the etymological level, “data” indicates something that precedes interpretative intervention as an “a priori” and unconstructed entity. However, despite the centrality of data in every contemporary social situation and the scientific faith placed in the systems they create, they cannot function apart from a human construction, an interpretative environment, or a research or application purpose that would allow their use: in particular, Leonelli refers to data journeys to describe “the material, social, and institutional circumstances by which data are packaged and transported across research situations, so as to function as evidence for a variety of knowledge claims,” (Leonelli, 2016, p. 5) thereby depicting a process by which a dataset may be constructed to be representative and meet a specific need for specific instances, only to end up being used in different contexts for which it was not originally conceived, leading to possible and serious epistemological problems.
36Far from being evidence sets that return a state of affairs in a definite way, data are attached to the notion of relational category, by which they are always and in every case defined by “who uses them, how, and for which purposes” (Leonelli, 2016, p. 78): as we have already mentioned in the case of the image of M87, in the production of knowledge through computational technologies we are always faced with a delicate tension to be balanced between the claim of objectivity and the need to make the data adhere to the specific type of classification we want to ascribe to it, to the type of features we want to abduce from its intrinsic complexity. Datasets are encyclopedic conglomerations that can be optimized to quantitatively and probabilistically describe certain attributes of a state of affairs, but their mobility among research projects determines that they can be incorporated into practices for which they were not originally designed, and thus lead to erroneous conclusions because the variables leading from data to interpretation are not consistent with each other. If a confidence in the soundness of logical-mathematical processes leads us to believe in the objectivity and autonomy of data organized in a computational system, an interpretative and probabilistic mentality should further alert us to the extent to which these systems are, by necessity, dependent on a range of circumstances that constitute different views of substances; the nature of enunciation as an evenemential act ultimately consists in this exercise in interpretative complexity, for it reveals how at the root of every text resides a chaining of subjective instances, that is, factors directly implicated in semiotic function but excluded from the formal surface.
37In the case of INaturalist we are faced with a process of collecting and processing visual data that allows a computer vision system to be trained to recognize taxonomic features and provide classification hypotheses: when a user uploads an image on the platform, the recognition hypotheses that are provided to them are based on probabilistic principles built on the huge database of images on which the system is built, and which take into account a wide variety of factors that are also not directly visual in nature, such as observations made at similar locations and times and classified correctly that make it more likely that a given species is observable in a given context; the computer vision system, in addition, is trained by the very images that users provide to it, and is therefore optimized to work properly on images that reflect the visual and interpretative habits of observers, but it will also have a number of biases determined by the unbalanced number of observations that are made in specific geographic areas, the accessibility of specific locations and varieties of environments, the optical limitations of the devices through which the images are obtained, and the degree of camouflage and caution toward humans by a given species.
38Far from being at the outset a representative catalog of the taxonomic complexity of the animal and plant world, INaturalist in order to function needs to make use of systems such as ranking the quality of observations on the basis of a Research Grade, a label that can be assigned through the active intervention of an expert who provides assurance on the basis of his or her experience that a classification is correct: only images with this label are actually used to constitute the open source database that can be accessed through the Global Biodiversity Information Facility, so as to ensure that the kind of bias in the system trained locally for the functionality of the platform is not reflected in other research contexts that use it.
39The functionality of computer vision systems such as INaturalist’s is the result of a series of instances that act to make the collection of visual data adhere to the type of practice in which they are to be activated, since we can envision the enunciative configuration of this process as a series of voices speaking within the system: the ensemble of users who produce the visual enunciates, the software developers who apply certain probabilistic principles to be able to process the data numerically, the experience and skills of the experts who implement the manual recognition, the information technology standards posed by the Global Biodiversity Information Facility to ensure a high quality of the required data, and again the values of objectivity that the scientific community requires to recognize the usability of a given empirical evidence.
- 10 Eco, 1983, p. 340: "L’enciclopedia non fornisce un modello completo di razionalità (non rispecchia (...)
40Big visual data are presented to us as raw assemblages, tiny fragments of evidence that confront us with a weak and variously circumstantiated type of knowledge, but this imperfect and limited nature of automatic processes should come as no surprise; each type of interpretation, after all, has no other tool than the records it owns, that is, the experiences, norms and habits of a community of speakers, to recognize a probable explanation from certain criteria of reasonableness. Quoting Eco again, “The encyclopedia does not provide a complete model of rationality (it does not unambiguously reflect an ordered universe) but it does provide rules of reasonableness, that is, rules for negotiating at every step the conditions that allow us to use language to account-according to some provisional criterion of order-for a disordered world (or whose criteria for order escape us)”.10
41To sum up, we have argued in this paper that when an image enters the universe of data, by virtue of a computational notation that defines it on the material level, its conditions for signification must be reconsidered in relation to what the semiotics of texts has regarded as closed and independent: the morphological features of a visual text, when digitized, reveal how profoundly dynamic, varied, and mutable the processes that integrate textual objects are, as they unfold as an endless cycle of reuses and redefinitions of the same visual features, for different purposes, situations, and scopes of knowledge. The visual enunciates embedded in these processes are not autonomous and independent objects of analysis, but always caught in a network of users and applications, instances open to redefinitions and reuses that select in various manners their characteristics to be regarded as pertinent. The complexity of data that can be processed thus dismantles the homogeneity of the semiotic object, and leads us to reconsider things not for what they are, but for their infinite means of being, that is, for what they can mean to us, with respect to points of view that it is up to us to variously combine, with respect to the possible knowledge demands that we are capable of expressing.
42The apparent immateriality of digital images should rather be conceptualized as an open and dynamic materiality, constantly renegotiating its modes of existence depending on the kind of process we intend to establish with a notation that can be translated into visual information: this description is particularly consistent with an evenemential conception of enunciation such as the one elaborated by Paolucci, since it underlines the need to understand each text within a practice that defines it as an object, grasped in a configuration of instances that allow it to be understood in a given semiotic function.
43Visual features such as the size and shape of the petals that can be observed in a photograph of an Eschscholzia californica are of absolutely no relevance if the image database in which it is entered is not trained to recognize them as characteristic, but they become relevant the moment there is a need to acknowledge them as taxonomic features, that is, to make those features mean something, distinguishing them from what is different from a given point of view: computer vision technologies confront us with images that take on semiotic properties depending on the environment in which they are placed, that acquire new perspectives of meaning depending on the type of information to be obtained from them, and take on different values depending on the degree of commensurability they establish with other images. If generative semiotics has developed methodologies that rely on an image’s signifying autonomy, it is clear that new methods are needed to account for the dimension of the image’s dependence on everything that transcends it as a text supported by internal values.
44For this reason, we believe it is essential to emphasize the need for a material turn for visual semiotics, intended as a shift in perspective concerning the proper role of enunciation for analyses as well as noting the need to develop methodologies able to trace the heterogeneous configuration of material and institutional subjectivities at work in practices involving the use of images, bringing to light the complex network of variables that can lead visual texts to produce meaning.
45As we become aware of the specific interpretative relationship that the present day is establishing with reality through data science, it becomes important to raise questions about the kind of knowledge we are constructing, which involves a multiplicity of instances pulsing behind the ways in which we use images in the form of huge encyclopedic archives: this can be a great opportunity for semiotics, both to free itself from the theories that still tie it to linguistic theories of the subject, and to develop tools that allow for evaluations and critical insights into the possible inappropriate uses of visual information, in terms of reasonableness and coherence with respect to predictions of use.