Why pictures?
Chapter Two: Why Nonhuman Pictures? With Trevor Paglen and Joanna Zylinska
Imaging and Imagining the Real

Five Provisional Statements

This short essay attempts to put in place some key points that aim to analyze and dissect practices and approaches to documentary photography in relation to an increasingly complex, interconnected and over-visualized world. It takes as a starting point how new positions and trends in documentary photography make use of what I would like to call expanded narratives to define their impact in the construction of mediated realities and their consumption.
In the following statements I draw on the input given by Joanna Zylinska and Trevor Paglen and argue that we ought to see photography as a catalyst for the construction of different ways of looking at the world. The focus is on contemporary documentary practices, analyzed in terms of the new forms of production and dissemination they use to dismantle established ideas about the linear relationship between reality and representation. I invite you to take a closer look at some of the options that are available to documentary practices to build a visual language that has the potential to lend legitimacy to our relationship with the world. Paradoxically, in a post-truth age, such a visual language allows for what can perhaps still be called an ‘authentic’ imagining practice.

We need to make a cognitive effort towards machine-mediated pictures of the real. Humans are no longer central to the act of taking a photograph. As a result of technical advances, various devices are now able to create images of ‘the real world’. But it is important to stress that humans remain at the center of the complex relationship between vision and representation. Machines are able to produce, read and distribute images autonomously. However, they are not yet able to form autonomous narratives regarding their own observations. What machines provide is an expansion of the possibilities of vision. Expanded vision in turn provides a different way to mediate social, economic, cultural and psychological processes. By making use of satellite imaging or medical body scans, for instance, humans can document both past and present; they are able to collect evidence of a ‘reality’ which is not yet empirical. As a consequence, our attention needs to shift from the medium to a critical reading of the possibilities and limitations of generating meaning. We need to analyze how technical apparatuses can contribute to the process of documentation while continuing with the critical debate about the way this kind of imagery intersects with non-fictional narratives.

We need to take advantage of what the internet has to offer. If we are to make sense of documentary approaches in the context of today’s networked image, we need to acknowledge the challenges of complex networks. In our universally connected world, with its non-linear access to information and its plurality of voices and sources, polarization and cyber-balkanization are ubiquitous phenomena. It becomes ever more important to analyze the way in which the visual is created, modified, post-produced, re-contextualized and distributed. The practices of editing, transforming and mixing offer possibilities for new narratives that – while allowing for factual documentation – shift the focus to a process of reproduction rather than production, where available content becomes the core of visual representation strategies. Emphasis will have to be on the process of narration. Through the use of different types of image – from vernacular photography to computer-generated imagery, memes and emojis, news footage, data visualizations and user-generated content – complex narratives can be created to present a comprehensive and aesthetically coherent process of documentation, as suggested, for example, by the internet-based investigative work of the Bellingcat collective .

We need to embrace interactivity. Navigating and experiencing the complexity of the visual in the context of networks is becoming an increasingly difficult cognitive, philosophical and psychological challenge. We are all fully immersed in a non-linear narrative as we simultaneously experience both the physical and the virtual worlds. Virtual reality, augmented reality, augmented photography, gamification and mixed reality offer different sensorial experiences and help to blur the boundaries between medium and content. These new paradigms enable new strategies for documentary engagement and awareness building. They also encourage the creation of complex narratives within the mix of visual languages and media. The way we consume narratives has shifted from a passive to a more and more active position: we select, we interact, we construct sense and meaning. The context of the networked image fully embraces the new relationship that occurs between the visual and the audience by fostering the creation of immersive experiences across different platforms. Take as an example sensorial forms of media immersion where audience members are actually able to play a part in the story. The choices of the audience will trigger narrative turns. We live in a world where we experience a multitude of identities and acts of positioning. Accordingly, documentary narration needs to be designed from a user-driven perspective and modelled on the example of social activism, which is able to produce new social frameworks and to challenge existing narratives by using a specific hashtag or meme.

We need a plurality of voices, but we also need to contextualize them. Visual representations are direct, fast, easy to produce and accessible to large audiences in the networked world. Many people welcomed the move towards democratization that this process seemed to entail. However, the ability to generate narratives, on the internet and in traditional media, is a double-edged sword. It potentially subverts the process of generating meaning. In our post-truth era we have witnessed the emergence of ‘alternative’ realities based on rumors, misinformation, disinformation and propaganda. While a plurality of narratives is a major asset of our democratized world, we also need to provide tools for reading, contextualizing and critically reflecting on the visual. We need to avoid falling into the trap of an all-too-easy remixing of visual narratives. As I write this text, Photoshop has just announced the release of a beta version of its new Neural Filters. These filters are based on machine learning and allow users to perform a multitude of image manipulations, from ‘skin smoothing’ to ‘smart portraits’ that perform GAN-based processing on various facial elements. Apps and software that allow this type of editing have been available for a while, and Photoshop’s new filters are far from perfect. But the fact that the market leader in photo-manipulation is now offering everyone – even my little sister! – the ability to produce realistic alterations to any type of image represents a major shift. It is becoming increasingly difficult to clearly identify the differences between human and non-human agency. This means we need to place more emphasis on how images are used and less on how they are created.

We need to take the time to process information. The way in which images are distributed via networks implies different approaches to their production, just as the time taken to react to and critically analyze social events, news stories and happenings differs. In an era of speed and acceleration, documentary approaches call for a counter-trend that favors complexity and long-termism. At the same time, the documentary photographer’s narrative needs to enter into a direct relationship with the multiplicity of visions that constitute it and with visual coherence at the level of language. Speed goes hand in hand with accessibility: the concepts of virality, real-time, live streaming and performativity permeate the way the world is narrated and, at the same time, are subordinated to technological mediation. Whether confronted with a trending hashtag on Twitter or fully immersed in what Hito Steyerl called “bubble vision” in a 2018 lecture at the University of Michigan, it is essential to acknowledge the implications of the speed with which these cultural objects permeate society. Production times for visual materials have changed; our reaction times need to change too.

This text been edited for the publication in Why Pictures?, and firstly published in: Salvatore Vitale, “Imaging and Imagining the Real. Five Provisional Statements”, in: Post-Photography (= Nummer 10), eds Wolfgang Brückle and Salvatore Vitale, Luzern 2021.

The Node
The work was created using photogrammetry techniques and visual filters constructed on the basis of Gan neural networks. It depicts a conversation between two bots against the backdrop of a journey through the city. The bots are an allegory of the Gan network – one is an image maker (Generator), the other recognizes falsehood and truth (Discriminator). The conversation revolves around the structure of the film itself as well as the role of poetry, the city and the infrastructure of the network. The film directly addresses the idea of the multidimensionality of cinematic experience in the city – simultaneous textual, audio, and visual messages originating both in nature and electronic devices. All these layers overlap and intermingle, creating a new form of image.
Chapter One: Why social pictures? With Nathan Jurgenson
Thank You for Watching My Art Online
Algorithms Without Vision

The change in the social functioning of photography is a fact, but the underlying image-distribution networks are not neutral. If we agree that software is part of the apparatus responsible for the production and circulation of images, we cannot forget that it serves more than improving communication with our relatives and friends. It is also a part of an extractivist logic that is crucial for the functioning of contemporary communication networks, extracting information about our behavior and emotions from the content we create – images included – but also the digital traces we unknowingly leave behind.

Of course, the alliance between photography and surveillance systems, and even more so the classification practices used by various centers of power, are nothing new. And yet it would be a cliché to suggest that the algorithms monitoring the global circulation of images remind us of less noble uses of photography than communication. The point is rather that it is impossible to look at them that way, as they remain, to a large extent, black boxes. And yet we know that how algorithms see us matters, because they watch us more often than humans do. They are part of a more complex cascade of gazes, influencing, in turn, what we are shown.

Since image-recognition systems are non-transparent, what remains are reverse-engineering experiments. As a simple exercise, I uploaded Marta Ziółek’s opening images for this series to several web services. Google Cloud, which “derives insights from your images,” doesn’t know how to deal with these images at all. The braids turn out to be earrings. But that’s not the only problem, because the algorithms recognize not only people, items of clothing, and objects, but they also classify emotions. The standard set used by the algorithm (joy, sorrow, anger, surprise) is insufficient – the listed emotions turn out to be “unlikely” or “very unlikely,” and only “surprise” from the top photo is “possible.” The algorithm is confused, unable to cope with the classification. And yet, given the growing importance of such automatic emotion-recognition systems, such confusion, not to mention possible errors, can have real consequences. By the way, it is this problem that was ridiculed by the researchers at Dovetail Labs, who created emojify.info , a website that allows you to have a “face duel” with the algorithm. It’s worth checking out, to see how clumsy the models can be when they assume that the deviation of the corner of the mouth or the position of the eyebrows can precisely define our mental state.

Such mistakes probably have a greater impact on our imagination than the helplessness of Google Cloud – especially since all sorts of biases are revealed. This is well illustrated by the experiment with PimEyes , an algorithm-based service that reportedly does a record-breaking job in finding similarities between uploaded images and photos on the web. Its business model is based on an image-control service – the idea is to search for images that resemble our own, and possibly allow us to intervene when they are used without our consent. The problem is that once Martha’s photos are posted to PimEyes, the screen is flooded with porn. Algorithms don’t understand context, and they associate a woman with parted lips and an outstretched hand with pornography – perhaps the only form of transgression known to software (we can spare ourselves jokes about the sexism of the IT industry). Unlike in Rob Wasiewicz’s work, there is no room for humor or irony here – no casseroles or giant women devouring subway cars. Moreover, in an attempt to better understand the logic behind such choices, I took a selfie while emulating Martha’s gesture – lips parted, hand outstretched. Less than a second after submitting it to PimEyes, the screen was covered with aptly chosen photos of myself. The only mistakes depicted guys similar to me, in public speaking situations. So much for biases. You know: a guy with a beard and glasses usually opens his mouth in order to say something into a microphone; a woman – to subordinate herself to male satisfaction.

Why does this matter? First of all, because just as (let's have it, let's try to include a bit of the humor that machines lack in this gloomy argument) subway cars run on rails, the images circulating among us are, to a large extent, directed by similarly automated software. The limits of its imagination become our horizon. Secondly, as Vladan Joler and Matteo Pasquinelli write, such an algorithmic “undetection of the new” condemns us to look in everything for what already was. Always finding well-recognized patterns, and thus repeating old mistakes. Not very useful in times of crises that may have had no precedents in history (and not to mention sustaining “good old” sexism).

In her latest, excellent book Atlas of AI, Kate Crawford presents her journey through the places that reveal the backstory – usually invisible to us – of the functioning of new, “smart” technologies. The author visits lithium mining sites, but also the archives of the government agencies that make mugshots of arrested individuals available to the cybercorporations that use these images to train facial-recognition algorithms. In a poignant account, reviewing photos of people at difficult moments in their lives, Crawford shows the effects of the lack of broader discussion of the issue, denouncing “the unswerving belief [of the tech sector] that everything is data and is there for the taking. It doesn’t matter where a photograph was taken or whether it reflects a moment of vulnerability or pain or if it represents a form of shaming the subject. It has become so normalized across the industry to take and use whatever is available that few stop to question the underlying politics.”

Crawford – probably known to photography enthusiasts from her collaboration with Trevor Paglen, whose projects touch upon, among other things, automated vision systems – writes about a paradigm shift different to that described by Nathan Jurgenson. It is the shift from image to infrastructure, where context once again ceases to matter – the stripped-down images are thrown into an immaterial machine which squeezes out the data that allows the system to function. It is the grim reverse of the process of socialization. From this perspective, it seems important for creators to regain control over their images. Without that it is difficult to speak of the true democratization of photography and of the growth of its social dimension. Even if, for most of us, these disturbing processes remain invisible – or, as I mentioned, precisely because of it.

Can You See What I am Sharing?
So What?
Can you see me now?
A romance between aesthetics and physiology
Hello! Can you hear me?
I open my mouth

I open my mouth. I let my lower lip relax and drop. I relax my jaw, my cheeks. I close my eyes. I turn my eyeballs toward the back of my skull. I feel ripples all the way from my tailbone to the back of my head. My body is all in motion. I feel it from the base of my feet to the root of my tongue. My tongue droops, my hands reach out, opening my body, zooming in and negotiating space. I allow my eyelids to open. I see and feel through my skin. By way of my tongue, it emerges from my mouth. I feel a vibration down my body.

Today, our epidermis, the mask we wear, and the air we breathe, have become the new established boundaries. By revisiting the basic choreography of the mouth and the physiology of the female body, I mediate historical gestures, questioning the violation of bodily boundaries, the kinetic and the tangible in the image. My body is frozen in gesture, between one bodily movement and another.

Credits 1

  • 1: Costume: Joanna Hawrot in collaboration with Rafał Domink, Photo: Karolina Zajączkowska
  • Series Curated by: Krzysztof Pijarski & Witek Orski

    Although from its inception photography promised to be a democratic medium, circumventing the hierarchies of skill, style, or culture, this potential, like that of a latent image, remained unrealised. The necessary knowledge of chemistry and optics, the prohibitive cost of equipment, the laboriousness of the process, and, finally, the skill required to operate the entire apparatus were the initial stumbling blocks. Over time, such obstacles became less of a hindrance, while the photographic gesture became more and more commonplace. One could argue that it was only after 2007, with the invention of the smartphone—a miniaturised, pocket-sized computer equipped with a phone and camera module—that photography became truly ubiquitous. The parallel development of photo-processing software, including advances in machine learning, colloquially known as artificial intelligence (AI), led to the fact that today everyone is not only capable of taking technically correct pictures, but actually does so on a daily basis.

    If photography really does have democratic potential, that potential does not necessarily lie in the photographic gesture itself, in its universality or ease. It should rather be sought in contemporary image-distribution networks. It has never been easier to reach thousands or even millions of other people with your message. This ecstasy of communication, however, is accompanied by ever-increasing anxiety, triggered by the awareness that this ease of participation in the global circulation of images is concomitant with ever more draconian attempts at controlling, curtailing, and censoring it. And, what’s more, with the knowledge that an increasing number of images are not only not made for people, but also not made by people. Images, in their multitude, are establishing apace an autonomous, global republic of their own.

    In Why Pictures?, we aim to, in concert with contemporary theorists and practitioners, explore this global republic of images in search of the democratic potential of photography. In the sphere of social media, where and how is a common cause established, and a community formed around it, through the sharing of images? When is a collective good felt to be at stake? Is the autonomous character of the republic of images analogous to that of the current modalities of capitalism? If so, could such autonomy, paradoxically, empower the agency of images? And, to take this further, can photography play the role of a universal language in a contemporary world increasingly dominated by particularisms? Can it be a common space for dispute, iconoclash? These, among others, are the questions we would like to ask.

    The Why Pictures? platform was designed by Kaja Kusztra .

    Programming by Stanisław Rojek.

    The series is co-organised by the Krakow Photomonth Festival ; View. Foundation for Visual Culture ; Jasna 10. The Warsaw Cultural Centre of Political Critique as a part of ‘Centrum Jasna,’ financed by the Municipality of Warsaw ; and the Visual Narratives Laboratory at the Film School in Łódź .