UW News

May 20, 2025

“Ways of Knowing” Episode 1: Digital Humanities

English, philosophy and comparative literature aren’t typically subjects that come to mind when thinking about big datasets. But the intersection between literature and data analysis is exactly where Anna Preus works.

Click to see the full transcript of the episode

Ways of Knowing

The World According to Sound

Season 2, Episode 1

Digital Humanities

[poetry reading begins]

Chris Hoff: British modernist literature is dominated by a few names: William Yeats, T.S. Eliot, Ezra Pound.

[reading continues]

CH: The story goes that these were the most important, famous poets of their time. And they’ve continued to be revered through the decades. But there were a host of other writers at the time who were also famous, but whose work was never part of the canon.

[reading of Tagore begins]

Who are you, reader, reading my poems a hundred years hence?
I cannot send you one single flower from this wealth of the spring, one single streak of gold from yonder clouds.
Open your doors and look abroad.

CH: This is Rabindranath [ruh BIN druh noth] Tagore [tuh GORE].

[reading of Tagore continues]

From your blossoming garden gather fragrant memories of the vanished flowers of a hundred years before.
In the joy of your heart may you feel the living joy that sang one spring morning, sending its glad voice across a hundred years.

CH: And this is Sarojini [ser OH juh knee] Naidu [NAI doo]

[reading of Naidu begins]

When from my cheek I lift my veil
The roses turn with envy pale
and from their pierced hearts rich with pain
send for their fragrances like a wail

Or if perchance one perfumed tress
Be loosened to the wind’s caress
The honeyed hyacinths complain
and languish in a sweet distress

[reading of Naidu fades]

CH: Tagore actually won the Nobel Prize for Literature in 1913, the first Asian person to have ever done so. And Naidu was a political activist and poet, known throughout India by the nickname Gandhi gave her, “The Nightingale of India.” Yet their work is largely unknown in the West.

Anna Preus: I had gotten deep into a Ph.D program focusing on the early 20th century and no one had ever mentioned her name to me.

CH: Anna Preus, professor of English and data science at the University of Washington.

AP: The received narratives about British modernism rely on a history of judgements by middle- and upper-class British and American men for the most part. You hear the same group of people talked about often: Ezra Pound, James Joyce, T.S. Eliot. We finally got a woman in there with Virginia Woolf in like the 70s and 80s, but she wasn’t always included. And yet there are all these other authors whose work has not been included in those received understanding of modernism.

CH: It’s a familiar idea: talented writers, thinkers and artists who have been marginalized based on their ethnicities and identities. The narrative is that they weren’t the most popular or revered in their time, and therefore weren’t included in the canon.

AP: That has resulted in the situation where they can sometimes be framed as less influential than they actually were, in my opinion, to not only literary cultures of Anglophone literature in South Asia or the Caribbean, but also to what we might refer to as British Modernism, capital B and capital M.

CH: But it turns out this narrative is not true. Writers like Naidu and Tagore were actually quite successful in their time. That fact becomes completely clear once you look at the data.

[reading of data begins]

CH: This is publishing data from colonial poets in the early 20th century in Great Britain. Anna pored over all this data for her research.

[reading of data continues]

CH: The number of times non-British authors were published was enormous, which convinced Anna that colonial authors like Naidu and Tagore were just as popular as their contemporaries in Eliot, Pound and Yeats. This means that the decision to exclude them from the canon was not because they were unknown or marginal. It was a conscious decision not to include these influential writers and to prioritize white, male authors like Eliot and Pound. To prove that these writers were popular in their time, Anna wanted to find out how many times the most well-known Indian authors were published in Britain. But getting at this information was tricky.

Publishing data is notoriously patchy. If the records exist at all, they could be stored away in some archive or library, perhaps not even labeled or easily accessible to the public.

AP: A few decades ago, a few institutions started digitizing historical text en masse. The English Catalogue of Books happens to be one of those texts that Google digitized. So, what we had when we started working on this with a team was all these kinds of PDFs, just static PDFs, and then plain text files with an endless stream of garbled, plain text. So we had to try to split that up so we could get usable data on each of the books published each year that we could both search and analyze computationally.

CH: This was no small task.

AP: There are years of work in just trying to take a PDF and turn it into a usable, correct list of books published at the time. And it’s digital humanities in part because of that process, of trying to transform a historical text — this publishing catalog — into a spreadsheet where people could see, for example, what the most popular publisher of 1913 was. Getting from Point A to Point B has involved a lot of different steps for trying to transform historical text but has also involved larger scale collaboration.

CH: Once Anna got all that publishing data into a spreadsheet, she could start asking some deeper questions.

AP: For me, particularly, I am interested in how the British publishing industry was a key institution in British imperialism. I was very interested in getting at this data to see what works were being published, especially in relation to British imperial projects. That’s why I am interested in 1902 to 1922. That’s the peak period of British imperialism in terms of land conquest.

[data reading begins]

CH: There are lots of books published about South Asia, but not so many literary texts by South Asian authors. There ended up being more than Anna thought. She combed through the data and established about 2,000 English-language texts by authors like Naidu, Tagore and others. Some Asian authors were prolific.

AP: Totally. It’s definitely caused me to shift my perspective. When I was initially looking at the data and I was in library databases looking at the number of works published by Tagore and Yeats, I saw that Tagore had doubled the library record to Yeats in at least the information I pulled. Even to me, I was like that’s striking. I’ve heard so much about Yeats.

[instrumental music begins]

CH: She started noticing something else. Another part of the story of why these writers had been excluded from the canon: Their work had been misclassified. Many works of poetry by these prominent South Asian writers were not published under the heading “poetry.” Instead, publishers labeled them as songs. Naidu’s poetry –– the readings we heard at the beginning of this piece –– are from a book titled “The Sceptred Flute: Songs of India.”

AP: Each of these authors’ first publications were so heavily, generically, associated with song. I was like, “What’s going on?” These authors are quite different, their poetry is quite different. It’s poetry. It keeps being called song. This was true for other writers all across the British colonies. I had this sort of sense that these works were being labeled as songs to associate them with oral cultures, with forms of oral literature and potentially to represent them not as necessarily literary text, high-elevated poetry or verse.

CH: It was happening not only to authors from Britain’s distant colonies, but also to those from the British Isles and Ireland.

AP: I found a ton of poetry books labeled as songs. So England, when it looked back at its own poetic history, found the kind of origins of poetry in these collectively produced ballads. They wrote a lot about that kind of poetic development in the places they colonized in the British Isles — so Wales, Ireland, Scotland. You see a lot of books of Welsh ballads, Scottish ballads, all talking about this early collective poetic culture. And so for me, that really seems like part of the reason authors like Naidu are not considered modernists is because from the beginning, they were associated with the old, and this earlier form of poetry, rather than the experimental, avant garde or boundary-pushing, or poetry of the now that you see some authors like T.S. Eliott sort of being marketed as

CH: For Anna, relegating these colonial authors to a more primitive style of poetry was no accident. And to prove it, she connected the texts of these South Asian authors with larger data sets. She could have done all this work without the aid of data analysis, but collecting this amount of information and making sense of it could take up the entire lifetime of one scholar. Digital tools can cut through a lot of the grunt work.

AP: Absolutely. If you wanted to do that for 1912, you would be reading through a 400-page publishing catalog that has 70 entries per page, and writing down each time a publisher was mentioned and then adding them up. So you’re absolutely right. What this allows us to do is break up the text, so essentially we have a column in the spreadsheet that’s publisher and then count so we don’t have to do that by hand.

CH: Instead, Anna and her team can get these results in a matter of days, which gets at a core aspect of her work: collaboration. All the tools you need are pretty easy to come by. As long as you have some coding skills, everything else –– the texts, the processing programs, and really all you need is a laptop –– it’s all basically free. But for Anna, she’s had staff members and both graduate and undergraduate colleagues helping her for years — all with a range of technical skills. This, she says, isn’t typical for humanities research, and it certainly doesn’t happen at many institutions.

In the end, though, this kind of humanistic work is not just about data. It’s about combining the work of studying primary sources, like a poem, with data analysis.

AP: The story comes from going back and forth between the books and the data. I don’t think the story comes through in the data alone. I don’t think it comes through in individual books alone. For me, it’s a constant back and forth of going to the data, looking at all these books, then seeing which of these books I can access, actually reading them, seeing how they’re marketed, what tropes are being called up, and then going back to the data and then seeing if that comes through more broadly. So it’s always this back-and-forth process.

CH: Anna Preus works in the more recent tradition of the digital humanities. It’s about applying the analysis of large sets of data in fields that aren’t typically known for using such methods, like English, philosophy or comparative literature. Using digital tools to study humanistic material allows access to the humanities that would otherwise not be possible.

CH: Here are five texts that’ll help you learn more about publishing culture and the digital humanities as a way of knowing.

“A World of Fiction: Digital Collections and the Future of Literary History,” by Katherine Bode

CH: This book uses the world’s largest collection of mass-digitized newspapers to understand how Anglophone fiction in the 19th century traveled around the world.

“Debates in the Digital Humanities,” edited by Matthew Gould and Lauren Klein

CH: A state of the digital humanities union, so to speak — at least as of 2023. This collection of essays highlights the major questions, problems,and practical knowledge of the field.

“New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and
Pedagogy,” by Roopika Risam

CH: Risam’s book examines the role of colonial violence in the development of digital archives and the possibilities of postcolonial digital archives for resisting this violence.

“Postcolonial Writers in the Global Literary Marketplace,” by Sarah Brouillette.

CH: A book about the relationship between postcolonial authors and the international marketplace where their work is published.

“In Another Country: Colonialism, Culture, and the English Novel in India,” by Priya Joshi

CH: Joshi explores how Indian writers of the English novel indigenized the once-imperial form and put it to their own uses.

CREDITS

SH: Ways of Knowing is a production of The World According to Sound. This season is about the different interpretative and analytical methods in the humanities. It was made in collaboration with the University of Washington and its College of Arts & Sciences. Thanks to Casey Miner and Ben Trefny for their voice work. Music provided by Ketsa, Serge Quadrado, Graffiti Mechanism, Oootini, and our friends, Matmos.

 

Preus, a University of Washington assistant professor of English and of data science, digitally streamlined the process of documenting the number of non-British poets published in early 20th-century Great Britain. Anna Preus The number was enormous, but these poets are still absent from the literary canon — a discrepancy that led Preus to believe their exclusion was a conscious decision. In this episode, Preus discusses her research and the infrastructure needed for similar digital humanities projects.

This is the first episode of Season 2 of “Ways of Knowing,” a podcast highlighting how studies of the humanities can reflect everyday life. Through a partnership between The World According to Sound and the University of Washington, each episode features a faculty member from the UW College of Arts & Sciences, who discuss the work that inspires them and suggest resources to learn more about the topic.

Tag(s):