Northwestern University
Michaelangelo artleft

The Data Renaissance

New worlds are opening in the humanities and social sciences, thanks to the tools of data science

using data in innovative ways

Let’s take a trip back in time, to a market in ancient Rome.

You’re an artisan planning a trip to the province of Gaul to sell your work in the up-and-coming town of Lutetia (present-day Paris). Seems like a pretty straightforward trip along those famous Roman roads, right?

Not exactly, assistant classics professor Taco Terpstra tells his students. Using ORBIS, an interactive online map of the ancient Roman world, Terpstra and his students model various options for that pre-modern business trip — and the results are surprising.

Want to maximize your profits by getting to Lutetia as cheaply as possible? Then you’ll be traveling by ship: west across the Mediterranean, through the Straits of Gibraltar, then north to the River Seine — a 46-day journey. (There’s a slightly shorter 30-day route, but it costs almost twice as much.)

Putting yourself in that Roman traveler’s shoes brings home the real-life constraints of the ancient world.

ORBIS — which is free and available to the public — is just one example of how modern computing power is making information more accessible. Numerous ancient sources refer to the cost and time it took to travel from point A to point B, but those pieces of information existed as isolated facts, with no real sense of how they related to each other.

“Bringing together all the references in the sources and feeding them into a computer model allows us to get an accurate idea of connectivity within the empire,” says Terpstra. Students in Terpstra’s course on the ancient Roman economy get a better idea of how commerce worked on a day-to-day basis, and Terpstra can reference those networks for his upcoming book on the economic history of Roman trade.

We live in a very different world than our hypothetical Roman traveler. Today, all of us have a staggering amount of information at our fingertips: the “big data” that is constantly being collected, digitized and stored electronically. Big data has obvious applications in science- and technology-based fields, from the algorithms that power a Google search to the mapping of the human genome that’s revolutionizing medicine. What’s less well known is how this wave of increasingly accessible information is transforming the humanities and social sciences.

“Practically all humanities fields work with large bodies of data, whether they’re textual, archaeological or visual,” says Terpstra. “Bringing that inform­ation together provides us with tools that were not previously available.”

Throughout Weinberg College, faculty are integrating data science into their teaching and research as their students prepare for careers in a digitally dominated world. The technological divide that once separated engineering majors from English majors is rapidly dissolving, and today every arts and sciences graduate needs to be comfortable with computer modeling and statistical analysis. “We’re living in a data society,” says Weinberg College Dean Adrian Randolph. “We always have been, but it’s exploded to a point where it touches our lives in so many ways. What does it mean to collect so much data, and what do we do with it?”

Those are questions that the humanities and social sciences are designed to address. Scholars and students across the College are not only using data in innovative ways, they are looking at the deeper issues it raises. Does more data lead to better conclusions? Does it bring us closer to the truth?

Digitization and research

There’s no question that digitization has been enormously helpful to anyone who studies and analyzes sources from the past. Digital records and databases that can be searched by keyword, for example, allow historians to make connections between previously disparate sets of information. When researching how the Latin American independence movements of the early 19th century were perceived in the United States, assistant professor Caitlin Fitz was able to scour early U.S. printed materials online from home — no small consideration for a parent of two young children.

“History is more of an art than a science,” Fitz says. “But digitization has opened new horizons to scholars ready to do the grunt work of counting, correlating, graphing and building databases.”

While Fitz’s research also involved visits to libraries and archives, having ready access to newspapers online allowed her to build a statistical case that U.S. citizens of the time were actually quite interested and engaged in developments in South America. “For example,” Fitz says, “I was curious if U.S. audiences knew that Spanish Americans were implementing antislavery measures during their independence wars. So over the course of a month or two, I searched 128 newspapers online. I was stunned to discover that 90 percent reported on Simón Bolívar’s 1816 refuge in Haiti or on his ensuing antislavery proclamations. I was even more stunned to see that slaveholding editors in the Deep South were often supportive! I would never have been able to establish all of that without searchable online databases.”

Through digitized federal census reports, Fitz also found more than 200 U.S. families who named their baby boys Bolivar. “That was a great way to track grassroots excitement for Spanish American independence, especially because it suggested that mothers were interested in events south of the border, not just fathers,” she says.

Fitz’s research resulted in her recently published book, Our Sister Republics: The United States in an Age of American Revolutions, and the experience made her appreciate the power of empirically driven research. “When fellow historians ask me how widespread public excitement for Latin American independence really was, or if they want to know how it varied from region to region and changed over time, I can point to specific data.”

This ability to search material that’s stored all over the globe has opened up all sorts of new publication and research opportunities for College faculty and students. The Advanced Papyrological Information System, for example, is an online database of texts written on papyrus, some with English translations. The tool allowed Terpstra to prepare a journal article about writing in Roman Egypt that otherwise would have been “a nightmare,” he says.

Thanks to projects like APIS, the scrolls of the past have entered the digital present. (Some of those papyrus texts also show how much human nature hasn’t changed, as in the letter a frustrated Egyptian father sent his son in the 12th century BC: “You did not hear any warning which I spoke to you formerly. Should you overturn yourself when you pilot a ship without me, should you sink into a watery grave… you will be in the water because of your own navigating.”)

Building Interactive Resources

Throughout the College, today’s students aren’t just using interactive resources, they’re building them, too.

For the English and humanities course Shakespeare’s Circuits: Local, Global, Digital, professors Wendy Wall and Will West guided students in creating a digital map that shows the spread of Shakespeare’s works through time, all around the world. On a practical level, such projects give students the kind of technological experience that looks good on a résumé — no small factor in an educational climate where parents worry about a college education paying off professionally. But it’s the “why” behind the map — rather than how it was put together — that really matters, says Wall.

“The map tells a story,” Wall says. “The dots make no sense without that larger story.” What were the reasons Romeo and Juliet and The Tempest were particularly popular in Latin America? When The Merchant of Venice was translated into Maori in 1945, what parallels could be found between the treatment of the Jewish character, Shylock, and the plight of the indigenous peoples of New Zealand? The map becomes the starting point for such discussions and research.

These kinds of courses aren’t easy to construct or teach, says English graduate student Casey Caldwell, the teaching assistant for Shakespeare’s Circuits.

“We had to put a lot of work not only into training the students and ourselves in using the software, but also — and far more importantly — into figuring out how to use it creatively and productively,” he says. The digital map was only one element of a rich, immersive experience: Students did both historical research and textual analysis; they attended live performances in Chicago and watched videos of culturally significant productions; they wrote papers and participated in both classroom and online discussions. In other words, they did the same kind of in-depth research and analytical thinking that Northwestern students have been doing for generations.

“The cognitive skills involved in this kind of class rely on a core ability to think critically,” says Caldwell. “That’s much more challenging and important than learning how to use a digital tool — or an analog one, for that matter. Personally, as a teacher, I think that if we hone that core ability in students to think for themselves, a lot of the so-called ‘up-skilling’ involved with any kind of technology will be something they can do with discernment and creativity.” 

Wall, who is also the director of the Kaplan Institute for the Humanities, says today’s students have the false impression that all information is available through a Google search. “It’s important to slow down and learn how to use sources responsibly,” she says. Like other professors throughout the College, she wants students to think about where the facts they use come from. “Data is an end point,” she says. “What went into making those numbers or facts? What counts as evidence? How do you measure it?”

Asking those questions not only challenges students — it challenges the way courses are constructed. When it comes to teaching, “It’s misleading and cheapening to sell digital tools as making things easier,” says Caldwell. “The use of digital data can make preparing for and teaching a class twice as hard, not twice as easy. But I think it’s a good kind of harder. This is the right kind of challenge for a teacher to face.”

Big Data

Just hearing the words “big data” can make you feel tired. There’s so much of it already, and it just keeps coming. Now more than ever, what’s needed are people who can sort through that tidal wave of information, pulling out what’s relevant and showing how it can be applied to real life. That’s exactly the kind of training students receive in the College’s Mathematical Methods in the Social Sciences program, a model for how to integrate “hard” numbers with “soft” sciences.

“More data in principle is always better,” says program director Jeff Ely, the Charles E. and Emma H. Morrison Professor of Economics. The MMSS program teaches students how to apply mathematics, statistics and computer modeling to research in social science fields such as political science, economics and sociology — a very in-demand skill set. But Ely believes all Weinberg College students need to develop a certain familiarity and fluency with numbers. “Whenever you have large data, there are guaranteed to be statistical accidents,” he says. “One of the things we need to do — and this is true of all the arts and sciences — is think about how we pick the new tools that are being developed at the frontiers of research, and bring them into the classroom for undergraduates.”

Those who know how to find the stories behind the numbers can have a real impact on public policy, as some MMSS students learned in their research on crime statistics. Recent graduate Andrew Zessar ’16 was part of a team, led by adjunct lecturer Mark Iris, that studied the relationship between low-level drug crime and subsequent violent crime for the New York City Police Department. Among the team’s findings: drug crime and violence aren’t as closely linked as legislators may have led us to believe. “Drugs are associated with violence simply because they are exchanged on the black market, and there’s no authority to regulate them,” Zessar says. “If candy were sold on the black market, it could easily be associated with violence, too.” Findings like this could play a crucial role in the current national debate over sentencing guidelines. “If we follow the data, we can educate future leaders to stop locking up harmless people,” Zessar says.

Psychology professor Dan Mroczek is also optimistic about the potential for big data to improve people’s lives. Mroczek, who holds a joint appointment in the Feinberg School of Medicine, is working with Feinberg neurologists to improve recoveries for stroke victims by studying tens of thousands of data points from hundreds of patients. “I think we’re on the edge of something really exciting and important,” he says. “If we find ways to tame all that data, it could bear some remarkable scientific fruit.” Digitalization and “smart” technology are also opening up intriguing new avenues for psychological research: cellphone texts can be analyzed for their emotional content, and depression levels can be tracked alongside readings from a FitBit and other wearable biosensors. The field, Mroczek says, is just starting to wrestle with all those possibilities: “How do you use that data? How do you best analyze it? That’s where people are floundering.”

Will our daily lives be transformed into a never-ending research study? Think of all the personal information that can be automatically collected by your phone or fitness device: your heartbeat, your temperature, the number of steps you’ve taken, your emotional state, even the words you use most often in phone conversations (which can be recorded and later searched). Does your wireless provider have the right to track all that information? Who owns it? “There may be some very interesting court battles over this,” Mroczek predicts.

Though data is now being collected in intimidating amounts and in new ways, the questions professors ask their students at the College haven’t changed. The “why” behind big data is just as important as the “what.” “If you do all your research in the form of keyword searches, you can easily lose your sense of context,” says history professor Caitlin Fitz. “At some point, you have to go to the archives, hold an old newspaper in your hands and remember how big it is, how many articles were on the page, how they’re interspersed with advertisements for runaway slaves or medical treatments or lottery tickets. You have to keep coming back to that richer and more holistic picture. Otherwise the numbers won’t mean much.”

