The Dangers of Using AI to Ban Books Artwork

The Just Security Podcast

Just Security is an online forum for the rigorous analysis of national security, foreign policy, and rights. We aim to promote principled solutions to problems confronting decision-makers in the United States and abroad. Our expert authors are individuals with significant government experience, academics, civil society practitioners, individuals directly affected by national security policies, and other leading voices.

All Episodes

The Just Security Podcast

The Dangers of Using AI to Ban Books

October 27, 2023 • Just Security • Episode 45

Across the United States, book bans, and attempted book bans, have hit a record high. Driven in part by newly passed state laws, public schools have seen a thirty-three percent increase in banned books.

The vague and subjective language used in these laws leave school boards struggling to figure out exactly what content is prohibited. Some school boards, like the Mason City School District in Iowa, have turned to ChatGPT and Artificial Intelligence to comply with these new state laws.

But, the inconsistency and limitations of AI technology have led to over inclusive results that disproportionately flag content about the experiences of women and marginalized communities, and raise concerns about free speech and censorship.

Joining the show to discuss AI and its effect on book bans is Emile Ayoub.

Emile is counsel in the Brennan Center’s Liberty and National Security Program where he focuses on the impact of technology on civil rights and liberties.

Show Notes:

Emile Ayoub (@eayoubg)
Paras Shah (@pshah518)
Emile and Faiza Patel’s (@FaizaPatelBCJ) Just Security article on using AI to comply with book bans
Just Security’s Artificial Intelligence coverage
Just Security’s content moderation coverage
Music: “The Parade” by “Hey Pluto!” from Uppbeat: https://uppbeat.io/t/hey-pluto/the-parade (License code: 36B6ODD7Y6ODZ3BX)
Music: “Tunnel” by Danijel Zambo from Uppbeat: https://uppbeat.io/t/danijel-zambo/tunnel (License code: SBF0UK70L6NH9R3G)

Paras Shah: Across the United States, book bans and attempted book bans have hit a record high. Driven in part by newly passed state laws, public schools have seen a thirty-three percent increase in banned books.

The vague and subjective language used in these laws leave school boards struggling to figure out exactly what content is prohibited. Some school boards, like the Mason City School District in Iowa, have turned to ChatGPT and artificial intelligence to comply with these new state laws.

But, the inconsistency and limitations of AI technology have led to overinclusive results that disproportionately flag content about the experiences of women and marginalized communities and raise concerns about free speech and censorship.

This is the Just Security Podcast. I’m your host, Paras Shah.

Joining the show to discuss AI and its effect on book bans is Emile Ayoub. Emile is counsel in the Brennan Center’s Liberty and National Security Program, where he focuses on the impact of technology on civil rights and liberties.

Hi, Emile, welcome to the show. Thanks so much for joining us today.

Emile Ayoub Hi, Paras. Thank you for having me.

Paras: So today we're talking about two trends that have been all over the news: artificial intelligence and book bands. To understand what happens when those two big trends meet, can you tell us about a book ban law in Iowa?

Emile: Yeah, so earlier this year, Iowa’s Republican-led legislature approved a law that seeks to limit discussion of gender identity and sexuality in schools. The law bars school libraries from carrying books that are not by the law's definition age appropriate. And that covers books that contain descriptions or visual depictions of a sex act. In other words, public school districts across the state are required to remove any books that contain depictions of sexual activity for grades K-12.

Paras: I imagine that book bans like this, that have such broad definitions are difficult to comply with, they might take a lot of time and resources — how do school districts typically handle these types of laws?

Emile: Yeah, so you know, unfortunately, these types of bans have become increasingly common all around the country. It's a trend that's been driven by both political pressure from state lawmakers and a growing number of organized groups, mostly conservative groups, pushing for book removals on what they deem to be divisive topics. In practice, these topics have tended to include subjects like racism, slavery, gender identity, and sexuality.

So over the last few years, we've seen states like Florida, Texas, Missouri, Utah, and South Carolina pass laws like Iowa’s that seek to limit discussion of gender identity and sexuality in schools. And yes, to your point, these bans put a huge burden on teachers, librarians and school administrators. They have to spend countless hours reviewing their library’s catalogs for books that may be in violation. And if they fail to comply with these laws, schools could face penalties, including having their teachers’ licenses revoked.

Now, given these risks, it's not surprising that school administrators often apply overly broad interpretations of their state's book ban law. As you might imagine, there's no penalty under the law for banning too many books. So for example, in Urbandale, Iowa, administrators sought to remove 374 books from school shelves for fear of facing penalties. Now this list included classics, including The Catcher in the Rye by J.D. Salinger, and Catch-22 by Joseph Heller. Now after public pressure, the school district later changed course and narrowed that list down to 64. But even that list includes books that we typically think of as must read classics, like Brave New World by Aldous Huxley and The Handmaid's Tale by Margaret Atwood.

Paras: One school district in Mason City, Iowa decided to do something different. How did it respond to this law?

Emile: It did what a lot of people might be doing these days when faced with a difficult task, and it turned to ChatGPT. The official in charge argued that it was simply not feasible for her staff to read every single book in the district's catalog. So instead, the school district compiled a list of commonly challenged books and then for each book on that list asked ChatGPT, “Does this book contain a description or visual depiction of a sex act?” If ChatGPT answered yes, that book would be removed from school libraries.

Paras: How successful was ChatGPT at this?

Emile: Not entirely successful. ChatGPT identified 19 books as depicting sexual acts from that list. But three of those books, The Absolutely True Diary of a Part-Time Indian by Sherman Alexie, Friday Night Lights by Buzz Bissinger, and The Kite Runner by Khaled Husseini didn't depict sexual acts at all. These mistakes were caught when administrators later reviewed ChatGPT’s results, and in the case of Friday Night Lights, after the author publicly expressed his outrage about his book being falsely depicted. That reversal demonstrated once again the importance of human evaluation of the output of these AI tools.

Paras: Okay, so ChatGPT wasn't perfect here. But in other areas, it's been more successful, like identifying and removing child sexual abuse material. What are the risks here of using AI to comply with book bans?

Emile: The accuracy of AI tools in detecting certain types of online content depends a lot on the type of content they're trained to address. You're right that the tools trained to detect child sexual abuse material are mostly successful. That's in large part because that material is removed regardless of context and tone. AI tools screen uploaded content online against a database of known images of SEC child sexual abuse material, and they'll detect and remove content that's similar to the images in that database.

But as we've seen over the years, in content moderation, when AI tools have been used to detect and remove content that requires sensitivity to context, tone and nuance, they've been less reliable. So for content that may contain adult nudity and sexual activity, AI tools have struggled to understand the context of posts and have a tough time distinguishing problematic content from posts that, say, depict healthcare-related content like breastfeeding or mastectomies.

Paras: How common are these types of mistakes in identifying adult nudity and sexual content?

Emile: You know, we've seen this happen over the past few years in the social media content moderation space. In one of the first cases that Meta’s oversight board took on, the company's AI tools had flagged and removed an image of uncovered female nipples showing breast cancer symptoms and corresponding descriptions of those symptoms. Even though Meta’s policy on adult nudity and sexual activity allowed users to post uncovered nipples in the context of health related situations like post mastectomy, or breast cancer awareness, the company's machine learning tool failed to understand the context of the post and wrongly removed it.

Now, as the oversight board acknowledged, one of the risks of relying on inaccurate AI tools to enforce these kinds of policies is that they'll likely have a disproportionate impact on women. And we've also seen that these AI tools tend to amplify biases. A recent study of AI tools used to flag sexual content found they were more likely to label photos of women as sexually suggestive, especially if the photo depicted pregnant bellies, exercise or nipples. The former co-head of Google's ethical AI research group suggested this might be due to the bias of the primarily heterosexual male staff who trained their tools. Now, even educational images released by the US National Cancer Institute that demonstrate how to do a clinical breast exam were flagged in the studies tests as sexually explicit.

Now, while research is still ongoing, early analyses suggest that generative AI tools, like ChatGPT, similarly reflect biases, produce inconsistent and unreliable responses, and are highly sensitive to user prompts.

Paras: Where else could we see AI being used in this space?

Emile: Well, we can expect that some states will continue to expand book banning laws. So it's not just schools that may be tempted to use ChatGPT to comply with the bans. In Texas, actually, a new law requires all book vendors who supply school districts to rate each book in their shelves on whether it contains depictions or descriptions of sexual conduct. In effect, that means nearly 300 booksellers will have to review and rate every single book on their shelves, including titles previously sold to school districts. What's more, the law gives vendors until April 1 to comply. Now, that's an incredibly labor intensive, almost impossible task, even without the April 1 deadline. The payroll costs alone could bank up the business. So these bookstores might be tempted to use AI tools like ChatGPT, just as Mason City did to save costs and comply with the law.

But as we have seen, the risk is that these tools could tag books depicting or describing educational or health related subjects like breastfeeding, breast cancer or pregnancy as sexually explicit. Even if a tool were to misclassify only 1% of book titles, that could add up to hundreds of books being inappropriately flagged as sexually explicit and removed from bookshelves.

Paras: Emile, I want to take a step back and talk about the broader implications here. When we zoom out, what are the problems that arise with using AI to enforce these book bans?

Emile: Yeah, you know, one of the broader concerns about using AI in this context is that it may allow decision makers to escape accountability and hide behind a veneer of objectivity and neutrality when making rights-affective decisions like this. The school official in Mason City, in fact, reportedly turned to ChatGPT, in part, because she believed it could be an objective way to identify books with sexual content.

Time and again, we've seen that AI tools amplify the biases of their training data and the humans that design them. They struggle to understand context and nuance and can produce inconsistent results. And another problem is that these tools are black boxes. The public has little to no visibility into how they're making decisions and what information they're using to make those decisions. As more states seek to limit discussion of race, gender identity and sexuality in schools, decision makers may rely on generative AI tools like ChatGPT, with all their limitations, biases and sensitivity to prompt design to make over inclusive lists of inappropriate books under the guise of objectivity.

Broad book bans like those in Iowa and Texas are a threat to free speech. They threatened students' rights to access information, and run the risk of chilling speech on topics such as sexuality, gender identity, teen pregnancy and sexual health. Using generative AI tools to comply with those bands only increases that risk and makes those laws more dangerous.

Paras: As you noted, book bans are likely to only increase and generative AI tools like ChatGPT are here to stay. So what can we expect next? What should we be looking for in this debate?

Emile: Unfortunately, the book banning movement across the country does not seem to be slowing down. And the school district officials, librarians and bookstore owners who have to figure out how to comply with these bans will continue to be tempted to turn to AI tools for labor intensive tasks. Congress, the White House and state legislatures are racing to establish frameworks that safeguard against the risks posed by rapidly advancing AI technology. But it's important for people to understand that AI tools are already causing harm. And in the case of book bans, even well meaning use of these tools could make those laws more dangerous.

Paras: What legislation and regulation have we seen in this space so far?

Emile: There's been a lot of activity in Congress, but no comprehensive legislation has really come through. Senate and House committees have held hearings on AI throughout the year and many bills have been announced. Senate Majority Leader Chuck Schumer has been leading a bipartisan effort to develop a policy response, and has already held and plans to continue holding hearings to hear from industry leaders, researchers and academics. And I might add that those hearings should also consider the work of organizations that have been protecting civil rights and civil liberties in this space.

And last year, the White House put out a blueprint for an AI bill of rights that sets out broad principles for safe and effective use of AI. And the White House is expected to put out an executive order on AI soon. Now, several themes have emerged from all this activity. While it's unclear how any regulation will look in the end, it's going to be important that any legislation focuses on principles that safeguard our civil rights and civil liberties — principles like ensuring these systems are safe and effective, protecting against these systems algorithmic discrimination, protecting data privacy, providing transparency, notice and explanation and providing human alternatives to these systems.

Paras: Is there anything else you'd like to add?

Emile: Yeah, I'll just take a step back for a second to say that, as society grapples with the opportunities and risks of AI, and how to regulate and safeguard against those risks, it's important to consider the various ways AI tools are already impacting our society, particularly historically marginalized communities. And it's important that policymakers put civil rights and civil liberties front and center as they work on those safeguards, including by working with groups that focus on protecting these rights and civil liberties. These risks aren't theoretical. AI tools are already affecting people's access to economic opportunities and our civil rights and civil liberties. Time and again, we've seen harms from inaccurate and biased AI used in policing, employment, housing and public accommodations. The kinds of harms that can arise from the use of AI tools for book bans are just one example of that.

Paras: Emile, this has been such a great conversation, and there's a lot to follow in this space. We'll be tracking it at Just Security. Thanks so much for joining the show.

Emile: Thanks so much for having me.

Paras: This episode was hosted by me, Paras Shah. It was edited and produced by Tiffany Chang, Michelle Eigenheer, and Clara Apt. Our theme song is “The Parade” by Hey Pluto.

Special thanks to Emile Ayoub. You can read Just Security’s coverage of artificial intelligence and content moderation, including Emile’s analysis, on our website. If you enjoyed this episode, please give us a five star rating on Apple Podcasts or wherever you listen.

People on this episode

Paras Shah

Host