Mapping the Discourse on AI Safety & Ethics

Değer Turan, Colleen McKenzie, Oliver Klingefjord

Using our prototype discourse visualization tool Talk to the City, we mapped Twitter conversations about the impacts of AI, and identified six distinct perspectives:

1. Ethical approaches to tackling current harms
2. AI is Inscrutable and Difficult to Control
3. Calls for Independent & Democratic Oversight
4. Dangers of Rapid Development
5. Human Interests are Hard to Formalize
6. Optimistic AI Futures

Each points to specific risks and courses for action, but we found significant overlap between the groups, and calls for more collaboration from all points of view.

We plan to continue the project of making these conversations more visible, and to create contexts that support richer, multi-threaded approaches to discussing problems of AI safety – and we hope others in the space will join us in doing so.

Background

In the past few weeks, conversations about the future of artificial intelligence and its societal effects have grown exponentially, in response to new product announcements and calls for safe adoption of new technology (such as the Future of Life Institute open letter and many opinion pieces). These discussions incorporate a variety of concerns – on AI ethics, AI safety, long-term concerns, and societal goals – in a conversation of increasing importance not just for the people working on these issues directly, but also for the global population whose lives they will affect. But for effective collective action, we believe we need to bring visibility to each perspective and their interactions, to capture the nuance of each viewpoint and prevent majority views from obscuring the full extent of discussion.

Last week, we started to analyze and visualize this conversation with a collective deliberation tool we're building called Talk to the City (more detail in our launch announcement). At AI Objectives Institute (AOI), a nonprofit research incubator, we're building tools for collective sensemaking and scalable coordination, with the goal of using AI safely to help institutions and societies build resilience.

We used Talk to the City to map this space of discourse: capturing conversations happening on Twitter and other platforms, and grouping tweets by topic, to help us create clusters of similar viewpoints on what can and should happen in the face of rapidly advancing technology. Our objective was to distill what each cluster believes to be the most important facets of this conversation, and which approaches they believe will result in the best outcomes.

Early Prototype, Preliminary Results

This experiment in mapping Twitter conversations is far from an in-depth summary: it is a first attempt at collecting data and categorizing relevant views, using a prototype tool we're still iterating on. We're doing this now because we expect AI safety discourse will continue to grow in scale and complexity. At AOI, we think that tools for collective discourse need to capture the nuance and diversity of views in large-scale conversations – and we believe our ongoing work towards that goal will benefit from feedback on early results. In parallel, we're integrating content from outbound Twitter links and from collective deliberation platforms like Pol.is, with the eventual goal of capturing input from a wide variety of platforms with relevant discussion. Ultimately, we aim to turn Talk to the City into a platform for more effective deliberation and policy development, by giving policymakers better visibility into continuous discourse at unprecedented depth and scale – but this is our first test of the platform as an early prototype.

If you notice important perspectives we've missed, please point them out to us! If you find specific tweets you don't see reflected in our analysis, retweet them and mention @AIObjectives, and we'll investigate. We'll be updating this project continually, to incorporate more of the conversation and reflect how it morphs over time. If you're interested in the thinking behind tools like Talk to the City, check out our whitepaper. If you think we've made mistakes and want to help us improve, get in touch at hello@objective.is. Our goal is to offer ever-evolving prototypes that help create visibility into collective discourse at scale, and to use that visibility to improve the tools we make.


How it works

Prototype at talktothecity.com/twitter

Talk to the City offers a simple UI for visualizing collective discourse at scale. It works with a variety of different underlying data structures, including collections of longform responses in addition to structured data like tweets.

Here's how we created the twitter mapping:

  1. We started with a seed population of 94 Twitter users who participate frequently in conversations about the intersection of AI and society.

  2. We added all users followed by that seed population, for a total of 35,416 unique twitter users, and collected all original tweets (not retweets) they had posted in the last three weeks (March 9 - April 6).

  3. We used GPT-4 to classify tweets as either relevant or not, by asking whether each tweet was about "AI ethics, AI safety, AI alignment, or the responsible use of AI". 3981 tweets from 245 users were marked relevant.

  4. Talk to the City clusters tweets by their text content (using UMAP for dimensionality reduction and HDBSCAN for clustering), and uses GPT-4 to label each cluster. Many of the LLM-generated labels were similar to each other, so we manually relabeled just over half of them for clarity.

  5. For the purposes of grouping opinions on the societal impacts of AI, we excluded clusters that were:

  • Descriptions of object-level news about AI capabilities (the "Capabilities of current AI" and “Reasoning about current capabilities” clusters)

  • Meta-level reflections on the discourse itself (the "External discourse" cluster)


For more details about this analysis, see the Methods & Caveats section below.

Six Perspectives on AI Ethics & Safety

Our visualization groups tweets by topic of discussion. While many of these clusters capture the participants' judgments of the topics considered, our categorization below combines and revises some of these clusters, to summarize the main themes surfaced by the Talk to the City platform.

The six perspectives we've seen emerge are as follows:

1. Ethical approaches to tackling current harms

Talk to the City clusters: AI Ethics; LLM safety & technical investigations; Red teaming AI

Concerns about the impact of advanced algorithms on society are not future problems: present-day issues such as misinformation, algorithmic bias, and disproportionate impacts of job automation already have clear negative impacts.


Members of this group call for action on these present questions, and not focus on hypothetical problems of future, more powerful systems – and in some cases, call out the focus on far-future concerns as harmful to work on present issues:


We also include here people concerned with unaddressed security vulnerabilities in current systems:


The views of this group are closely related to movements like FAT ML and the ACM FAccT conference, and institutes such as DAIR, whose statement in response to the FLI letter offers an explanation of one of the perspectives in this cluster:

Tl;dr: The harms from so-called AI are real and present and follow from the acts of people and corporations deploying automated systems. Regulatory efforts should focus on transparency, accountability and preventing exploitative labor practices.

2. AI is Inscrutable and Difficult to Control

Talk to the City clusters: Difficulty of aligning superintelligence; Why future alignment is hard; parts of Reflections on optimism & pessimism; Why future alignment is hard; parts of: Long-term concerns; Predictions about superintelligence, LLM safety & technical investigations

A more future-facing group of perspectives is that of people pointing out the difficulty of wrapping our minds around what current and future AI systems can do, and the difficulty this poses for creating systems that serve our best interests.

As a concise summary:

The difficulty of determining how AI will behave is not an entirely foreign concept, if it involves deception:


But current and future systems are fundamentally different from the human minds we are familiar with…


…which this makes them unprecedentedly hard to predict, and possibly volatile in destructive ways:

Members of this group call not only for recognition of the difficulty of these problems, but also for improving interpretability of the systems in question:

3. Calls for Independent & Democratic Oversight

Talk to the City cluster: AI Governance & Regulation

Regardless of the type and time-scale of their concerns, many people believe the most pressing concern for AI safety is the development of independent regulation and oversight for the development of these new technologies.


Calls for such regulation point to the need for transparency in how these systems are built and how they behave, and most suggest that such oversight must be democratic.


In parallel to interest in democratic processes for AI governance, some people point out that new technology can be built in ways that work in favor of democratic processes, or in ways that favor authoritarian structures:


Another common theme among these perspectives is the need for new modes of governance to keep pace with cutting-edge technology, to ensure that reflective deliberation about their use keeps up with the development of new capabilities.

4. Dangers of Rapid Development

Talk to the City clusters: Pausing training of powerful AI systems; parts of LLM safety & technical investigations

Another specific concern raised by many voices – many of them adjacent to the previous cluster, in our graph – is the speed of development of new technology, compared to how quickly we are coming to understand its effects.


In particular, people in this group point to the effects of race dynamics on all other issues faced by the field. Many signatories of the FLI cited this incentive for speed, and the need to counter it, as motivating factors:


Others felt that the FLI letter's call for a six-month pause in development did not go far enough:


This group had a variety of views on how to best handle development races between countries, but were largely in agreement about the importance of achieving as much international cooperation as possible towards slowing down the development of advanced AI.

5. Human Interests are Difficult to Formalize

Talk to the City cluster: Why future alignment is hard, parts of Comparing machine & human intelligence

Separate from the difficulty of ensuring present and future AI systems work towards human interests is the question of what, exactly, those human interests are. Human morality is complex, full of nuance, and subjective – making it difficult to describe our values and goals clearly enough for use in formal systems.


People in this group point out that many of our existing systems are misaligned, and that current approaches to the alignment of new technology will have similar issues:


And in some sense, specifying our intent even for simple software systems has been a problem throughout the history of the field:


As an aside, this current twitter dataset includes fewer voices from this perspective than we expect to hear from in the future, as more people from relevant non-technical fields (e.g. ethics, psychology, and moral philosophy) join the conversation.

6. Optimistic AI Futures

Talk to the City clusters: AI optimism

Most distinct from the other groups in this discourse are the people suggesting reasons to be optimistic about the integration of cutting-edge AI – from expectations that safety will be a solvable problem, to excitement about amazing new experiences it could bring about for humanity.

Optimistic discussions of AI safety point to similar problems with past technologies which we now generally recognize as safe:


Others expect that as AI improves, it will pose fewer risks to human interests, and that advances in AI safety will correlate with technological progress:


Still others make less specific claims about outcomes, but reflect on humans' development of tools having brought us abundance in the past – and, in some cases, suggest that our uncertainty about the future means that unimaginably positive outcomes are also possible:

Conclusions & Further Questions

The complexity, impact, and global scale of this discussion results in a diversity of approaches and motivations. Our goal here was to group perspectives by their opinions on the most important aspects of that question and actions to take toward resolving it. As these perspectives come into contact with one another, they sometimes generate interesting syntheses and novel ideas – but also lead to detailed arguments about the best paths forward.

One common theme we noticed from people across these categories was an interest in more coordination between these different approaches to the problem of AI safety. While the groups outlined above have different priorities and suggestions for how to ensure safe adoption of AI, many participants note that these differences are outweighed by their common goals.


These conversations are difficult to capture in sufficient nuance and depth, especially on platforms like Twitter that require concise, discrete statements of beliefs. We plan to create contexts that support richer, multi-threaded approaches to discussing problems of AI safety, and hope others in the space will join us in doing so. This initial report is intended as a jumping-off point for further analysis: the clusters we've identified will inevitably morph over time, as groups deliberate with and learn from each other. Other approaches to analyzing these conversations may even move away from clustering algorithms entirely, towards more granular analysis of the questions under consideration.

With a platform that can articulate the overlap and divergence of different viewpoints, to source agreements and disagreements within groups efficiently and comprehensively, we believe can develop policy and deliberate more effectively. We hope this initial report offers a starting point for more reflection on the landscape of ideas in this discourse as we continue this work towards clarifying large-scale conversations, with the aim of capturing as much nuance and variety of perspectives as possible. Look for our ongoing work towards that end on our substack and twitter.


For comments or suggestions on this project, talk to us at @AIObjectives. If you want to help us build more tools for collective deliberation, get in touch at hello@objective.is. For deeper discussion of the ideas and theory of impact behind all AOI's projects, including Talk to the City, check out our whitepaper. We welcome collaboration with anyone who wants to get involved with the work we're doing – reach out if you want to join us on our mission to use these rapidly advancing technologies to create the best possible futures.

Methods & Caveats

Our approach let us quickly capture discussions quickly unfolding among large groups, but has a number of drawbacks:

  • The time-limited nature of the data means we don't capture the most clear, precise, or accurate articulation of people's views. Many of the people making comments have previously explained their views in much more nuance and depth elsewhere – often in longform newsletters or articles not quoted anywhere on Twitter – and make only brief allusions to them in tweets. The data here capture the most common and recently relevant aspects of the interlocutors' views, but more targeted surveys are necessary to capture them in high fidelity.

  • Our tweet-by-tweet analysis makes it difficult to capture more complex dialogue between multiple parties. Our analysis makes it hard to parse brief replies that were clear in the context of a broader discussion, and we don't evaluate threads as a whole even when they outline a single perspective. This influenced our approach to categorizing tweets by their core goals, values, and assessments – as opposed to their evaluations of other perspectives.

  • Using LLMs to gauge relevance is an imperfect measure. For this project, we were most concerned with missing tweets relevant to the conversation, and it performed better than expected – in a manual review, only 13 of the tweets it discarded were deemed relevant – but it included a number of tweets that were unrelated to the discussion or low-content (e.g. single emoji responses).

  • Our choice of which voices to include in this discussion is only as diverse as the follow graphs of our initial users. We hope that as this conversation unfolds, it will expand to include more people and ideas relevant to questions of alignment – including ethicists, economists, complex systems theorists, psychologists, and others with insight to bring.

Many thanks to Alex Gray, Emma Bluemke, Gaia Dempsey, and Anna Yelizarova for their input and feedback on this piece.

Previous
Previous

Research Brief: Beneficial Deployment of Transformative Technologies

Next
Next

Introducing Talk to the City: Collective Deliberation at Scale