Categories
Links

LSE Study Exposes AI Bias in Social Care

A new study from the London School of Economics highlights how AI systems can reinforce existing inequalities when used for high risk activities like social care.

Writing in The Guardian, Jessica Murray describes how Google’s Gemma model summarized identical case notes differently depending on gender.

An 84-year-old man, “Mr Smith,” was described as having a “complex medical history, no care package and poor mobility,” while “Mrs Smith” was portrayed as “[d]espite her limitations, she is independent and able to maintain her personal care.” In another example, Mr Smith was noted as “unable to access the community,” but Mrs Smith as “able to manage her daily activities.”

These subtle but significant differences risk making women’s needs appear less urgent, and could influence the care and resources provided. By contrast, Meta’s Llama 3 did not use different language based on gender, underscoring that bias can vary across models and the need to measure bias in LLMs adopted for public service delivery

These findings reinforce why AI systems must be valid and reliable, safe, transparent, accountable, privacy-protective, and human-rights affirming. This is especially the case in high risk settings where AI systems affect decisions linked with accessing essential public services.

Categories
Links

Unpacking the Global Pivot from AI Safety

The global pivot away from AI safety is now driving a lot of international AI policy. This shift is often attributed to the current U.S. administration and is reshaping how liberal democracies approach AI governance.

In a recent article on Lawfare, author Jakub Kraus argues there are deeper reasons behind this shift. Specifically, countries such as France had already begun reorienting toward innovation-friendly frameworks before the activities of the current American administration. The rapid emergence of ChatGPT also sparked a fear of missing out and a surge in AI optimism, while governments also confronted the perceived economic and military opportunities associated with AI technologies.

Kraus concludes his article by arguing that there may be some benefits of emphasizing opportunity over safety while, also, recognizing the risks of not building up effective international or domestic governance institutions.

However, if AI systems are not designed to be safe, transparent, accountable, privacy protective, or human rights affirming then there is a risk that people will lack trust in these systems based on the actual and potential harms of them being developed and deployed without sufficient regulatory safeguards. The result could be a birthing or fostering of a range of socially destructive harms and long-term hesitancy to take advantage of the potential benefits associated with emerging AI technologies.

Categories
Links Writing

Learning from Service Innovation in the Global South

Western policy makers, understandably, often focus on how emerging technologies can benefit their own administrative and governance processes. Looking beyond the Global North to understand how other countries are experimenting with administrative technologies, such as those with embedded AI capacities, can productively reveal the benefits and challenges of applying new technologies at scale.

The Rest of the World continues to be a superb resource for getting out of prototypical discussions and news cycles, with its vision of capturing people’s experiences of technology outside of the Western world.

Their recent article, “Brazil’s AI-powered social security app is wrongly rejecting claims,” on the use of AI technologies South American and Latin American countries reveals the profound potential that automation has for processing social benefits claims…as well as how they can struggle with complex claims and further disadvantage the least privileged in society. In focusing on Brazil, we learn about how the government is turning to automated systems to expedite access to service; while in aggregate these automated systems may be helpful, there are still complex cases where automation is impairing access to (now largely automated) government services and benefits.

The article also mentions how Argentina is using generative AI technologies to help draft court opinions and Costa Rica is using AI systems to optimize tax filing and detect fraudulent behaviours. It is valuable for Western policymakers to see how smaller or more nimble or more resource constrained jurisdictions are integrating automation into service delivery, and learn from their positive experiences and seek to improve upon (or avoid similar) innovation that leads to inadequate service delivery.

Governments are very different from companies. They provide service and assistance to highly diverse populations and, as such, the ‘edge cases’ that government administrators must handle require a degree of attention and care that is often beyond the obligations that corporations have or adopt towards their customer base. We can’t ask, or expect, government administrators to behave like companies because they have fundamentally different obligations and expectations.

It behooves all who are considering the automation of public service delivery to consider how this goal can be accomplished in a trustworthy and responsible manner, where automated services work properly and are fit for purpose, and are safe, privacy protective, transparent and accountable, and human rights affirming. Doing anything less risks entrenching or further systematizing existing inequities that already harm or punish the least privileged in our societies.

Categories
Aside

Foundational Models, Semiconductors, and a Regulatory Opportunity

Lots of think about in this interview with Arm’s CEO.

Of note: the discussion that current larger AI models that are in-use today will really have noticeable effects / changes in user behaviour on edge or end point devices in a 2-3 years once semiconductors have more properly caught up.

Significantly, this may mean policy makers still have some time to establish appropriate regulatory frameworks and guardrails ahead of what maybe more substantive and pervasive changes to daily computing.

Categories
Writing

Some Challenges Facing Physician AI Scribes

Recent reporting from the Associated Press highlights the potential challenges in adopting emergent generative AI technologies into the working world. Their reporting focused on how American health care providers are using OpenAI’s transcription tool, Whisper, to transcribe patients’ conversations with medical staff.

These activities are occurring despite OpenAI’s warnings that Whisper should not be used in high-risk domains.

The article reports that a “machine learning engineer said he initially discovered hallucinations in about half of the over 100 hours of Whisper transcriptions he analyzed. A third developer said he found hallucinations in nearly every one of the 26,000 transcripts he created with Whisper. The problems persist even in well-recorded, short audio samples. A recent study by computer scientists uncovered 187 hallucinations in more than 13,000 clear audio snippets they examined.”

Transcription errors can be very serious. Research by Prof. Koenecke and Prof. Sloane of the University of Virgina found:

… that nearly 40% of the hallucinations were harmful or concerning because the speaker could be misinterpreted or misrepresented.

In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”

But the transcription software added: “He took a big piece of a cross, a teeny, small piece … I’m sure he didn’t have a terror knife so he killed a number of people.”

A speaker in another recording described “two other girls and one lady.” Whisper invented extra commentary on race, adding “two other girls and one lady, um, which were Black.”

In a third transcription, Whisper invented a non-existent medication called “hyperactivated antibiotics.”

While, in some cases, voice data is deleted for privacy reasons this can impede physicians (or other medical personnel) from double checking the accuracy of transcription. While some may be caught, easily and quickly, more subtle errors or mistakes may be less likely to be caught.

One area where work stills needs to be done is to assess the relative accuracy of the AI scribes versus that of physicians. While there may be errors introduced by automated transcription what is the error rate of physicians? Also, what is the difference in quality of care between one whom is self-transcribing during a meeting vs reviewing transcriptions after the interaction? These are central questions that should play a significant role in assessments of when and how these technologies are deployed.

Categories
Links Writing

Russian State Media Disinformation Campaign Exposed

Today, a series of Western allies — including Canada, the United States, and the Netherlands — disclosed the existence of a sophisticated Russian social media influence operation that was being operated by RT. The details of the campaign are exquisite, and include some of code used to drive the operation.

Of note, the campaign used a covert artificial intelligence (AI) enhanced software package to create fictitious online personas, representing a number of nationalities, to post content on X (formerly Twitter). Using this tool, RT affiliates disseminated disinformation to and about a number of countries, including the United States, Poland, Germany, the Netherlands, Spain, Ukraine, and Israel.

Although the tool was only identified on X, the authoring organizations’ analysis of the software used for the campaign indicated the developers intended to expand its functionality to other social media platforms. The authoring organizations’ analysis also indicated the tool is capable of the following:

  1. Creating authentic appearing social media personas en masse;
  2. Deploying content similar to typical social media users;
  3. Mirroring disinformation of other bot personas;
  4. Perpetuating the use of pre-existing false narratives to amplify malign foreign influence; and
  5. Formulating messages, to include the topic and framing, based on the specific archetype of the bot.

Mitigations to address this influence campaign include:

  1. Consider implementing processes to validate that accounts are created and operated by a human person who abides by the platform’s respective terms of use. Such processes could be similar to well-established Know Your Customer guidelines.
  2. Consider reviewing and making upgrades to authentication and verification processes based on the information provided in this advisory;
  3. Consider protocols for identifying and subsequently reviewing users with known-suspicious user agent strings;
  4. Consider making user accounts Secure by Default by using default settings such as MFA, default settings that support privacy, removing personally identifiable information shared without consent, and clear documentation of acceptable behavior.

This is a continuation of how AI tools are being (and will be) used to expand the ability of actors to undertake next-generation digital influence campaigns. And while adversaries are found using these techniques, today, we should anticipate that private companies (and others) will offer similar capabilities in the near future in democratic and non-democratic countries alike.

Categories
Aside Writing

2024.6.27

For the past many months I’ve had the joy of working with, and learning from, a truly terrific set of colleagues. One of the files we’ve handled has been around law reform in Ontario and specifically Bill 194, the Strengthening Cyber Security and Building Trust in the Public Sector Act.

Our organization’s submission focuses on ways to further improve the legislation by way of offering 28 recommendations that apply to Schedule 1 (concerning cybersecurity, artificial intelligence, and technologies affecting individuals under the age of 18) and Schedule 2 (amendments to FIPPA). Broadly, our recommendations concern the levels of accountability, transparency, and oversight that are needed in a rapidly changing world.

Categories
Writing

What Does It Mean To “Search”

Are we approaching “Google zero”, where Google searches will use generative AI systems to summarize responses to queries, thus ending the reason for people to visit website? And if that happens what is lost?

These are common questions that have been building month over month as more advanced foundational models are built, deployed, and iterated upon. But there has been relatively little assessment in public forums around the social dimensions of making a web search. Instead, the focus has tended to be on loss of traffic and subsequent economic effects of this transition.

A 2022 paper entitled “Situating Search” identifies what a search engine does, and what it is used for, in order for the authors to argue that search that only provides specific requested information (often inaccurately) fails to account for the broader range of things that people use search for.

Specifically, when people search they:

  • lookup
  • learn
  • investigate

When a ChatGPT or Gemini approach to search is applied, however, it limits the range of options before a user. Specifically, in binding search to conversational responses we may impair individuals from conducting search/learning in ways that expand domain knowledge or that rely on sensemaking of results to come to a given conclusion.

Page 227 of the paper has a helpful overview of the dimensions of Information Seeking Strategies (ISS), which explain the links between search and the kinds of activities in which individuals engage. Why, also, might chat-based (or other multimodal) search be a problem?

  • it can come across as too authoritative
  • by synthesizing data from multiple sources and masking the available range of sources, it cuts the individual’s ability to expose the broader knowledge space
  • LLMs, in synthesizing text, may provide results that are not true

All of the above issues are compounded in situations where individuals have low information literacy and, thus, are challenged in their ability to recognize deficient responses from an AI-based search system.

The authors ultimately conclude with the following:

…we should be looking to build tools that help users find and make sense of information rather than tools that purport to do it all for them. We should also acknowledge that the search systems are used and will continue to be used for tasks other than simply finding an answer to a question; that there is tremendous value in information seekers exploring, stumbling, and learning through the process of querying and discovery through these systems.

As we race to upend the systems we use, today, we should avoid moving quickly and breaking things and instead opt to enhance and improve our knowledge ecosystem. There is a place for these emerging technologies but rather than bolting them onto–and into–all of our information technologies we should determine when they are or are not fit for a given purpose.

Categories
Links

New York City’s Chatbot: A Warning to Other Government Agencies?

A good article by The Markup assessed the accuracy of New York City’s municipal chatbot. The chatbot is intended to provide New Yorkers with information about starting or operating a business in the city. The journalists found the chatbot regularly provided false or incorrect information which could result in legal repercussions for businesses and significantly discriminate against city residents. Problematic outputs included incorrect housing-related information, whether businesses must accept cash for services rendered, whether employers can take cuts of employees’ tips, and more. 

While New York does include a warning to those using the chatbot, it remains unclear (and perhaps doubtful) that residents who use it will know when to dispute outputs. Moreover, the statements of how the tool can be helpful and sources it is trained on may cause individuals to trust the chatbot.

In aggregate, this speaks to how important it is to effectively communicate with users, in excess of policies simply mandating some kind of disclosure of the risks associated with these tools, as well as demonstrates the importance of government institutions more carefully assessing (and appreciating) the risks of these systems prior to deploying them.

Categories
Links Writing

Near-Term Threats Posed by Emergent AI Technologies

In January, the UK’s National Cyber Security Centre (NCSC) published its assessment of the near-term impact of AI with regards to cyber threats. The whole assessment is worth reading for its clarity and brevity in identifying different ways that AI technologies will be used by high-capacity state actors, by other state and well resourced criminal and mercenary actors, and by comparatively low-skill actors.

A few items which caught my eye:

  • More sophisticated uses of AI in cyber operations are highly likely to be restricted to threat actors with access to quality training data, significant expertise (in both AI and cyber), and resources. More advanced uses are unlikely to be realised before 2025.
  • AI will almost certainly make cyber operations more impactful because threat actors will be able to analyse exfiltrated data faster and more effectively, and use it to train AI models.
  • AI lowers the barrier for novice cyber criminals, hackers-for-hire and hacktivists to carry out effective access and information gathering operations. This enhanced access will likely contribute to the global ransomware threat over the next two years.
  • Cyber resilience challenges will become more acute as the technology develops. To 2025, GenAI and large language models will make it difficult for everyone, regardless of their level of cyber security understanding, to assess whether an email or password reset request is genuine, or to identify phishing, spoofing or social engineering attempts.

There are more insights, such as the value of training data held by high capacity actors and the likelihood that low skill actors will see significant upskilling over the next 18 months due to the availability of AI technologies.

The potential to assess information more quickly may have particularly notable impacts in the national security space, enable more effective corporate espionage operations, as well as enhance cyber criminal activities. In all cases, the ability to assess and query volumes of information at speed and scale will let threat actors extract value from information more efficiently than today.

The fact that the same technologies may enable lower-skilled actors to undertake wider ransomware operations, where it will be challenging to distinguish legitimate versus illegitimate security-related emails, also speaks to the desperate need for organizations to transition to higher-security solutions, including multiple factor authentication or passkeys.