Categories
Writing

Some Challenges Facing Physician AI Scribes

Recent reporting from the Associated Press highlights the potential challenges in adopting emergent generative AI technologies into the working world. Their reporting focused on how American health care providers are using OpenAI’s transcription tool, Whisper, to transcribe patients’ conversations with medical staff.

These activities are occurring despite OpenAI’s warnings that Whisper should not be used in high-risk domains.

The article reports that a “machine learning engineer said he initially discovered hallucinations in about half of the over 100 hours of Whisper transcriptions he analyzed. A third developer said he found hallucinations in nearly every one of the 26,000 transcripts he created with Whisper. The problems persist even in well-recorded, short audio samples. A recent study by computer scientists uncovered 187 hallucinations in more than 13,000 clear audio snippets they examined.”

Transcription errors can be very serious. Research by Prof. Koenecke and Prof. Sloane of the University of Virgina found:

… that nearly 40% of the hallucinations were harmful or concerning because the speaker could be misinterpreted or misrepresented.

In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”

But the transcription software added: “He took a big piece of a cross, a teeny, small piece … I’m sure he didn’t have a terror knife so he killed a number of people.”

A speaker in another recording described “two other girls and one lady.” Whisper invented extra commentary on race, adding “two other girls and one lady, um, which were Black.”

In a third transcription, Whisper invented a non-existent medication called “hyperactivated antibiotics.”

While, in some cases, voice data is deleted for privacy reasons this can impede physicians (or other medical personnel) from double checking the accuracy of transcription. While some may be caught, easily and quickly, more subtle errors or mistakes may be less likely to be caught.

One area where work stills needs to be done is to assess the relative accuracy of the AI scribes versus that of physicians. While there may be errors introduced by automated transcription what is the error rate of physicians? Also, what is the difference in quality of care between one whom is self-transcribing during a meeting vs reviewing transcriptions after the interaction? These are central questions that should play a significant role in assessments of when and how these technologies are deployed.

Categories
Aside Links

What Your Klout Score Really Means

Something that hit me while I was reading this (other than how much I dislike Klout) is that companies are increasingly using the ‘service’ to discriminate between preferred and non-preferred customers. I can see a service like Klout developing in the future that is widely used by marketers, insurance agencies, and other groups interested in actuarial sales/risk analysis to mine social media information in order to assign scores that invisibly affect individuals’ daily behaviours and routines.

Hopefully things won’t be so invisible that consumer protection laws can’t be activated to dilute such behaviours. Even more hopefully, let’s pray that those laws still have the dulled teeth they have today when Klout on steroids is truly birthed.

Categories
Writing

Don’t Risk Model for Aged, Wealthy, Americans

Data security and communicative privacy matters. The boons of the contemporary computer era has led to people across the world using common services for security, for data processing, and for communications generally despite users’ radically different risk profiles. Few users are savvy enough to engage in code-level audits, fewer to ascertain the validity of improperly issued security certificates, and likely even fewer to guarantee that programs’ and operating systems’ updates are from the actual developers. These are problems – important problems – that need to be directly addressed by developers.

It’s always been morally wrong to be cavalier about your software’s security profile, and to just discount the potential vulnerabilities or bugs linked to your tools. Things aren’t getting better, however, on account of state actors becoming more and more sophisticated in how they target and monitor their citizens’ and residents’ communications. Consequently, the blasé attitude towards security that has (largely) focused on successful engineering over successful security in depth is a larger and larger problem. This attitude, especially when it comes to anti-circumvention and encryption software, is leading to individual users ending up seriously hurt, imprisoned, or dead.

Security is important. Speech is important. And ensuring that secure, private, speech is possible is an increasingly critical issue for parties throughout the world. Developers and companies and individuals ought to take the severity of the consequences of their actions to heart, or risk having very real blood on their hands.