AI and Snake Oil

27 Apr

We’ve put together a guide to some of the most common pitfalls you’ll face when reading about AI online – in whitepapers, product pages, and journalism.

Choosing an Artificial Intelligence solution when you’re not comfortable talking technology is like browsing a Parisian bookstore when you failed your GCSE French.

You can tell when the cover looks good, and you recognise some of the words (even if you’re not too sure what it means when you put them together), but when you flip a book open, you’re overwhelmed enough that you end up leaving with a stack full of recommendations from a shop assistant that has deduced what you are looking for from your broken French.

Now imagine you have been given a book of ‘common phrases’ that is full of mistranslations.

Understanding literature on AI is fraught with the ‘gotcha’s and ‘well technically’s that we’re so familiar with, seeing that they lure us into purchases that might not make sense for our teams.

I’m afraid it’s turtles all the way down

A common way to avoid answering tough questions about the outstanding challenges facing an AI solution or implementation is to defer the question to another type of technology. ‘But how do we protect the integrity of the data?’ ‘We’ll use the blockchain’.

It’s not that technology shouldn’t be used in conjunction to solve problems. It’s that, often, those making such statements are implying “that problem isn’t really ours to solve.” The trick is to pry more into how these technologies will slot together – has the product creator put any thought into it? If you’re hiring them for their expertise, are they also experts in these complimentary technologies? Could they answer questions on how that new technology can be used to solve the challenges of the technology they are selling?

100% success rate (for the one person I asked)

It is extremely common to see the results of AI implementations reported with enticing lines like “98% accurate”. The only problem is, with no qualification, such a sentence is meaningless. If I created a machine that always output “False” and then fed in the questions “The sky is neon green.”, “Water is dry”, “Pigs fly”, my machine would be correct 100% of the time. It is “100% accurate”.

It’s not that such claims are useless to make. If you’re using a bench line dataset that other AI models have been tested on, that “98% accurate” places its effectiveness in relation to other models tested on that same dataset. The dataset might also be very representative of the data you expect for your use case. For example, if an algorithm model was 98% effective at predicting ground tremors based on real-time data and unsupervised learning over a period of three years, you might be more confident that the 98% has meaning for you than if the dataset used for assessment was collected over a period of two weeks.

The trick is drilling down into the meaning behind the statistic.

It works!* (*Limited time only)

The effectiveness of some AI solutions comes with a limited time warranty. Assessments of accuracy might be extremely positive based on the data collected now, but depending on the data involved, it may not be effective a year from now.

This is a problem that frequently faces AI dealing with natural language, because the way that we communicate (online and in person) is constantly evolving. AI solutions for detecting misinformation, for example, can be extremely effective at detecting misinformation campaigns that occurred in the past (because they were active in the period that the labelled data was collected from). But then the language of misinformation changes. The campaigns are new, the actors behind them are new, and the techniques are new. The AI algorithm needs to be updated with new, more relevant, training data.

That doesn’t mean that these AI models are not good purchases – just that you need to be aware if your purchases have an expiry date and decide whether you can afford the maintenance of keeping them relevant.

We found it like this

We’re quick to use language that hides the human behind the AI model. Headlines will read “AI has discovered-” when what they mean is “research scientist has discovered through the application of an AI model”. The problem with such attribution, other than awarding tools accomplishments over the people that wield them, is that it also disguises some of the fallibility of AI.

We talk about these conclusions as if they are beyond reproach – after all, they could only have been logically reached if AI was involved – but baked into every AI model is the circumstances under which they were created. The easiest illustration is the failure around facial recognition algorithms and minority groups. The issue was that the datasets used, selected by humans, contained overwhelming numbers of white faces, leading the AI to tend towards euro-centric features for distinguishing between people. Something none of the humans on the team picked up on. The AI was not learning to recognise human faces. It was learning to recognise the differences between pictures curated for them by a group of people.

Remembering where AI comes from reminds us to be sceptical of the results, and to question the methods employed in their creation and application.

To conclude

Often these little ‘gotcha’s are not intentional traps sprung by organisations selling AI products. Some of these turns of phrases and omissions have been normalised over the years to the point where we expect to read them, and AI vendors expect to use them.

The trick is to be aware of what’s actually being said versus taking everything at face value. Armed with knowledge of some of these pitfalls, you will at least know a few soft spots to prod before you commit to onboarding an AI solution.

If you want to read any more of our own AI write ups, be sure to follow us on social media and keep an eye on our website which we frequently update with blogs.

Guest User

AI and Snake Oil

A love letter to NLTK

An Introduction to Humane Technology

Legal Innovation Lab Wales