PITCH

# Audio Samples for AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response

First is the imposter voice. Second is the verified ground truth. Third is a deepfake: an imposter creates using the target's recording, other than the ground truth. Last column is a short caption.

## Audio Samples of Regular Speech

No Challenge

Deepfake Sounds Genuine

## Audio Samples of Top-11 Valid Machine-Detectable Challenges

Captions are prospective explanations and not machine predictions.

Static Mouth

Audible distortions at 'formalities'

Cup Mouth

Non-compliance and Distortions

Whisper

Non-compliance

Speak Softly

Non-compliance

High Pitch

High Non-Compliance

Foreign Words

Vibrating Voice Distortions (also seen with suspends linguistic chain ya ne)

Sing

Non-compliance towards the last

Emotions

Sounds flatter in comparison to imposter

Crosstalk

Non-compliance and Distortions

## Audio Samples of the 9 Weaker Tasks

Speak Loudly

Non-Compliance (Deepfake still whispers)

Read Quickly

Deepfake Sounds Genuine

Read Slowly

Mild Distortions

## Video Samples of Selected Challenges

High-Pitch

Cross-talk (with a self played audio on phone)

Whisper