From Disinformation to Deep Fakes: How Threat Actors Manipulate Reality

Deep fakes are expected to become a more prominent attack vector. Here's how to identify them. What are Deep Fakes? A deep fake is the act of maliciously replacing real images and videos with fabricated ones to perform information manipulation. To create images, video and audio that are high quality enough to be used in deep fakes, AI and ML are required. Such use of AI, ML and image replacement

Deep fakes are expected to become a more prominent attack vector. Here’s how to identify them.

What are Deep Fakes?

A deep fake is the act of maliciously replacing real images and videos with fabricated ones to perform information manipulation. To create images, video and audio that are high quality enough to be used in deep fakes, AI and ML are required. Such use of AI, ML and image replacement are unlike other types of information manipulation, which use less extreme manipulation techniques, like misrepresentation of information, isolating parts of the information or editing it in a deceptive manner. Etay Maor, Senior Director of Security Strategy at Cato Networks adds “To add complications, the recent advancements and accessibility to AI generated text, such as GPT3, have already been used in combination with deepfakes (as a proof of concept) to create interactive, human looking conversation bots”

What Do Deep Fakes Look Like?

Deep fakes come in all shapes and sizes. Some are simpler and some are more advanced. Some of the most popular examples of deep fakes are:

Face Swap

Face swapping is the act of replacing the face in a video or image from one person to another. Face swapping requires dedicated software, but it does not have to be based on advanced technology – today one can even find mobile apps that support face swapping. Face swapping that is available in mobile apps is usually limited to simple use cases, like swapping between the user’s photos and actors’ faces in movie scenes.

More advanced face swapping does exist, but it requires more model training and code, and – as a result – GPUs, which is expensive and resource-intensive. An example of a more advanced face swapping deep fake can be see in this video, in which Tom Cruise is swapped with the presenter’s face:

This Tom Cruise face swap required two hours of training on a GPU as well as days of professional video editing post-processing. This might sound like a lot, but it was also considered a simpler swap than others, because the presenter had a similar haircut to Cruise and can impersonate his voice, which means less training and post-processing were required.

Puppet Master (Lip Sync)

The ‘Puppet Master’ deep fake is a technique in which the image of a person’s mouth movements are manipulated, making it seem like the person is saying something they haven’t actually said. Compared to face swapping, which trains a model on the new, swapped face, ‘Puppet Master’ trains a model on the face of the original image, and specifically on the mouth movements.

Here’s what it looks like:

[embedded content]

[warning – explicit language]

The technology behind the ‘Puppet Master’ is based on synthesizing the mask, i.e. the original image, and placing it on top of the model of the person who is impersonating and lip syncing them.

Audio

The third prominent type of deep fake is audio-based. Audio deep fakes are audio files that take a real person’s voice and make it sounds like they’re saying something they had never said. Audio deep fakes are created by taking audio files, allocating annotations to the sounds, training an ML model based on the annotations to associate sounds with text and then generating a new audio file.

Here’s what is sounds like:

[embedded content]

Deep Fakes vs. Cheap Fixes

Not all modified images or audio are deep fakes. While deep fakes are media synthesized or modified using AI, cheap fixes are media synthesized or modified using low-tech methods, which are easy to spot. They often have distortions and have been clearly manipulated. Here’s what a cheap fix looks like:

The Cyber Risk of Deep Fakes

Deep fakes have become more realistic and accessible and they are also faster to create than ever before. This makes them a powerful tool for weaponization. As a result, they pose a risk for businesses and for countries. They can be used for cyber crime, social engineering, fraud, by threat actor nations to influence foreign operations, and more.

For example, deep fake was used to mimic a CEO’s voice and convince an executive to wire $243,000 to a scam account. Etay Maor of Cato Networks “Business email comprise and phishing attacks are becoming harder and harder to detect based on simple analysis of the language used. There is a need for a holistic approach, such as the one offered by a single vendor SASE solution, that can detect the attack at different multiple choke points and not rely on isolated point products that are doomed to fail”. In another case, a deep fake was presented as evidence in a child custody case.

Deep fakes can also be used for spreading disinformation, i.e false distribution of information to influence public opinion or obscure the truth. For example, deep fakes could be used to impersonate world leaders and ignite an attack, or to impersonate a CEO and manipulate a company’s stock price. In other cases, deep fakes enable plausible deniability, in which people could deny all sources of media by claiming they are deep fakes, which creates a social breach of trust.

Finally, deep fake can be used for defamation, i.e damaging someone’s good reputation. For example, by creating revenge porn.

How to Detect Deep Fakes

There are two main types of methods for accurately detecting deep fakes:

Low-level detection methods
High-level detection methods

Low-level Detection Methods

Low-level detection methods rely on ML models that are trained to identify artifacts or pixellations that were introduced through the deep fake generation process. These artifacts may be imperceptible to the human eye, but the models, which were trained on real images and deep fake images, are able to detect them.

High-level Detection Methods

High-level detection methods use models that can identify semantically meaningful features. These include unnatural movements, like blinking, head-pose or unique mannerisms, and phoneme-viseme mismatches.

Today, these detection methods are considered accurate. However, as deep fake technology improves and becomes more sophisticated, they are expected to become less effective and will need to be updated and improved. In addition to these techniques, each one of us can help detect deep fakes by verifying the media source of videos and images we receive.

To learn more about different types of cybersecurity attacks and how to prevent them, Cato Networks’ Cyber Security Masterclass series is available for your viewing.

Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.