Artificial Intelligence towards automated fake news

The dark side of automation: the evolution of the news from generated to viral fake news

Automated fake news (but we can call them automated hoaxes too) is a reality. Actually, this “fake news”, as it is called today, is certainly not a recent invention. However, it becomes far more relevant when the mass media plays an active role in its dissemination.

In the history of hoaxes that become “viral” thanks to the media, there have been particularly sensational cases. These cases foreshadowed one of the major concerns on the mass media, or the ability to shape public opinion thanks to its huge capillarity and diffusion.

The “The Great Moon Hoax”

The Great Moon Hoax is considered one of the most sensational pranks of all time perpetrated by mass media.

On August 21st, 1835, on the second page of the New York Sun, a small notice appeared, apparently taken from the supplement of the Edinburgh Courant. The text announces amazing astronomical discoveries that would have been made by the already distinguished Sir John Herschel through innovative techniques.

First notice (source: Hoax.org)

The following week, an article titled with the headline below made its appearance on the cover:

GREAT ASTRONOMICAL DISCOVERIES  
LATELY MADE  
BY SIR JOHN HERSCHEL, LLDFRS & c. 
At the Cape of Good Hope.

It is the first of a series of six articles1, which describes how John Herschel from the Cape of Good Hope (he had really been there), discovered new forms of life on the moon, between blue unicorns, winged men, and bouncing globes. The series is also translated the following year in Italian, and published as Delle scoperte fatte nella luna del dottor Giovanni Herschel.

Lithograph of the Ruby Colosseum from the New York Sun, August 28, 1835

The author who is reporting would be a certain Dr. Andrew Grant, alleged Herschel’s traveling companion.

Despite the suspicions of the New York Herald, the real author admitted everything publicly much later, in 1840. It was Richard Adams Locke, one of the two editors of the Sun.

At that time the media was quite divided on the issue, but the coverage they provided did contribute to the general excitement that these “discoveries” created.

Before 1830, such a prank would not have been even conceivable, and it was the first sensational demonstration of the power of the mass media that had arisen after the introduction of the steam-powered press.

War of the Worlds

A century after Locke’s prank, it is no longer all about the press. Thanks to the radio, the mass media can now reach almost all American homes.

Historical context

October 1938, the international situation outside the United States seems to be on its way to the ineluctable.

While the Civil War rages in Spain, the war between China and Japan is still going. Nazi Germany has been remilitarized and has already annexed Austria and the Sudeten without firing a shot.

And the illusions of peace of Chamberlain are strongly contested by Churchill, who already fears the next claims on Gdańsk.

Martians in New Jersey!

It is precisely in this context that war also lands on American shores, with the invasion of New Jersey by … the Martians!

On the night of October 30th, CBS is broadcasting live music from the imaginary orchestra of Ramon Raquello.

The play, however, is interrupted after a few minutes by an “Intercontinental Radio News reporter” with an… unusual announcement. It seems that astronomers have just spotted a huge blue flame erupting from the surface of Mars.

 

fake news
Titles of the time

It’s just the beginning: soon a meteorite will be “sighted” falling into New Jersey. According to radio reports, Martians will emerge aboard of huge three-legged vehicles. The reports will follow each other frantically with fictitious interviews describing the catastrophic effects of the invasion.

This is obviously a colossal prank, inspired by “The War of the Worlds”, a novel written by HG Wells in 1898.

The whole apparatus was conceived and directed by Orson Welles, and half of the American population was thrown into total chaos… or was it?

In this case, the “fake news” was actually two. One was the “live” story of the alien invasion, which still led several Americans to believe and rush into place armed and ready to fight the threat. The second was the myth, still solid today, of the nation-wide panic, with millions of terrified Americans.

Once again the mass media do play a crucial role. The press who hastened to divulge the news of the hoax-generated panic, but this time heavily supported by an even more powerful and pervasive media: the radio.

The Sokal affair

The world of pranks did not spare academic publications, either.

In 1994 Alan Sokal sent to Social Text2 an article titled  Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity. The journal (which at the time was not practicing peer review) published the article in the spring of 1996.

The peculiarity of the article was that it was composed entirely of meaningless sentences. The only requirements were that the words should sound good together and that the content was aligned with the journal’s ideology.

Later on, Sokal declared on Lingua Franca, that article’s main purpose was to demonstrate the decline of the intellectual standards of the humanistic journals of the time3.

Another interesting case happened in 2005, when three students from MIT replicated Sokal’s experiment, developing SCIgen, a rudimentary software that automatically generated “scientific” pseudo-articles.

The purpose here was to bring to public attention the proliferation of unscrupulous “predatory” magazines. In fact, many of them, for a substantial price, would accept any kind of article without even reading it.

Their website lists a number of other randomly generated items that have been published.

Internet and the proliferation of sounding boards

Today, with the Internet, the concept of fake news has grown to a totally different scale. While the Internet itself does not reach more people than the radio does, it has radically lowered the entry threshold for publication. In the pre-Internet era, in fact, the mass media, was accessed by the masses almost only passively. Those who had the prerogative to broadcast content were very few.

Today anyone can publish their own ideas, but above all the crucial point is that anyone can act as a sounding board, and the speed with which a news or a hoax can spread has become exponential.

People like sensationalized news, they get clicks, and readers more often than not aren’t even interested in knowing if what they are reading is true. And this traffic generates revenue for the crafter of fake news, encouraging the production of fake news even more.

The era of Artificial Intelligence and Big Data: automated disinformation and fake news

So far, the fake news quoted were mostly due to errors or mistakes. At most (as for Sokal’s case), they were intended as a denounce. And then the campaigns of disinformation came out. That is, where the “fake” assumes a meaning of intentional harm toward someone.

The diffusion

“When  What Happened , the latest book by Hillary Clinton debut on Amazon, the reception was incredible … so incredible that Amazon did not believe it and erased 900 of the 1600 reviews for being suspiciously false, written by people who said they love or hate the book, but they had neither purchased it nor read it. ” (quoted by Could AI Be the Future of Fake News and Product Reviews?- Scientific American).

It is known that in social media the news is not spread through a linear path, but with a “hub” distribution. In other words, there are some users particularly “influential”, often with hundreds of thousands of followers. Posts of these type of users have an exponentially higher chance of spreading.

distribution of automated fake news to the hub
“Hub” diffusion model

These “sounding boards” have become targets for the so-called “social bots”. Apparently, Chengcheng Shao from Indiana University has found that those bots make most of the accounts that spread the automated fake news.

The automated fake news is rapidly becoming a large-scale business. It is no longer about just the student writing from his garage, for a few hundred dollars a day.

Now the game is moving on the real disinformation campaign. The aim is often to get to influence government elections, or perhaps even the performance of the stock market.

Given the volumes, automation has now become a requirement. Indeed, we can say that it is likely that the mere falsification of text-based news will soon be no longer sufficient. The technology is almost ready to falsify stories, also creating images, video, and audio completely from scratch.

Art or counterfeiting? The synthesis of images

Nowadays, accessibility to big data has made possible the development of Artificial Intelligence algorithms without precedents. In the end, we switched from analysis algorithms to generative algorithms almost without realizing it.

The automatic image synthesis has already started gaining traction. Software like Pix2Pix, for example, is able to create photo-realistic images from sketches through Adversarial Neural Networks4. Today new technologies emerge with increasingly surprising results almost every day: StackGAN, for example, implements two stages of adversarial networks, one that produces the first level of low-resolution images, and a second stage that on the basis of this input produces a more refined hi-res image, which is practically photo-realistic.

stackgan
StackGAN architecture

Art or counterfeiting? The synthesis of videos

As if it wasn’t enough, the image synthesis is now moving towards its most natural evolution: the video synthesis.

Last August, musician Françoise Hardy, in all the beauty of her 20s, appeared in a YouTube video (below). The singer responded to the pressing questions of Chuck Todd (off the court) about why Trump had sent his press secretary Sean Spicer to lie publicly, speaking of “alternative facts”.

What’s wrong? Well, first of all, Françoise Hardy is now 735, and the voice that is heard for those who had not already recognized it is that of Kellyanne Conway. But the surprising thing is that it is not just a video of Françoise Hardy dubbed with the voice of Kellyanne Conway, but an entirely generated video. It is, in fact, a work of Mario Klingemann, who was able, to “automatically” recreate the face of Françoise Hardy,  lip-syncing to the voice of Kellyanne Conway! This was done through (again) adversarial neural networks, trained with old video clips of the singer.

Another technology worth mentioning is DeepStereo, which from different images of places can “predict” and reconstruct scenes as if they were taken from different points of view.

Art or counterfeiting? Speech synthesis

Speech synthesis is an already widespread field, but the technology used commercially (such as Siri) is still made up of recorded vocal fragments, then chained together. Being inherently limited to the set of recorded phrases, it sounds “true” only under certain circumstances, and is therefore not yet suitable for a realistic “counterfeiting”.

But the “deep audio”, that is the audio generated entirely by neural networks, is another story. It is about making the network learn the characteristics of the voices with which it is fed in the training, and to obtain faithful reconstructions in any context.

Because it’s about working directly on raw audio files, the technology is not yet commercially available, but it’s there nevertheless. The problem with “raw” audio files is that it requires very heavy sampling (up to 16k per second). Moreover, structures of these type of files tends to emerge at various time scales (see below).

 

Wave animation

For example, Google has developed Wavenet, a convoluted neural network that during training is fed with human voices in the form of raw wave files. Wavenet determines the sampling value in each step, based on the probabilistic distribution of the network.

Animation of Wavenet architecture

In the end, the network is even able to reproduce “filling” sounds such as breaths, stuttering, etc. (listen below for an example).

The results, compared to other technologies already used by Google, are much better and in the case of Mandarin are also even quite close to the human performance.

 

Below, a comparison between audio samples of the voice generation technologies used by Google:

Parametric Technology

Concatenative Technology

Wavenet

Conclusions

The progress in image generation and video has been going on pretty fast so far. Keeping this growth in consideration, it is not difficult to think of a near future in which it will be possible to forge photographs and videos of scenes that never happened. Also, it will likely be possible to put in people’s mouths completely fake phrases out of nothing. The world of fake news and disinformation is about to enter a new radiant era where the only limit is imagination, and the difference between “fake” and reality will be increasingly blurred.

Notes

1. To get out of it was reported that the observations were terminated due to a fire that would have reduced the magical telescope and the entire laboratory(!) to Ashes.

2. Social Text, a journal of post-modern cultural studies.

3. Subsequently, the editors of the magazine, even regretting having published the piece, published an editorial where they refuted Sokal’s conclusions on the alleged lack of rigor of the newspaper, claiming to have noticed the “bizarre” of the piece, but to have considered it an attempt honest to venture into the field of postmodernism philosophy by a physics professor.

4. It will be topic for a follow-up article.

5. When she was 20, Todd was not born yet, and Trump himself was obviously not yet president.

Links

The Great Moon Hoax

The Great Moon Hoax – History.com

The moon hoax; or, a discovery of the moon with a vast population of human beings – Internet Archives (Scanning from the Smithsonian Library)

Sokal Affair

Sokal affair – Wikipedia

Editorial response to Sokal hoax by editors of Social Text (pdf)

A Physicist Experiments With Cultural Studies – Franca Language

SCIpher

How three MIT students fooled the world of scientific journals

Rooter: A Methodology for the Typical Unification of Access Points and Redundancy

SCIpher – A Scholarly Message Encoder

Fact checking and fake detection

Factmata

FactCheck.org – A Project of The Annenberg Public Policy Center

Could AI fight fake news? Yes-with the help of humans. – IBM iX

Fake News

How Fake News Goes Viral—Here’s the Math – Scientific American

Could AI Be the Future of Fake News and Product Reviews?

Fake news ‘as a service’ booming among cybercrooks • The Register

BBC – Future – Lies, propaganda and fake news: A challenge for our age

How Data And Information Literacy Could End Fake News

Are Facebook and Twitter encouraging fake news? | Daily Mail Online

How Fake News Goes Viral—Here’s the Math

First Evidence That Social Bots Play a Major Role in Spreading Fake News – MIT Technology Review

Advertisers Try to Avoid the Web’s Dark Side, From Fake News to Extremist Videos – WSJ

A future of AI-generated fake news photos, hands off machine-learning boffins – and more • The Register

Inside the Macedonian fake-news complex

Break Your Own News – Breaking News Generator

Could AI Be the Future of Fake News and Product Reviews? – Scientific American

Fake news: you ain’t seen nothing yet – Creation stories

WaveNet: A Generative Model for Raw Audio

About The Author

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: