Face2Face is still a work in progress, with a team of researchers from Stanford, the Max Planck Institute, and the University of Erlangen-Nuremberg constantly improving the program. The working prototype allows anyone to use a standard RGB webcam to capture video footage of their face, either saying words or making facial expressions. Pairing that footage with video footage of a celebrity, president, or public figure in the Face2Face program will make it look like the person is saying pretty much anything you want them to.
The system pairs two monocular video streams (captured with a single webcam or video source) to make the finished video dub. Face2Face tracks facial expressions of the source video and the target video in order to warp the mouth movement in the final video into something believable. The system then combines the two feeds using a “dense photometric consistency measure,” and re-renders the source expression onto the target face in the original video setting. It’s scary, but even from an unfinished prototype of the system, the Face2Face edited video is pretty convincing.
The challenge to making convincing video may have a lot to do with the target star’s popularity level – if someone we’re used to seeing on TV all the time suddenly makes facial expressions that are new to the repertoire, viewers’ brains will raise some red flags automatically. Nonetheless, some savvy video hackers could theoretically make very believable video and audio dubs of perfectly legitimate speeches or broadcast appearances. Some people are afraid that this kind of system poses a threat to technology by making it inherently less reliable or believable, but the truth is that hoaxes in every medium have been around for ages.
There’s no telling what this software might eventually be used for — but we’re definitely looking forward to all the hilarious speech remixes that will undoubtedly hit YouTube once this software goes mainstream.