Story: Researchers discovered that people find AI-generated (white) faces to be more convincing and realistic compared to actual photographs of real individuals. 124 participants were asked to report their confidence level and to identify the cues they relied on while attempting to distinguish AI from human faces. Further, the researchers report various visual attributes that distinguish AI from human faces but were misinterpreted by participants.
Key Findings:
White AI faces were judged as human (65,9%) significantly more often than white human faces (51,1%)
Dunning-Kruger effect: People who made the most errors in detecting AI faces were the most confident
AI faces were significantly more average (less distinctive), familiar, and attractive, and less memorable than human faces
Among the most important visual attributes that made people think AI faces are real are facial proportions, familiarity, and memorability
Among the most important visual attributes that helped people detecting AI faces are congruent lightning, symmetric and facial attractiveness
Pixit‘s Two Cents: The progress we made in generating AI faces is astonishing. While the study focused on white faces, we’re anticipating further studies and results that encompass faces from various races. From the photo above, which of the two images is AI-Generated?
Story: Stability AI, known for its Stable Diffusion text-to-image model, has announced its start into video generation with the introduction of Stable Video Diffusion. This new AI model generates videos by animating existing images. Currently it is in a "research preview" phase and available as one of the very few open-source models (with Meta and Google recently launching their own).
Key Findings:
Innovative Technology: Stable Video Diffusion is among the few open-source video-generating AI models. It operates in two forms: SVD (lower quality) and SVD-XT (higher quality), generating videos (576×1024 pixels) from still images with varying frame lengths (SVD up to 14 and SC-XT up to 24 Frames).
Training and Potential Issues: The models were trained on a large dataset, but the sources of these videos at least raise concerns about copyright and ethical usage, since the exact origin is not known.
Current Limitations and Future Potential: While offering high-quality outputs, the model has limitations, such as inability to generate videos without motion or render text clearly. Future expansions may include more versatile models and a text-to-video tool, which does not need images as input anymore.
Pixit‘s Two Cents: At Pixit we also make heavy use of technology created at Stability AI. Thus, we are intrigued to see them making ever more progress regarding AI image and video generation. This development might directly impact features in our products in the future as soon as it becomes clear where the training data is actually coming from.
Story: DeepMind has beta-launched SynthID, a tool designed for watermarking and identifying AI-generated content, including images and audio. This tool enables the embedding of a digital watermark directly into AI creations, which remains imperceptible to humans but detectable for identification purposes.
Key Findings:
Pixit‘s Two Cents: Our Team at Pixit finds this type of AI content verification addressing an important growing concern about the authenticity of AI-generated materials. As AI generated content continues to increase in quality and quantity, tools like SynthID are crucial for maintaining transparency and trust.
Fake AI products to help find real gifts: Google's AI-powered Search Generative Experience (SGE) introduces new shopping features to assist users in finding unique products and gift ideas. This includes AI-generated suggestions for fashion items and gifts, alongside a virtual try-on tool for men's clothing, enhancing the online shopping experience.
Google DeepMind announced Mirsaol3B, a new AI model that can analyse long videos. The multimodel approach consists of an autoregressive component for the time-synchronized modalities (audio and video) and a separate autoregressive component for modalities that are not necessarily time-aligned (e.g., text)
Meta launches AI video editing tools for Social Platforms: Meta has launched two new AI-based video editing features, Emu Video and Emu Edit, for use on Instagram and Facebook. Emu Video generates short videos from prompts, while Emu Edit allows text-prompted video editing.
President Candidates in Argentina utilized AI-generated photos to create a persuasive narrative against their rivals: Sergio Massa and Javier Milei used AI-generated images in their political campaigns to win Argentine’s general election, such as an image showing Sergio Massa in military garb.