Pixit Pulse: The Weekly Generative AI Wave

AI News - Week #50

Geschrieben von Pix | Dec 11, 2023 11:02:51 AM

Stability AI launches lightning fast SDXL Turbo

Story: StabilityAI has launched Stable Diffusion XL Turbo (SDXL Turbo), an AI text-to-image model that can generate images in real-time. Using a technique called Adversarial Diffusion Distillation, the model can produce images with a single step (usually models need ~20-50 steps), significantly reducing the time required for image generation.

Key Findings:

  • Adversarial Diffusion Distillation (ADD): By incorporating ADD, the diffusion model is on par with Generative Adversarial Networks (GANs), enabling single-step image outputs.

  • Performance Superiority: Compared to other models (e.g., StyleGAN-T++, OpenMUSE, SDXL 1.0 Base), SDXL Turbo demonstrated superior performance with respect to image quality.

  • Impressive Inference Speed: SDXL Turbo is super efficient, generating, for example, a 512x512 image in just 207ms on an A100. This includes (a) prompt encoding, (b) single denoising step, and (c) decoding

  • Explore SDXL Turbo with Clipdrop: You can test the capabilities of the model via Clipdrop

Pixit‘s Two Cents: SDXL Turbo’s ability to produce image outputs within milliseconds while maintaining, if not improving, image quality is a game-changer. We’re looking forward to playing around and incorporating the model in our own use cases.

Google Deepmind launches Gemini

Story: After a long period of anticipation Google launched Gemini, their competitor to Open AI’s GPT-4. Gemini will be released in three versions: Ultra, Pro and Nano. Google claims to outperform GPT-4 in many of the industry standard LLM Benchmarks including general knowledge an most reasoning tasks with their Ultra Model.

Key Findings:

  • Better than humans: Gemini Ultra stands out as the first model to surpass human experts in Massive Multitask Language Understanding (MMLU), a benchmark for evaluating AI knowledge and problem-solving skills.

  • A pack of three: The three model variants come each with a different focus. Gemini Ultra is ideal for highly complex tasks, Pro for a wide range of tasks, and Nano for efficient on-device applications like in the Google Pixel Phones.

  • Multi Modal Any-Any: Gemini surprises with it’s multi modal capabilities even surpassing GPT-4, since it even understand video. It comes as an any-any model, meaning that it is able to digest any form of media and generating any.

Pixit‘s Two Cents: Currently there is no week without mentioning work of Google’s Deepmind. Gemini surpassing human-level performance in MMLU is a significant milestone, however for now only the Nano and Pro Model are available, where as the most interesting one (Ultra) will be publicly available in early 2024. At a second glance, as Google acknowledges in their blog, the very compelling demonstration videos are made with giving Gemini a few hints.

Bosch creates synthetic images using AI

Story: Bosch, one of the forerunners in the usage of AI, is pioneering in the application of its usage in industrial settings. The company is taking a leap in using AI for quality control, by utilizing generative AI to create images of defective products. This approach is expected to substantially improve the AI's learning process in quality control, enhancing efficiency and reliability in industrial manufacturing. Thus significantly cutting costs.

Key Findings:

  • Synthetic Training Images: Bosch's generative AI is aimed at producing data to train its quality control AI, making it more reliable.

  • Small sample needed: The company claims to only need a double-digit amount of training samples to reliably produce synthetic data (They needed about 15.000 images) used to train the Quality control AI.

  • Cost cutting & productivity gains: The company anticipates significant cost savings and productivity gains from this AI innovation, potentially amounting to millions annually per plant.

  • Saving Millions: Bosch plans to roll out this technology across all its 230 plants, potentially saving over 300 million euros.

Pixit‘s Two Cents: Bosch's initiative to integrate generative AI into its quality control processes is a great step forward in using generative image AI in industrial application instead of mostly using it in the creative space so far. It also underlines how generative AI is capable enough to produce precise data that can be a cost-saving tool in large-scale industrial operations. It is also interesting to see one specialised AI producing data for another specialised AI.

Small Bites, Big Stories: