Skip to main content
Made with ❤️ by Pixit
Made with ❤️ by Pixit

ByteDance Publishes SOTA Text-to-Video Framework MagicVideo-V2

ai waves

Story: ByteDance introduced MagicVideo-V2, a Text-to-Video framework demonstrating superior performance over leading Text-to-Video systems such as Runway, Pika 1.0, Morph, Moon Valley and Stable Video Diffusion via user evaluation at large scale.

Key Findings:

  • Multi-Stage Framework: MagicVideo-V2 integrates multiple modules (Text-to-Image, Image-to-Video, Video-to-Video, and Video Frame Interpolation) into a comprehensive text-to-video generation system.

  • High-Aesthetic Videos: The system is specifically designed to produce videos that are not only high in resolution but also possess a high aesthetic quality.

  • Enhanced Fidelity and Smoothness: A key focus of MagicVideo-V2 is on improving the fidelity and smoothness of the generated videos, making them more appealing and realistic.

  • Significant Advancement: The framework is outperforming existing text-to-video systems, as demonstrated through extensive human evaluations. 

Pixit‘s Two Cents: It is stunning to see how fast the Text-to-Video research is moving forward. We believe there is still some way to go, but for fast paced video as well as very short videos and gifs the tech might be already there.


Rabbit Released r1 - Automating your Apps

robot in front of hardware gadget

Story: AI startup Rabbit released r1. It’s an AI powered hardware gadget that is like a voice assistant that can use your apps for you. Their so called Rabbit OS is based on a new “Large Action Model”. r1 has a training mode, which you can use to teach the device how to do something.


Key Findings:

  • Design and Hardware: It's a small, square device with a 2.88-inch touchscreen, analog scroll wheel, two microphones, a speaker, and a 360-degree rotational camera.

  • Interaction and Functionality: The Rabbit R1 operates using a "Push-to-Talk" button, activating the Rabbit OS to perform various tasks, such as booking services or suggesting recipes.

  • Large Action Model (LAM): This innovative feature allows the Rabbit R1 to interact with interfaces directly, enabling it to learn and replicate tasks.

  • Privacy and Practicality: The rotating camera acts as a privacy shutter and can identify objects and people. However, there are still questions about its battery life and user training ease.

  • Hype and High Demand: The Rabbit R1, priced at $199, is available for preorder. Rabbit has sold out two batches of 10,000 r1 over two days.

Pixit‘s Two Cents: We are probably as excited as everyone for this little device. People are curious, since it is something completely new giving room to human imagination and creativity to come up with very exciting uses.


Generative AI by iStock powered NVIDIA Picasso

robot as a judge in front of a computer

Story: Getty Images launches Generative AI by iStock. It offers AI-driven image generation tools, enabling users to create high-quality images. The service includes inpainting and outpainting features. What makes it special is the legal protection and usage rights for generated images.


Key Findings:

  • Based on Getty Images: The service is trained on the company’s creative library of licensed, proprietary data, which makes it unique in comparison to open or closed source alternatives.

  • NVIDIA Picasso Integration: The service is built on NVIDIA Picasso, a custom AI model foundry, enabling high-quality, text-to-image generation with legal protection and usage rights.

  • Advanced Inpainting and Outpainting Features: These features are available via APIs, allowing users to add or replace elements in images and expand images to fit various aspect ratios.

  • Photo-Quality Image Generation: Users can create up to 4K resolution images using text prompts, trained on Getty Images’ extensive library.

Pixit‘s Two Cents: The new service might not offer the best quality on the market (even though we don’t say it offers bad quality), however the fact that it is trained on a vast library of images with appropriate copyrights might make it much more secure for many users to use in a commercial setting.


Small Bites, Big Stories:


Tags:
Pix
Post by Pix
Jan 15, 2024 9:32:08 AM