Skip to main content
Made with ❤️ by Pixit
Made with ❤️ by Pixit

StabilityAI Releases Early Preview of Stable Diffusion 3

stable-diff-3

Story: StabilityAI announced Stable Diffusion 3 in early preview. The new text-to-image model is supposed to improve performance in multi-subject prompts, image quality, and spelling abilities. The company says that the suite of models will range from 800M to 8B parameters - without specifying how many models there will be in total.

Key Findings:

  • Technology: The technical report will be published soon. But it seems that the text-to-image model combined Diffusion Transformers (DiTs) with flow matching. In Diffusion Transformer, the U-Net backbone is replaced by a transformer architecture (Vision Transformer, ViT), that is, converting images into patches and processing each patch by transformer blocks. Sora is using a similar technique.

  • Open Source: The weights of the model will be available, similar to previous models, allowing to run and fine-tune the models locally.

  • Variety of models: Multiple models will be available upon the official release to run the model locally on a variety of devices - from smartphones to servers.

  • Waitlist: You can sign up to join the waitlist here.

Pixit‘s Two Cents:At Pixit, we’re leveraging StabilityAI’s models a lot and that is why we already signed up for the waitlist and we’re eager to test the new model. Let’s see whether the new model really is better in spelling and writing abilities!


Google’s Imagen 2 Ensuring Too Much Diversity

gemma-open-spurce

Story: Google recently faced a challenging situation with its text-to-image model (Imagen 2), which mistakenly incorporated diversity elements into images in an inappropriate manner. This incident has brought to light the complexities and potential pitfalls in the development and deployment of AI technologies. As a consequent, Google pauses Gemini’s ability to generate images of people.


Key Findings:

  • Inaccurate Historical Depictions: The AI model generated images with incongruous representations of historical figures (like multi-cultural Founding Fathers)

  • Impact on AI Industry: Such incidents can have broader implications for the AI industry, emphasizing the need for robust testing and sensitivity in AI systems.

  • Re-release: Google aims to re-release an improved version of the model soon, focusing on more accurate and sensitive AI outputs.

Pixit‘s Two Cents: The recent incident with Google's AI highlights the critical importance of responsible AI development, especially in areas like diversity representation. It serves as a reminder for companies like Pixit, who specialize in AI and image generation, to prioritize accuracy and sensitivity in AI implementations.


Gemma: Google joins the Game in Open AI Models

google_open_source

Story Google introduces Gemma, a new family of state-of-the-art open models. Built on the foundation laid by Gemini, Gemma models are designed to empower developers and researchers to create AI applications responsibly. With a focus on accessibility, these models come in two sizes, Gemma 2B and Gemma 7B, and are complemented by tools to encourage innovation, collaboration, and responsible usage.


Key Findings:

  • Broad Accessibility and Support: Gemma models are available worldwide, offering pre-trained and instruction-tuned variants, with toolchains for JAX, PyTorch, and TensorFlow. Integration with popular tools and platforms makes it easy to start developing with Gemma.

  • Responsible AI Development: A new Responsible Generative AI Toolkit accompanies Gemma, guiding the creation of safer AI applications. Google emphasizes safety and responsible AI with extensive evaluations and fine-tuning processes.

  • Optimized Performance: Gemma models deliver best-in-class performance for their size, capable of running on various devices and optimized for multiple AI hardware platforms, ensuring efficient and accessible AI development.

  • Commercial Use and Research Support: The terms of use allow for commercial applications, and Google offers free credits for researchers and developers, highlighting its support for the broader AI community.

Pixit‘s Two Cents: At Pixit we are glad to hear that Google finally joined many other big tech companies in open sourcing (some) of their models. The new Gemma models also give some very interesting insights in how the bigger brother Gemini was built. We are very interested to see what the Open Source community it going to unveil in the weeks to come. Thanks Google and keep on open sourcing.


Small Bites, Big Stories:

Tags:
Pix
Post by Pix
Feb 26, 2024 9:38:23 AM