LivePortrait Animates Images with High-Precision Video-Driven Techniques
Story: Researchers from the University of Science and Technology of China have introduced LivePortrait, a novel approach to animating portrait images using video-driven techniques. This method achieves high-precision control and fast inference speed, enabling efficient and realistic portrait animation.
Key Findings:
-
Stitching-based Animation: LivePortrait uses a stitching-based approach to animate portrait images, combining the best-matching patches from a driving video to create smooth and coherent animations.
-
Retargeting Control: The method incorporates retargeting control, allowing users to specify the desired motion and expressions for the animated portrait, enhancing flexibility and customization.
-
Fast Inference Speed: LivePortrait achieves real-time inference speed, processing up to 30 frames per second on a single GPU, making it suitable for interactive applications.
-
High-Quality Results: The animated portraits generated by LivePortrait exhibit high fidelity to the input image and the driving video, preserving the identity and style of the subject.
-
Diverse Applications: This technique has potential applications in various fields, including virtual avatars, video conferencing, and entertainment.
Pixit‘s Two Cents: By leveraging video-driven techniques and retargeting control, this approach opens up new possibilities for creating engaging and personalized animated content. As the demand for virtual avatars and immersive experiences grows, LivePortrait could become a valuable tool for developers and content creators seeking to create lifelike animations efficiently. On our RTX4090 we could create stunning avatars in milliseconds!!
Odyssey Pioneers Hollywood-Grade Visual AI for Immersive Storytelling
Story: Odyssey, a startup founded by former Cruise CEO Oliver Cameron, is developing a groundbreaking 'Hollywood-grade' visual AI system to empower professional storytellers with advanced tools for creating high-quality, immersive visual content and experiences. The company aims to revolutionize the way AI is used in storytelling, rejecting the notion of replacing human creativity with algorithms optimized for clicks.
Key Findings:
-
Pioneering Hollywood-Grade Visual AI: Odyssey is at the forefront of developing visual AI that enables the generation and precise direction of stunning video content, including beautiful scenery, characters, lighting, and motion.
-
Empowering Professional Storytellers: Instead of replacing humans, Odyssey's visual AI is designed to be placed in the hands of professional storytellers, providing them with game-changing efficiencies and creativity while maintaining full control over every element in the scene.
-
Experienced Team: Odyssey's team comprises AI researchers, computer graphics experts, and Hollywood artists who have delivered state-of-the-art AI and simulation systems at companies such as Cruise, Waymo, Wayve, Tesla, Microsoft, Meta, and NVIDIA.
-
Rejecting Low-Quality AI Content: Odyssey aims to hold AI to higher standards, rejecting the proliferation of low-quality AI-generated content that lacks the spark and story of human storytelling.
-
Hiring Top Talent: The company is actively hiring for various technical roles, including generative 3D, motion, AI simulation, graphics, 3D reconstruction, ML infrastructure, data engineering, and software engineering.
Pixit‘s Two Cents: By focusing on empowering professional storytellers rather than replacing them, the company is charting a path that leverages the strengths of both human creativity and advanced AI technology. This approach has the potential to turn the the way we create and experience visual content up side down, from movies and television to video games and virtual reality. We are excited how this new venture develops and if it can keep up with their promises.
Microsoft's MInference Slashes LLM Processing Time by 90%
Story: Microsoft Research has introduced MInference, a novel approach to prompt inference for long-context large language models (LLMs). This technique significantly reduces processing time by up to 90% while maintaining high-quality outputs, making it easier and more efficient to work with LLMs on resource-constrained devices.
Key Findings:
-
Efficient Prompt Inference: MInference optimizes prompt inference for LLMs by leveraging a combination of techniques, including dynamic sequence length, adaptive attention, and efficient token representations.
-
Reduced Processing Time: By applying MInference, the processing time for prompts with millions of tokens can be reduced by up to 90%, enabling faster and more responsive LLM applications.
-
Maintained Output Quality: Despite the significant reduction in processing time, MInference maintains the quality of the generated outputs, ensuring that the LLMs' performance remains high.
-
Resource-Constrained Devices: MInference makes it possible to run long-context LLMs on resource-constrained devices, such as smartphones and edge devices, expanding the potential applications of these models.
-
Open-Source Availability: Microsoft has made the MInference code and models publicly available on GitHub, encouraging researchers and developers to build upon and extend this work.
Pixit‘s Two Cents: By drastically reducing processing time while maintaining output quality, this technique opens up new possibilities for deploying LLMs in real-world applications, particularly on resource-constrained devices such as phones or other edge devices. As LLMs continue to grow in size and complexity, efficient inference methods like MInference will become increasingly important for ensuring their practicality and accessibility.
Small Bites, Big Stories:
-
AWS Empowers Developers with Generative AI Tools and Certifications: Amazon Web Services (AWS) announces a suite of new generative AI tools, including AWS App Studio for low-code app development, Amazon Q Developer integration in SageMaker Studio, and enhanced capabilities for Amazon Bedrock.
-
Waymo Issues Software Recall After Robotaxi Collides with Telephone Pole in Phoenix: Waymo announces a voluntary software recall for 672 of its driverless vehicles following an incident where an autonomous vehicle collided with a telephone pole in Phoenix.
-
Cloudflare Introduces One-Click Solution to Block AI Bots and Protect Website Content: Cloudflare launches a new feature that allows customers, including those on the free tier, to block all AI bots with a single click, helping content creators maintain control over their work and combat unauthorized scraping by AI companies.
-
SenseTime Unveils SenseNova 5.5, China's First Real-Time Multimodal AI Model: SenseTime introduces SenseNova 5.5, featuring SenseNova 5o, China's pioneer real-time multimodal model, offering capabilities comparable to GPT-4o and launching a cost-efficient large model for edge-side deployment, reducing annual costs to as low as RMB 9.90 per device per year.
-
Groq's LLM Inference Engine Outperforms Cloud Providers on Anyscale's Benchmark: Groq's LLM engine demonstrates remarkable speed on Anyscale's LLMPerf Leaderboard, processing around 1256.54 tokens per second and surpassing Nvidia GPU capabilities in performance tests.
-
Odyssey Systems Develops Hollywood-Grade Visual AI for Immersive Storytelling: Odyssey, a startup founded by former Cruise CEO Oliver Cameron, is pioneering a 'Hollywood-grade' visual AI system to generate high-quality visual content and experiences, aiming to empower professional storytellers with advanced AI tools.
-
Anthropic Introduces Prompt Playground for Crafting High-Quality AI Interactions: Anthropic adds a new playground feature to its AI platform, enabling users to experiment with and evaluate prompts for their AI models, helping to create high-quality prompts and improve model performance.
-
Google AI Reconstructs Human Brain in Unprecedented Detail: Google researchers collaborate with Harvard neuroscientists to create a highly detailed 3D mapping of a small volume of human brain tissue using advanced AI tools, requiring 1.4 petabytes of data and offering new insights into brain structure and function.
-
OpenAI Partners with Los Alamos National Laboratory to Advance Bioscience Research: OpenAI and Los Alamos National Laboratory announce a research partnership to develop safety evaluations for assessing the biological capabilities and risks of multimodal AI models in laboratory settings, aiming to advance bioscientific research responsibly.
-
Samsung Unveils AI-Powered Wearables and Foldables: Samsung introduces a new line of AI-powered wearables, including the Galaxy Ring, Galaxy Watch Ultra, and Galaxy Watch7, alongside the Galaxy Z Fold6 and Galaxy Z Flip6 foldable smartphones, showcasing advancements in personalized health monitoring and AI-driven features.
-
Google's DeepMind Develops AI-Powered Office Assistant Robot: Google's DeepMind division unveils an AI-powered robot that serves as a tour guide and informal office helper, leveraging the latest version of Google's Gemini large language model to navigate its surroundings and interact with humans effectively.
-
China Leads World in GenAI Usage, US Tops in Full Implementation: SAS and Coleman Parkes Research surveyed 1,600 global decision-makers, revealing that China has the highest generative AI usage at 83%, while the US leads in full implementation at 24%, with leaders citing lack of understanding, strategy, data, and regulation readiness as key challenges.
Tags:
Jul 15, 2024 9:46:42 AM