Apple's machine learning researchers have introduced Keyframer, a cutting-edge application designed to animate simple drawings based on text descriptions. The trio behind this innovation, Tiffany Tseng, Ruijia Cheng, and Jeffrey Nichols, recently presented their findings in a paper published on the arXiv preprint server. Keyframer leverages the power of the Language Model GPT-4 (LLM), proving its versatility beyond conventional applications. The researchers discovered that by inputting a basic drawing in Scalable Vector Graphics (SVG) format and a corresponding textual prompt describing the desired animation, GPT-4 could generate animations that faithfully execute the provided instructions.
As an illustrative example, the application can take an image of a rocket on a launch pad accompanied by a text prompt like "Make the rocket launch into the sky with a bunch of fire blowing out beneath it." Keyframer then endeavors to bring the rocket to life, aligning with the specified prompt. The researchers elucidate that the LLM's animation process involves envisioning the necessary steps and subsequently generating Cascading Style Sheets (CSS) animation code. This facilitates easy portability of the animation across various devices, with the added flexibility of manual code editing to incorporate or remove additional animations.
Keyframer supports iterative animation, enabling users to refine their projects by adding prompts for continuous improvement. In their paper, the researchers propose that Keyframer has the potential to revolutionize the animation landscape. If Apple integrates it across all its hardware platforms, it could eliminate the need for other animation applications. This innovation not only holds promise for professionals creating commercials but also empowers non-professionals to craft high-quality animations with minimal effort, ushering in a new era of accessibility in animation creation.