• Tue. Nov 26th, 2024

SalesForce AI Research BannerGen: An Open-Source Library for Multi-Modality Banner Generation

Dec 14, 2023

Effective graphic designing is the backbone of a successful marketing campaign. It acts as a communication bridge between the designers and their audience by captivating the users, highlighting essential details, and enhancing the campaign’s visual appearance. However, current methodologies are both time-consuming and involve layer-by-layer assembly work, which requires expertise and is not easily scalable.

To address the abovementioned issue, the researchers at Salesforce have introduced an open-source library, BannerGen, that streamlines the design process using the power of generative AI. The library consists of three parallel multimodal banner generation methods – LayoutDETR, LayoutInstructPix2Pix, and Framed Template RetrieveAdapter. Each one has been trained on a large corpus of designed graphical data, which allows them to expedite the design process. Moreover, all of them have been open-sourced in BannerGen’s GitHub repository and can be imported as Python modules, making it easy for the developers to experiment with each method. BannerGen also has licensed fonts and carefully crafted templates, allowing developers to build high-quality designs.

The user can upload an image that they want to create a banner of. The image then undergoes a cropping process that focuses on the main elements to create multiple sub-images. Users can also specify the type of banner they want and the text they want to include. The sub-images are then integrated into the selected template to create a stunning visual. The final design is produced as an HTML and a PNG file.

The researchers have integrated the VAEGAN framework into their approach to align the generated designs with real-world patterns. The DETR architecture has also been incorporated into BannerGen and is referred to as LayoutDETR. The researchers have modified the DETR decoder to handle multimodal foreground inputs. This architecture allows BannerGen to understand the background and foreground elements better, leading to better results.

BannerGen has also incorporated InstructPix2Pix, an image-to-image editing technique powered by diffusion models. The same has been fine-tuned to convert background images into images with superimposed text. 

The third method, Framed Template RetrieveAdapter, is used to enhance the diversity of generated designs and consists of three components – the retriever, which finds the most suited frame on the basis of the metrics; the adaptor, which customizes input images and texts to fit in the frame, and the renderer which produces the design in HTML/CSS by integrating the background layer with the user’s inputs. 

In conclusion, BannerGen is a powerful and versatile framework that enables users to seamlessly create customized banners by leveraging generative AI. The architecture of BannerGen has been designed to learn from real layouts and understand the background and the foreground elements. The final design is generated as an HTML and a PNG file, which allows for easy manual adjustments and can be embedded into any media for immediate use. BannerGen aims to make the process of graphic designing less time-consuming and help users generate high-quality and professional-grade designs.

The post SalesForce AI Research BannerGen: An Open-Source Library for Multi-Modality Banner Generation appeared first on MarkTechPost.


#AIShorts #Applications #ArtificialIntelligence #DeepLearning #EditorsPick #MachineLearning #Staff #TechNews #Technology #Uncategorized
[Source: AI Techpark]

Related Post