StableYolo: Optimizing Image Generation for Large Language Models

Harel Berger, Aidan Dakhama, Zishuo Ding, Karine Even-Mendoza, David Kelly, Hector D. Menendez, Rebecca Moussa, Federica Sarro

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

103 Downloads (Pure)

Abstract

AI-based image generation is bounded by system parameters and the way users define prompts. Both prompt engineering and AI tuning configuration are current open research challenges and they require a significant amount of manual effort to generate good quality images. We tackle this problem by applying evolutionary computation to Stable Diffusion, tuning both prompts and model parameters simultaneously. We guide our search process by using Yolo. Our experiments show that our system, dubbed StableYolo, significantly improves image quality (52% on average compared to the baseline), helps identify relevant words for prompts, reduces the number of GPU inference steps per image (from 100 to 45 on average), and keeps the length of the prompt short (≈ 7 keywords).
Original languageEnglish
Title of host publicationSymposium on Search-Based Software Engineering (SSBSE) 2023
PublisherSpringer
Publication statusPublished - 8 Dec 2023

Keywords

  • LLM
  • SBSE
  • Image Generation
  • Stable Diffusion
  • Yolo
  • search-based software engineering
  • large language models
  • Generative AI

Fingerprint

Dive into the research topics of 'StableYolo: Optimizing Image Generation for Large Language Models'. Together they form a unique fingerprint.

Cite this