Prompt Wrangling: On Replication and Generalization in Large Language Models for PCG Levels

Document Type

Conference Proceeding

Publication Date

5-21-2024

Abstract

The ChatGPT4PCG competition calls for participants to submit inputs to ChatGPT or prompts that guide its output toward instructions to generate levels as sequences of Tetris-like block drops. Prompts submitted to the competition are queried by ChatGPT to generate levels that resemble letters of the English alphabet. Levels are evaluated based on their similarity to the target letter and physical stability in the game engine. This provides a quantitative evaluation setting for prompt-based procedural content generation (PCG), an approach that has been gaining popularity in PCG, as in other areas of generative AI. This paper focuses on replicating and generalizing the competition results. The replication experiments in the paper first aim to test whether the number of responses gathered from ChatGPT is sufficient to account for the stochasticity requery the original prompt submissions to rerun the original scripts from the competition on different machines about six months after the competition organizers. We re-run the competition, using the original scripts, but on our own machines, several months later, and with varying sample sizes. We find that results largely replicate, except that two of the 15 submissions do much better in our replication, for reasons we can only partly determine. When it comes to generalization, we notice that the top-performing prompt has instructions for all 26 target levels hardcoded, which is at odds with the PCGML goal of generating new, previously unseen content from examples. We perform experiments in a more restricted few-shot prompting scenario, and find that generalization remains a challenge for current approaches.

Identifier

85199032754 (Scopus)

ISBN

[9798400709555]

Publication Title

ACM International Conference Proceeding Series

External Full Text Location

https://doi.org/10.1145/3649921.3659853

This document is currently not available here.

Share

COinS