Location

Library Room 2509

Date and Time

Time
10:00 AM to 10:50 AM

Abstract

Large language models (LLMs) are increasingly deployed in multi-agent teams (MAS), yet their search efficiency and strategy in solving novel problems remain unclear. We compared performance between solo versus two-agent ChatGPT-4o teams to solve narrative situation puzzles in a 2 (Group) × 2 (Time) experiment (N = 82). Solo agents discovered significantly more plot elements than teams, independent of time limits. Solo agents performed better than the LLM teams, measured by the amount of cumulated correctly founded plot elements across all puzzles, independent of time limits. However, LLM teams reflect more divergent thinking process when solving the puzzle, as measured by entropy, the variety of questions asked. The questions quality are comparable between groups. These findings indicate the potential existence of a coordination cost, that the increased exploratory diversity of MAS failed to align with the task’s demand for the efficient use of both divergent and convergent reasoning.