AI Is More Creative Than the Average Person, Study Finds

AI Is More Creative Than the Average Person, Study Finds

A comprehensive new analysis comparing human and artificial intelligence has unveiled a fascinating paradox, suggesting that while advanced AI can now brainstorm with greater originality than the typical person, it remains leagues behind the most imaginative human minds. The research, a collaborative effort from the Université de Montréal, Concordia University, and the University of Toronto, represents the most extensive direct comparison of creative output between humans and state-of-the-art AI. The findings from this large-scale experiment provide a crucial reality check, simultaneously confirming AI’s impressive progress in divergent thinking while also highlighting the profound and perhaps unbridgeable gaps that still separate machine-generated ideas from the pinnacle of human ingenuity. This nuanced picture challenges both the unbridled optimism and the deep-seated fears surrounding AI’s role in creative fields.

Measuring Creativity: The Divergent Association Task

The Experimental Setup

The core of this extensive investigation was the Divergent Association Task (DAT), a standardized test designed to quantify one of the fundamental components of creativity: divergent thinking. The task itself is deceptively simple, requiring participants to generate a list of ten words that are as conceptually unrelated to one another as possible. The assessment does not reward poetic language or clever wordplay but rather measures the “semantic distance” between the words—a computational metric that calculates how far apart concepts are in a mental map of language. A set of words like “cat, dog, pet” would receive a low score due to its tight conceptual clustering, whereas a list such as “galaxy, fork, freedom, algae” would score highly for demonstrating an ability to leap across disparate cognitive categories. This methodology provides an objective, quantifiable framework for comparing the raw ideation abilities of different entities, removing the subjectivity that often complicates creativity assessments.

To create a robust comparison, the researchers pitted a formidable roster of leading AI models, including the highly advanced GPT-4, Google’s Gemini, and Anthropic’s Claude, against an enormous human dataset. This human benchmark was composed of over 100,000 individual responses, creating a vast and statistically significant sample of human creative performance on this specific task. By leveraging such a massive dataset, the study was able to establish a clear baseline for average human performance, as well as identify the characteristics of top-tier human creativity. The direct head-to-head competition between the latest AI and this wide spectrum of human participants allowed for an unprecedented analysis of where machines excel, where they falter, and how their “thought” processes fundamentally differ from our own when it comes to generating novel ideas. The scale of the experiment ensures that the findings are not anecdotal but represent a broad and reliable snapshot of current capabilities.

A Tale of Two Tiers

The initial, headline-grabbing result from the study was nothing short of stunning: GPT-4, on average, outperformed the entire human sample in the Divergent Association Task. The model’s mean score was statistically higher than the average score produced by the more than 100,000 human participants. This outcome confirms that for baseline brainstorming, AI has achieved a level of performance that surpasses that of a typical person. Other models, such as GeminiPro, performed at a level statistically on par with the human average, solidifying the idea that AI has effectively “raised the floor” for this type of creative ideation. The implication is that these systems can now serve as powerful tools for generating a wide variety of initial ideas, potentially helping individuals overcome creative blocks or explore conceptual spaces they might not have considered on their own. This capability represents a significant milestone in the development of creative AI.

However, a deeper dive into the data revealed a critical and illuminating counter-narrative. When the researchers segmented the results by performance tiers, the hierarchy of creativity completely inverted. The most creative 50% of human participants achieved higher scores than every single AI model tested in the study. This performance gap became a chasm when focusing on the elite performers; the top 10% of humans significantly outclassed the AI, demonstrating a capacity for conceptual leaps that the models could not replicate. This crucial detail reframes the entire discussion, indicating that while AI has successfully mimicked and even surpassed average-level divergent thinking, it has not yet managed to break through the ceiling of exceptional human imagination. It suggests that peak creativity involves cognitive processes or a type of worldly understanding that current AI architectures have yet to grasp, leaving the highest echelons of originality firmly in the human domain for the foreseeable future.

Unpacking the Differences Between Human and AI Cognition

The “Ocean Problem” and Probabilistic Thinking

Further analysis exposed fundamental, almost alien, distinctions in the cognitive patterns of AI compared to humans. The researchers identified a peculiar tendency in the models, which they dubbed the “Ocean Problem,” characterized by a surprising degree of repetition and formulaic responses, even when explicitly prompted for maximum variety. For instance, GPT-4 exhibited an odd fixation on certain words, including “microscope” in an astonishing 70% of its responses and “elephant” in 60%. An even more recent version, GPT-4-turbo, was more repetitive still, using the word “ocean” in over 90% of its attempts at the task. This behavior stands in stark contrast to the immense diversity seen in the human data. The most frequently submitted words by people were “car” and “dog,” yet each of these appeared in only about 1% of the total human responses, showcasing a far wider and more organic distribution of ideas across the population.

This stark difference suggests that what appears as creativity in AI may often be a sophisticated probabilistic artifact rather than genuine ideation. The models, trained on vast datasets of text, seem to be returning to statistically prominent but conceptually distant “corners” of their training data. These words are “random” in a mathematical sense but are not generated through the same associative, experiential, and contextual processes that drive human thought. While a human might draw on a lifetime of unique memories and sensory inputs to make a creative leap, the AI is navigating a pre-existing map of statistical relationships. This reliance on its training data appears to create strange attractors, or conceptual gravity wells, that cause it to repeatedly offer the same “creative” words, betraying a mechanical and non-human approach to the creative process that lacks the personal and unpredictable spark of human imagination.

Tunable Creativity and Complex Tasks

The research team discovered that the creative output of the AI was not a fixed quality but a highly malleable one, directly influenced by its operational parameters. A key setting known as “temperature” controls the level of randomness or risk in the AI’s word selection. A low temperature produces predictable, safe, and often generic responses, while a high temperature encourages more unconventional and diverse—though potentially nonsensical—outputs. By “cranking up” this temperature setting, the researchers were able to significantly boost the AI’s performance on the DAT. At its highest temperature, GPT-4’s score improved to the point where it could outperform approximately 72% of the human participants. This finding, along with the discovery that specific prompting strategies could also improve scores, underscores that AI’s creativity is not an innate, emergent property but rather a tunable output heavily dependent on precise user guidance and technical calibration.

The limitations of current AI became even more pronounced when the investigation moved beyond simple word lists to more complex creative writing assignments, such as composing haikus, movie plot summaries, and short works of flash fiction. In these tasks, human writers consistently achieved higher scores on a metric called “Divergent Semantic Integration,” which assesses the ability to skillfully weave diverse concepts into a unified and compelling narrative. Visual analysis revealed that human writing occupied a completely different “region of meaning” from the machine-generated text, indicating a deeper level of thematic and conceptual integration. A particularly telling result emerged from the haiku task; unlike with longer stories, simply increasing the AI’s “temperature” did not improve the quality of its poetry. This implies that short, highly constrained artistic forms demand a level of intentionality, nuance, and structural awareness that current statistical prediction models are fundamentally unable to replicate, highlighting a critical boundary in AI’s creative abilities.

A New Era of Human-AI Collaboration

Ultimately, the study’s findings painted a picture not of an impending machine takeover in the creative arts but of a future defined by collaboration. The research provided a valuable reality check, demonstrating that today’s AI was not a replacement for human artistry but a powerful tool that could augment it. Its proven excellence at baseline ideation positioned it as a potent instrument to help creators overcome initial hurdles like writer’s block or a lack of starting points. By effectively handling the “average” part of the creative process—the generation of varied but not necessarily profound ideas—AI allowed human artists, writers, and musicians to concentrate on achieving the peak levels of originality, emotional depth, and narrative coherence that machines, for now, could not touch. The robot may have suggested “ocean” and “microscope,” but the distinctly human task remained to imbue those concepts with meaning and weave them into something that resonated with the human experience.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later