Beijing, China — In a groundbreaking study, a team of Chinese scientists has confirmed that multimodal large language models (LLMs) can spontaneously develop human-like object concept representations. This discovery bridges a crucial gap between artificial intelligence (AI) and human cognition, paving the way for more intuitive AI systems.
“The ability to conceptualize objects in nature has long been regarded as the core of human intelligence,” said He Huiguang, a researcher at the Institute of Automation under the Chinese Academy of Sciences (CAS) and the study’s corresponding author.
When humans encounter objects like dogs, cars, or apples, they don’t just recognize physical features like size, color, and shape. They also understand the objects’ functions, emotional values, and cultural significance. This multidimensional understanding forms the cornerstone of human cognition.
The research team from the Institute of Automation and the CAS Center for Excellence in Brain Science and Intelligence Technology combined behavioral experiments with neuroimaging analyses. They designed an innovative approach that integrates computational modeling with brain science to explore how LLMs represent object concepts.
Their findings revealed that the 66 dimensions extracted from LLMs’ behavioral data are strongly correlated with neural activity patterns in human brains, particularly in regions selective for categorizing objects. When comparing multiple models, multimodal LLMs demonstrated a higher consistency with human behavior in choice patterns.
“Our study shows that humans tend to combine visual features and semantic information when making decisions,” He explained. “In contrast, LLMs rely more on semantic labels and abstract concepts.”
This research not only provides a new path for understanding the cognitive science of AI but also offers a theoretical framework for building AI systems with human-like cognitive structures. The implications could be vast, impacting how future AI interacts with the world and understands complex human concepts.
Reference(s):
Multimodal LLMs can develop human-like object concepts: study
cgtn.com








