Can AI learn to understand our internal mental states? A primer on concepts.
Theory of Mind (ToM) and its relationship to trust
Is it possible that future technologies can learn to understand the internal intentions and motivations of humans? Can it have the capability to build trust with humans? Does technology have the potential to develop a version of Theory of Mind (ToM)?
Theory of Mind1 is a well-established theory within the field of psychology that asserts that the ability for humans to have an understanding of the internal thoughts and feelings of others is the means that serves as the beginnings of human capabilities to trust each other. Active research within the field of AI is exploring the potential of this idea to create human-like capabilities within AI with the intent of improving the efficacy and impact of human-AI interactions.
In human-human interactions, Theory of Mind clarifies how children develop the interpersonal skills to navigate complex social interactions. Successfully navigating social interactions is part of the process of establishing the ability to trust. Traditional tests for Theory of Mind such as the ‘false-belief task’2 have led the way to additional tests such as the ‘representational change’3 test and ‘penny-hiding game’4 to increase accessibility across populations.
Within the field of AI, analogous research has been developed with recent advances in the development of an Artificial Theory of Mind (AToM)5 such as the work from Google DeepMind creating ToMnet6 that utilizes the architecture of three neural networks to assist AI to learn the internal ‘mental’ states of other AI.
The benefits of Theory of Mind are facilitating AI to better understand humans and therefore increase the opportunities for trust within Human-AI interactions. But there are technical and ethical concerns that are still unanswered. Namely, from a technical perspective, how would researchers know that an AI system has developed a Theory of Mind vs. demonstrating an advanced application of statistical pattern discovery? From an ethical standpoint, is it appropriate for AI to understand internal mental states of humans which leaves opportunities for manipulation at scale?
As with all developments of technology, there are opportunities for benefits and detriments. As Theory of Mind continues to develop within the AI space, we have the opportunity to be more aware, more choiceful and more vigilant in shaping this technology. By fostering open communication and prioritizing ethical considerations, we can ensure that Artificial Theory of Mind becomes a tool for building trust and understanding, not manipulation.
Weiskopf, D. (n.d.). The Theory-Theory of Concepts. Internet encyclopedia of philosophy. https://iep.utm.edu/theory-theory-of-concepts/
Schidelko, L. P., Huemer, M., Schröder, L. M., Lueb, A. S., Perner, J., & Rakoczy, H. (2021, December 21). Why do children who solve false belief tasks begin to find true belief control tasks difficult? A test of pragmatic performance factors in theory of Mind Tasks. Frontiers. https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2021.797246/full
Korovkin, S., Vladimirov, I., Chistopolskaya, A., & Savinova, A. (2018, September 12). How working memory provides representational change during insight problem solving. Frontiers. https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2018.01864/full
Keren, N. (n.d.). Rcpsych. Royal College of Psychiatrists. https://www.rcpsych.ac.uk/docs/default-source/members/divisions/london/london-essay-prizes/london-prev-prize-winners-2012-the-role-of-inhibitory-control-and-social-cognition-in-a-naturalistic-theory-of-mind-test.pdf?sfvrsn=7c2541a5_4
Williams, J., Fiore, S. M., & Jentsch, F. (2022, February 28). Supporting artificial social intelligence with theory of mind. Frontiers in artificial intelligence. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8919046
Rabinowitz, N. C., Perbet, F., Song, H. F., Zhang, C., Eslami, S. M. A., & Botvinick, M. (2018, March 12). Machine theory of mind. arXiv.org. https://arxiv.org/abs/1802.07740