Bridging Human Deception to AI Exploitation: A Systematic Review of Psychological Strategies in LLM Model Manipulation

The rapid advancement of large language models (LLMs) has introduced unprecedented capabilities in human-AI interaction, yet it has also created new opportunities for exploitation and manipulation. This systematic literature review investigates the psychological tactics behind the exploitation of LLMs, establishing connections between human deception and AI manipulation. This study seeks to integrate prior investigations into the methods by which adversarial entities manipulate LLMs, identify deficiencies in present knowledge, and propose avenues for subsequent research to address these threats. The review methodically organizes research into core dimensions such as deception and manipulation in LLMs, vulnerabilities related to circumventing restrictions, attacks based on psychological manipulation, and ethical implications, while also examining the cognitive and behavioral dimensions of LLM engagements. The findings indicate large language models are vulnerable to many adversarial approaches, numerous resembling conventional human deceit methods, thus highlighting the necessity for resilient detection and assessment strategies. The results highlight the importance of interdisciplinary methods, integrating aspects of cognitive psychology, computer science, and ethics, to address the growing difficulties of LLM misuse. In conclusion, this analysis advances comprehension of the mental processes underlying LLM control and presents practical suggestions for improving model security and robustness in effective implementations.

Liked Liked