Content area
Motivational Interviewing (MI) is a widely-used talk therapy approach employed by clinicians to guide clients toward healthy behaviour change. Evaluating MI sessions and training MI counsellors relies on behavioural coding, the classification of counsellor and client utterances into predefined categories. Recent advances in Large Language Models (LLMs) now make it possible to automate not only behavioural coding, but the delivery of MI itself. This dissertation introduces AutoMISC, which performs utterance-level parsing and behavioural coding under the Motivational Interviewing Skill Code, the original annotation scheme for MI. AutoMISC achieves an overall accuracy of 70% and a macro F1 score of 0.42 for counsellor speech (19 categories) and 0.41 for client speech (17 categories) against expert-aligned annotations using GPT-4.1 (n= 821 utterances). Additional validation showed that the codes predict session-level counselling quality in a widely-used MI transcript dataset at 87% accuracy, and align with existing annotations in another dataset at 71% accuracy. We also demonstrate how the codes can visualize the trajectory of client motivation over a session alongside counsellor codes. We apply AutoMISC to the transcripts of a brief smoking cessation intervention experiment where tobacco smokers conversed with a fully generative MI counsellor chatbot evolved in collaboration with experienced MI clinician-scientists. Two versions were tested: (1) a single prompted LLM (106 participants), and (2) a two-stage approach which decouples technique selection from utterance generation (93 participants). Participant-reported confidence in quitting smoking was measured before the conversation and one week later. Both versions yielded an average increase in confidence of 1.7 on a 0-10 scale (p<0.001 for both). The first version scored well on participant-reported perceived empathy, higher than typical human counsellors, while the second scored lower. AutoMISC’s analysis of the transcripts provided deeper insights beyond participant-reported outcomes. Both versions showed adherence to MI standards in 99% of utterances. We found that the slope of the trajectory of the client’s motivation correlates with the change in confidence (Spearman’s r= 0.28, p<0.005 for Version 1; r= 0.20, p= 0.051 for Version 2). This work demonstrates the potential synergy between automated MI delivery and automated MI behavioural coding.