- Evaluating Generative AI in Mental Health: Systematic Review of Capabilities and Limitations
Background: The global shortage of mental health professionals, exacerbated by increasing mental health needs post-COVID-19, has driven interest in leveraging large language models (LLMs) like ChatGPT to address these challenges through applications such as clinical note generation, personalized treatment planning, and therapeutic support. Objective: This systematic review aims to evaluate the current capabilities of generative AI (genAI) models in the context of mental health applications. Methods: A comprehensive search across five databases yielded 1,046 references, of which eight studies met the inclusion criteria. These criteria required original research with experimental designs (e.g., Turing tests, socio-cognitive tasks, trials, or qualitative methods), a focus on genAI models, and explicit measurement of socio-cognitive abilities (e.g., empathy, emotional awareness), mental health outcomes, and user experience (e.g., perceived trust, empathy). Results: The studies, published between 2023 and 2024, primarily evaluated models like ChatGPT 3.5 and 4.0, Bard, and Claude in tasks such as psychoeducation, diagnosis, emotional awareness, and clinical interventions. Most studies employed zero-shot prompting and human evaluators to assess the AI responses, using standardized rating scales or qualitative analysis. However, these methods were often insufficient to fully capture the complexity of genAI capabilities. The reliance on single-shot evaluation techniques, limited comparisons, and task-based assessments isolated from a specific context may oversimplify genAI’s abilities and overlook the nuances of human-AI interaction, especially in areas requiring contextual reasoning or cultural sensitivity. The findings suggest that while genAI models demonstrate strengths in psychoeducation and emotional awareness, their diagnostic accuracy, cultural competence, and ability to engage users emotionally remain limited. Users frequently reported concerns about trustworthiness, accuracy, and the lack of emotional engagement. Conclusions: Future research could use more sophisticated evaluation methods, such as few-shot and chain-of-thought prompting to fully uncover genAI’s potential. Future studies should also focus on longitudinal research, broader comparisons with human benchmarks, and exploring how AI can be better integrated into mental health care with improved socio-cognitive and ethical decision-making capabilities.
- Prescriptive Predictors of Mindfulness Ecological Momentary Intervention for Social Anxiety Disorder: Machine Learning Analysis of Randomized Controlled Trial Data
Background: Shame and stigma often prevent individuals with social anxiety disorder (SAD) from seeking and attending costly and time-intensive psychotherapies, highlighting the importance of brief, low-cost, and scalable treatments. Creating prescriptive outcome prediction models is thus crucial for identifying which clients with SAD might gain the most from a unique scalable treatment option. Nevertheless, widely used classical regression methods might not optimally capture complex nonlinear associations and interactions. Objective: Precision medicine approaches were thus harnessed to examine prescriptive predictors of optimization to a 14-day fully self-guided mindfulness ecological momentary intervention (MEMI) over a self-monitoring app (SM). Methods: This study involved 191 participants who had probable SAD. Participants were randomly assigned to MEMI (n=96) or SM (n=95). They completed self-reports of symptoms, risk factors, treatment, and sociodemographics at baseline, posttreatment, and 1-month follow-up (1MFU). Machine learning (ML) models with 17 predictors of optimization to MEMI over SM, defined as a higher probability of SAD remission from MEMI at posttreatment and 1MFU, were evaluated. The Social Phobia Diagnostic Questionnaire, structurally equivalent to the Diagnostic and Statistical Manual SAD criteria, was used to define remission. These ML models included random forest and support vector machines (radial basis function kernel) and 10-fold nested cross-validation that separated model training, minimal tuning in inner folds, and model testing in outer folds. Results: ML models outperformed logistic regression. The multivariable ML models using the 10 most important predictors achieved good performance, with the area under the receiver operating characteristic curve (AU-ROC) values ranging from .71 to .72 at posttreatment and 1MFU. These prerandomization and early-stage prescriptive predictors consistently identified which participants had the highest probability of optimization of MEMI over SM after 14 days and 6 weeks from baseline. Significant predictors included 4 strengths (higher trait mindfulness, lower SAD severity, presence of university education, no current psychotropic medication use), 2 weaknesses (higher generalized anxiety severity and clinician-diagnosed depression or anxiety disorder), and 1 sociodemographic variable (Chinese ethnicity). Emotion dysregulation and current psychotherapy predicted remission with inconsistent signs across time points. Conclusions: The AU-ROC values indicated moderately meaningful effect sizes in identifying prescriptive predictors within multivariable models for clients with SAD. Focusing on the identified notable client strengths, weaknesses, and Chinese ethnicity may enhance our ability to predict future responses to scalable treatments. Estimating the likelihood of SAD remission with a “prescriptive predictor calculator” for each client may help clinicians and policymakers allocate scarce treatment resources effectively. Clients with high remission probability may benefit from receiving the MEMI as a vigilant waitlist strategy before intensive therapist-led psychotherapy. These efforts may aid in creating actionable treatment selection tools to optimize care for clients with SAD in routine health care settings that use stratified care principles. Clinical Trial: OSF Registries 10.17605/OSF.IO/M3KXZ; https://osf.io/m3kxz
- Digital Integrated Interventions for Comorbid Depression and Substance Use Disorder: Narrative Review and Content Analysis
Background: Integrated digital interventions for the treatment of comorbid depression and substance use disorder have been developed, and evidence of their effectiveness is mixed. Objective: This study aimed to explore potential reasons for mixed findings in the literature on integrated digital treatments. We described the methodologies and core characteristics of these interventions, identified the presence of evidence-based treatment strategies, examined patterns across digital modalities, and highlighted areas of overlap as well as critical gaps in the existing evidence base. Methods: In June 2024, a literature search was conducted in Google Scholar to identify digital integrated interventions for comorbid major depressive disorder and substance use disorder. Articles were included if they described interventions targeting both conditions simultaneously; were grounded in cognitive behavioral therapy, motivational interviewing, or motivational enhancement therapy; and were delivered at least in part via digital modalities. In total, 14 studies meeting these criteria were coded using an open-coding approach to identify intervention characteristics and treatment strategies (n=25). Statistical analyses summarized descriptive statistics to capture the frequency and overlap of these strategies. Results: Studies included a range of digital modalities: internet (n=6, 43%), computer (n=3, 21%), smartphone (n=2, 14%), and supportive text messaging interventions (n=3, 21%). Half (n=7, 50%) of the studies included participants with mild to moderate depression symptom severity and hazardous substance use. Only 36% (n=5) of the studies required participants to meet full diagnostic criteria for major depressive disorder for inclusion and 21% (n=3) required a substance use disorder diagnosis. Most interventions targeted adults (n=11, 79%), with few targeting young or emerging adults (n=4, 29%), and only 36% (n=5) reported detailed demographic data. Treatment duration averaged 10.3 (SD 6.8) weeks. Internet-based interventions offered the widest range of treatment strategies (mean 11.7), while supportive text messaging used the fewest (mean 4.6). Common treatment strategies included self-monitoring (n=11, 79%), psychoeducation (n=10, 71%), and coping skills (n=9, 64%). Interventions often combined therapeutic strategies, with psychoeducation frequently paired with self-monitoring (n=9, 64%), assessment (n=7, 50%), coping skills (n=7, 50%), decisional balance (n=7, 50%), feedback (n=7, 50%), and goal setting (n=7, 50%). Conclusions: Among integrated digital interventions for comorbid depression and substance use, there was noteworthy variability in methodology, inclusion criteria, digital modalities, and embedded treatment strategies. Without standardized methods, comparison of the clinical outcomes across studies is challenging. These results emphasize the critical need for future research to adopt standardized approaches to facilitate more accurate comparisons and a clearer understanding of intervention efficacy.
- The Prevalence and Incidence of Suicidal Thoughts and Behavior in a Smartphone-Delivered Treatment Trial for Body Dysmorphic Disorder: Cohort Study
Background: Background: People with past suicidal thoughts and behavior (STB) are often excluded from digital mental health intervention (DMHI) treatment trials. This may perpetuate barriers to care and reduce treatment generalizability, especially in populations with elevated rates of STB, like body dysmorphic disorder (BDD). We conducted a randomized controlled trial (RCT, N = 80) of a smartphone-based cognitive behavioral therapy (CBT) for BDD that allowed for most forms of past STB except for past-month active suicidal ideation. Objective: Objective: This study had two objectives: to (1) characterize the sample’s lifetime prevalence of STB, and (2) estimate and predict STB incidence during the trial. Methods: Methods: We completed secondary analyses on data from an RCT of smartphone-delivered CBT for BDD. The primary outcomes consisted of STB severity and suicide attempt assessed at baseline with the Columbia-Suicide Severity Rating Scale (C-SSRS) and weekly during the trial via one item from the Quick Inventory of Depressive Symptomatology—Self Report (QIDS-SR item #12; 1,043 observations). We computed descriptive statistics (M, %) and ran a series of bi- and multivariate linear regressions predicting STB incidence during the three-month trial. Results: Results: At baseline, 40% of participants reported lifetime history of active suicidal thoughts and 10% reported lifetime suicide attempts. During the three-month trial, 42.5% reporting thinking about death- and/or suicide via weekly assessment. No participants reported frequent/acute suicidal thoughts, plans, or attempts. Lifetime suicide attempt (OR = 11.0, p < .01) and lifetime severity of suicidal thoughts (OR = 1.76, p < .01) were significant bivariate predictors of death-/suicide-related thought incidence reported during the trial. Multivariate models including STB risk factor covariates (e.g., age, sexual orientation) modestly improved prediction of death-/suicide-related thoughts (e.g., PPV = .91, NPV = .75, AUC =.83). Conclusions: Conclusions: Although some participants may think about death and/or suicide during a DMHI trial, it may be safe and feasible to include participants with most forms of past STB. Among other procedures, researchers should carefully select eligibility criteria, use frequent, ongoing, low-burden, and valid monitoring procedures, and implement risk mitigation protocols tailored to the presenting problem. Clinical Trial: Trial Registration: ClinicalTrials.gov NCT04034693; https://www.clinicaltrials.gov/study/NCT04034693
- Game Design, Effectiveness, and Implementation of Serious Games Promoting Aspects of Mental Health Literacy Among Children and Adolescents: Systematic Review
Background: The effects of traditional health-promoting and preventive interventions in mental health and mental health literacy are often attenuated by low adherence and user engagement. Gamified approaches such as serious games (SGs) may be useful to reach and engage youth for mental health prevention and promotion. Objective: This study aims to systematically review the literature on SGs designed to promote aspects of mental health literacy among adolescents aged 10 to 14 years, focusing on game design characteristics and the evaluation of user engagement, as well as efficacy, effectiveness, and implementation-related factors. Methods: We searched PubMed, Scopus, and PsycINFO for original studies, intervention development studies, and study protocols that described the development, characteristics, and evaluation of SG interventions promoting aspects of mental health literacy among adolescents aged 10 to 14 years. We included SGs developed for both universal and selected prevention. Using the co.LAB framework, which considers aspects of learning design, game mechanics, and game design, we coded the design elements of the SGs described in the studies. We coded the characteristics of the evaluation studies; indicators of efficacy, effectiveness, and user engagement; and factors potentially fostering or hindering the reach, efficacy and effectiveness, organizational adoption, implementation, and maintenance of the SGs. Results: We retrieved 1454 records through database searches and other sources. Of these, 36 (2.48%) studies describing 17 distinct SGs were included in the review. Most of the SGs (14/17, 82%) were targeted to a universal population of youth, with learning objectives mainly focusing on how to obtain and maintain good mental health and on enhancing help-seeking efficacy. All SGs were single-player games, and many (7/17, 41%) were embedded within a wider pedagogical scenario. Diverse game mechanics and game elements (eg, minigames and quizzes) were used to foster user engagement. Most of the SGs (12/17, 71%) featured an overarching storyline resembling real-world scenarios, fictional scenarios, or a combination of both. The evaluation studies provided evidence for the short-term efficacy and effectiveness of SGs in improving aspects of mental health literacy as well as their feasibility. However, the evidence was mostly based on small samples, and user adherence was sometimes low. Conclusions: The results of this review may inform the future development and implementation of SGs for adolescents. Intervention co-design, the involvement of facilitators (eg, teachers), and the use of diverse game mechanics and customization to meet the needs of diverse users are examples of elements that may promote intervention success. Although there is promising evidence for the efficacy and effectiveness of SGs for promoting mental health literacy in youth, there is a need for more rigorously planned studies, including randomized controlled trials and real-world evaluations, that involve follow-up measures and the assessment of in-game performance alongside self-reports.
- Impact of Conversational and Animation Features of a Mental Health App Virtual Agent on Depressive Symptoms and User Experience Among College Students: Randomized Controlled Trial
Background: Numerous mental health applications (MHealth apps) purport to alleviate depressive symptoms. Strong evidence suggests that brief cognitive behavioral therapy (bCBT)-based MHealth apps can decrease depressive symptoms, yet there is limited research elucidating the specific features that may augment its therapeutic benefits. One potential design feature that may influence effectiveness and user experience is the inclusion of virtual agents that can mimic realistic, human face-to-face interactions. Objective: The goal of the current experiment was to determine the effect of conversational and animation features of a virtual agent within a bCBT-based MHealth app on depressive symptoms and user experience in college students with and without depressive symptoms. Methods: College students (N=209) completed a two-week intervention in which they engaged with a bCBT-based MHealth app with a customizable therapeutic virtual agent that varied in conversational and animation features. A 2 (Time: Baseline vs. Two-Week Follow-Up) x 2 (Conversational vs. Non-Conversational Agent) x 2 (Animated vs. Non-Animated Agent) randomized controlled trial was utilized to assess mental health symptoms (PHQ-8, PSS-10, and RRS questionnaires) and user experience (MAUQ questionnaire) in college students with and without current depressive symptoms. MHealth app usability and qualitative questions regarding users’ perceptions of their therapeutic virtual agent interactions and customization process were assessed at follow-up. Results: Mixed ANOVA results demonstrated a significant decrease in symptoms of depression (P = .002; M = 5.50±4.86 at follow-up vs. M = 6.35±4.71 at baseline), stress (P = .005; M = 15.91±7.67 at follow-up vs. M = 17.02±6.81 at baseline), and rumination (P = .028; M = 40.42±12.96 at follow-up vs. M = 41.92±13.61 at baseline); however, no significant effect of conversation or animation was observed. Findings also indicate a significant increase in user experience in animated conditions. This significant increase in animated conditions is also reflected in the user’s ease of use and satisfaction (F(1, 201) = 102.60, P < .001), system information arrangement (F(1, 201) = 123.12, P < .001), and usefulness of the application (F(1, 201) = 3667.62, P < .001). Conclusions: The current experiment provides support for bCBT-based MHealth apps featuring customizable, humanlike therapeutic virtual agents and their ability to significantly reduce negative symptomology over a brief timeframe. The app intervention reduced mental health symptoms, regardless of whether the agent included conversational or animation features, but animation features enhanced the user experience. These effects were observed in both users with and without depressive symptoms. Clinical Trial: Open Science Framework B2HX5; https://doi.org/10.17605/OSF.IO/B2HX5
- Health Care Professionals' Engagement With Digital Mental Health Interventions in the United Kingdom and China: Mixed Methods Study on Engagement Factors and Design Implications
Background: Mental health issues like occupational stress and burnout, compounded with the after-effects of COVID-19, have affected healthcare professionals (HCPs) around the world. Digital mental health interventions (DMHIs) can be accessible and effective in supporting well-being among HCPs. However, low engagement rates of DMHIs are frequently reported, limiting the potential effectiveness. More evidence is needed to reveal the factors that impact HCPs’ decision to adopt and engage with DMHIs. Objective: This study aims to explore HCPs’ motivation to engage with DMHIs and identify key factors affecting their engagement. Amongst these, we include cultural factors impacting DMHI perception and engagement among HCPs. Methods: We used a mixed-method approach, with a cross-sectional survey (N=438) and semi-structured interviews (N=25) with HCPs from the UK and China. Participants were recruited from one major public hospital in each country. Results: Our results demonstrated a generally low engagement rate with DMHIs among HCPs from the two countries. Several key factors that affect DMHI engagement were identified, including belonging to underrepresented cultural and ethnic groups, limited mental health knowledge, low perceived need, lack of time, needs for relevance and personal-based support, and cultural elements like self-stigma. The results support recommendations for DMHIs for HCPs. Conclusions: Although DMHIs can be an ideal alternative mental health support for HCPs, engagement rates among HCPs in China and the UK are still low due to multiple factors and barriers. More research is needed to develop and evaluate tailored DMHIs with unique designs and content that HCPs can engage from various cultural backgrounds.
- Psychotherapy Access Barriers and Interest in Digital Mental Health Interventions Among Adults With Treatment Needs: Survey Study
Background: Digital mental health interventions (DMHIs) are a promising approach to reducing the public health burden of mental illness. DMHIs are efficacious, can provide evidence-based treatment with few resources, and are highly scalable relative to one-on-one face-to-face psychotherapy. There is potential for DMHIs to substantially reduce unmet treatment needs by circumventing structural barriers to treatment access (eg, cost, geography, and time). However, epidemiological research on perceived barriers to mental health care use demonstrates that attitudinal barriers, such as the lack of perceived need for treatment, are the most common self-reported reasons for not accessing care. Thus, the most important barriers to accessing traditional psychotherapy may also be barriers to accessing DMHIs. Objective: This study aimed to explore whether attitudinal barriers to traditional psychotherapy access might also serve as barriers to DMHI uptake. We explored the relationships between individuals’ structural versus attitudinal barriers to accessing psychotherapy and their indicators of potential use of internet-delivered guided self-help (GSH). Methods: We collected survey data from 971 US adults who were recruited online via Prolific and screened for the presence of psychological distress. Participants provided information about demographic characteristics, current symptoms, and the use of psychotherapy in the past year. Those without past-year psychotherapy use (640/971, 65.9%) answered questions about perceived barriers to psychotherapy access, selecting all contributing barriers to not using psychotherapy and a primary barrier. Participants also read detailed information about a GSH intervention. Primary outcomes were participants’ self-reported interest in the GSH intervention and self-reported likelihood of using the intervention if offered to them. Results: Individuals who had used psychotherapy in the past year reported greater interest in GSH than those who had not (odds ratio [OR] 2.38, 95% CI 1.86-3.06; P<.001) and greater self-reported likelihood of using GSH (OR 2.25, 95% CI 1.71-2.96; P<.001). Attitudinal primary barriers (eg, lack of perceived need; 336/640, 52.5%) were more common than structural primary barriers (eg, money or insurance; 244/640, 38.1%). Relative to endorsing a structural primary barrier, endorsing an attitudinal primary barrier was associated with lower interest in GSH (OR 0.44, 95% CI 0.32-0.6; across all 3 barrier types, P<.001) and lower self-reported likelihood of using GSH (OR 0.61, 95% CI 0.43-0.87; P=.045). We found no statistically significant differences in primary study outcomes by race or ethnicity or by income, but income had a statistically significant relationship with primary barrier type (ORs 0.27-3.71; P=.045). Conclusions: Our findings suggest that attitudinal barriers to traditional psychotherapy use may also serve as barriers to DMHI use, suggesting that disregarding the role of attitudinal barriers may limit the reach of DMHIs. Future research should seek to further understand the relationship between general treatment-seeking attitudes and attitudes about DMHIs to inform the design and marketing of DMHIs.
- Mental Health Professionals’ Technology Usage and Attitudes Toward Digital Health for Psychosis: Comparative Cross-Sectional Survey Study
Background: Digital health technologies (DHTs) for psychosis have been developed and tested rapidly in recent years, and the COVID-19 pandemic accelerated the transition to telehealth and digital health. However, research examining mental health professionals’ views on the use of DHTs for people with psychosis is limited. Given the increased accessibility and availability of DHTs for psychosis, an up-to-date understanding of staff perception of DHTs for psychosis is warranted. Objective: In this study, we investigated: i) staff technology usage and their perception of service users’ technology usage; ii) staff views towards the use of DHTs for psychosis; and iii) the differences in staff technology usage and views towards the use of DHTs in clinical practice before and after the COVID-19 pandemic. Methods: Two cross-sectional surveys were deployed before and after the pandemic. Both surveys were distributed to mental healthcare staff who had experience of working with individuals experiencing psychosis in the UK. Results: A total of 155 and 352 participants completed the Phase 1 and Phase 2 surveys respectively. Staff reported high levels of technology ownership and usage. In general, staff expressed positive views regarding the use of DHTs for psychosis; however, barriers and concerns, including affordability, digital literacy, and potential negative effects on service users’ mental health, were also reported. There was no change in staff use of digital technology in clinical practice pre-post pandemic. Conclusions: Staff expressed optimism about the potential implementation of DHTs in practice, though they also noted some concerns regarding safety and access. Although the COVID-19 pandemic accelerated the adoption of digital tools for healthcare, the sustainability of this shift from traditional to digital healthcare has been less than optimal. To promote the implementation of DHTs, systematic evaluation of adverse effects of using DHTs and dissemination of evidence are needed to address concerns staff expressed regarding safety. Organisational support and training should be offered to staff to help address barriers and increase confidence in recommending and utilising DHTs with service users.
- Effectiveness of General Practitioner Referral Versus Self-Referral Pathways to Guided Internet-Delivered Cognitive Behavioral Therapy for Depression, Panic Disorder, and Social Anxiety Disorder: Naturalistic Study
Background: Therapist guided internet-delivered cognitive behavioral therapy (guided ICBT) appears to be efficacious for depression, panic disorder (PD) and social anxiety disorder (SAD) in routine care clinical settings. However, implementation of guided ICBT in specialist mental health services is limited partly due to low referral rates from general practitioners (GP), which may stem from lack of awareness, limited knowledge of its effectiveness, or negative attitudes toward the treatment format. In response, self-referral systems were introduced in mental health care about a decade ago to improve access to care. Yet, little is known about how referral pathways may affect treatment outcomes in guided ICBT. Objective: This study aims to compare the overall treatment effectiveness of GP-referral and self-referral to guided ICBT for patients with depression, PD or SAD in a specialized routine care clinic. This study also explores if the treatment effectiveness varies between referral pathways and the respective diagnoses. Methods: This naturalistic open effectiveness study compares treatment outcomes from pre-treatment to post-treatment and from pre- to 6-month follow-up across two referral pathways. All patients underwent module based guided ICBT lasting up to 14 weeks. The modules covered psychoeducation, working with negative or automatic thoughts, exposure training, and relapse prevention. Patients received weekly therapist guidance through asynchronous messaging, with therapists spending an average of 10–30 minutes per patient per week. Patients self-reported symptoms before, during, immediately after, and 6-months post-treatment. Level and change in symptom severity were measured across all diagnoses. Results: In total 460 patients met inclusion criteria, 305 GP-referred (GP), and 155 self-referred (Self). Across the total sample about 60% were female, mean age 32 years, average duration of disorder 10 years. We found no significant differences in pre-treatment symptom levels between referral pathways, across the diagnoses. Estimated effect sizes based on Linear Mixed Modelling showed large improvements from pre- to post-treatment and from pre- to follow-up across all diagnoses, with statistically significant differences between referral pathways (GP: 0.97 - 1.22, Self: 1.34 - 1.58, P: <.001 - .002) and for the diagnoses separately: depression (GP: 0.86 – 1.26, Self: 1.97 -2.07, P: <.001 - .018), PD (GP: 1.32 – 1.60, Self: 1.64 – 2.08, P: .065 - .016 ) and SAD (GP: 0.80 - 0.99, Self: 0.99 – 1.19, P: .178 - .222 ). Conclusions: Self-referral to guided ICBT for depression and panic disorder appears to yield greater treatment outcomes compared to GP-referrals. We found no difference in outcome between referral pathway for SAD. This study underscores the potential of self-referral pathways to enhance access to evidence-based psychological treatment, improve treatment outcomes, and promote sustained engagement in specialist mental health services. Future studies should examine the effect of the self-referral pathway when it is implemented on a larger scale.