The FBI's Behavioral Analysis Unit has held since 2015 that all perpetrators of mass violence are best categorized as in...
The FBI's Behavioral Analysis Unit has held since 2015 that all perpetrators of mass violence are best categorized as instrumental offenders. This dissertation rejects that position. One hundred pre-attack communications were analyzed with LIWC2015 along four summary dimensions (Analytic, Clout, Authentic, Tone); K-means clustering separated the corpus into two psychometrically distinct types, Affective (n=53) and Instrumental (n=47), with linear discriminant analysis correctly classifying 100% of cases. A binary logistic regression on four refined attack-behavior variables (lethality, demographic-based targeting, perpetrator's home among attack locations, perpetrator killed by another person) recovered cluster membership at McFadden R² = 0.24, with the strongest single predictor (perpetrator killed) carrying an odds ratio of 14.11. The contribution is the first within-groups typology of mass murderers with statistically validated predictive ties to specific attack characteristics, intended as input to threat assessment and risk management for discovered pre-attack communications.
01 / Theoretical Background
Theoretical Background
Existing typologies of mass violence fall into three categories: those that distinguish mass murderers from the general population, those that distinguish them from other violent offenders, and those that distinguish kinds of mass murderers from each other. The first two have produced workable likelihood-of-attack screens (Knoll & Meloy, 2014; Langman, 2009; Verlinden, Hersen, & Thomas, 2000). The third has not. Theoretical proposals from Dietz (1986), Holmes and Holmes (1994), and Kelleher (1997) have not been empirically replicated, and recent attempts at empirical within-groups typology (Langman, 2009; Ioannou, Hammond, & Simpson, 2015) sort attackers ex post facto without predicting how an attack will unfold. The literature has missed a within-groups distinction with statistically validated ties to specific attack behaviors.
In threat assessment, the standard frame separates threat (the probability an event will occur) from risk (the potential impacts if it does). A burglary of an unoccupied home is low-threat and low-risk. A school shooting is low-threat and high-risk: low probability for any given school, catastrophic impact when it happens. An armed individual with credible communicated intent at an accessible location is high-threat and high-risk. The dissertation does not address the threat side of this matrix (whether an attack will occur). It addresses the risk side (what kind of attack is more likely, given that a pre-attack communication has been discovered), by treating that communication as a measurable psychometric signal.
Threat: Probability of Event →
Risk: Impact of Event →
Low Threat / High Risk
Example: A school shooting in the abstract. Dire consequences for any individual school, but low base rate.
High Threat / High Risk
Example: An armed individual with credible communicated intent attacking an accessible location. Mandates immediate tactical intervention.
Low Threat / Low Risk
Example: Burglary of an unoccupied home. Low probability for any given home, limited impact.
High Threat / Low Risk
High-probability events with minimal destructive impact. Procedural deterrence is the standard response.
Separating attackers into affective and instrumental types allows linguistic signals from a discovered pre-attack communication to inform what kind of attack is more likely, rather than waiting on the event itself to disclose its character.
02 / Methodology
Methodology
The methodological premise: as many as 67% of perpetrators of mass violence create meaningful pre-attack communications (Hempel, Meloy, & Richards, 1999; Meloy, Hempel, Mohandie, Shiva, & Gray, 2001; Langman, 2013; Silver, Horgan, & Gill, 2018). Most perpetrators do not survive their attack, but the communications do. Computer-mediated linguistic analysis on those communications is the only route to standardized, posthumous psychological characterization of this population, sidestepping the field's central methodological constraint.
This dissertation also rejected the FBI's definition of mass murder (four or more killed without significant cooling-off period) because that definition embeds a success component external to the perpetrator's psychology. Geddy Kramer, who at a FedEx facility in 2014 fired on six coworkers but killed only himself, was prepared and committed to a mass attack; his victims survived because of emergency response speed, hospital proximity, and his choice of less-lethal ammunition. The FBI definition excludes him. The definition used here, "any perpetrator who presents as prepared and motivated to murder four or more individuals, with insubstantial time passing between violent acts," includes him and treats lack of success as a typology factor rather than a disqualifier.
100 pre-attack communications were collected from individuals who carried out or attempted mass violence between 1966 and 2019. Sources included SchoolShooters.info (curated by Peter Langman), the Stanford Mass Shootings of America database, the Mother Jones US mass shootings dataset, and the Gun Violence Archive. LIWC2015 (Pennebaker, Booth, Boyd, & Francis, 2015) generated standardized percentile scores along four summary variables: Analytic, Clout, Authentic, and Tone. K-means clustering and binary logistic regression handled the small-sample, mixed-variable structure.
0Communications Analyzed
0Affective Type
0Instrumental Type
0.00McFadden R²
LIWC2015 Summary Variables
| Variable | High Score | Low Score |
|---|---|---|
| Analytic | Formal, logical, hierarchical thinking | Personal, narrative, present-focused |
| Clout | Confident, expert, authoritative | Tentative, humble, anxious |
| Authentic | Honest, personal, disclosing | Guarded, distanced, evasive |
| Tone | Positive, upbeat emotional valence | Anxious, sad, or hostile |
03 / Cluster Analysis
Cluster Analysis
K-means cluster analysis of the four LIWC summary variables produced two groups with no fuzzy boundary. Linear discriminant analysis correctly classified 100% of the 100 cases under cross-validation, with Analytic (r = .64), Clout (r = .88), and Authentic (r = -.80) carrying the strongest contributions to component separation and Tone (r = .48) carrying a moderate one.
Cluster 1, the Affective Type (n = 53), shows the lower Analytic, lower Clout, lower Tone, and higher Authentic profile. Their writing reads personal, present-tense, narrative, tentative, often saturated with hostility and despair. Cluster 2, the Instrumental Type (n = 47), shows the inverse profile: higher Analytic, higher Clout, higher Tone, lower Authentic. Their writing reads formal, logical, hierarchical, confident, guarded; closer to ideological treatise than to personal grievance. The split aligns with the theoretical distinction in the violence literature between emotionally-driven and goal-driven offending (Dodge, 1991; Glenn & Raine, 2009; McEllistrem, 2004; Meloy, 1988), but applies it within the mass-murder population for the first time.
Relative Profile Absolute Scale
Analytic
Tone
Clout
Authentic
Affective Type (n=53)
Instrumental Type (n=47)
Analytic
Tone
Clout
Authentic
Affective (LIWC percentile)
Instrumental (LIWC percentile)
04 / Behavioral Predictions
Behavioral Predictions
Can the words an attacker writes months before an event actually effectively predict the tactical specifics of the attack itself? To answer this, we formulated six hypotheses to map these linguistic clusters directly to physical outcomes. We wanted to see if the language alone could act as a leading indicator for how an attack would unfold.
H1
The text will naturally cluster into two distinct behavioral typologies. (Confirmed via K-means analysis).
H2
The cluster an attacker belongs to will predict the number of victims killed. (True: Instrumental attackers are more lethal. Mann-Whitney U: p = .019).
H3
Membership predicts whether an attacker will target a location they have a personal connection to. (True: Affective attackers are 270% more likely to strike schools or workplaces they know. Fisher's exact: p = .032).
H4
The cluster will dictate whether specific victim demographics are targeted. (True: Systematic demographic targeting correlates almost exclusively with the Instrumental group).
H5
The cluster predicts how the attack ends. (True: Affective attackers are substantially more likely to end the attack via suicide or apprehension. Fisher's exact: p = .006).
H6
We can build a unified logistical regression model correlating text clusters with physical lethality, targeting, location, and termination states.
Key Odds Ratios Test Statistics
Perpetrator Suicide or Apprehension
p = .006 · Affective vs Instrumental
SCALE: 0 TO 10
Personal Connection to Attack Location
p = .032 · Affective vs Instrumental
SCALE: 0 TO 10
| Hypothesis | Test | Statistic | Result |
|---|---|---|---|
| H2: Victims Killed | Mann-Whitney U | U=908, z=-2.35 | p = .019 |
| H3: Location Connection | Fisher's Exact | OR = 0.37 (inv = 2.7) | p = .032 |
| H4: Initial Targeting | Fisher's Exact | OR = 2.09 | p = .075 (n.s.) |
| H5: Perpetrator Killed | Fisher's Exact | OR = 7.65 | p = .006 |
AFFECTIVE TYPE (N=53) VS INSTRUMENTAL TYPE (N=47) · ALPHA = .05
05 / Logistic Regression
Logistic Regression
The first binary logistic regression used the raw H1-H5 predictors (lethality, connection, targeting, perpetrator killed) and was significant (χ²(4) = 30.58, p < .001, McFadden R² = 0.22), but two of those variables were too coarse to do their work. "Targeting" was coded as yes/no without distinguishing between specific-individual targeting and demographic targeting; "connection" was coded without distinguishing between the perpetrator's home and a less intimate site like their school or workplace. A second iteration refined both: targeting became demographic-based targeting specifically, and connection became perpetrator's home among the attack locations specifically. The refined model improved fit to McFadden R² = 0.24 (χ²(4) = 32.52, p < .001) and produced the odds ratios below. Each of the four predictors is significant when controlling for the others.
Odds Ratio % Change in Odds
Perpetrator KilledOR 14.11
Home as Attack LocationOR 5.0 inv.
Demographic TargetingOR 4.83
Per Victim KilledOR 1.14
RUST = PREDICTS AFFECTIVE TYPE · GLACIAL = PREDICTS INSTRUMENTAL TYPE
Perpetrator Killed+1,311%
Home as Attack Location+400%
Demographic Targeting+383%
Per Victim Killed+14%
CHI-SQUARED(4) = 32.52 · P < .001 · McFADDEN R² = 0.24
06 / Type Exemplars
Type Exemplars
The difference between these two clusters is starkest when we look at extreme, real-world examples. Matthew Murray's pre-attack writings represent the absolute limit of the Affective Type - highly personal, emotionally raw, and deeply focused on his own perceived mistreatment. Conversely, Anders Breivik represents the maximum bound of the Instrumental Type. His 1,500-page manifesto is completely devoid of personal emotional disclosure, reading instead like a cold, algorithmic political document.
Affective Type
Matthew Murray
Arvada and Colorado Springs, 2007. Four killed, five injured. Murray attacked locations where he had personal history. His communications were saturated with first-person emotionality, immediate grievances, and a stated indifference to his own survival.
LIWC Profile Attack Outcomes
Analytic18.81
Tone1.00
Clout46.47
Authentic89.35
Victims Killed4
Victims Injured5
Location ConnectionYes
Attack ConclusionSuicide
Instrumental Type
Anders Breivik
Utoya Island and Oslo, 2011. Seventy-seven killed, 319 injured. Breivik generated a 1,500-page manifesto reading as a calculated scholarly treatise. It is structured entirely around rigid ideological argument rather than personal grievance. He was ultimately apprehended alive.
LIWC Profile Attack Outcomes
Analytic83.47
Tone23.40
Clout67.32
Authentic26.71
Victims Killed77
Victims Injured319
Location ConnectionNo
Attack ConclusionApprehended
07 / Practical Constraints
Practical Constraints
The intended use of this typology is narrow and worth stating precisely. It does not predict whether an attack will occur. It predicts what kind of attack is more likely, conditional on a pre-attack communication being discovered and submitted for analysis. The work belongs to risk management (the impact side of the threat-assessment equation), not threat detection (the probability side).
Institutional Implementation Scenario
A university threat assessment team is presented with what appears to be a pre-attack communication composed by a student. Running the text through the LIWC summary variables and applying the cluster model classifies the author as Affective. The team then knows the author is more likely to attack a location with personal connection (potentially the campus itself), more likely to commit suicide at the conclusion, and likely to be in an acute emotional state. Response shifts accordingly: target hardening of the connected location, mental-health-led de-escalation rather than counter-terror posture.
The same model applied to an Instrumental classification points elsewhere: lower likelihood of personal connection to attack location, higher likelihood of demographic-based targeting, substantially higher likelihood that the attack continues until the perpetrator is apprehended or killed by another person. The two response postures are not interchangeable.
Limitations stated transparently. n = 100 is small for a field with low base rates; power analysis suggests significance may be underestimated for small and moderate effects. Stephen Paddock (Las Vegas, 2017) and other perpetrators who left no pre-attack communications are necessarily excluded. The within-sample classification accuracy of 100% is a property of the sample; out-of-sample validation on new pre-attack communications is the appropriate next step. Earlier work on increased communicated power drives in mass-murder manifestos (Leffew, 2017) was the seed for this dissertation's variable selection.