TL;DR
- Propensity scoring ranks accounts by likelihood to pay now
- AI models use behaviour signals, not just credit scores
- High-propensity accounts contacted first improves recovery economics
- Low-propensity accounts get different treatment, not more pressure
- Model must be validated and monitored under SR 11-7
A collection floor on a Monday morning. The 30-day bucket report has come through. Two thousand accounts. Every one of them gets the same outbound call, queued in the same order, worked by the same team through the same script.
Account 847 was going to pay on Wednesday regardless. The call interrupts her at work, she does not answer, and the attempt is logged as a failed contact. Account 1,203 needs a payment arrangement conversation. He answers, but the agent is working from a standard payment demand script and the call ends without a commitment. Account 1,891 is in acute financial distress. The call produces a formal complaint that takes three hours of supervisor time to resolve.
Three accounts. Three completely different situations. One contact strategy applied uniformly to all of them. The outcome is predictable: one wasted attempt, one missed conversion, one complaint generated.
Propensity scoring is the mechanism that separates those three accounts before any contact is generated. It assigns each account a score that reflects its likelihood of payment in response to outreach, and routes each one to the treatment most likely to produce a useful outcome. This blog covers what propensity scoring is, what signals drive it, how it changes collections economics, and what US banks need to know about SR 11-7 compliance for models of this type.
What Propensity Scoring Is
A propensity score is a probability estimate: the likelihood that a specific borrower will make a payment in response to a specific type of contact within a defined time window. It is generated from behavioural signals on the account and updates as those signals change, giving collections operations a current picture of each borrower’s responsiveness rather than a static assessment made at origination.
Three distinct propensity scores serve different routing decisions in collections operations:
Propensity to pay: the likelihood that the borrower makes a payment within a defined number of days following outbound contact. This is the primary prioritisation score, used to determine which accounts to contact first in each treatment cycle.
Propensity to respond: the likelihood that the borrower answers a call or replies to a digital message. This score drives channel selection and timing optimisation. A borrower with a high pay propensity but a low response propensity on voice calls should receive digital-first outreach.
Propensity to self-cure: the likelihood that the borrower pays without any outbound contact. This is the most immediately valuable score in early bucket collections. Accounts with high self-cure probability will resolve on their own. Contacting them early consumes capacity without producing any incremental recovery and risks generating unnecessary friction with a borrower who was already going to pay.
These three scores serve three different decisions. Collapsing them into a single output produces a composite that serves none of those decisions cleanly.

What Signals Drive a Propensity Model
Traditional collections prioritisation relies on days-past-due, outstanding balance, and origination credit score. All three are static snapshots of a borrower’s position at a single point in time. None of them tell you how this borrower is behaving right now, how they have responded to contact in the past, or whether they are on a recovery trajectory or a deteriorating one.
AI propensity models draw on four categories of signal that update continuously as the account generates new data.
Payment Behaviour Signals
The payment history on the account is the strongest predictor of near-term payment behaviour. Key signals include:
- Recency, frequency, and amount of payments over the last three to six cycles
- Payment trajectory: is the pattern improving, stable, or deteriorating over recent cycles
- History of self-cure: has this borrower paid late without contact before, and at what point in the delinquency cycle
- Partial payment patterns: a borrower making consistent partial payments is in a different situation from one who has made no payment at all
Contact Response Signals
How a borrower has responded to past contact attempts is a direct predictor of how they will respond to the next one:
- Historical answer rate on outbound voice calls
- Historical response rate on SMS, email, and digital channels
- Time-of-day and day-of-week patterns in past successful contacts
- Promise-to-pay history: how often the borrower made a commitment and whether they fulfilled it
Account Context Signals
The structure of the account itself informs the score:
- Product type and original credit terms
- Current balance relative to original credit limit or loan amount
- Time elapsed since the last payment
- Whether a prior payment arrangement existed and how it concluded
External Signals
Where available through bureau refresh or open banking data:
- Changes in the borrower’s overall credit position across all accounts
- Employment or income indicators
- Geographic economic data relevant to the borrower’s location, particularly useful for portfolios concentrated in specific regions or sectors

How Propensity Scoring Changes Collections Economics
The economic case for propensity scoring operates through three mechanisms: contact efficiency, treatment differentiation, and capacity allocation. Each produces measurable improvements in recovery economics independently. Together they compound.
Contact Efficiency
A collections operation without propensity scoring treats all accounts in a delinquency bucket as equivalent contact candidates. Self-cure accounts, mid-propensity accounts, and low-propensity accounts all enter the same contact queue and receive the same outreach volume.
Consider a portfolio of 20,000 early bucket accounts where 15% carry a high self-cure probability. That is 3,000 accounts per cycle that will pay without contact. Without propensity scoring, those 3,000 accounts receive outbound contact that produces no incremental recovery. The contact attempts consume channel capacity, agent queue time where voice is used, and generate unnecessary touchpoints with borrowers who were already going to pay. Removing those accounts from the early-stage contact queue does not reduce recovery. It redirects the same resources toward accounts where outreach actually changes what happens.
Treatment Differentiation
Propensity scoring allows each score band to receive a treatment calibrated to its actual situation rather than a uniform collections pressure applied across the portfolio.
High-propensity accounts are close to paying. They need a low-friction reminder with a clear payment path: a digital message with a direct payment link, minimal conversation required. Applying a full structured collections call to this segment wastes the call and risks irritating a borrower who needed only a nudge.
Mid-propensity accounts are reachable but need a reason to commit. These accounts benefit from a structured outreach with a clear payment arrangement offer, a human conversation that identifies the barrier and proposes a concrete solution.
Low-propensity accounts are experiencing genuine financial difficulty. The right treatment here is not increased contact pressure. It is a financial hardship assessment, extended arrangement options, and where appropriate, early referral to a specialist team. Applying standard collections pressure to a low-propensity account generates complaints, opt-outs, and in some cases debt review applications, outcomes that are more expensive to manage than the original delinquency.
Capacity Allocation
Agent time is the most expensive resource in a collections operation. Propensity scoring directs that capacity toward the accounts where a human conversation changes the outcome: mid-propensity accounts where a payment arrangement converts to a commitment, and low-propensity accounts where a hardship conversation prevents escalation. High-propensity accounts are handled through digital channels at a fraction of the cost per contact. The net effect is the same or better recovery at a lower cost-to-collect across the portfolio.
SR 11-7 Compliance for Propensity Models
A propensity model used to prioritise accounts and determine contact strategies at scale meets the definition of a model under SR 11-7, the Federal Reserve and OCC joint guidance on model risk management. That applies whether the model was built by the bank’s internal data science team or sourced from a third-party vendor or collections platform.
The bank carries the validation responsibility either way. A vendor’s assertion that their model has been validated does not substitute for the bank’s own independent validation under SR 11-7. The bank must understand what the model does, what features it uses, what its limitations are, and how it performs on the bank’s own portfolio.
Before deploying a propensity model in production, the SR 11-7 documentation requirement covers:
- Model purpose and the specific decisions it informs
- Full feature list with data sources and refresh frequencies
- Development methodology and algorithm selection rationale
- Backtesting results on a holdout sample from the bank’s own portfolio
- Champion-challenger test design as the mechanism for ongoing performance validation
- Documented limitations and known conditions under which model performance degrades
Ongoing monitoring after deployment requires tracking score distribution stability, rank-order stability (high-score accounts continuing to recover at higher rates than low-score accounts), and feature drift in the input signals the model relies on. A full independent revalidation at a defined interval, typically annually or when a material change in portfolio composition occurs, closes the monitoring loop.
Fair Lending
Propensity models must be tested for disparate impact before deployment. A model that produces systematically lower scores for accounts held by consumers in protected demographic groups creates ECOA and Reg B exposure, even if the model uses no demographic features directly. Collections decisions fall within fair lending scope in the same way origination decisions do. Disparate impact testing is not a deployment gate that is cleared once. It is an ongoing obligation repeated at each model revalidation.

What Good Propensity Scoring Looks Like in Practice
Five markers distinguish a well-designed propensity scoring programme from one that produces a score without producing better outcomes:
Separate scores for separate decisions. Pay propensity, respond propensity, and self-cure propensity serve different routing decisions and should be generated and used separately. A single composite score that blends all three produces outputs that serve none of those decisions with precision.
Behavioural features that update in real time. A model trained on origination-era data and never refreshed is a static snapshot with a propensity label on it. The features driving the score must reflect the borrower’s current behaviour, updated on a cycle that matches the contact frequency of the collections operation.
Explicit treatment routing per score band. Each score band must map to a defined treatment strategy in the collections policy. A score that produces a priority ranking without a corresponding treatment definition does not change what the collections operation does. The score is only as useful as the treatment it drives.
Continuous monitoring with documented thresholds. Score distribution, rank-order stability, and feature drift are tracked on a defined cadence. Thresholds for triggering a model review or retraining are documented before they are needed, not determined after performance has already degraded.
Fair lending testing at deployment and revalidation. Disparate impact analysis is conducted before the model goes live and repeated at each revalidation cycle. The absence of demographic features in the model does not remove the obligation to test for differential outcomes across demographic groups.
The Score Is Only as Useful as the Treatment It Drives
A propensity score sitting in a database with no defined treatment routing is a number. It becomes a collections tool when it determines which accounts receive a digital nudge, which receive a payment arrangement conversation, which receive a hardship assessment, and which receive no contact at all because they will pay anyway.
The banks improving recovery rates consistently with AI collections are not the ones with the most sophisticated models. They are the ones where the score connects directly to a differentiated treatment, the treatment is calibrated to the borrower’s actual situation, and the model is monitored closely enough to catch drift before it affects portfolio performance.
That connection between score and treatment, built on behavioural signals that update continuously, is what separates propensity scoring from the prioritisation approaches that have been standard in US collections for the past two decades.
iTuring’s AI collections platform generates separate pay, respond, and self-cure propensity scores per account, updated continuously from live behavioural signals, with treatment routing logic and SR 11-7 monitoring infrastructure built in.


