You've successfully subscribed to WorldRemit Technology Blog
Great! Next, complete checkout for full access to WorldRemit Technology Blog
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Billing info update failed.
Machine Learning for Actionable Behavioural Clustering at WorldRemit

Machine Learning for Actionable Behavioural Clustering at WorldRemit

. 3 min read

| Image by Drew Graham via Unsplash Copyright-free

In this blog post, I want to present the Machine Leaning approach that we use to cluster WorldRemit’s clients according to behavioural patterns, with the ultimate goal of getting actionable behavioural groups. We have successfully applied this approach to cluster clients into groups with different levels of loyalty as will be explained below.

Measuring Customer Behaviour with Time-To-Event

In order to cluster our clients according to behavioural patterns, we start by identifying events that can be used to define those patterns, and then proceed to use Machine Learning to predict the time it takes for the defining event to occur. This time-to-event forecast can then be used as a behaviour measure. In the specific case of Loyalty,  we can recognise the defining event as “churning”-  or the act of giving up our services - and the degree of Loyalty can be "measured" by the estimated time-to-churn.

Time-to-events can be predicted relying on a given set of predictive factors and Machine Learning Models can be used to perform a regression of time-to-event to these factors (commonly known as features in machine learning). These models can be trained to learn to predict time-to-events relying on training datasets and this is known as supervised learning.

There can be cases where the defining event does not occur within the predefined observation period. These cases are known as censored cases and there are special mechanisms to incorporate censorship into the learning process. These ideas have been acquired in machine learning from the area of knowledge known as “survival analysis”.

Currently we use Multi-Output Gradient Boosted Trees to predict time-to-churn as a measure of Loyalty. One of the advantages of using Gradient Boosted Trees is that predictions are “straightforwardly explainable”. It is possible to use the Boosted Trees to rank predictive factors according to their degree of importance and we can further explain why a client is assigned a given behavioural measure.

Clustering and Labelling Customers according to their measured behaviour

Once a measure of behaviour is obtained, one way to cluster clients is to identify the probability distribution of the time-to-events as “customer profiles” and rely on unsupervised approaches to group these profiles into coherent groups.

For Loyalty, we use K-means clustering performed on  cumulative “survival” distributions obtained using Multi-Output Gradient Boosted Trees. The resulting customer clusters, grouped according to their degree of loyalty, are feasible in the sense that they can be used, for example, to implement marketing retention strategies. And further expanding the explainability potential here, we can explain why a client is labeled as 4-degree loyal (very loyal in relative terms).

Aggregated Behavioural Insights

It is possible to perform insights into a subset of the loyalty clusters to visualise the distribution of predictive factors in areas such as:

  • Paid count
  • Tenure
  • Days since last transaction
  • Transaction Count


In the approach presented here, customer behaviours can be “explained” by identifying behaviour-defining events and the corresponding  time-to-event distributions - predicted relying on supervised machine learning approaches - can be used as behavioural profiles that will ultimately serve to cluster clients into actionable groups.  This approach has  been successfully exploited in the task of clustering clients into 5 groups with distinct levels of loyalty.


Fernandez del Rio, Ana & Chen, Pei & Periáñez, África. (2019). Profiling Players with Engagement Predictions.

Wang, Ping & li, Yan & Reddy, Chandan. (2017). Machine Learning for Survival Analysis: A Survey. ACM Computing Surveys. 51. 10.1145/3214306.

P. Liu, B. Fu and S. X. Yang, "HitBoost: Survival Analysis via a Multi-Output Gradient Boosting Decision Tree Method," in IEEE Access, vol. 7, pp. 56785-56795, 2019.

L.J.P. van der Maaten and G.E. Hinton. "Visualizing High-Dimensional Data Using t-SNE". Journal of Machine Learning Research 9(Nov):2579-2605, 2008