1) 19% of existing customers become repeat customers, purchasing a second or third car from the same dealership.
2) The document analyzes purchase history data to determine which subsequent car models repeat customers are most likely to purchase after their initial car.
3) Several predictive models are proposed, including decision trees, to more accurately predict a repeat customer's next vehicle based on additional customer profile data like age, income, gender, and occupation. Better predicting customer preferences could help improve marketing strategies.
2. Repeat Customer Proportion
Type of Customer No. of Cars
Non repeat Customer 468,558
Customer who purchased 2 cars 177,447
Customer who purchased 3 cars 75,965
468,558, 81%
88,723, 15%
25,321, 4%
No. of Customers, %
Non repeat customer Bought 2nd Car Bought 3rd Car
19% of existing customers become repeat customers!
3. Count Analysis Method:-
S.N. Customer ID No. of
Contracts
MM-YYYY Product
1 1000102580 3 03-2003 CAMRY
2 1000102580 3 07-2005 LANDCRUISER
WAGON
3 1000102580 3 03-2008 FORTUNER
Previous Car Subsequent Car
CAMRY LANDCRUISER WAGON
CAMRY FORTUNER
LANDCRUISER WAGON FORTUNER
From History of Sales
Observing Preference of Buying Subsequent Car
Counting number of purchase of same
first car and same subsequent car
for all repeat customers.
PreviousSubsequent FORTUNER LANDCRUISER
WAGEN
CAMRY 1 1
LANDCRUISER WAGEN 1
4. How first car owners preferred their next car(s)?
First Car CAMRY PRADO
LANDCRUISER
WAGON COROLLA FORTUNER Yaris Sedan
YARIS (H/B)
16% 2% 1% 28% 8% 17%
COROLLA
22% 2% 2% 32% 6% 12%
CAMRY
31% 3% 1% 20% 6% 8%
FORTUNER
13% 7% 2% 21% 16% 14%
Aurion
20% 2% 2% 20% 3% 10%
INNOVA
13% 2% 1% 22% 10% 14%
Subsequent Car
* Indicating selected products due to space constraint
Only Active products used for subsequent car preference for marketing relevance
5. Top 3 prediction with distribution:-
0%
5%
10%
15%
20%
25%
30%
COROLLA CAMRY Yaris Sedan
COROLLA
6. Top 5 prediction with distribution:-
0%
5%
10%
15%
20%
25%
30%
35%
CAMRY COROLLA Yaris Sedan HILUX DOUBLE
CAB(IMV3)
FORTUNER
CAMRY
7. Top 3 prediction with distribution:-
0%
5%
10%
15%
20%
25%
30%
COROLLA Yaris Sedan CAMRY
YARIS (H/B)
8. Top 3 prediction with distribution:-
0%
5%
10%
15%
20%
25%
30%
COROLLA Yaris Sedan CAMRY
YARIS (H/B)
9. How can we improve this knowledge?
Decision Tree Model: Non-linear Prediction Model for multi-
category classification through hierarchical segmentation of
the data, by partitioning recursively.
A tree structure of rules over the input variables are used to
classify or predict according to the target variable.
Decisions upon additional internal information we have about
the repeat customers can be used as additional input variables.
Buying preference generally depends upon multifactor ( Additional Data we have of our Customers)
Marital Status Nationality Work Place Job
Guest Age Range Guest Income Range Guest Gender Government/ Non Government
10. Data Preparation
1. Exploring the significance of variables
2. Data Partition:- 55% Training, 45% Validation.
Chisquaredplotofvariablesbytheir
chi-squaredStatistics.
Variableworthplot:Worthofvariablesby
theirworthinpredictingthetargetvariable.
12. Model Interpretation: Misclassification Rate
Lower Misclassification rate represents a good model.
Model performance is good as indicated by the
training and validation model results are close.