The document analyzes real estate data using k-means clustering to identify optimal areas for investment. It recommends Cluster 3, which has the highest rental yield at 9% and lowest rental share at 23%, indicating potential for rental increases. Specific areas highlighted for investment include counties in Michigan, cities and counties in Texas, and counties in Ohio, Illinois, and Missouri.
2. RECOMMENDATIONS
1. Cluster 3 seems to be the better segment for investment
options.
2. The primary reason being high rental yield along with low
rental share with good potential to rental rise.
3. On further drilling down Cluster 3 based on Rental yield ,
Rental share & Population parameters we can shortlist the
below areas.
State Place
Michigan Genesse,Macomb,Ingh
am - Counties
Texas Corpus Christi, Nueces,
Fort worth – Cities
Bell,Bexar,Tarrant –
Counties
Arlington City
Ohio Montgomery county
Illinois St. Clair County
Missouri Jackson County
3. 1. THE CHART BELOW EXPLAINS HOW RENTAL YIELD & RENTAL SHARE PARAMETERS FARE IN THE AREAS
SELECTED.
2. THE DATA HAS BEEN ORDERED IN DECREASING VALUE OF RENTAL YIELD AND THE TREND HAS BEEN GIVEN.
0
5
10
15
20
25
Genesee
County
Corpus Christi
city
Nueces County Fort Worth city Macomb
County
Bell County Bexar County St. Clair County Montgomery
County
Ingham CountyJackson County Tarrant County Arlington city
Cluster chart-Rental Yield & Rental Share
Rent Yield Rent Share Linear (Rent Yield)
7. OBJECTIVE & APPROACH
? Goal :
Recommend a good place / zip code to buy property for
investment purpose
? K-means Clustering :
This algorithm uses minimizing the distance between
points and centroids for creating clusters. Effective for
large sized datasets.
PROC FASTCLUS procedure has been used for this
method.
8. ANALYSIS STEPS
? We can use clustering analysis on the given dataset to segment each data based on
the critical factors like Rental yield, Rental share of income, Place type and size of the
place.
? By this approach we can actually split the data in to high, medium and low returns for
investment.
? The goal of clustering would be to find similarities and differences within the data by
creating homogeneous groups wherein with in group similarities are maximized and
the between group similarities are minimized.
10. CLUSTER 1 PROFILE
Variable Mean Pop mean Std dev Z score
Rental share 26% 21% 4% 1.25
Population 3597926 260474 2144874 1.6
Rental yield 5% 6% 2% 0.5
1. 19 data points fall in this cluster.
2. Rental share has highest z-score and it differentiates this cluster.
3. As rental share has high z-score, we can conclude this cluster comprises of
low income groups and has less scope for yield on investment.
4. This can be further seen in the rental yield z-score and population means
11. CLUSTER 2 PROFILE
Variable Mean Pop mean Std dev Z score
Rental share 20% 21% 4% 1.25
Rental yield 5% 6% 1% 1
Population 219706 260474 241058 0.17
1. 1176 data points fall in this cluster. This is roughly 73% of the total data.
Cluster 2 is the biggest cluster.
2. Rental yield is marginally high compared to cluster 1.
3. Even in cluster 2 rental share seems to be having higher z score.
4. Cluster 2 not ideal for investment option.
12. CLUSTER 3 PROFILE
Variable Mean Pop mean Std dev Z score
Rental yield 9% 6% 2% 1.5
Population 222090 260474 317575 0.75
Rental share 23% 21% 4% 0.5
1. 403 data points fall in this cluster. This is 25% of the total data.
2. Rental yield has the highest score of 1.5 and this differentiates this cluster.
3. Rental share z-score denotes that this cluster has potential to pay more
rent as their rental share value is relatively low compared to clusters 1 and
2.
4. Also cluster population size is decent enough compared to population
mean for any investment decision.
5. Cluster 3 has all the ingredients for an ideal investment option.
13. RECOMMENDATIONS
? Cluster 3 seems to be the ideal investment option.
? The reason being high Rental yield and low rental share values with
good potential for rental rise.
? On further analysis of cluster 3 data based on rental yield,
population size and propensity to given more rent we can shortlist
the below areas.
State Place
Michigan Genesse,Macomb,Ingham counties
Texas Corpus christi,Nueces,Fort worth – Cities
Bell,Bexar,Tarrant - Counties
Arlington city
Ohio Montgomery county
Illinois St.Clair county
MO Jackson county
14. APPENDIX – SAS CODE
? SAS code location in WPS :
Y:USERSUSER169ProgrammesClusteringReal Estate_clustering.sas