This document describes VacAdvisor, a tool that recommends vacation options based on a user's specified budget. It clusters over 720 US cities using data on flight costs, hotel rates, daily expenses, and location attributes. The conceptual framework involves clustering algorithms like k-means to group similar cities. Validation tests various numbers of clusters and algorithms, with k-means providing the best results based on metrics like within-sum-of-squares and adjusted rand index. The goal is to match users to vacation spots optimally based on their preferences and budgets.
7. Features
Flight cost from New York to 720 cities in the US
Average cost of hotels for 3143 counties
Average daily expenses including car
Location of city [east, west, central, south, north]
City speci?cs like beaches, museum, national parks
Fred N. Kiwanuka Fellow Insight Data Science VacAdvisor
12. Cluster Validation
Table: (Cluster Validation)
Number of Clusters WSS(105) City Similarity
2 9.98 Seattle Detroit, Charlotte, South Bend
3 9.73 Seattle Boston, Phoenix, Detroit
4 9.87 Seattle Charlotte, South Bend, Minneapolis
5 9.15 Seattle Detroit, Charlotte, South Bend
7 9.43 Seattle Detroit, Charlotte, South Bend
10 9.63 Seattle Sacrameto, San Jose, Colombus
12 9.52 Seattle Sacrameto, San Jose, Colombus
Fred N. Kiwanuka Fellow Insight Data Science VacAdvisor
13. Cluster Initialization and Validation
Table: (Cluster Initialization and Validation)
Alg Time(s) homo compl v-meas ARI AMI silhouette
k-means 0.03 0.971 0.971 0.971 0.988 0.970 0.389
VQ 0.04 1.000 1.000 1.000 1.000 1.000 0.388
After PCA 0.00 1.000 1.000 1.000 1.000 1.000 0.388
Mean Shift 0.24 1.000 0.970 0.972 0.980 0.972 0.386
Fred N. Kiwanuka Fellow Insight Data Science VacAdvisor