The document analyzes Wi-Fi usage data from the downtown mall in Charlottesville. Some key findings include:
1) The number of Wi-Fi clients is highest from April to August and correlates with weather, exhibiting strong weekly seasonality peaking on Fridays.
2) Usage data differs from client data, with usage highest from October to November. Daily usage peaks from 10am-6pm, indicating local users are biggest data consumers.
3) Parking ticket data, which tracks downtown activity, shows a meaningful correlation to Wi-Fi data usage at a 4-hour level, suggesting tickets can partially explain visitor data usage patterns.
1 of 14
Download to read offline
More Related Content
2018 Charlottesville Open Data Challenge - Team DSB
2. 2
OBSERVATIONS IN NUMBER OF CLIENTS DATA
High variance from April to June indicate either special events (holiday, festival, event in downtown mall),
beautiful weather drawing visitors to the downtown mall, and/or surprise inclement weather forcing visitors
indoors and onto Wi-Fi
No observable increasing or decreasing trend in overall time series; the slope of the plotted trendline is not
statistically significant
Monticello Wine Trail Festival
Tom Tom Founders Festival
Pride Festival
3. 3
NUMBER OF CLIENTS & WEATHER DATA
Monthly trend in number of clients reveals
correlation with weather data. Number of
clients rises and falls with temperature
April-August: High
Sept-Oct : Medium
Nov-Mar: Low
Precipitation, observed at a daily level, does
not seem to have a consistent effect on the
number of clients. More granular, hourly data
may be more predictive
The number of clients is highest in the
months of April to August a time when most
UVa students are out of town. Thus, UVa
students are not a significant percentage of
Wi-Fi clients at the downtown mall
4. 4
STRONG WEEKLY SEASONALITY OBSERVED IN
THE CLIENTS DATA
Number of clients exhibits strong weekly seasonality increases steadily through the week starting on
Sunday, peaks on Friday and settles down at the end of the week
Fridays are the most popular days on downtown mall, particularly from April to September, during
Fridays After Five
5. 5
SESSIONS DATA CLOSELY FOLLOWS CLIENTS
DATA
Number of sessions is highly correlated with number of clients
The histogram of sessions per client follows a near normal distribution indicating there are no additional factors affecting
number of sessions beyond those captured in the number of clients
Note: The data for number of sessions is missing for the months of Jan and half of Feb.
Therefore the # sessions values in Jan & Feb are low.
6. 6
OBSERVATIONS IN USAGE DATA
Usage data is inconsistent with clients data. Usage is highest in Oct-Nov while clients are highest in Apr-Aug,
indicating that the drivers of usage differ from drivers of clients
No global trend observed in usage data
Downloads are roughly 85% of total data usage, with uploads comprising the remainder. This ratio shifts slightly
towards uploads on Friday, Saturday, and Sunday
7. 7
NO WEEKLY SEASONALITY IN USAGE
The number of clients is highest on Fridays and Saturdays, but data usage does not peak on those days. Thus,
weekend visitors drive up the number of clients but are light consumers of Wi-Fi data
Therefore, clients can be broken down into two segments:
Segment 1 Weekend visitors, large in number but light users of data
Segment 2 Likely local residents/businesses, small in number but heavy users of data
8. 8
DAILY SEASONALITY IN USAGE DATA
Total usage follows a daily seasonality peaking between 10am-6pm EST (9am-5pm with daylight savings) each
day. Since these are non-peak hours for visitors, it reinforces the hypothesis that local residents and/or
businesses (Segment 2) are the biggest consumers of Wi-Fi data
Note: The time on the x-axis is UTC time zone
9. 9
PARKING TICKET DATA ACTS AS A PROXY
FOR DOWNTOWN MALL ACTIVITY
Heatmap of Parking Tickets Issued 2017
Parking tickets are issued Mon-Fri
Data set is publicly available through
City of Charlottesville Open Data Portal
0
50
100
150
200
250
300
350
400
450
500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Parking Tickets by Hour of Day
and Day of Week
Mon Tue Wed Thu Fri
10. 10
COMPARISON OF WEEKLY SEASONALITY IS
INCONCLUSIVE
On a daily level, parking tickets
track more closely with data
usage than with sessions or
clients, but still the relationship
is weak
Note: Weekends excluded because very few parking tickets are issued on weekends
-
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
0
20
40
60
80
100
120
140
160
180
Mon Tue Wed Thu Fri
DataUsage(MB)
Tickets,Clients,Sessions
Average Parking Tickets Verses
Wi-Fi Clients, Sessions, and Usage
Tickets Clients Sessions (x10^-1) Data Usage
11. 11
PARKING TICKETS SHOW A MEANINGFUL
CORRELATION TO DATA USAGE AT 4-HOUR
GRANULARITY
Note: Weekends excluded because very few parking tickets are issued on weekends
y = 9982.5x + 533600
R族 = 0.0297
-
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
0 20 40 60 80 100
DataUsage(B)
Parking Tickets
4-Hour Data Usage vs Parking Tickets
y = 0.3945x + 11.469
R族 = 0.0362
0
2
4
6
8
10
12
14
16
18
0 1 2 3 4 5
LN(DataUsage)
LN(Parking Tickets + 1)
Log-Log Transform
4-Hour Data Usage vs Parking Tickets
Parking tickets partially explain visitors to the downtown mall, and therefore data usage
If client and session data were available with 4-hour granularity, we could more rigorously test this claim
and tease out the relationship between tickets and data usage versus tickets and clients