An Aalto university student project for course "Simulation".
A simple example how analytics and statistics can be used in sports, especially to predict match results in Floorball or other team sports.
The CCV company is holding a quiz competition between its different departments to improve cross-departmental knowledge and collaboration. Each department will create a 10 question quiz about their work processes. Employees will then take the quizzes from the other departments to learn about their goals and tasks. The department with the most correct answers will win a lunch prize. The aim is for all locations and employees to better understand what each department does through the friendly competition.
1. The document discusses different approaches to sports betting, including following betting experts and analyzing bookmakers' odds.
2. It provides an example of calculating expected value from betting on correct scores and outcomes. Maximizing expected value is key to long-term profits.
3. Statistics on past GP Betting League results are presented, showing trends in different leagues like draws in Spain and unpredictability in the Polish league. Common correct scores are highlighted.
This document provides information about scoring systems in tennis and badminton. It includes examples of scoring notation in a tennis match between two classmates that went to three sets, with the winner being Elena 7-6 in a tiebreak in the final set. It also gives an example of scoring notation for a badminton match between the same two classmates, with Elena winning the first set 21-18. Scoring in tennis uses terminology like "love", "fifteen", "thirty", "forty", and "deuce" while badminton uses a straight point scoring system up to 21 points per set.
This document provides an overview of Monte Carlo simulation and examples of how to apply it. Monte Carlo simulation uses random numbers to estimate outcomes of complex problems that are difficult to solve analytically. Example problems include simulating daily demand at a grocery store, coin flips, dice rolls, and traffic light colors. The document also provides a newsboy problem simulation and car accident penalty simulation to demonstrate how to set up and run Monte Carlo simulations.
The document discusses queuing theory and waiting lines. Queuing theory uses a mathematical approach to analyze waiting lines. The goal is to minimize the sum of customer waiting costs and service capacity costs. Key factors of queuing systems include arrival and service patterns, population source, number of servers, and queue discipline. Performance is measured by average wait time, number of customers, utilization, and probability of waiting. The Poisson distribution can model customer arrivals, and service times are often exponentially distributed.
Doing Sourcing Right Requires Pipelining discusses the importance of talent pipelining for companies. It notes that traditional applicant tracking systems (ATS) are not well-suited for tracking pre-applicant talent leads over long periods of time. While pipelining is seen as a strategic priority, most organizations take an ad-hoc approach to it. LinkedIn Talent Pipeline aims to address this challenge by providing a centralized system for tracking, engaging, and maintaining relationships with talent leads on an ongoing basis directly within the LinkedIn Recruiter platform. Early customers report that Talent Pipeline allows them to more easily build and leverage just-in-time candidate pools compared to static databases or spreadsheets.
A recruiting advert made by students at Association of Economics Students in Helsinki (KY ry).
22 companies participated in the catalogue and it reached over 3000 students.
Water is the chemical substance with the formula H2O. It is a colorless, odorless, and tasteless liquid. Water exists on Earth in solid, liquid, and gas forms. Water resources are sources of water that are useful or potentially useful to humans and include surface water, groundwater, and fresh water. Surface water includes water in rivers, lakes, and wetlands, which is replenished by precipitation and lost through discharge and evaporation. Groundwater is water located below the earth's surface but within 2,500 feet of the surface. The importance of water includes uses in agriculture, drinking, washing, fire extinction, recreation, and industrial applications.
This short document appears to be about a daughter named Ezra Joy InguitoAralar. It also mentions inspirational quotes and a teachers quote, but provides no further context or details about these topics in the limited text.
This document provides an introduction to queuing theory, which analyzes systems where customers wait in line for service. It discusses the key elements of a queuing model, including the arrival process, service system, and queue structure. Common assumptions are that arrivals and service times follow Poisson and exponential distributions respectively. Key metrics analyzed include the average number of customers in queue and in the system, as well as the average waiting times. The M/M/1 queuing model with a Poisson arrival process and exponential service times at a single server is presented.
This document discusses waiting line management and queuing systems. It covers topics like how the average person spends their time waiting, laws of service, components of queuing systems like arrivals, queues, servers and exits. It also discusses customer populations, service patterns, queue disciplines, line structures and performance measures. The key aspects are controlling customer demand through reservations, segmenting populations based on factors like urgency and designing well-organized reservation systems.
This document discusses queue management and queuing models. It describes the components of a queuing system including arrivals, servers, waiting lines, and exit. It provides suggestions for managing queues such as determining acceptable wait times and informing customers. It also presents four queuing models: single channel infinite, single channel constant, multichannel infinite, and single/multi finite. Examples are given to demonstrate how to use the models to calculate performance measures like utilization, wait times, and number of customers in the system or line.
This document provides an overview of queuing theory, which is used to model waiting lines. It discusses key concepts like arrival processes, service systems, queuing models and their characteristics. Some examples where queuing theory is applied include telecommunications, traffic control, and manufacturing layout. Common elements of queuing systems are customers, servers and queues. The document also presents examples of single and multiple channel queuing models.
The document contains information about measures of variation and distributions, including:
1) A table showing the age distribution of Nigeria's population in 1991, with the lower quartile around 11 years, median around 24 years, and upper quartile around 40.5 years.
2) A table with test marks for 70 students, including constructing a cumulative frequency curve and determining that 28 students passed with a mark over 47.
3) Box and whisker plots are constructed to represent the goals scored in football matches by two teams, comparing their median, quartiles, and range.
4) Standard deviation is defined as a measure of spread from the mean, and the standard deviations of three data sets S1, S2,
This document summarizes a presentation on modeling dynamic incentives in basketball games. The presentation discusses using basketball game data to model how team effort and performance change over the course of a game in response to the score difference. A controlled stochastic process is used to model the score difference as being driven by the effort levels of the two teams. The effort levels are introduced as control processes that the teams choose to maximize their probability of winning, while minimizing their costs of effort. This framework is then related to modeling incentives and effort in labor economics settings like tournaments.
Effects of Rule Changes and Three-point System in NHLPatrice Marek
?
There are two main reasons for changing rules in the ice hockey. The first reason is a safety of players and spectators and the second reason is an attractiveness of matches. This paper studies effects of rule changes that were made because of the second reason, e.g., allowing a two-line pass, narrowing a neutral zone, overtime in the case of a tie match. Rule changes are analyzed from two perspectives – the first one is a distribution of goals scored in a match and the second one is a relative number of ties after a 60-minute regulation time. All seasons since the last big expansion of the NHL in 1979 are used in this analysis. The second part of this paper is dedicated to study of the three-point system that is often named as a cure for a high number of ties in the NHL. This system was earlier introduced in the world’s most important ice hockey leagues, i.e., in the Czech Republic, Finland, Germany, Russia, Switzerland and Sweden and its effect in these leagues is analyzed.
Descriptive Statistics Part II: Graphical Descriptiongetyourcheaton
?
The document provides information on descriptive statistics and graphical descriptions of data, including bar charts, pie charts, histograms, and cumulative frequency distributions. It discusses how to construct these various graphs using Excel and includes examples and questions to describe and interpret the graphs. Key information that can be obtained from these graphs includes the mode, range, percentages of observations within certain classes or below/above certain values, and comparing values across categories.
This document discusses measures of central tendency, including the mode, median, quartiles, and percentiles. It provides definitions and formulas for calculating each measure. The mode is defined as the value that occurs most frequently. The median divides the data set into two equal parts. Quartiles divide the data set into four equal parts, with the second quartile being the median. Percentiles divide the data set into 100 equal parts. Several examples are provided to demonstrate calculating the mode, median, quartiles, deciles and percentiles from data sets.
The document discusses various aspects of sports event management including the meaning of sports management, types of tournaments, fixtures, and intramurals and extramurals. It defines sports management as the process of planning, organizing, evaluating, and controlling efforts to achieve stated objectives. It describes different types of tournaments like knockout, combination, challenge, and league tournaments. It provides details on fixtures, byes, and methods of drawing fixtures for knockout and league tournaments. It also explains the objectives and differences between intramurals and extramurals.
The document discusses why Class I schools in New Hampshire are doing a 1-year trial of a playoff point system to determine tournament seeding. Budget and transportation issues led to changes in scheduling that made it difficult to create a fair schedule. The new system awards points for wins and more points for wins over stronger opponents. It uses a two-step process of preliminary index and tournament index to determine seeds, with ties broken through various criteria like head-to-head results. Examples are given showing how teams' seeds changed under this system compared to the previous non-point system.
This document presents a case study on forecasting attendance at football games for Southwestern University from 2005-2010. It develops two forecasting models - trend projections and moving average - to project attendance through 2012. Trend projections is selected as it fits a trend line to historical data points. Attendance is forecasted to be 239,000 in 2011 and 250,528 in 2012. Expected revenues in 2011 are $4,780,000 and $5,010,540 in 2012. The school's opinion is that a new stadium should be built to accommodate increasing attendance forecasts.
This document discusses predicting football match results using machine learning models. It outlines the workflow of data collection, feature engineering, model training and evaluation. Several classification models are tested on a dataset of European football matches and odds data. The best models achieve similar prediction accuracy to bookmakers, though draws remain difficult to predict. The author concludes that reaching bookmakers' performance is achievable with more refined data but predicting football matches accurately remains very challenging.
- The document provides an overview of topics that may be covered on the Math 533 final exam, including hypothesis testing, the binomial distribution, descriptive statistics, confidence intervals, and regression analysis.
- It includes examples of sample questions and worked problems for each topic to help students prepare.
This document provides instruction on multiplying integers. It begins with the rules for multiplying integers:
1) Positive x Positive = Positive
2) Negative x Negative = Positive
3) Negative x Positive = Negative
4) Any Number x 0 = Zero
Examples are provided to illustrate each rule. The document emphasizes that if the signs are the same, the answer is positive, and if the signs are different, the answer is negative. Students then practice multiplying integers in a group activity before evaluating additional examples.
This short document appears to be about a daughter named Ezra Joy InguitoAralar. It also mentions inspirational quotes and a teachers quote, but provides no further context or details about these topics in the limited text.
This document provides an introduction to queuing theory, which analyzes systems where customers wait in line for service. It discusses the key elements of a queuing model, including the arrival process, service system, and queue structure. Common assumptions are that arrivals and service times follow Poisson and exponential distributions respectively. Key metrics analyzed include the average number of customers in queue and in the system, as well as the average waiting times. The M/M/1 queuing model with a Poisson arrival process and exponential service times at a single server is presented.
This document discusses waiting line management and queuing systems. It covers topics like how the average person spends their time waiting, laws of service, components of queuing systems like arrivals, queues, servers and exits. It also discusses customer populations, service patterns, queue disciplines, line structures and performance measures. The key aspects are controlling customer demand through reservations, segmenting populations based on factors like urgency and designing well-organized reservation systems.
This document discusses queue management and queuing models. It describes the components of a queuing system including arrivals, servers, waiting lines, and exit. It provides suggestions for managing queues such as determining acceptable wait times and informing customers. It also presents four queuing models: single channel infinite, single channel constant, multichannel infinite, and single/multi finite. Examples are given to demonstrate how to use the models to calculate performance measures like utilization, wait times, and number of customers in the system or line.
This document provides an overview of queuing theory, which is used to model waiting lines. It discusses key concepts like arrival processes, service systems, queuing models and their characteristics. Some examples where queuing theory is applied include telecommunications, traffic control, and manufacturing layout. Common elements of queuing systems are customers, servers and queues. The document also presents examples of single and multiple channel queuing models.
The document contains information about measures of variation and distributions, including:
1) A table showing the age distribution of Nigeria's population in 1991, with the lower quartile around 11 years, median around 24 years, and upper quartile around 40.5 years.
2) A table with test marks for 70 students, including constructing a cumulative frequency curve and determining that 28 students passed with a mark over 47.
3) Box and whisker plots are constructed to represent the goals scored in football matches by two teams, comparing their median, quartiles, and range.
4) Standard deviation is defined as a measure of spread from the mean, and the standard deviations of three data sets S1, S2,
This document summarizes a presentation on modeling dynamic incentives in basketball games. The presentation discusses using basketball game data to model how team effort and performance change over the course of a game in response to the score difference. A controlled stochastic process is used to model the score difference as being driven by the effort levels of the two teams. The effort levels are introduced as control processes that the teams choose to maximize their probability of winning, while minimizing their costs of effort. This framework is then related to modeling incentives and effort in labor economics settings like tournaments.
Effects of Rule Changes and Three-point System in NHLPatrice Marek
?
There are two main reasons for changing rules in the ice hockey. The first reason is a safety of players and spectators and the second reason is an attractiveness of matches. This paper studies effects of rule changes that were made because of the second reason, e.g., allowing a two-line pass, narrowing a neutral zone, overtime in the case of a tie match. Rule changes are analyzed from two perspectives – the first one is a distribution of goals scored in a match and the second one is a relative number of ties after a 60-minute regulation time. All seasons since the last big expansion of the NHL in 1979 are used in this analysis. The second part of this paper is dedicated to study of the three-point system that is often named as a cure for a high number of ties in the NHL. This system was earlier introduced in the world’s most important ice hockey leagues, i.e., in the Czech Republic, Finland, Germany, Russia, Switzerland and Sweden and its effect in these leagues is analyzed.
Descriptive Statistics Part II: Graphical Descriptiongetyourcheaton
?
The document provides information on descriptive statistics and graphical descriptions of data, including bar charts, pie charts, histograms, and cumulative frequency distributions. It discusses how to construct these various graphs using Excel and includes examples and questions to describe and interpret the graphs. Key information that can be obtained from these graphs includes the mode, range, percentages of observations within certain classes or below/above certain values, and comparing values across categories.
This document discusses measures of central tendency, including the mode, median, quartiles, and percentiles. It provides definitions and formulas for calculating each measure. The mode is defined as the value that occurs most frequently. The median divides the data set into two equal parts. Quartiles divide the data set into four equal parts, with the second quartile being the median. Percentiles divide the data set into 100 equal parts. Several examples are provided to demonstrate calculating the mode, median, quartiles, deciles and percentiles from data sets.
The document discusses various aspects of sports event management including the meaning of sports management, types of tournaments, fixtures, and intramurals and extramurals. It defines sports management as the process of planning, organizing, evaluating, and controlling efforts to achieve stated objectives. It describes different types of tournaments like knockout, combination, challenge, and league tournaments. It provides details on fixtures, byes, and methods of drawing fixtures for knockout and league tournaments. It also explains the objectives and differences between intramurals and extramurals.
The document discusses why Class I schools in New Hampshire are doing a 1-year trial of a playoff point system to determine tournament seeding. Budget and transportation issues led to changes in scheduling that made it difficult to create a fair schedule. The new system awards points for wins and more points for wins over stronger opponents. It uses a two-step process of preliminary index and tournament index to determine seeds, with ties broken through various criteria like head-to-head results. Examples are given showing how teams' seeds changed under this system compared to the previous non-point system.
This document presents a case study on forecasting attendance at football games for Southwestern University from 2005-2010. It develops two forecasting models - trend projections and moving average - to project attendance through 2012. Trend projections is selected as it fits a trend line to historical data points. Attendance is forecasted to be 239,000 in 2011 and 250,528 in 2012. Expected revenues in 2011 are $4,780,000 and $5,010,540 in 2012. The school's opinion is that a new stadium should be built to accommodate increasing attendance forecasts.
This document discusses predicting football match results using machine learning models. It outlines the workflow of data collection, feature engineering, model training and evaluation. Several classification models are tested on a dataset of European football matches and odds data. The best models achieve similar prediction accuracy to bookmakers, though draws remain difficult to predict. The author concludes that reaching bookmakers' performance is achievable with more refined data but predicting football matches accurately remains very challenging.
- The document provides an overview of topics that may be covered on the Math 533 final exam, including hypothesis testing, the binomial distribution, descriptive statistics, confidence intervals, and regression analysis.
- It includes examples of sample questions and worked problems for each topic to help students prepare.
This document provides instruction on multiplying integers. It begins with the rules for multiplying integers:
1) Positive x Positive = Positive
2) Negative x Negative = Positive
3) Negative x Positive = Negative
4) Any Number x 0 = Zero
Examples are provided to illustrate each rule. The document emphasizes that if the signs are the same, the answer is positive, and if the signs are different, the answer is negative. Students then practice multiplying integers in a group activity before evaluating additional examples.
Psychometric success numerical ability computation practice test 1 - copyRoselito Baclay
?
This document provides a practice test to assess numerical computation skills. It consists of 30 multiple choice questions to be completed within 10 minutes, testing basic arithmetic like addition, subtraction, multiplication, and division. These types of tests are used to evaluate applicants' numeracy and clerical skills. The document also provides information on how to improve numerical skills through practicing examples. It describes how numerical tests are commonly used in administrative and clerical jobs screenings. Finally, it advertises eBooks and online resources to help candidates succeed on psychometric and aptitude tests.
Psychometric success numerical ability computation practice test 1Roselito Baclay
?
This document provides a practice test to assess numerical computation skills. It consists of 30 multiple choice questions to be completed within 10 minutes, testing basic arithmetic like addition, subtraction, multiplication, and division. These types of tests are used to evaluate applicants' numeracy and clerical skills. The document also provides information on how to improve numerical skills through practicing examples. It describes how numerical tests are commonly used in administrative and clerical jobs screenings. Finally, it advertises eBooks and online resources to help candidates succeed on psychometric and aptitude tests.
The document provides information about regression analysis and calculating the coefficient of determination. It includes:
1) Instructions on how to perform a regression analysis using a calculator to find the least squares regression line, correlation coefficient, and residual plot from sample data.
2) An explanation of the coefficient of determination as a measure of how much variability in the variable y can be explained by its linear relationship with variable x.
3) A calculation example finding the coefficient of determination to be 0.83 for a dataset relating height and shoe size, meaning approximately 83% of the variation in shoe size can be explained by height.
The document contains sample questions from previous years' business statistics exams. It includes two questions:
1) A question from 2006 that involves calculating the mean, standard deviation, and coefficient of variation for age data grouped into classes with frequency counts.
2) A question from 2007 that involves calculating the mean and median income from frequency data grouped into classes. The document shows the work and calculations to arrive at the answers for both questions.
The document discusses different types of data and statistical methods. It provides examples of qualitative and quantitative data. It also explains various ways to represent data visually, including pie charts, bar charts, histograms, line graphs, and more. Finally, it gives examples of calculating common statistical measures like the mean, median, mode, and quartiles from raw data sets.
This document provides information about different types of averages (mean, median, mode) and the range, and how to calculate them from raw data and frequency tables. It discusses when each average is most appropriate to use and how outliers can affect calculations. Examples are provided of calculating averages and range from sets of data, as well as estimating the mean from grouped data. The median is defined as the middle number when values are in order. Outliers are identified as values significantly higher or lower than others.
Deep-QPP: A Pairwise Interaction-based Deep Learning Model for Supervised Que...suchanadatta3
?
Motivated by the recent success of end-to-end deep neural models
for ranking tasks, we present here a supervised end-to-end neural
approach for query performance prediction (QPP). In contrast to
unsupervised approaches that rely on various statistics of document
score distributions, our approach is entirely data-driven. Further,
in contrast to weakly supervised approaches, our method also does
not rely on the outputs from different QPP estimators. In particular, our model leverages information from the semantic interactions between the terms of a query and those in the top-documents retrieved with it. The architecture of the model comprises multiple layers of 2D convolution filters followed by a feed-forward layer of parameters. Experiments on standard test collections demonstrate
that our proposed supervised approach outperforms other state-of-the-art supervised and unsupervised approaches.
Hire Android App Developers in India with Cerebraixcerebraixs
?
Android app developers are crucial for creating
high-quality, user-friendly, and innovative mobile
applications. Their expertise in mobile development,
UI/UX design, and seamless integration ensures robust
and scalable apps that drive user engagement and
business success in the competitive mobile market.
To conserve resources and optimize investment, a business must determine which potential opportunities are most likely to result in conversions and evolve into successful deals and determine which opportunities are at risk. This Hot Lead predictive analytics use case describes the value of predictive analytics to prioritize high-value leads and capitalize on an opportunity to convert a lead into a relationship by identifying key patterns that contribute to successful deal closures. Use these tools to identify the leads that are most likely to result in conversion and provide the most benefit to the enterprise. This technique can be used in many industries, including Financial Services, B2C and B2B. For more info https://www.smarten.com/augmented-analytics-learn-explore/use-cases.html
A Relative Information Gain-based Query Performance Prediction Framework with...suchanadatta3
?
To improve the QPP estimate for neural models, we propose to use additional information from a set of queries that express a similar information need to the current one (these queries are called variants). The key idea of our proposed method, named Weighted Relative Information Gain (WRIG), is to estimate the performance of these variants, and then to improve the QPP estimate of the original query based on the relative differences with the variants. The hypothesis is that if a query’s estimate is significantly higher than the average QPP score of its variants, then the original query itself is assumed (with a higher confidence) to be one for which a retrieval model works well.
CloudMonitor - Architecture Audit Review February 2025.pdfRodney Joyce
?
CloudMonitor FinOps is now a Microsoft Certified solution in the Azure Marketplace. This little badge means that we passed a 3rd-party Technical Audit as well as met various sales KPIs and milestones over the last 12 months.
We used our existing Architecture docs for CISOs and Cloud Architects to craft an Audit Response - I've shared it below to help others obtain their cert.
Interestingly, 90% of our customers are in the USA, with very few in Australia. This is odd as the first thing I hear in every meetup and conference, from partners, customers and Microsoft, is that they want to optimise their cloud spend! But very few Australian companies are using the FinOps Framework to lower Azure costs.
Optimizing Common Table Expressions in Apache Hive with CalciteStamatis Zampetakis
?
In many real-world queries, certain expressions may appear multiple times, requiring repeated computations to construct the final result. These recurring computations, known as common table expressions (CTEs), can be explicitly defined in SQL queries using the WITH clause or implicitly derived through transformation rules. Identifying and leveraging CTEs is essential for reducing the cost of executing complex queries and is a critical component of modern data management systems.
Apache Hive, a SQL-based data management system, provides powerful mechanisms to detect and exploit CTEs through heuristic and cost-based optimization techniques.
This talk delves into the internals of Hive's planner, focusing on its integration with Apache Calcite for CTE optimization. We will begin with a high-level overview of Hive's planner architecture and its reliance on Calcite in various planning phases. The discussion will then shift to the CTE rewriting phase, highlighting key Calcite concepts and demonstrating how they are employed to optimize CTEs effectively.
3. Salibandyliiga in a nutshell
?? 14 teams
?? A twofold regular season - 182 games
?? Top 8 teams continue in playoffs
?? Weakest team will be relegated from the league
?? Teams placing 12. and 13. face elimination rounds
This project simulates results of the regular season,
based on game results during 1990-2015
4. Research question
”What are the final standings and which
team will win Salibandyliiga regular
season 2015-2016?”
5. Assumptions
?? Amount of goals scored in a single match follow truncated
Poisson distribution – based on data (1990-2015)
?? Statistically siginficant difference in scored home goals vs.
scored away goals – F-test, t-test
?(?; ?)=?Pr?( ?= ?)?= ?? ?↑??? ?↑? ??/?!?, where x = {1,2,3, …, 25}
0%
5%
10%
15%
20%
1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526
Figure 1: Realized goal distribution vs. Poisson
distribution
truncated Poisson distribution Realized goal distribution
5.33
4.90
3
3.5
4
4.5
5
5.5
6
Home team Away team
Figure 2: Avg. number of goals for home and
away team
7. Step 1: Averages for goal
distributions
?λ↓( ? ??. ?)?=? ??
Goal averages for each
matchpair are derived from
historical data
8. Step 2: How is a single match
simulated?
Un ~ U(0,1)
Um ~ U(0,1)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
pdf: goal distributions for team n vs. team m
Team n Team m
9. Step 2: How is a single match
simulated?
Un ~ U(0,1)
Um ~ U(0,1)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
cdf: goal distributions for team n vs. team m
Team n Team m
10. Step 2: How is a single match
simulated?
Un = 0,65
Um = 0,79
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
cdf: goal distributions for team n vs. team m
Team n Team m
11. Step 2: How is a single match
simulated?
Un = 0,65
Um = 0,79
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
cdf: goal distributions for team n vs. team m
Team n Team m
12. Step 2: How is a single match
simulated?
Un = 0,65
Um = 0,79
Match result
5 - 9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
cdf: goal distributions for team n vs. team m
Team n Team m
13. Step 3: Match outcome probabilities
?? Simulation of each match pair x 1000 times to gain probabilities
for different match outcomes
Team n wins 68% of the time
Match ends as a draw 10% of the time
Team m wins 22% of the time
14. Step 4: Simulation of one season
?? The outcome of each match is determined with a LOOKUP
function and a uniformly distributed random number U ~ U(0,1)
?? Cumulative match outcome probabilities
Team n wins 0,68
Draw 0,78
Team m wins 1,0
15. Step 4: Simulation of one season
?? The outcome of each game is determined with a LOOKUP
function and a uniformly distributed random number U ~ U(0,1)
?? Cumulative match outcome probabilities
Team n wins 0,68
Draw 0,78
Team m wins 1,0
U = 0,82
16. Step 4: Simulation of one season
?? The outcome of each game is determined with a LOOKUP
function and a uniformly distributed random number U ~ U(0,1)
?? Cumulative match outcome probabilities
Team n wins 0,68
Draw 0,78
Team m wins 1,0
?? 182 matches
?? Win = +2p, draw = +1p
U = 0,82 Team m wins!
17. Step 5: Outcomes of the model
?? After 1000 simulation runs of a single season
Points per
team per
season
Average, max and
min points per
team after 1000
simulation runs
Variance, standard deviation
and standard error of mean
of points per team after 1000
simulation runs
19. Iteration 1
?? Data: Past 3 seasons
?? Assumptions: Home advantage does not exist
?? Problems: Older data has too much weight on results
14%
49%
26%
9%
2%
Probability of placing top 1 in regular
season - iteration 1
Classic Happee SPV SSV Oilers
0
5
10
15
20
25
30
35
40
45
Team performance: Avg. points per team -
iteration 1
20. Iteration 2
?? Data: Past 1? seasons
?? Assumptions: Home advantage does exist
?? Problems: Poisson distribution with too low sample size
61%
10%
2%
3%
26%
Probability of placing top 1 in regular
season - iteration 2
Happee SPV Oilers SSV Classic
0
5
10
15
20
25
30
35
40
45
Team performance: Avg. points -
Iteration 1 & 2
Iteration 2 Iteration 1
21. Iteration 3
?? Data: 3? seasons with weights (0,6 / 0,25 / 0,15)
?? Assumptions: Home advantage does exist
?? Problems: Are weights accurate?
54%
15%3%
7%
21%
Probability of placing top 1 in regular season -
Iteration 3
Happee SPV Oilers SSV Classic
26. Limitations
?? There is no perfect data set available
?? teams and their relative strenghts change between seasons
?? Salibandyliiga is probably not the most optimal sports
league to use the model for
?? Only 182 games per season (vs. NHL 1230 games per season)
?? Different teams each season
à Timeliness of data vs. sample size