PDF版 世界中のゲーム分析をしてきたPlayFabが大進化!一緒に裏側の最新データ探索の仕組みを覗いてみよう Db tech showcase2020Daisuke Masubuchi
?
世界中のオンラインゲームやスマフォアプリの分析をしてきたPlayFab。最近、従来のイベント分析に加えて様々なテレメトリーを包含したクラウド分析機能が備わりました。今回は、その裏の Azure Data Explorer a.k.a Kusto での構成や仕組みをご紹介します。Windowsのテレメトリー分析やAzureのログ解析基盤の裏側と共通した仕掛けが含まれているのでお楽しみに!ゲーム業界に限らず、ビックデータ運用を考えている大規模なSaaS事業やIoT事業にもご参考いただけたら幸いです。
at db tech showcase ONLINE 2020 https://db-tech-showcase.com/dbts/2020/online #dbts2020 #gamestackjp
*本資料は 2020年11月11日に開催された DB Tech Showcase イベントにてお話させていただいた、同タイトルのセッション資料となります
Online and offline handwritten chinese character recognition a comprehensive...Shuhei Iitsuka
?
The document presents a study on online and offline handwritten Chinese character recognition. A deep learning model is proposed that incorporates directional decomposition of characters into an 8-direction feature map. This model achieves state-of-the-art results on benchmark datasets while using less memory than comparison methods. The model can also be adapted through an unsupervised adaptation layer to new domains without requiring large labeled datasets.
Inferring win–lose product network from user behaviorShuhei Iitsuka
?
1) The document proposes a new method to analyze relationships between substitute products using user browsing and purchase behavior data from e-commerce sites. It examines which products are superior to others in attractiveness.
2) The method was tested on wedding venue data from a Japanese wedding planning site. It accurately identified competitive and win-lose relationships between venues based on correlations with user survey data.
3) The method also extracts keywords explaining why one product is superior by analyzing reviews from users who chose that product over others. This provided more accurate superiority factors than a simple baseline method.
Procedural modeling using autoencoder networksShuhei Iitsuka
?
1) The document proposes using autoencoder neural networks to reduce the dimensionality of procedural modeling parameters for 3D shapes. This creates a lower-dimensional latent space that organizes shapes based on similarity.
2) A user study showed that combining shape features with procedural parameters in the latent space improved the usability of the design system by generating a space organized by shape similarity.
3) The proposed method allows for an intuitive exploration of the design space compared to conventional procedural modeling interfaces but may limit the representational capacity of the design space.
Generating sentences from a continuous spaceShuhei Iitsuka
?
1) The document summarizes a research paper that proposed using a variational autoencoder (VAE) model to generate natural language sentences from a continuous latent space.
2) It showed the VAE model could outperform an RNN language model baseline on a missing word imputation task, suggesting the VAE better captures global sentence characteristics.
3) Analysis found the VAE learns topics and lengths of sentences, and can generate grammatical sentences when interpolating in the latent space, showing promise for text generation.
Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-E...Shuhei Iitsuka
?
CUPED is a technique that uses pre-experiment data to reduce the variability of metrics in online controlled experiments. It works by adjusting the metrics based on their correlation with covariates from prior data to remove between-group variability. Experiments at Bing showed CUPED reduced metric variance by 50% and found significant results faster. The effectiveness depends on how well the covariates predict the metrics and that the covariates are measured before the experiment starts.
This document discusses how machine learning can be used by web developers, designers, and marketers in addition to product engineers. It provides examples of how machine learning can enable interactive data visualization for designers, A/B testing of website variations by developers to determine the best performing version, and mapping of products in multi-dimensional preference spaces based on user behavior logs to help marketers. The conclusion is that machine learning has applications beyond product engineering and can also benefit other roles in web development.
Asia Trend Map: Forecasting “Cool Japan” Content Popularity on Web DataShuhei Iitsuka
?
This document discusses a system called Asia Trend Map that forecasts the popularity of Japanese content such as anime, manga, and games in Asian countries over the next 6 months. The system collects data on these cultural products from Twitter, Wikipedia, and search engines in different Asian languages and uses this web data along with past Japanese sales data to train a model that can predict future trends. The results showed the system could improve its predictive accuracy by combining data sources and that Wikipedia data, especially page content attributes, was particularly helpful for predicting longer term trends.
5. 4/27(土)&辩耻辞迟;
UT Startup Gym !
リリース会
2013/02/23 5 UT Startup Gym
6. 今回使うデータ
?? データベース: MySQL
–? RDBMS: Relational Database Management System
?? SQL という言語でデータの出し入れをするシステム
–? 世界で最も普及しているオープンソースデータベース
?? データの中身: 架空の SNS
–? ユーザー 約 50 万件
–? 会社情報 約 15 万件
–? 友人情報 約 1,000 万件
MySQL – Wikipedia http://ja.wikipedia.org/wiki/MySQL
2013/02/23 6 UT Startup Gym
7. テーブル设计
user friend
id VARCHAR(128) user_id VARCHAR(128)
name VARCHAR(256) friend_id VARCHAR(128)
gender_id INT UNSIGNED" nakayoshi INT
lang VARCHAR(10)" PRIMARY KEY (user_id, friend_id)"
created_at DATETIME" UNIQUE KEY (friend_id, user_id)"
..."
PRIMARY KEY (id)"
company
id BIGINT(20) UNSIGNED
name VARCHAR(256)
PRIMARY KEY (id)"
2013/02/23 7 UT Startup Gym
8. データベースを叩いてみよう
?? 普通の SELECT 文"
–? SELECT * FROM user LIMIT 10;"
–? SELECT * FROM user;"
?? 条件指定"
–? SELECT * FROM user WHERE id = '109092915251428393573';"
–? SELECT * FROM user WHERE name = ‘飯塚修平';"
–? SELECT id, name FROM user WHERE created_at >
DATE_SUB(NOW(), INTERVAL 10 MINUTE);"
?? 同姓同名ランキング"
–? SELECT user.name, COUNT(id) FROM user GROUP BY user.name
ORDER BY COUNT(id) DESC LIMIT 10;"
2013/02/23 8 UT Startup Gym
9. データベースを叩いてみよう
?? 従業員数ランキング"
–? SELECT company.id, company.name, COUNT(user_employment.id) FROM
user_employment LEFT JOIN company ON user_employment.company_id =
company.id GROUP BY company.id ORDER BY COUNT(user_employment.id)
DESC LIMIT 10;"
?? Google の従業員一覧"
–? SELECT user.id, user.name FROM user LEFT JOIN user_employment ON
user.id = user_employment.user_id LEFT JOIN company ON
user_employment.company_id = company.id WHERE company.name =
'Google';"
?? 共通の友達"
–? SELECT * FROM friend AS f1 LEFT JOIN friend AS f2 ON f1.friend_id =
f2.user_id WHERE f1.user_id = '109092915251428393573' AND f2.friend_id =
'113100517422007103669’"
2013/02/23 9 UT Startup Gym
10. SQL ムズい???
そもそもなんでデータベースを
使うの?
2013/02/23 10 UT Startup Gym
13. インデックス
普通に探索すると
SELECT name FROM user WHERE id = 1468;
id = 1468"
のやつが見つ id name
かるまで探す 1 Shuhei
?Iitsuka
ぜ!
2 Kazuya
?Kawakami
... ...
1468 Taro
?Tanaka
... ... 探索時間"
O(n)."
おそいね
2013/02/23 13 UT Startup Gym
14. インデックス
バイナリツリーのインデックスが張られていると
SELECT name FROM user WHERE id = 1468;
2分木探索で" 1000
探すぜ!
500 1500
250 750 1250 1750
... 探索時間
O(log(n))."
はやいね
2013/02/23 14 UT Startup Gym
15. データベースつかうといいこと
は分かったけど、
うちのサービスではどうすれ
ばいいの?
2013/02/23 15 UT Startup Gym
23. 実際に使うときは
?? LEFT JOIN でくっつけていく"
?? もっとも使われている技名ランキング"
–? SELECT ba.attack_id, SUM(ba.num) FROM battle_attack ba
GROUP BY ba.attack_id;"
–? やっぱり技名も欲しい???→ LEFT JOIN attack"
–? SELECT a.attack_name, SUM(ba.num) FROM battle_attack ba
LEFT JOIN attack a ON ba.attack_id = a.attack_id GROUP BY
ba.attack_id;"
2013/02/23 23 UT Startup Gym
24. さきほどの例なら
?? Google の従業員一覧"
?? まず Google が ID = なんちゃらだとして"
–? SELECT user_id ?
FROM user_employment ?
WHERE company_id = なんちゃら;"
?? 社名で指定したいので LEFT JOIN company"
–? SELECT user_id ?
FROM user_employment ue ?
LEFT JOIN company c ON ue.company_id = c.id ?
WHERE c.name = 'Google';"
?? さらに従業員の名前を出したいので LEFT JOIN user"
–? SELECT u.id, u.name ?
FROM user_employment ue ?
LEFT JOIN company c ON ue.company_id = c.id ?
LEFT JOIN user u ON ue.user_id = u.id ?
WHERE c.name = 'Google';"
2013/02/23 24 UT Startup Gym