The document discusses distances between data and similarity measures in data analysis. It introduces the concept of distance between data as a quantitative measure of how different two data points are, with smaller distances indicating greater similarity. Distances are useful for tasks like clustering data, detecting anomalies, data recognition, and measuring approximation errors. The most common distance measure, Euclidean distance, is explained for vectors of any dimension using the concept of norm from geometry. Caution is advised when calculating distances between data with differing scales.
The document discusses the Optuna hyperparameter optimization framework, highlighting its features like define-by-run, pruning, and distributed optimization. It provides examples of successful applications in competitions and introduces the use of LightGBM hyperparameter tuning. Additionally, it outlines the installation procedure, key components of Optuna, and the introduction of the lightgbmtuner for automated optimization.
The document discusses the rights of data subjects under the EU GDPR, particularly regarding automated decision-making and profiling. It outlines conditions under which such decisions can be made, emphasizing the need for measures that protect the data subjects' rights and freedoms. Additionally, it includes references to various machine learning and artificial intelligence interpretability frameworks and studies.
The document outlines strategies for enhancing research efficiency, emphasizing the importance of effective literature review, management skills, and collaborative efforts among researchers. It discusses two main methods for skill enhancement: learning from peers and leveraging online resources, while highlighting the challenges and advantages of each approach. Additionally, it provides insights into the dynamics of various research labs, communication practices, and the value of sharing knowledge across institutions.
BERT を中心に解説した資料です.BERT に比べると,XLNet と RoBERTa の内容は詳細に追ってないです.
あと,自作の図は上から下ですが,引っ張ってきた図は下から上になっているので注意してください.
もし間違い等あったら修正するので,言ってください.
(特に,RoBERTa の英語を読み間違えがちょっと怖いです.言い訳すいません.)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
RoBERTa: A Robustly Optimized BERT Pretraining Approach
The document discusses the rights of data subjects under the EU GDPR, particularly regarding automated decision-making and profiling. It outlines conditions under which such decisions can be made, emphasizing the need for measures that protect the data subjects' rights and freedoms. Additionally, it includes references to various machine learning and artificial intelligence interpretability frameworks and studies.
The document outlines strategies for enhancing research efficiency, emphasizing the importance of effective literature review, management skills, and collaborative efforts among researchers. It discusses two main methods for skill enhancement: learning from peers and leveraging online resources, while highlighting the challenges and advantages of each approach. Additionally, it provides insights into the dynamics of various research labs, communication practices, and the value of sharing knowledge across institutions.
BERT を中心に解説した資料です.BERT に比べると,XLNet と RoBERTa の内容は詳細に追ってないです.
あと,自作の図は上から下ですが,引っ張ってきた図は下から上になっているので注意してください.
もし間違い等あったら修正するので,言ってください.
(特に,RoBERTa の英語を読み間違えがちょっと怖いです.言い訳すいません.)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
RoBERTa: A Robustly Optimized BERT Pretraining Approach