2017 tensor flow dev summit (Sequence Models and the RNN API)
焔 襭襦 2017 2 22 ろ 8 覿 Maru180
GDG Seoul 譯殊 2017 Tensorflow Dev Summit Extended Seou
覦襯 讌
Sequence Models and the RNN API 襴 伎 螻旧
Auto Scalable Deep Learning Production AI Serving Infra 蟲 覦 AI DevOps...hoondong kim
油
[Tensorflow-KR Offline 碁碁 覦襭]
Auto Scalable Deep Learning Production AI Serving Infra 蟲 覦 AI DevOps Cycle 蟲 覦覯襦. (Azure Docker PaaS 1襷 TPS Tensorflow Inference Serving 覦覯襦 螻旧)
Auto Scalable Deep Learning Production AI Serving Infra 蟲 覦 AI DevOps...hoondong kim
油
[Tensorflow-KR Offline 碁碁 覦襭]
Auto Scalable Deep Learning Production AI Serving Infra 蟲 覦 AI DevOps Cycle 蟲 覦覯襦. (Azure Docker PaaS 1襷 TPS Tensorflow Inference Serving 覦覯襦 螻旧)
Exploring Deep Learning Acceleration Technology Embedded in LLMsTae Young Lee
油
Lab's research presentation
I am a doctoral student at Seoul National University of Science and Technology and am currently the head of the Applying LLMs to Various Industry (AL2VI) Lab.
DeepSeek梶 釈 Trend (Faculty Tae Young Lee)Tae Young Lee
油
The document titled "Trends Observed Through DeepSeek" explores advancements in AI and reinforcement learning through the lens of DeepSeek's latest developments. It is structured into three main sections:
DeepSeek-V3
Focuses on context length extension, initially supporting 32,000 characters and later expanding to 128,000 characters.
Introduces Mixture of Experts (MoE) architecture, optimizing computational efficiency using a novel Auxiliary-Loss-Free Load Balancing strategy.
Multi-Head Latent Attention (MLA) reduces memory consumption while maintaining performance, enhancing large-scale model efficiency.
DeepSeek-R1-Zero
Explores advancements in reinforcement learning algorithms, transitioning from RLHF to GRPO (Group Relative Policy Optimization) for cost-effective optimization.
Direct Preference Optimization (DPO) enhances learning by leveraging preference-based optimization instead of traditional reward functions.
DeepSeek-R1 and Data Attribution
Discusses a Cold Start approach using high-quality data (SFT) to ensure stable initial training.
Incorporates reasoning-focused reinforcement learning, balancing logical accuracy with multilingual consistency.
Utilizes rejection sampling and data augmentation to refine AI-generated outputs for enhanced usability and safety.
The document provides a detailed analysis of these methodologies, positioning DeepSeek as a key player in AI model development and reinforcement learning.
Transitioning from the Era of Big Data to LLMs_Deriving InsightsTae Young Lee
油
Transitioning from the Era of Big Data to LLMs: Deriving Insights
Table of Contents
Big Data and LLMs: Evolution Over Time
Definition and role of Big Data
The emergence and advancements of LLMs (Large Language Models)
Differences and connections between Big Data and LLMs
Challenges of Big Data and the Introduction of LLMs
The initial hype around Big Data and infrastructure expansion
Limitations caused by insufficient data utilization
New possibilities unlocked by the development of LLMs
Current State and Limitations of LLMs
Service innovations brought by LLMs
Gaps between expectations and reality
Data privacy and ethical challenges
Complexity in technology management
A Successful Transition from Big Data to LLMs
Creating business value through data
Shifting from domain-focused to process-oriented thinking
Developing new business models through service innovation
Future Directions for Insight Derivation
Integrating AI and data utilization
Effective approaches to derive actionable insights
Establishing real-time decision-making systems powered by LLMs
Key Messages
Limitations of Big Data: Despite the expansion of data infrastructure, many organizations struggled to translate it into actionable services or insights.
Opportunities with LLMs: LLMs have shown remarkable success in natural language processing and leveraging large-scale data, moving beyond infrastructure to create tangible business value.
Challenges Ahead: Leveraging LLMs requires addressing technical complexity, ethical considerations, and operational costs.
Path to Success: Rather than a technology-centric approach, adopting a problem-solving mindset and developing innovative processes are crucial for success.
Conclusion
The transition from Big Data to LLMs represents a paradigm shift in how data is utilized. Overcoming the challenges of LLM adoption and building a business-driven strategy will pave the way for greater insights and value creation.
This presentation explores these topics with practical examples, offering strategies for using data and AI to shape the future of business.
Facebook Meta's technical direction in Large Language Models (LLMs)Tae Young Lee
油
LLaMA (Large Language Model Meta AI) is a series of large language models developed by Meta (formerly Facebook), designed for natural language processing (NLP) tasks. These models are based on transformer architecture and are trained on extensive datasets, covering a wide range of topics and styles. LLaMA models come in various sizes, catering to tasks from lightweight operations to complex language understanding and generation. Meta emphasizes ethical considerations in developing LLaMA, focusing on reducing biases, ensuring safety, and enhancing transparency. These models can be applied to various NLP tasks such as text completion, question answering, and summarization, and can be fine-tuned for specific industries or needs.
FAISS (Facebook AI Similarity Search) is an open-source library developed by Meta for efficient similarity search and clustering of dense vectors. It is widely used in machine learning and AI applications requiring large-scale data processing and retrieval. FAISS is optimized for both CPU and GPU, enabling rapid processing of large datasets. It supports various indexing methods, including flat indexing, inverted indexing, and product quantization, allowing users to balance accuracy and computational efficiency. The library can scale to billions of vectors, making it suitable for extensive applications, and offers both exact and approximate search methods to trade off between speed and precision based on user needs.
FAISS is commonly used in image and text retrieval, efficiently finding similar items within large datasets, and in recommendation systems to identify similar users or products. It provides a Python API for ease of use and can be integrated with other tools and frameworks, such as PyTorch.
Both LLaMA and FAISS represent Meta's efforts to advance AI technology and its wide range of applications. LLaMA focuses on language understanding and generation, while FAISS is centered on efficient data retrieval and similarity search.
MultiModal Embedding integrates various data types, like images, text, and au...Tae Young Lee
油
MultiModal Embedding refers to a technique used to integrate and process different types of data. "Modality" refers to the type or form of data, such as images, text, audio, etc. MultiModal Embedding maps these different modalities into a common space, allowing for the integration and correlation of diverse types of data.
Key Concepts
Integration of Different Modalities:
It transforms data from various types, such as images, text, and audio, into a common vector space. In this space, each piece of data is represented as a vector, enabling the understanding and analysis of relationships between different modalities.
Common Embedding Space:
It maps data from different modalities into a shared embedding space, allowing for comparison or combination of data across modalities. This process helps capture the features of the data effectively and understand interactions between multiple modalities.
Training and Application:
MultiModal Embedding models are typically trained on large datasets that incorporate various modalities, helping the model learn from a richer set of information. These trained models can be used in applications such as search, recommendation systems, and question-answering.
Applications
Image and Text Integration:
For tasks such as generating descriptions for images or comparing the similarity between images and text.
Multimodal Search:
For performing image searches based on text queries or extracting textual information from images.
Automatic Translation:
For performing speech recognition and translation simultaneously by integrating text and audio.
Enhanced Model Understanding:
Helps models learn more comprehensive and diverse information by leveraging various modalities.
Examples
CLIP (Contrastive Language-Image Pretraining): Developed by OpenAI, this model understands and correlates both images and text, allowing for matching tasks between the two modalities.
DALL-E: An image generation model that creates images from textual descriptions. It operates by converting text and images into a shared embedding space.
MultiModal Embedding enables the integration of diverse data types, contributing to the development of more sophisticated and useful models.
A future that integrates LLMs and LAMs (Symposium)Tae Young Lee
油
Presentation material from the IT graduate school joint event
- Korea University Graduate School of Computer Information and Communication
- Sogang University Graduate School of Information and Communication
- Sungkyunkwan University Graduate School of Information and Communication
- Yonsei University Graduate School of Engineering
- Hanyang University Graduate School of Artificial Intelligence Convergence
Course Overview:
This course offers a comprehensive exploration of recommender systems, focusing on both theoretical foundations and practical applications. Through a combination of lectures, hands-on exercises, and real-world case studies, you will gain a deep understanding of the key principles, methodologies, and evaluation techniques that drive effective recommendation algorithms.
Course Objectives:
Acquire a solid understanding of recommender systems, including their significance and impact in various domains.
Explore different types of recommendation algorithms, such as collaborative filtering, content-based filtering, and hybrid approaches.
Study cutting-edge techniques, including deep learning, matrix factorization, and graph-based methods, for enhanced recommendation accuracy.
Gain hands-on experience with popular recommendation frameworks and libraries, and learn how to implement and evaluate recommendation models.
Investigate advanced topics in recommender systems, such as fairness, diversity, and explainability, and their ethical implications.
Analyze and discuss real-world case studies and research papers to gain insights into the challenges and future directions of recommender systems.
Course Structure:
Introduction to Recommender Systems
Collaborative Filtering Techniques
Content-Based Filtering and Hybrid Approaches
Matrix Factorization Methods
Deep Learning for Recommender Systems
Graph-Based Recommendation Approaches
Evaluation Metrics and Experimental Design
Ethical Considerations in Recommender Systems
Fairness, Diversity, and Explainability in Recommendations
Case Studies and Research Trends
Course Delivery:
The course will be delivered through a combination of lectures, interactive discussions, hands-on coding exercises, and group projects. You will have access to state-of-the-art resources, including relevant research papers, datasets, and software tools, to enhance your learning experience.
ChatGPT is a natural language processing technology developed by OpenAI. This model is based on the GPT-3 architecture and can be applied to various language tasks by training on large-scale datasets. When applied to a search engine, ChatGPT enables the implementation of an AI-based conversational system that understands user questions or queries and provides relevant information.
ChatGPT takes user questions as input and generates appropriate responses based on them. Since this model considers the context of previous conversations, it can provide more natural dialogue. Moreover, ChatGPT has been trained on diverse information from the internet, allowing it to provide practical and accurate answers to user questions.
When applying ChatGPT to a search engine, the system searches for relevant information based on the user's search query and uses ChatGPT to generate answers to present along with the search results. To do this, the search engine provides an interface that connects with ChatGPT, allowing the user's questions to be passed to the model and the answers generated by the model to be presented alongside the search results.
Points to be aware of when setting up the GPU and points to be aware of when verifying performance are summarized based on the reference link (https://hiwony.tistory.com/3).
29. Feeding Sequence Data
SequenceExample proto to store sequence
Efficient storage of multiple sequence
Per time step variable feature counts
Efficient Parser Op
tf.parse_single_sequence_example
Coming soon : TensorFlow Serving First Class citizen
https://www.tensorflow.org/api_docs/python/tf/parse_single_sequence_example
31. Batching Sequence Data : Static Padding
Pad each input sequence yourself, use FIFOQueue :
tf.train.batch()
https://www.tensorflow.org/api_docs/python/tf/train/batch
32. Batching Sequence Data : Dynamic Padding
Use Padding FIFOQueue :
tf.train.batch( dynamic_pad=True)
https://www.tensorflow.org/api_docs/python/tf/train/batch
33. Batching Sequence Data : Bucketing
Use N + 1 Queues with conditional enqueueing :
tf.contrib.training.bucket_by_sequence_length(. dynamic_pad=True)
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api
_docs/python/functions_and_classes/shard8/tf.contrib.training.bucket_by_sequ
ence_length.md
34. Batching Sequence Data :
Truncated BPTT via State Saver
Use Barrier + Queues, you must call save_state each training step :
tf.contrib.training.batch_sequences_with_states()
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api
_docs/python/contrib.training.md
53. Type of Fusion
XLA Fused time steps
Manually fused time steps
Manually fused loops
Fusion tradeoffs :
Flexibility for Speed
Works Everywhere to Fast on XOR(GPU, Android,)
54. XLA (Accelerated Linear Algebra) is a domain-specific compiler for linear algebra
that optimizes TensorFlow computations. The results are improvements in speed,
memory usage, and portability on server and mobile platforms. Initially, most
users will not see large benefits from XLA, but are welcome to experiment by
using XLA via just-in-time (JIT) compilaton or ahead-of-time (AOT) compilation.
Developers targeting new hardware accelerators are especially encouraged to try
out XLA
XLA (Accelerated Linear Algebra) TensorFlow 螻一 豕
覃 覲 貉危朱. 蠏 蟆郁骸 覯 覦 覈覦 弰 , 覃覈襴
覦 伎煙 螳給. 豌 覿覿 螳 XLA 伎
覲 讌襷 JIT (Just-In-Time) 貉危 AOT (Ahead-Of-Time) 貉危殊
牛 XLA襯 ろ 給. 襦 螳蠍磯ゼ 覈襦
螳覦 麹 XLA襯 覲企 蟆 譬給.
https://www.tensorflow.org/versions/master/experimental/xla/
61. Dynamic Decoder
New OO API
Under active development
Base decoder library for Open Source Neural Machine
Translation tutorial (coming soon)
tf.contrib.seq2seq