�ݺ�ߣ

You and Your Research
LLMs Perspective
Dr Mohamed Elawady
Department of Computer and Information Sciences
University of Strathclyde
4th ML/AI Workshop
14th Sep 2023

Agenda
● Introduction: LLMs
● History of LLMs
● LLMs + Chatbots
● LLMs + Research
2
https://www.reddit.com/r/ChatGPTMemes/comme
nts/102mvys/yours_sincerely_chatgpt/?rdt=43569
“I visualise a time when we will
be to robots what dogs are to
humans, and I’m rooting for the
machines.”
Claude Shannon (1916-2001)

Introduction: LLMs
Large Language Model
(LLM): Natural Language
Processing (NLP) + Deep
Learning (DL)
● Basic: Input (text),
Output (text)
● How: self-supervised
(aka reinforcement
learning) and
semi-supervised
training over massive
datasets (in
terabytes).
3
https://lifearchitect.ai/models/

History of LLMs
4
Zhao, Wayne Xin, et al. "A survey of large language models." arXiv
preprint arXiv:2303.18223 (2023).
● What’s behind
○ Transformers
○ Massive data
○ GPUs
● Popular
○ OpenAI GPT 3/4
○ Google Bard
○ Meta LLaMA
○ Google T5
○ BLOOM
● Coming Soon!
○ Deepmind Gemini
○ OpenAI GPT 5

LLMs + Chatbots
● GPT-3.5/4 + ChatGPT (OpenAI)
● LaMDA + Bard (Google)
● GPT 4 + Bing (Microsoft)
● GPT 4 + YouChat (You.com)
● Claude + Claude AI (Anthropic)
● GPT 4 + ChatSonic (ChatSonic)
5

LLMs + Research
● Sentence-BERT / T5 / GPT-3 + Elicit
● SciBERT + Scite Assistant
● GPT-4 + Consensus
6

More Resources
● LLM Introduction: Learn Language Models, GitHub Gist:
https://gist.github.com/rain-1/eebd5e5eb2784feecf450324e3341c8d
● Awesome-LLM: a curated list of Large Language Model, GitHub:
https://github.com/Hannibal046/Awesome-LLM
● Demos over Hugging Face platform (signup required)
○ Text-to-Text Generation: https://huggingface.co/google/flan-t5-base
○ Text Summarization: https://huggingface.co/facebook/bart-large-cnn
○ Text Generation: https://huggingface.co/bigscience/bloom
7

References
● (GPT-3) Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33
(2020): 1877-1901.
● (GPT-4) OpenAI. “GPT-4 Technical Report.” ArXiv abs/2303.08774 (2023).
● (LaMDA) Thoppilan, Romal, et al. "Lamda: Language models for dialog applications." arXiv preprint arXiv:2201.08239
(2022).
● (SciBERT) Beltagy, Iz, Kyle Lo, and Arman Cohan. "SciBERT: A pretrained language model for scientific text." arXiv preprint
arXiv:1903.10676 (2019).
● (Sentence-bert) Reimers, Nils, and Iryna Gurevych. "Sentence-bert: Sentence embeddings using siamese bert-networks."
arXiv preprint arXiv:1908.10084 (2019).
● (T5) Raffel, Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." The Journal of
Machine Learning Research 21.1 (2020): 5485-5551.
● (LLaMA) Touvron, Hugo, et al. "Llama: Open and efficient foundation language models." arXiv preprint arXiv:2302.13971
(2023).
● (BLOOM) Scao, Teven Le, et al. "Bloom: A 176b-parameter open-access multilingual language model." arXiv preprint
arXiv:2211.05100 (2022).
● (LaMDA) Thoppilan, Romal, et al. "Lamda: Language models for dialog applications." arXiv preprint arXiv:2201.08239
(2022).
● (PaLM) Chowdhery, Aakanksha, et al. "Palm: Scaling language modeling with pathways." arXiv preprint arXiv:2204.02311
(2022).
● (Chinchilla) Hoffmann, Jordan, et al. "Training compute-optimal large language models." arXiv preprint arXiv:2203.15556
(2022).
8

�ݺ�ߣ

You and Your Research -- LLMs Perspective

More Related Content

You and Your Research -- LLMs Perspective