Difference between revisions of "AI tutorials"
| KevinYager (talk | contribs)  (→LLM) | KevinYager (talk | contribs)   (→Diffusion) | ||
| (14 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| ==General== | ==General== | ||
| * [https://mlu-explain.github.io/ MLU-EXPLAIN] | * [https://mlu-explain.github.io/ MLU-EXPLAIN] | ||
| + | * [https://arxiv.org/abs/2503.02113 Deep Learning is Not So Mysterious or Different] | ||
| + | * [https://theaidigest.org/ AI Digest]: Interactive AI explainers | ||
| + | * [https://academy.openai.com/ OpenAI Academy] | ||
| + | * [https://goyalpramod.github.io/blogs/evolution_of_LLMs/ Evolution of LLMs] | ||
| ==Loss Functions== | ==Loss Functions== | ||
| * [https://gombru.github.io/2018/05/23/cross_entropy_loss/ Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names] | * [https://gombru.github.io/2018/05/23/cross_entropy_loss/ Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names] | ||
| + | |||
| + | ==Diffusion== | ||
| + | * 2024-06: [https://arxiv.org/abs/2406.08929 Step-by-Step Diffusion: An Elementary Tutorial] | ||
| + | * 2025-04: [https://sander.ai/2025/04/15/latents.html Generative modelling in latent space] | ||
| ==Transformer== | ==Transformer== | ||
| + | * Low-level: | ||
| + | ** [https://peterbloem.nl/ Peter Bloem]: [https://peterbloem.nl/blog/transformers Transformers from scratch] | ||
| + | ** [https://www.brandonrohrer.com/blog.html Brandon Rohrer]: [https://e2eml.school/transformers.html Transformers from Scratch] | ||
| * Wolfram: [https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ What Is ChatGPT Doing … and Why Does It Work?] | * Wolfram: [https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ What Is ChatGPT Doing … and Why Does It Work?] | ||
| * [https://jalammar.github.io/illustrated-transformer/ The Illustrated Transformer] | * [https://jalammar.github.io/illustrated-transformer/ The Illustrated Transformer] | ||
| * [https://towardsdatascience.com/transformers-explained-visually-not-just-how-but-why-they-work-so-well-d840bd61a9d3 Transformers Explained Visually — Not Just How, but Why They Work So Well] | * [https://towardsdatascience.com/transformers-explained-visually-not-just-how-but-why-they-work-so-well-d840bd61a9d3 Transformers Explained Visually — Not Just How, but Why They Work So Well] | ||
| + | * [https://erdem.pl/2021/05/understanding-positional-encoding-in-transformers#positional-encoding-visualization Understanding Positional Encoding in Transformers] | ||
| + | |||
| ===Visualizations=== | ===Visualizations=== | ||
| * [https://bbycroft.net/llm LLM Visualization] | * [https://bbycroft.net/llm LLM Visualization] | ||
| Line 16: | Line 29: | ||
| ==LLM== | ==LLM== | ||
| + | * [https://transformer-circuits.pub/2022/toy_model/index.html Toy Models of Superposition] | ||
| * [https://www.techrxiv.org/doi/full/10.36227/techrxiv.23589741.v1 A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage] | * [https://www.techrxiv.org/doi/full/10.36227/techrxiv.23589741.v1 A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage] | ||
| * [https://aman.ai/ Aman AI]: [https://aman.ai/primers/ai/LLM/ Overview of Large Language Models] | * [https://aman.ai/ Aman AI]: [https://aman.ai/primers/ai/LLM/ Overview of Large Language Models] | ||
| Line 23: | Line 37: | ||
| ===Video=== | ===Video=== | ||
| − | * [https://www.youtube.com/watch?v=EWvNQjAaOHw How I use LLMs] | + | * Andrej Karpathy: | 
| − | * [https://www.youtube.com/watch?v=7xTGNNLPyMI Deep Dive into LLMs like ChatGPT] | + | ** [https://www.youtube.com/watch?v=EWvNQjAaOHw How I use LLMs] | 
| + | ** [https://www.youtube.com/watch?v=7xTGNNLPyMI Deep Dive into LLMs like ChatGPT] | ||
| + | |||
| + | ===Prompt Engineering=== | ||
| + | * 2025-04: Lee Boonstra (Google): [https://www.kaggle.com/whitepaper-prompt-engineering Prompt Engineering] | ||
| ==Other Visualizations== | ==Other Visualizations== | ||
| + | * [https://distill.pub/2019/visual-exploration-gaussian-processes/ A Visual Exploration of Gaussian Processes] | ||
| * [https://pytorch.org/blog/inside-the-matrix/ Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond] | * [https://pytorch.org/blog/inside-the-matrix/ Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond] | ||
| * [https://sohl-dickstein.github.io/2024/02/12/fractal.html Neural network training makes beautiful fractals] | * [https://sohl-dickstein.github.io/2024/02/12/fractal.html Neural network training makes beautiful fractals] | ||
| + | * [https://github.com/apple/embedding-atlas?tab=readme-ov-file Embedding Atlas] ([https://apple.github.io/embedding-atlas/ demo]) | ||
Latest revision as of 11:08, 15 August 2025
Contents
General
- MLU-EXPLAIN
- Deep Learning is Not So Mysterious or Different
- AI Digest: Interactive AI explainers
- OpenAI Academy
- Evolution of LLMs
Loss Functions
Diffusion
- 2024-06: Step-by-Step Diffusion: An Elementary Tutorial
- 2025-04: Generative modelling in latent space
Transformer
- Low-level:
- Wolfram: What Is ChatGPT Doing … and Why Does It Work?
- The Illustrated Transformer
- Transformers Explained Visually — Not Just How, but Why They Work So Well
- Understanding Positional Encoding in Transformers
Visualizations
LLM
- Toy Models of Superposition
- A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage
- Aman AI: Overview of Large Language Models
- Awesome-LLM: Curated list of LLM projects
- The Big Book of Large Language Models (Damien Benveniste)
- 2025-01: Foundations of Large Language Models
Video
- Andrej Karpathy:
Prompt Engineering
- 2025-04: Lee Boonstra (Google): Prompt Engineering

