Notebooks — Overview
(1) Activation Functions |
(2) Artificial Neural Networks (Basic Architecture) |
(3) Attention & Multi-Head Attention |
(4) Backpropagation |
(5) Backpropagation (Generalization) |
(6) Backpropagation Through Time (BPTT) |
(7) Bias & Variance (Machine Learning) |
(8) Bias-Variance Decomposition |
(9) Building a GPT-Style LLM from Scratch |
(10) Building a Word Tokenizer from Scratch |
(11) Byte-Pair Encoding Tokenization |
(12) Curse of Dimensionality |
(13) Data Batching for Training LLMs |
(14) Data Normalization — Motivation & Overview |
(15) Data Preparation for Training LLMs — An Overview |
(16) Decision Trees |
(17) Decision Trees — CART (Classification and Regression Trees) |
(18) Dropout |
(19) Gradient Descent with Momentum |
(20) Gradient Descent — The (Very) Basics |
(21) Handwritten Digit Recognition with Artificial Neural Networks (ANNs) |
(22) Implementing an ANN from Scratch (NumPy only) |
(23) Language Models |
(24) Linear Regression |
(25) Linear Regression — Assumptions & Caveats |
(26) LoRA Fine-Tuning — A Basic Example |
(27) Logistic Regression — Basics |
(28) Logistic Regression: The Math |
(29) Logit Distillation |
(30) Machine Translation with Transformers |
(31) Masking in Sequence Models |
(32) Mixture of Experts (MoE) |
(33) Model Fine-Tuning for LLMs — An Overview |
(34) Multinomial Naive Bayes (Basics) |
(35) NumPy — Basic Tutorial |
(36) Part-of-Speech (POS) Tagging (Basics) |
(37) Porter Stemmer |
(38) Positional Encodings — Overview |
(39) RNN-based Language Models |
(40) Recurrent Neural Networks — An Introduction |
(41) Resource-Efficient LLMs — An Overview |
(42) Retrieval-Augmented Generation (RAG) — A (Very) Basic Example |
(43) Retrieval-Augmented Generation (RAG) — Basics |
(44) Rotary Position Embeddings (RoPE) |
(45) Sinusoidal Positional Encodings (Original Transformer) |
(46) Stemming & Lemmatization |
(47) Subword Tokenization (WordPiece) |
(48) Text Classification with Recurrent Neural Networks (RNNs) |
(49) Text Normalization |
(50) Text Tokenization |
(51) The AdaGrad Optimizer |
(52) The Adam Optimizer |
(53) The Linear Layer |
(54) The Math Behind Linear Regression |
(55) The RMSProp Optimizer |
(56) The Softmax Function |
(57) Token Indexing with Vocabularies |
(58) Training Word2Vec from Scratch |
(59) Transformers — Basic Architecture |
(60) Using Pretrained LLMs Locally — A Starter Guide |
(61) Vector Space Model |
(62) Word & Text Embeddings — An Overview |
(63) Working with Batches for Sequence Tasks |
(64) Working with the OpenAI API — An Introduction |
(this list of notebooks is auto-generated)