Roadmap

Your journey to a first break in AI

Step-by-step learning path: set up a Quarto blog, run models locally, learn inference and training, build an AI product, and ship a capstone project.

Your learning path — one journey, step by step. The cohort runs 1 May 2026 — 30 June 2026 (2 months). Use your AI-based IDE and the community to complete each step. This roadmap is in progress; new steps get added as the cohort grows.

ImportantBefore you begin

Make sure you’ve completed the Checklist — set up your accounts and join the Discord so you can ask questions and share progress as you go.

Step 1

First use of AI for coding

Set up a Quarto blog and host it on GitHub with an about-me page, blog posts, “Today I learned,” and other pages.

Show details ↓

Tasks:

  1. Set up the project locally, link it to a GitHub repo, and configure GitHub Pages for deployment.
  2. Use your AI-based IDE to complete this setup.

You will learn:

  • GitHub basics refresher — branches, commits, PRs, merge conflicts
  • Setting up a personal and blogging website with Quarto
  • How AI coding tools and SWE agents work in practice

Office Hours sessions:

Step 2

Run a model locally

Run Qwen3 0.6B locally in pure C — trace every operation from tokenization to token output.

Show details ↓

You will learn:

  • Basics of inference: decoding, KV cache, sampling
  • Chat templates and system prompts (ChatML format, special tokens)
  • Tokenization — why subword units, vocabulary size trade-offs
  • Temperature, top-p, and how they control output randomness
  • GGUF vs SafeTensors — model file formats and quantization
  • Transformer architecture: self-attention, MHA, positional encoding, decoder-only design

Guides:

  • Run Qwen3 0.6B in pure C — inference from first principles (tokenization, chat templates, attention, KV cache).
  • GGUF vs SafeTensors — model weight formats, security, quantization, and why we start with pure C.

Learning resources — follow in this order:

  1. KV Cache explained (video, ~10 min) — start here. Visual walkthrough of why the KV cache exists, how it grows with context length, and what practical models do to keep it small. Directly answers why GPU memory is the bottleneck in inference.
  2. The Illustrated Transformer — Jay Alammar — the most referenced visual guide to transformer architecture. Covers embeddings, self-attention, multi-head attention, and positional encoding with animated diagrams. Read this before the paper — it will make the paper feel simple.
  3. Attention Is All You Need (2017 paper) — the original transformer paper. After reading the Alammar blog, this is approachable. The key contributions: removal of recurrence, self-attention as a core operator, multi-head attention, positional encoding. Shorter than you expect (~15 pages).
  4. Agentic models are the future — Junyang Lin — former Qwen team lead on why reasoning/thinking models are a stepping stone, and why agentic models (RL-trained, tool-using, multi-step) are where the field is heading. Read this last — it gives you a mental map of where everything you are learning is going.

Office Hours sessions:

Step 3

Inference deep dive

Go beyond running a model — understand how inference works under the hood and how to serve models at scale.

Show details ↓

You will learn:

  • Inference engines and runtimes (vLLM, TGI, llama.cpp server)
  • Batching, continuous batching, and throughput vs. latency trade-offs
  • Quantization (GGUF, GPTQ, AWQ) and when to use each format
  • Speculative decoding — how draft models speed up large model inference
  • Structured output, function calling, and tool use
  • Serving and API design for inference endpoints

Office Hours sessions:

Step 4

Training fundamentals

Build the foundations to train and fine-tune models from scratch — from a single GPU to distributed multi-node setups.

Show details ↓

You will learn:

  • PyTorch fundamentals: tensors, autograd, training loops
  • Modelling: architectures (transformers, attention, MLP), building blocks from scratch
  • Data pipelines: datasets, dataloaders, preprocessing, tokenization at scale
  • Fine-tuning: LoRA, QLoRA, full fine-tune, adapters — when to use each
  • Distributed training: DDP, FSDP, multi-GPU and multi-node setups
  • Experiment tracking and evaluation (Weights & Biases, validation loss curves)
  • Parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism

Projects you’ll be ready for:

  • nanoGPT speedrun — train GPT-2 scale to a target validation loss as fast as possible
  • Megatron / Picotron — read and understand production-grade distributed training code

Office Hours sessions:

Step 5 — coming soon

Build an AI product

Ship an AI-powered product end to end — from idea to deployed, monitored application.

Show details ↓

You will learn:

  • Product thinking: problem → solution → users — scoping a project that can be shipped
  • Building with APIs, RAG, agents, and tool use
  • Frontend/backend integration for AI features
  • Deployment, monitoring, and iteration — keeping it running after launch

Office Hours sessions:

Step 6 — coming soon

Capstone project or open-source contribution

Prove what you’ve learned. Pick one: ship a capstone project or make a meaningful contribution to an open-source AI project.

Show details ↓

Options:

  • Capstone: End-to-end project combining inference, training, or product skills — deployed, documented, and added to your public portfolio
  • Open-source contribution: Submit a PR to an AI repo (model, library, dataset, docs) — get reviewed, merged, and credited
  • Present your work to the cohort; get peer feedback

Why it matters: A shipped project or merged PR is the strongest signal on your profile when applying for your first AI role.

Step 1: First use of AI for coding — Quarto blog with GitHub

Goal: Create a Quarto blog and host it on GitHub with an about-me page, blog posts, “Today I learned,” and other pages.

  1. Set up the project locally, link it to a GitHub repository, and configure GitHub Pages for deployment.
  2. Use your AI-based IDE to complete this setup.

Learning objectives: GitHub basics refresher · Setting up a personal and blogging website · Understanding how coding tools or SWE / AI agents work

Office Hours sessions:

Step 2: Run a model locally — Basic inference setup

Goal: Run Qwen3 0.6B locally in pure C — trace every operation from tokenization to token output.

Topics: Basics of inference (decoding, KV cache) · Chat templates and system prompts · Tokenization · Temperature and sampling · GGUF vs SafeTensors · Transformer architecture (self-attention, MHA, positional encoding, decoder-only design)

Guides:

Learning resources — follow in this order:

  1. KV Cache explained (video, ~10 min) — why the KV cache exists, how it grows with context length, and what practical models do to keep it small.
  2. The Illustrated Transformer — Jay Alammar — the most referenced visual guide to transformer architecture. Read before the paper.
  3. Attention Is All You Need (2017 paper) — the original transformer paper. Approachable after the Alammar blog.
  4. Agentic models are the future — Junyang Lin — former Qwen team lead on why agentic models (not just thinking models) are where the field is heading.

Office Hours sessions:

Step 3: Inference deep dive

Goal: Go beyond running a model — understand how inference works under the hood and how to serve models.

Topics: Inference engines and runtimes (vLLM, TGI, llama.cpp server) · Batching and continuous batching · Quantization (GGUF, GPTQ, AWQ) · Structured output, function calling, and tool use · Serving and API design for inference endpoints

Office Hours sessions:

Step 4: Training fundamentals

Goal: Build the foundations to train and fine-tune models from scratch.

Topics: PyTorch fundamentals (tensors, autograd, training loops) · Modelling (transformers, attention, MLP) · Data pipelines · Fine-tuning (LoRA, QLoRA, full fine-tune, adapters) · Distributed training (DDP, FSDP, multi-GPU) · Experiment tracking and evaluation

Office Hours sessions:

Step 5: Build an AI product (coming soon)

Goal: Ship an AI-powered product end to end.

Topics: Product thinking · Building with APIs, RAG, agents, and tool use · Frontend/backend integration · Deployment, monitoring, and iteration

Office Hours sessions:

Step 6: Capstone project or open-source contribution (coming soon)

Goal: Prove what you’ve learned. Pick one: build a capstone project or make a meaningful contribution to an open-source AI project.

Options: Capstone (end-to-end project, deployed and documented, added to public portfolio) · Open-source contribution (PR to an AI repo — model, library, dataset, or docs) · Present your work to the cohort for peer feedback

More steps can be added as the roadmap grows. Suggest new modules via CONTRIBUTING.md or a pull request.