Practical Computer Vision: Zero to Hero Over 10 Topics

Course
49 Lessons

Build AI Systems: Sports Analytics with YOLOv8, AI Interior Design with Stable Diffusion, Influencer Advert with Flux & Invoice Parsing with Llama 3

Buy now

Welcome

Image Classification with Hugging Face Transformers

Link to Code

Introduction to Image Classification and Data Preparation

Modeling and Training

Evaluation and Testing with Gradio

Model Deployment

Link to Code

Running Model with ONNX Runtime

Creating API with FastAPI

Ball Object Detection with Ultralytics YOLOv8

Link to Code

Understanding YOLOv8 Format

Loading Dataset and Training YOLOv8 Model

Running Inference on a single Image

Running Inference on a full Video

Player Detection with Grounding DINO on Hugging Face

Link to Code

Understanding Grounding DINO Model and Zero-Shot Object Detection

Prompting the DINO Model to Get Bounding Boxes

Carrying Out Inference on a Full Video

Player Tracking Throughout the Video with DeepSORT

Link to Code

Understanding DeepSORT and Implementing Player Tracking on Full Video

Filtering out Non Players

Field Projection with Homography

Link to Code

Drawing the plane

Understanding Homography and using it to Project the Pitch on a Plane

Key Point Detection with Ultralytics YOLO Pose Estimation Model

Link to Code

Preparing the Keypoints Detection Dataset and training the YOLO-Pose Model

Running Inference on Full Video

Assignment

Interior Designer (Empty House Filling) with Stable Diffusion (Img2Img)

Link to Code

Understanding Stable Diffusion

Simple img2img Pipeline with RunwayML's Stable Diffusion 1.5

Interior Designer with Stable Diffusion Inpainting Model

Understanding the Inpainting Pipeline

Obtaining Depth Maps with Zero-Shot (Depth Anything) Model

Getting Segmentation Mask and Map with Grounding Dino and Segment Anything Model

Generating Interior

Influencer Generator with FLUX

Link to Code

Generating Influencer Images

Generating Images with Lower VRAM

Influencer with Product (Advert) Generation using InsertAnything Framework

Link to Code

Understanding InsertAnything Framework

Understanding LoRA Adapters

Integrating FLUX Fill Base Model, FLUX Redux, and Insert Anything Framework

OCR Data Extraction and Parsing of Invoices and Receipts

Link to Code

Data Extraction with PaddleOCR

Dataset Preparation

Low Rank Adaptation (LoRA)

Modeling and Training Llama 3.2 - 3B with Unsloth

Running Inference and Understanding Inference Parametres

Saving and Loading Model to Huggingface Hub