Neuralearn dotAI/Practical Computer Vision: Zero to Hero Over 10 Topics

  • $49

Practical Computer Vision: Zero to Hero Over 10 Topics

  • Course
  • 49 Lessons

Build AI Systems: Sports Analytics with YOLOv8, AI Interior Design with Stable Diffusion, Influencer Advert with Flux & Invoice Parsing with Llama 3

Contents

Welcome

Image Classification with Hugging Face Transformers

Link to Code
Introduction to Image Classification and Data Preparation
Modeling and Training
Evaluation and Testing with Gradio

Model Deployment

Link to Code
Running Model with ONNX Runtime
Creating API with FastAPI

Ball Object Detection with Ultralytics YOLOv8

Link to Code
Understanding YOLOv8 Format
Loading Dataset and Training YOLOv8 Model
Running Inference on a single Image
Running Inference on a full Video

Player Detection with Grounding DINO on Hugging Face

Link to Code
Understanding Grounding DINO Model and Zero-Shot Object Detection
Prompting the DINO Model to Get Bounding Boxes
Carrying Out Inference on a Full Video

Player Tracking Throughout the Video with DeepSORT

Link to Code
Understanding DeepSORT and Implementing Player Tracking on Full Video
Filtering out Non Players

Field Projection with Homography

Link to Code
Drawing the plane
Understanding Homography and using it to Project the Pitch on a Plane

Key Point Detection with Ultralytics YOLO Pose Estimation Model

Link to Code
Preparing the Keypoints Detection Dataset and training the YOLO-Pose Model
Running Inference on Full Video
Assignment

Interior Designer (Empty House Filling) with Stable Diffusion (Img2Img)

Link to Code
Understanding Stable Diffusion
Simple img2img Pipeline with RunwayML's Stable Diffusion 1.5

Interior Designer with Stable Diffusion Inpainting Model

Understanding the Inpainting Pipeline
Obtaining Depth Maps with Zero-Shot (Depth Anything) Model
Getting Segmentation Mask and Map with Grounding Dino and Segment Anything Model
Generating Interior

Influencer Generator with FLUX

Link to Code
Generating Influencer Images
Generating Images with Lower VRAM

Influencer with Product (Advert) Generation using InsertAnything Framework

Link to Code
Understanding InsertAnything Framework
Understanding LoRA Adapters
Integrating FLUX Fill Base Model, FLUX Redux, and Insert Anything Framework

OCR Data Extraction and Parsing of Invoices and Receipts

Link to Code
Data Extraction with PaddleOCR
Dataset Preparation
Low Rank Adaptation (LoRA)
Modeling and Training Llama 3.2 - 3B with Unsloth
Modeling and Training Llama 3.2 - 3B with Unsloth
Running Inference and Understanding Inference Parametres
Saving and Loading Model to Huggingface Hub