All
Search
Images
Videos
Shorts
Maps
News
More
Shopping
Flights
Travel
Notebook
Report an inappropriate content
Please select one of the options below.
Not Relevant
Offensive
Adult
Child Sexual Abuse
Int8
Quantization Inference
How Int8 Quantized
Convolution Works
Sparseml Yolov5 Documentation
Amir GitHub
Use of Char Command in MATLAB
Awq0
Onnx vs Ultralytics
Porfelwirting Qshen with Awsar
Dyad Model
Tensorrt Dla
Int8 Quantization
Int8
Dynamic Model Quantization
Int8
Quantization
LLM Int4
Regression Quantileneural Network Matlab
Vision Language Model Quantization
Hawq Practical and Theory
Model Quantization
Meaning Quantaization Ai
Length
All
Short (less than 5 minutes)
Medium (5-20 minutes)
Long (more than 20 minutes)
Date
All
Past 24 hours
Past week
Past month
Past year
Resolution
All
Lower than 360p
360p or higher
480p or higher
720p or higher
1080p or higher
Source
All
Dailymotion
Vimeo
Metacafe
Hulu
VEVO
Myspace
MTV
CBS
Fox
CNN
MSN
Price
All
Free
Paid
Clear filters
SafeSearch:
Moderate
Strict
Moderate (default)
Off
Filter
Int8
Quantization Inference
How Int8 Quantized
Convolution Works
Sparseml Yolov5 Documentation
Amir GitHub
Use of Char Command in MATLAB
Awq0
Onnx vs Ultralytics
Porfelwirting Qshen with Awsar
Dyad Model
Tensorrt Dla
Int8 Quantization
Int8
Dynamic Model Quantization
Int8
Quantization
LLM Int4
Regression Quantileneural Network Matlab
Vision Language Model Quantization
Hawq Practical and Theory
Model Quantization
Meaning Quantaization Ai
19:55
Faster and Lighter Model Inference with ONNX Runtime from Cloud to Client
Aug 3, 2022
Microsoft
markdefalco
0:44
Quantization: What Everyone Gets Wrong (Accuracy Myths)
65 views
2 weeks ago
YouTube
Code & Capital
0:16
FP32 FP16 FP8 TENSOR INT #chatgpt #llm #google #tech #ytshorts #yt #youtube #youtubeshorts
61 views
1 month ago
YouTube
Amit_Chopra_assruc
0:41
Google magic bullet - TurboQuant #ai #gpu #google #chips #cuda #quantization
1.3K views
1 month ago
YouTube
Neural AI Flair
13:42
From 15GB to 4.7GB: Quantizing AI Models Locally
7.7K views
1 month ago
YouTube
NeuralNine
1:08:05
Tikhomirov M.M. - Training of large language models - 8. Inference, quantization
218 views
2 weeks ago
YouTube
teach-in
15:14
Why Inference is hard..
232 views
3 weeks ago
YouTube
Caleb Writes Code
6:29
Inference Optimization: Making AI Faster & Cheaper (Latency, Throughput & GPUs)
56 views
1 month ago
YouTube
wecite
7:29
Model Quantization Explained 8 bit, 4 bit & Inference Optimization #genai #aigenerated
32 views
2 months ago
YouTube
SmartSkale
2:36
I added KV caching and INT8 KV quantization to our transformer inference, improving throughput by 35x.All of this was done from scratch in Rust + CUDA, on top of a homemade ML framework.On a 4-token prompt with 252 generated tokens:- Original: 0.76 tok/s- KV cache fp32: 27.21 tok/s- KV cache int8 (quantized): 27.29 tok/sTry it out yourself here: https://t.co/kFS9Z0fs4hIn practice:- KV caching gave us about a 35x end-to-end speedup- INT8 KV cache kept roughly the same speed as fp32 but cut KV cac
48.8K views
3 weeks ago
x.com
Reese Chong
0:23
This is the clearest explanation of how LLM quantization works:It lets engineers compress a model by 4x and run it 2x faster without quality loss.I stumbled on this piece by Sam Rose and honestly wish I had it when I first tried to understand quantization.He breaks it down from absolute zero - bits, floats, how weights are stored - all the way to actually benchmarking quantized models. With interactive demos you can play with.Here's the core idea in 60 seconds:→ An LLM is billions of numbers (we
7K views
2 weeks ago
x.com
Sukh Sroay
Z-Image-Turbo INT8 — AI Playground & API - deAPI.ai
3 weeks ago
deapi.ai
Model Precision and Deployment Choices in Object Detection | Mohammad Zaid posted on the topic | LinkedIn
4 views
3 weeks ago
linkedin.com
Edge ML Development for i.MX Processors
Aug 21, 2024
nxp.com
15:47
Discrete Math - 1.4.2 Quantifiers
226.9K views
Feb 25, 2020
YouTube
Kimberly Brehm
5:23
Rules of Inference for Quantified Statements (Part 1)
169.5K views
Jan 19, 2021
YouTube
Neso Academy
41:59
Sampling Theorem Quantization and Binary Coding
7.1K views
Apr 11, 2021
YouTube
Engineering with Bingabr
15:08
PREDICATE LOGIC and QUANTIFIER NEGATION - DISCRETE MATHEMATICS
551.3K views
Jul 17, 2017
YouTube
TrevTutor
12:35
The Mathematics of Quantum Computers | Infinite Series
715.7K views
Feb 16, 2017
YouTube
PBS Infinite Series
0:51
Quantization explained
504 views
3 months ago
YouTube
Chip Talks AI
9:57
What is LLM Quantization ?
3.2K views
Mar 19, 2025
YouTube
New Machina
2:11
NVIDIA Tesla T4 Introduction to Inference
3.7K views
Apr 18, 2019
YouTube
Boston Limited
1:16:40
Lecture 30: Quantized Training
3.3K views
Oct 7, 2024
YouTube
GPU MODE
5:22
1.2.2 Quantifying Information
26.3K views
Jul 12, 2019
YouTube
MIT OpenCourseWare
53:01
22. Sampling and Quantization
30.8K views
Mar 15, 2013
YouTube
MIT OpenCourseWare
12:30
41. Adaptive Quantization
28.1K views
Apr 24, 2018
YouTube
itechnica
12:10
Optimize Your AI - Quantization Explained
465.1K views
Dec 28, 2024
YouTube
Matt Williams
1:36
What Is Quantization? | Decoding LLM File Names
1.3K views
4 months ago
YouTube
Anaconda, Inc.
1:01
Towards Unified INT8 Training for Convolutional Neural Network
803 views
Jul 17, 2020
YouTube
ComputerVisionFoundation Videos
1:03:51
DSP Lecture 23: Introduction to quantization
39.8K views
Nov 24, 2014
YouTube
Rich Radke
See more
More like this
Feedback