Saturday, January 3, 2026

NVIDIA vs AMD GPU Architecture

Core Architecture

Aspek	NVIDIA	AMD
Unit utama	SM (Streaming Multiprocessor)	CU (Compute Unit)
Core	CUDA Core	Stream Processor
Eksekusi	Warp (32 thread)	Wavefront (64 thread)
Fokus	Compute + AI	Gaming + Compute

NVIDIA lebih agresif di compute & AI, AMD unggul di price/performance gaming

AI / Matrix Acceleration

NVIDIA

Tensor Cores (RTX, A100, H100)
Mendukung:
- FP16
- BF16
- INT8 / INT4
Sangat kuat untuk training & inference AI

AMD

Matrix Cores (RDNA3 / CDNA)
Lebih baru & terbatas dukungannya
Performa AI masih tertinggal dari NVIDIA

AI research & production → NVIDIA unggul

Software Ecosystem

Software	NVIDIA	AMD
Low-level	CUDA	ROCm / OpenCL
AI	cuDNN, TensorRT	MIOpen
ML Framework	Native support	Limited
Tools	Nsight	Radeon GPU Profiler

CUDA ecosystem adalah keunggulan terbesar NVIDIA

Memory & Bandwidth

Aspek	NVIDIA	AMD
VRAM	GDDR6 / HBM	GDDR6 / HBM
Cache	L1 + L2	Infinity Cache
Bandwidth	Sangat tinggi (HBM)	Efisien (Infinity Cache)

AMD lebih efisien di gaming dengan Infinity Cache

Ray Tracing & Graphics

Fitur	NVIDIA	AMD
Ray Tracing	RT Cores Gen 3	Ray Accelerator
Upscaling	DLSS (AI-based)	FSR (software)
Driver	Stabil	Kadang lambat update

Gaming modern → NVIDIA lebih unggul di ray tracing & DLSS

GPU untuk GAME vs GPU untuk AI

GPU untuk Gaming

Karakteristik:

Clock tinggi
Fokus rasterization & ray tracing
Optimasi shader
Harga lebih terjangkau

Contoh:

NVIDIA RTX 4060 / 4070
AMD RX 6700 XT / 7800 XT

AMD sering lebih murah dengan FPS tinggi

GPU untuk AI

Karakteristik:

Banyak compute core
Tensor acceleration
VRAM besar (24GB+)
Support mixed precision

Contoh:

NVIDIA RTX 4090 (consumer AI)
NVIDIA A100 / H100 (datacenter)
AMD MI250 (datacenter)

AI training serius → NVIDIA dominan

Mobile GPU (Sedikit Tambahan)

Mobile	Fokus
Adreno (Qualcomm)	Gaming mobile
Mali (ARM)	Efisiensi
Apple GPU	AI + Graphics
NVIDIA Tegra	Gaming / Embedded AI

Ringkasan Cepat

Pilih NVIDIA jika:

AI / Machine Learning
Ray tracing & DLSS
Software ecosystem matang

Pilih AMD jika:

Gaming murni
Budget terbatas
Open-source friendly

Apa itu GPU Architecture?

GPU (Graphics Processing Unit) adalah prosesor yang dirancang untuk parallel processing — menjalankan ribuan operasi kecil secara bersamaan.
Awalnya untuk grafis, sekarang dipakai luas untuk AI, ML, data science, game, dan komputasi berat.

CPU vs GPU (Gambaran Singkat)

CPU	GPU
Sedikit core (4–32)	Ribuan core kecil
Kuat di single-thread	Kuat di parallel
Kontrol kompleks	Throughput tinggi
Latency rendah	Bandwidth tinggi

GPU cocok untuk operasi yang sama di banyak data (matrix, pixel, vector).

Komponen Utama GPU Architecture

Streaming Multiprocessor (SM) / Compute Unit (CU)

Unit utama GPU
Berisi banyak core kecil (ALU)
NVIDIA → SM
AMD → CU

Satu GPU bisa punya puluhan hingga ratusan SM

GPU Cores (CUDA Core / Stream Processor)

Core sederhana
Fokus pada operasi aritmatika (add, multiply)
Tidak sekompleks core CPU

Contoh: RTX GPU bisa punya 10.000+ CUDA cores

Warp / Wavefront (Execution Model)

GPU mengeksekusi thread berkelompok
NVIDIA: Warp = 32 threads
AMD: Wavefront = 64 threads

Semua thread dalam warp menjalankan instruksi yang sama

Branch divergence (if-else berbeda) → performa turun

Memory Hierarchy GPU

Global Memory (VRAM)

Paling besar
Paling lambat
Bisa diakses semua thread

Shared Memory

Sangat cepat
Dibagi dalam satu SM
Cocok untuk data yang sering dipakai ulang

Registers

Paling cepat
Private untuk tiap thread
Jumlah terbatas

Register → Shared → L2 Cache → Global (VRAM)

Cache System

L1 Cache (dekat SM)
L2 Cache (shared antar SM)
Optimasi bandwidth memori

Execution Model (Cara GPU Bekerja)

CPU kirim kernel ke GPU
Kernel dipecah jadi:
- Grid
- Block
- Thread
GPU menjalankan ribuan thread paralel

Grid
 └── Block
      └── Thread

GPU Programming Models

Beberapa cara memprogram GPU:

CUDA (NVIDIA)
OpenCL (Cross-platform)
Vulkan Compute
Metal (Apple)
DirectCompute (Windows)

GPU untuk AI & ML

GPU sangat cocok untuk:

Matrix multiplication
Tensor operations
Neural network training

Fitur khusus:

Tensor Cores (NVIDIA)
Matrix Cores (AMD)
Mixed precision (FP16, BF16)

Tantangan GPU Architecture

Memory bottleneck
Branch divergence
Debugging sulit
Power consumption tinggi

Ringkasan Singkat

GPU = parallel monster
Banyak core kecil
Eksekusi berbasis warp
Memory hierarchy sangat penting
Ideal untuk AI, grafis, scientific computing

Menu Bar

Kata Mutiara

ANIMASI TULISAN BERJALAN

Saturday, January 3, 2026

NVIDIA vs AMD GPU Architecture

NVIDIA vs AMD GPU Architecture

Core Architecture

AI / Matrix Acceleration

NVIDIA

AMD

Software Ecosystem

Memory & Bandwidth

Ray Tracing & Graphics

GPU untuk GAME vs GPU untuk AI

GPU untuk Gaming

Karakteristik:

Contoh:

GPU untuk AI

Karakteristik:

Contoh:

Mobile GPU (Sedikit Tambahan)

Ringkasan Cepat

Pilih NVIDIA jika:

Pilih AMD jika:

Apa itu GPU Architecture?

Apa itu GPU Architecture?

CPU vs GPU (Gambaran Singkat)

Komponen Utama GPU Architecture

Streaming Multiprocessor (SM) / Compute Unit (CU)

GPU Cores (CUDA Core / Stream Processor)

Warp / Wavefront (Execution Model)

Memory Hierarchy GPU

Global Memory (VRAM)

Shared Memory

Registers

Cache System

Execution Model (Cara GPU Bekerja)

GPU Programming Models

GPU untuk AI & ML

Tantangan GPU Architecture

Ringkasan Singkat

iklan

iklan