ai-x1-pro ~/llamacpp>./llama-bench --hf-repo unsloth/gemma-4-E4B-it-GGUF ggml_cuda_init: found 1 ROCm devices (Total VRAM: 47072 MiB): Device 0: AMD Radeon 890M Graphics, gfx1150 (0x1150), VMM: no, Wave Size: 32, VRAM: 47072 MiB | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: | | gemma4 E4B Q4_K - Medium | 4.62 GiB | 7.52 B | ROCm | 99 | pp512 | 568.54 ± 1.96 | | gemma4 E4B Q4_K - Medium | 4.62 GiB | 7.52 B | ROCm | 99 | tg128 | 21.16 ± 0.03 | build: 69c28f1 (1)