About
The engineering layer of AI.
fp4 is a technical publication covering the layer between research papers and production systems. The layer where GPU memory hierarchies determine whether your model fits. Where attention kernel choices cut latency by 40%. Where tensor parallelism configurations make or break throughput.
We write for senior ML engineers, research engineers, and infrastructure leads who already know what RAG is. Every article cites the paper, derives the math, and shows the code. No beginner content. No marketing. No padding.
The name references FP4 — the E2M1 floating point format that sits at the precision frontier of LLM quantization. The logo's four cells encode the bit pattern S·E·E·M. Restraint at the hardware level, restraint in the prose.
Reach us at hello@fp4.dev.