AI Meet Marlin: A FP16xINT4 LLM Inference Kernel that can Achieve Near-Ideal ~4x Speedups up to Medium Batch Sizes of 16-32 Tokens
The Future Reality TV Star-Senate Candidate Claims He Intentionally Got Caught Insider Trading on Kalshi to Make a Point