This repo contains the official code of our ICLR'25 paper: ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation. We introduce ViDiT-Q, a quantization ...
Running the example script llm-compressor/examples/quantization_w4a4_fp4/llama3_example.py results in a runtime error. Full traceback is included below.
Abstract: Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. Recently, data-free quantization has been widely studied as a practical and ...
Abstract: Quantization has enabled the widespread implementation of deep learning algorithms on resource-constrained Internet of Things (IoT) devices, which compresses neural networks by reducing the ...
The quantization of classical theories that admit more than one Hamiltonian description is considered. This is done from a geometrical viewpoint, both at the quantization level (geometric quantization ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results