unary_element_wise_operation.hpp File Reference#
unary_element_wise_operation.hpp File Reference
Go to the source code of this file.
Namespaces | |
| namespace | ck_tile |
| namespace | ck_tile::element_wise |
Macros | |
| #define | CONSTEXPR_LOOKUP_TABLE_FOR_BF16 1 |
| #define | CONSTEXPR_LOOKUP_TABLE_FOR_FP8 0 |
| #define | CONSTEXPR_LOOKUP_TABLE_FOR_BF8 0 |
Functions | |
| template<typename T, std::size_t N, typename F, std::size_t... Is> | |
| constexpr std::array< T, N > | ck_tile::element_wise::make_lookup_table_impl (F &&func, std::index_sequence< Is... >) |
| template<typename T, std::size_t N, typename F> | |
| constexpr std::array< T, N > | ck_tile::element_wise::make_lookup_table (F &&func) |
| CK_TILE_DEVICE fp16x4_t | ck_tile::element_wise::i4_to_half4 (int q) |
| Fast int4x4 to fp16x8_t data type conversion based on paper "Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production". | |
| CK_TILE_DEVICE fp16x4_t | ck_tile::element_wise::i4_to_half4_scale (int q, const fp16x2_t &scale) |
| This function dequantizes 4 int4 values into 4 fp16 values and applies scaling. | |
| CK_TILE_DEVICE bf16x4_t | ck_tile::element_wise::i4_to_bhalf4 (int q) |
| This function converts 4 4-bit integers into 4 bf16 values. | |
| CK_TILE_DEVICE fp8x8_t | ck_tile::element_wise::amd_assembly_i4_to_fp8x8 (int a) |
| This function converts 8 packed 4-bit integers into 8 fp8 values. | |
| CK_TILE_DEVICE float | ck_tile::element_wise::amd_assembly_fp8_to_fp32 (uint32_t src) |
| CK_TILE_DEVICE float | ck_tile::element_wise::amd_assembly_bf8_to_fp32 (uint32_t src) |
| CK_TILE_DEVICE bf8x8_t | ck_tile::element_wise::amd_assembly_i4_to_bf8x8 (uint32_t a) |
| This function converts 8 packed 4-bit integers into 8 bf8 values. | |
Macro Definition Documentation
◆ CONSTEXPR_LOOKUP_TABLE_FOR_BF16
| #define CONSTEXPR_LOOKUP_TABLE_FOR_BF16 1 |
◆ CONSTEXPR_LOOKUP_TABLE_FOR_BF8
| #define CONSTEXPR_LOOKUP_TABLE_FOR_BF8 0 |
◆ CONSTEXPR_LOOKUP_TABLE_FOR_FP8
| #define CONSTEXPR_LOOKUP_TABLE_FOR_FP8 0 |