FDOT (8-bit floating-point to half-precision, by element) 8-bit floating-point dot product to half-precision (vector, by element) This instruction computes the fused sum-of-products of a group of two 8-bit floating-point values held in each 16-bit element of the first source vector and a group of two 8-bit floating-point values in an indexed 16-bit element of the second source vector. The half-precision sum-of-products are scaled by 2-UInt(FPMR.LSCALE[3:0]), before being destructively added without intermediate rounding to the corresponding half-precision elements of the destination vector. The 8-bit floating-point groups within the second source vector are specified using an immediate index. The 8-bit floating-point encoding format for the elements of the first source vector is selected by FPMR.F8S1. The 8-bit floating-point encoding format for the elements of the second source vector is selected by FPMR.F8S2. 0 0 0 1 1 1 1 0 1 0 0 0 0 0 FDOT <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.2B[<index>] if !IsFeatureImplemented(FEAT_FP8DOT2) then UNDEFINED; constant integer n = UInt(Rn); constant integer d = UInt(Rd); constant integer m = UInt('0':Rm); constant integer i = UInt(H:L:M); constant integer datasize = if Q == '1' then 128 else 64; constant integer esize = 16; constant integer elements = datasize DIV esize; <Vd> Is the name of the SIMD&FP destination register, encoded in the "Rd" field. <Ta> Is an arrangement specifier, Q <Ta> 0 4H 1 8H
<Vn> Is the name of the first SIMD&FP source register, encoded in the "Rn" field. <Tb> Is an arrangement specifier, Q <Tb> 0 8B 1 16B
<Vm> Is the name of the second SIMD&FP source register, in the range V0 to V15, encoded in the "Rm" field. <index> Is the immediate index of a group of two 8-bit elements, in the range 0 to 7, encoded in the "H:L:M" fields.
CheckFPMREnabled(); CheckFPAdvSIMDEnabled64(); constant bits(datasize) operand1 = V[n, datasize]; constant bits(128) operand2 = V[m, 128]; constant bits(datasize) operand3 = V[d, datasize]; bits(datasize) result; for e = 0 to elements-1 constant bits(esize) op1 = Elem[operand1, e, esize]; constant bits(esize) op2 = Elem[operand2, i, esize]; constant bits(esize) sum = Elem[operand3, e, esize]; Elem[result, e, esize] = FP8DotAddFP(sum, op1, op2, FPCR, FPMR); V[d, datasize] = result;