MOVAZ (tile to vector, four registers) Move and zero four ZA tile slices to vector registers The instruction operates on four consecutive horizontal or vertical slices within a named ZA tile of the specified element size. The tile slices are zeroed after moving their contents to the destination vectors. The consecutive slice numbers within the tile are selected starting from the sum of the slice index register and immediate offset, modulo the number of such elements in a vector. The immediate offset is a multiple of 4 in the range 0 to the number of elements in a 128-bit vector segment minus 4. This instruction is unpredicated. Green False True SM_1_only It has encodings from 4 classes: 8-bit , 16-bit , 32-bit and 64-bit 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 MOVAZ { <Zd1>.B-<Zd4>.B }, ZA0<HV>.B[<Ws>, <offs1>:<offs4>] if !IsFeatureImplemented(FEAT_SME2p1) then UNDEFINED; constant integer s = UInt('011':Rs); constant integer nreg = 4; constant integer esize = 8; constant integer d = UInt(Zd:'00'); constant integer n = 0; constant integer offset = UInt(off2:'00'); constant boolean vertical = V == '1'; 1 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 MOVAZ { <Zd1>.H-<Zd4>.H }, <ZAn><HV>.H[<Ws>, <offs1>:<offs4>] if !IsFeatureImplemented(FEAT_SME2p1) then UNDEFINED; constant integer s = UInt('011':Rs); constant integer nreg = 4; constant integer esize = 16; constant integer d = UInt(Zd:'00'); constant integer n = UInt(ZAn); constant integer offset = UInt(o1:'00'); constant boolean vertical = V == '1'; 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 MOVAZ { <Zd1>.S-<Zd4>.S }, <ZAn><HV>.S[<Ws>, <offs1>:<offs4>] if !IsFeatureImplemented(FEAT_SME2p1) then UNDEFINED; constant integer s = UInt('011':Rs); constant integer nreg = 4; constant integer esize = 32; constant integer d = UInt(Zd:'00'); constant integer n = UInt(ZAn); constant integer offset = 0; constant boolean vertical = V == '1'; 1 1 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 MOVAZ { <Zd1>.D-<Zd4>.D }, <ZAn><HV>.D[<Ws>, <offs1>:<offs4>] if !IsFeatureImplemented(FEAT_SME2p1) then UNDEFINED; if MaxImplementedSVL() < 256 then UNDEFINED; constant integer s = UInt('011':Rs); constant integer nreg = 4; constant integer esize = 64; constant integer d = UInt(Zd:'00'); constant integer n = UInt(ZAn); constant integer offset = 0; constant boolean vertical = V == '1'; <Zd1> Is the name of the first scalable vector register of the destination multi-vector group, encoded as "Zd" times 4. <Zd4> Is the name of the fourth scalable vector register of the destination multi-vector group, encoded as "Zd" times 4 plus 3. <ZAn> For the 16-bit variant: is the name of the ZA tile ZA0-ZA1 to be accessed, encoded in the "ZAn" field. <ZAn> For the 32-bit variant: is the name of the ZA tile ZA0-ZA3 to be accessed, encoded in the "ZAn" field. <ZAn> For the 64-bit variant: is the name of the ZA tile ZA0-ZA7 to be accessed, encoded in the "ZAn" field. <HV> Is the horizontal or vertical slice indicator, V <HV> 0 H 1 V
<Ws> Is the 32-bit name of the slice index register W12-W15, encoded in the "Rs" field. <offs1> For the 8-bit variant: is the first slice index offset, encoded as "off2" field times 4. <offs1> For the 16-bit variant: is the first slice index offset, encoded as "o1" field times 4. <offs1> For the 32-bit and 64-bit variant: is the first slice index offset, with implicit value 0. <offs4> For the 8-bit variant: is the fourth slice index offset, encoded as "off2" field times 4 plus 3. <offs4> For the 16-bit variant: is the fourth slice index offset, encoded as "o1" field times 4 plus 3. <offs4> For the 32-bit and 64-bit variant: is the fourth slice index offset, with implicit value 3.
CheckStreamingSVEAndZAEnabled(); constant integer VL = CurrentVL; if nreg == 4 && esize == 64 && VL < 256 then UNDEFINED; constant integer slices = VL DIV esize; constant bits(32) index = X[s, 32]; constant integer slice = ((UInt(index) - (UInt(index) MOD nreg)) + offset) MOD slices; for r = 0 to nreg-1 constant bits(VL) result = ZAslice[n, esize, vertical, slice + r, VL]; ZAslice[n, esize, vertical, slice + r, VL] = Zeros(VL); Z[d + r, VL] = result;