LD1D (scalar plus scalar, tile slice)
Contiguous load of doublewords to 64-bit element ZA tile slice
The slice number within the tile is selected by the sum of the slice index register and immediate offset, modulo the number of 64-bit elements in a vector. The immediate offset is in the range 0 to 1. The memory address is generated by a 64-bit scalar base and an optional 64-bit scalar offset which is multiplied by 8 and added to the base address. Inactive elements will not cause a read from Device memory or signal a fault, and are set to zero in the destination vector.
Green
True
True
True
SM_1_only
1
1
1
0
0
0
0
0
1
1
0
0
LD1D { <ZAt><HV>.D[<Ws>, <offs>] }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #3}]
if !IsFeatureImplemented(FEAT_SME) then UNDEFINED;
constant integer n = UInt(Rn);
constant integer m = UInt(Rm);
constant integer g = UInt('0':Pg);
constant integer s = UInt('011':Rs);
constant integer t = UInt(ZAt);
constant integer offset = UInt(o1);
constant integer esize = 64;
constant boolean vertical = V == '1';
<ZAt>
Is the name of the ZA tile ZA0-ZA7 to be accessed, encoded in the "ZAt" field.
<HV>
Is the horizontal or vertical slice indicator,
<Ws>
Is the 32-bit name of the slice index register W12-W15, encoded in the "Rs" field.
<offs>
Is the slice index offset, in the range 0 to 1, encoded in the "o1" field.
<Pg>
Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.
<Xn|SP>
Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field.
<Xm>
Is the optional 64-bit name of the general-purpose offset register, defaulting to XZR, encoded in the "Rm" field.
CheckStreamingSVEAndZAEnabled();
constant integer VL = CurrentVL;
constant integer PL = VL DIV 8;
constant integer dim = VL DIV esize;
bits(64) base;
bits(64) addr;
constant bits(PL) mask = P[g, PL];
bits(64) moffs = X[m, 64];
constant bits(32) index = X[s, 32];
constant integer slice = (UInt(index) + offset) MOD dim;
bits(VL) result;
constant integer mbytes = esize DIV 8;
constant boolean contiguous = TRUE;
constant boolean nontemporal = FALSE;
constant boolean tagchecked = TRUE;
constant AccessDescriptor accdesc = CreateAccDescSME(MemOp_LOAD, nontemporal, contiguous,
tagchecked);
if n == 31 then
if (AnyActiveElement(mask, esize) ||
ConstrainUnpredictableBool(Unpredictable_CHECKSPNONEACTIVE)) then
CheckSPAlignment();
base = SP[];
else
base = X[n, 64];
for e = 0 to dim - 1
addr = AddressAdd(base, UInt(moffs) * mbytes, accdesc);
if ActivePredicateElement(mask, e, esize) then
Elem[result, e, esize] = Mem[addr, mbytes, accdesc];
else
Elem[result, e, esize] = Zeros(esize);
moffs = moffs + 1;
ZAslice[t, esize, vertical, slice, VL] = result;