¡CLDEMOTECLDEMOTEcldemoteCache Line Demotecldemote#8*https://www.felixcloutier.com/x86/cldemote VGATHERPF0QPD VGATHERPF0QPD vgatherpf0qpdmSparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Quadword Indices Using T0 Hint vgatherpf0qpdMLYhttps://www.felixcloutier.com/x86/vgatherpf0dps:vgatherpf0qps:vgatherpf0dpd:vgatherpf0qpdCMPPDCMPPDcmppd5Compare Packed Double-Precision Floating-Point ValuescmppdCMPPD3cmppdCMPPD3/'https://www.felixcloutier.com/x86/cmppd VPERM2I128 VPERM2I128 vperm2i128Permute 128-Bit Integer Values vperm2i1284! vperm2i12842!,https://www.felixcloutier.com/x86/vperm2i128 VRSQRT28PD VRSQRT28PD vrsqrt28pd€Approximation to the Reciprocal Square Root of Packed Double-Precision Floating-Point Values with Less Than 2^-28 Relative Error vrsqrt28pdAM vrsqrt28pdM vrsqrt28pdAM vrsqrt28pdM vrsqrt28pdRM vrsqrt28pdRM,https://www.felixcloutier.com/x86/vrsqrt28pdBSWAPBSWAPbswap Byte Swapbswap3bswap3'https://www.felixcloutier.com/x86/bswapJPJPjpJump if parity (PF == 1)jpJPS3NjpJPS3OKANDNDKANDNDkandnd$Bitwise Logical AND NOT 32-bit MaskskandndI=https://www.felixcloutier.com/x86/kandnw:kandnb:kandnq:kandndKTESTDKTESTDktestd#Bit Test 32-bit Masks and Set FlagsktestdI=https://www.felixcloutier.com/x86/ktestw:ktestb:ktestq:ktestdKUNPCKWDKUNPCKWDkunpckwd"Unpack and Interleave 16-bit MaskskunpckwdI<https://www.felixcloutier.com/x86/kunpckbw:kunpckwd:kunpckdqPMINSBPMINSBpminsb&Minimum of Packed Signed Byte Integerspminsb3pminsb3//https://www.felixcloutier.com/x86/pminsb:pminswCQOCQOcqoConvert Quadword to Octawordcqto3-https://www.felixcloutier.com/x86/cwd:cdq:cqoUNPCKLPDUNPCKLPDunpcklpdGUnpack and Interleave Low Packed Double-Precision Floating-Point ValuesunpcklpdUNPCKLPD3unpcklpdUNPCKLPD3/*https://www.felixcloutier.com/x86/unpcklpdDECDECdecDecrement by 1decbDECB3 decwDECW3 declDECL3decqDECQ3decbDECB3#decwDECW3$declDECL3'decqDECQ3+%https://www.felixcloutier.com/x86/dec VFNMSUB231PH VFNMSUB231PH vfnmsub231phOFused Negative Multiply-Subtract of Packed Half-Precision Floating-Point Values vfnmsub231ph<K vfnmsub231phK vfnmsub231ph>K vfnmsub231phK vfnmsub231ph@R vfnmsub231phR vfnmsub231ph<K vfnmsub231phK vfnmsub231ph>K vfnmsub231phK vfnmsub231ph@R vfnmsub231phR vfnmsub231phQR vfnmsub231phQRlhttps://www.felixcloutier.com/x86/vfmsub132ph:vfnmsub132ph:vfmsub213ph:vfnmsub213ph:vfmsub231ph:vfnmsub231ph VINSERTI128 VINSERTI128 vinserti128Insert Packed Integer Values vinserti1284! vinserti1284/!ahttps://www.felixcloutier.com/x86/vinserti128:vinserti32x4:vinserti64x2:vinserti32x8:vinserti64x4VPERMI2QVPERMI2Qvpermi2q?Full Permute of Quadwords From Two Tables Overwriting the Index vpermi2q=Hvpermi2qHvpermi2q?Hvpermi2qHvpermi2qAHvpermi2qHvpermi2q=Hvpermi2qHvpermi2q?Hvpermi2qHvpermi2qAHvpermi2qHPhttps://www.felixcloutier.com/x86/vpermi2w:vpermi2d:vpermi2q:vpermi2ps:vpermi2pd VPMACSSWD VPMACSSWD vpmacsswdKPacked Multiply Accumulate with Saturation Signed Word to Signed Doubleword vpmacsswd" vpmacsswd/" VFMSUB132PH VFMSUB132PH vfmsub132phFFused Multiply-Subtract of Packed Half-Precision Floating-Point Values vfmsub132ph<K vfmsub132phK vfmsub132ph>K vfmsub132phK vfmsub132ph@R vfmsub132phR vfmsub132ph<K vfmsub132phK vfmsub132ph>K vfmsub132phK vfmsub132ph@R vfmsub132phR vfmsub132phQR vfmsub132phQRlhttps://www.felixcloutier.com/x86/vfmsub132ph:vfnmsub132ph:vfmsub213ph:vfnmsub213ph:vfmsub231ph:vfnmsub231phVSTMXCSRVSTMXCSRvstmxcsrStore MXCSR Register Statevstmxcsr4' VSUBSSVSUBSSvsubss6Subtract Scalar Single-Precision Floating-Point ValuesvsubssHvsubss'Hvsubss4 vsubssHvsubss4' vsubss'HvsubssQHvsubssQHLFENCELFENCElfence Load FencelfenceLFENCE3(https://www.felixcloutier.com/x86/lfenceBLENDPSBLENDPSblendps4 Blend Packed Single Precision Floating-Point Valuesblendps3blendps3/)https://www.felixcloutier.com/x86/blendpsPFRCPPFRCPpfrcp.Packed Floating-Point Reciprocal ApproximationpfrcpPFRCP3pfrcpPFRCP3+VSCATTERPF1DPDVSCATTERPF1DPDvscatterpf1dpd„Sparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Doubleword Indices Using T1 Hint with Intent to Writevscatterpf1dpdGL]https://www.felixcloutier.com/x86/vscatterpf1dps:vscatterpf1qps:vscatterpf1dpd:vscatterpf1qpdVHSUBPSVHSUBPSvhsubps$Packed Single-FP Horizontal Subtractvhsubps4 vhsubps4/ vhsubps4 vhsubps42 VPMOVSDWVPMOVSDWvpmovsdwKDown Convert Packed Doubleword Values to Word Values with Signed Saturation vpmovsdwHvpmovsdw,HvpmovsdwHvpmovsdw0HvpmovsdwHvpmovsdw3HvpmovsdwHvpmovsdwHvpmovsdwHvpmovsdw+Hvpmovsdw/Hvpmovsdw2H<https://www.felixcloutier.com/x86/vpmovdw:vpmovsdw:vpmovusdwSHUFPDSHUFPDshufpd5Shuffle Packed Double-Precision Floating-Point ValuesshufpdSHUFPD3shufpdSHUFPD3/(https://www.felixcloutier.com/x86/shufpd VFMSUBADDPD VFMSUBADDPD vfmsubaddpdXFused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Values vfmsubaddpd$ vfmsubaddpd/$ vfmsubaddpd/$ vfmsubaddpd$ vfmsubaddpd2$ vfmsubaddpd2$JLEJLEjle+Jump if less or equal (ZF == 1 or SF != OF)jleJLE3NjleJLE3OVPERMT2BVPERMT2Bvpermt2b9Full Permute of Bytes From Two Tables Overwriting a Table vpermt2bTvpermt2b/Tvpermt2bTvpermt2b2Tvpermt2bTvpermt2b5Tvpermt2bTvpermt2b/Tvpermt2bTvpermt2b2Tvpermt2bTvpermt2b5T*https://www.felixcloutier.com/x86/vpermt2b VCVTUQQ2PD VCVTUQQ2PD vcvtuqq2pdZConvert Packed Unsigned Quadword Integers to Packed Double-Precision Floating-Point Values vcvtuqq2pd=J vcvtuqq2pd?J vcvtuqq2pdAJ vcvtuqq2pdJ vcvtuqq2pdJ vcvtuqq2pdJ vcvtuqq2pd=J vcvtuqq2pdJ vcvtuqq2pd?J vcvtuqq2pdJ vcvtuqq2pdAJ vcvtuqq2pdJ vcvtuqq2pdQJ vcvtuqq2pdQJ,https://www.felixcloutier.com/x86/vcvtuqq2pdSETNBSETNBsetnbSet byte if not below (CF == 0)setnbSETCC3 setnbSETCC3# VFCMADDCPH VFCMADDCPH vfcmaddcphSFused Conjugate Multiply-Add of Complex Packed Half-Precision Floating-Point Values vfcmaddcph9K vfcmaddcphK vfcmaddcph:K vfcmaddcphK vfcmaddcph;R vfcmaddcphR vfcmaddcph9K vfcmaddcphK vfcmaddcph:K vfcmaddcphK vfcmaddcph;R vfcmaddcphR vfcmaddcphQR vfcmaddcphQR6https://www.felixcloutier.com/x86/vfcmaddcph:vfmaddcphVPMULLWVPMULLWvpmullw9Multiply Packed Signed Word Integers and Store Low ResultvpmullwIvpmullw/IvpmullwIvpmullw2IvpmullwIvpmullw5Ivpmullw4 vpmullwIvpmullw4/ vpmullw/Ivpmullw4!vpmullwIvpmullw42!vpmullw2IvpmullwIvpmullw5I SERIALIZE SERIALIZE serializeSerialize Instruction Execution serializeF+https://www.felixcloutier.com/x86/serialize VPCMPISTRM VPCMPISTRM vpcmpistrm3Packed Compare Implicit Length Strings, Return Mask vpcmpistrm4  vpcmpistrm4/ VCMPPHVCMPPHvcmpph3Compare Packed Half-Precision Floating-Point Valuesvcmpph<Kvcmpph<KvcmpphKvcmpphKvcmpph>Kvcmpph>KvcmpphKvcmpphKvcmpph@Rvcmpph@RvcmpphRvcmpphRvcmpphRRvcmpphRR(https://www.felixcloutier.com/x86/vcmpph VCVTPD2UDQ VCVTPD2UDQ vcvtpd2udq\Convert Packed Double-Precision Floating-Point Values to Packed Unsigned Doubleword Integers vcvtpd2udqx=H vcvtpd2udqy?H vcvtpd2udqAH vcvtpd2udqxH vcvtpd2udqyH vcvtpd2udqH vcvtpd2udqx=H vcvtpd2udqy?H vcvtpd2udqxH vcvtpd2udqyH vcvtpd2udqAH vcvtpd2udqH vcvtpd2udqQH vcvtpd2udqQH,https://www.felixcloutier.com/x86/vcvtpd2udqKANDBKANDBkandbBitwise Logical AND 8-bit MaskskandbJ9https://www.felixcloutier.com/x86/kandw:kandb:kandq:kanddVPMAXSDVPMAXSDvpmaxsd,Maximum of Packed Signed Doubleword Integersvpmaxsd9HvpmaxsdHvpmaxsd:HvpmaxsdHvpmaxsd;HvpmaxsdHvpmaxsd9Hvpmaxsd4 vpmaxsdHvpmaxsd4/ vpmaxsd:Hvpmaxsd4!vpmaxsdHvpmaxsd42!vpmaxsd;HvpmaxsdHCLFLUSHCLFLUSHclflushFlush Cache Lineclflush#9)https://www.felixcloutier.com/x86/clflushVPSRLVQVPSRLVQvpsrlvq1Variable Shift Packed Quadword Data Right Logicalvpsrlvq=HvpsrlvqHvpsrlvq?HvpsrlvqHvpsrlvqAHvpsrlvqHvpsrlvq=Hvpsrlvq4!vpsrlvqHvpsrlvq4/!vpsrlvq?Hvpsrlvq4!vpsrlvqHvpsrlvq42!vpsrlvqAHvpsrlvqH9https://www.felixcloutier.com/x86/vpsrlvw:vpsrlvd:vpsrlvqVFMSUBADD132PSVFMSUBADD132PSvfmsubadd132psXFused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Valuesvfmsubadd132ps9Hvfmsubadd132psHvfmsubadd132ps:Hvfmsubadd132psHvfmsubadd132ps;Hvfmsubadd132psHvfmsubadd132ps9Hvfmsubadd132ps4#vfmsubadd132psHvfmsubadd132ps4/#vfmsubadd132ps:Hvfmsubadd132ps4#vfmsubadd132psHvfmsubadd132ps42#vfmsubadd132ps;Hvfmsubadd132psHvfmsubadd132psQHvfmsubadd132psQHNhttps://www.felixcloutier.com/x86/vfmsubadd132ps:vfmsubadd213ps:vfmsubadd231psKORTESTWKORTESTWkortestwOR 16-bit Masks and Set FlagskortestwHEhttps://www.felixcloutier.com/x86/kortestw:kortestb:kortestq:kortestdMOVZXMOVZXmovzxMove with Zero-Extend movzbwMOVBWZX3  movzbwMOVBWZX3 #movzblMOVBLZX3 movzwlMOVWLZX3 movzblMOVBLZX3#movzwlMOVWLZX3$movzbqMOVBQZX3 movzwqMOVWQZX3 movzbqMOVBQZX3#movzwqMOVWQZX3$'https://www.felixcloutier.com/x86/movzxVBROADCASTI64X4VBROADCASTI64X4vbroadcasti64x4 Broadcast Four Quadword Elementsvbroadcasti64x42Hvbroadcasti64x42HVMOVNTDQVMOVNTDQvmovntdq-Store Double Quadword Using Non-Temporal Hintvmovntdq4/ vmovntdq/Hvmovntdq42 vmovntdq2Hvmovntdq5H VPHADDUWQ VPHADDUWQ vphadduwq/Packed Horizontal Add Unsigned Word to Quadword vphadduwq" vphadduwq/"BLCMSKBLCMSKblcmskMask From Lowest Clear Bitblcmsk6blcmsk'6blcmsk6blcmsk+6 CVTTSD2SI CVTTSD2SI cvttsd2siJConvert with Truncation Scalar Double-Precision FP Value to Signed Integer cvttsd2si CVTTSD2SL3 cvttsd2si CVTTSD2SL3+ cvttsd2si CVTTSD2SQ3 cvttsd2si CVTTSD2SQ3++https://www.felixcloutier.com/x86/cvttsd2siPSRLDPSRLDpsrld*Shift Packed Doubleword Data Right LogicalpsrldPSRLL3 psrldPSRLL3 psrldPSRLL3+ psrldPSRLL3psrldPSRLL3psrldPSRLL3/3https://www.felixcloutier.com/x86/psrlw:psrld:psrlqSTDSTDstdSet Direction FlagstdSTD3%https://www.felixcloutier.com/x86/stdPACKUSWBPACKUSWBpackuswb.Pack Words into Bytes with Unsigned SaturationpackuswbPACKUSWB3 packuswbPACKUSWB3+ packuswbPACKUSWB3packuswbPACKUSWB3/*https://www.felixcloutier.com/x86/packuswb VFNMSUB132SH VFNMSUB132SH vfnmsub132shOFused Negative Multiply-Subtract of Scalar Half-Precision Floating-Point Values vfnmsub132shR vfnmsub132sh$R vfnmsub132shR vfnmsub132sh$R vfnmsub132shQR vfnmsub132shQRlhttps://www.felixcloutier.com/x86/vfmsub132sh:vfnmsub132sh:vfmsub213sh:vfnmsub213sh:vfmsub231sh:vfnmsub231sh VPERMI2PS VPERMI2PS vpermi2ps\Full Permute of Single-Precision Floating-Point Values From Two Tables Overwriting the Index  vpermi2ps9H vpermi2psH vpermi2ps:H vpermi2psH vpermi2ps;H vpermi2psH vpermi2ps9H vpermi2psH vpermi2ps:H vpermi2psH vpermi2ps;H vpermi2psHPhttps://www.felixcloutier.com/x86/vpermi2w:vpermi2d:vpermi2q:vpermi2ps:vpermi2pdVPSRLDVPSRLDvpsrld*Shift Packed Doubleword Data Right Logicalvpsrld9Hvpsrld:Hvpsrld;HvpsrldHvpsrldHvpsrld/HvpsrldHvpsrldHvpsrld/HvpsrldHvpsrldHvpsrld/Hvpsrld9Hvpsrld4 vpsrldHvpsrld4 vpsrldHvpsrld4/ vpsrld/Hvpsrld:Hvpsrld4!vpsrldHvpsrld4!vpsrldHvpsrld4/!vpsrld/Hvpsrld;HvpsrldHvpsrldHvpsrld/HPHADDWPHADDWphaddw#Packed Horizontal Add Word Integersphaddw3phaddw3+phaddw3phaddw3//https://www.felixcloutier.com/x86/phaddw:phaddd VPTESTNMB VPTESTNMB vptestnmb7Logical NAND of Packed Byte Integer Values and Set Mask  vptestnmbI vptestnmbI vptestnmb/I vptestnmb/I vptestnmbI vptestnmbI vptestnmb2I vptestnmb2I vptestnmbI vptestnmbI vptestnmb5I vptestnmb5IIhttps://www.felixcloutier.com/x86/vptestnmb:vptestnmw:vptestnmd:vptestnmqADDSUBPDADDSUBPDaddsubpdPacked Double-FP Add/Subtractaddsubpd3addsubpd3/*https://www.felixcloutier.com/x86/addsubpdANDPDANDPDandpdDBitwise Logical AND of Packed Double-Precision Floating-Point ValuesandpdANDPD3andpdANDPD3/'https://www.felixcloutier.com/x86/andpdVMINPHVMINPHvminph:Return Minimum Packed Half-Precision Floating-Point Valuesvminph<KvminphKvminph>KvminphKvminph@RvminphRvminph<KvminphKvminph>KvminphKvminph@RvminphRvminphRRvminphRR(https://www.felixcloutier.com/x86/vminph VPCMPESTRI VPCMPESTRI vpcmpestri4Packed Compare Explicit Length Strings, Return Index vpcmpestriq4  vpcmpestriq4/  VINSERTF64X2 VINSERTF64X2 vinsertf64x2@Insert 128 Bits of Packed Double-Precision Floating-Point Values vinsertf64x2J vinsertf64x2/J vinsertf64x2J vinsertf64x2/J vinsertf64x2J vinsertf64x2/J vinsertf64x2J vinsertf64x2/JMOVHLPSMOVHLPSmovhlps>Move Packed Single-Precision Floating-Point Values High to LowmovhlpsMOVHLPS3)https://www.felixcloutier.com/x86/movhlpsAESENCAESENCaesenc+Perform One Round of an AES Encryption FlowaesencAESENC'aesencAESENC/'(https://www.felixcloutier.com/x86/aesencPMULLDPMULLDpmulld?Multiply Packed Signed Doubleword Integers and Store Low Resultpmulld3pmulld3//https://www.felixcloutier.com/x86/pmulld:pmullqVPCOMUWVPCOMUWvpcomuw%Compare Packed Unsigned Word Integersvpcomuw"vpcomuw/"SARXSARXsarx.Arithmetic Shift Right Without Affecting Flagssarxl5sarxl'5sarxq5sarxq+50https://www.felixcloutier.com/x86/sarx:shlx:shrx VGATHERPF0DPD VGATHERPF0DPD vgatherpf0dpdoSparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Doubleword Indices Using T0 Hint vgatherpf0dpdGLYhttps://www.felixcloutier.com/x86/vgatherpf0dps:vgatherpf0qps:vgatherpf0dpd:vgatherpf0qpdVPMOVQ2MVPMOVQ2Mvpmovq2m7Move Signs of Packed Quadword Integers to Mask Registervpmovq2mJvpmovq2mJvpmovq2mJEhttps://www.felixcloutier.com/x86/vpmovb2m:vpmovw2m:vpmovd2m:vpmovq2m VEXPANDPD VEXPANDPD vexpandpdKLoad Sparse Packed Double-Precision Floating-Point Values from Dense Memory  vexpandpdK vexpandpdH vexpandpdH vexpandpd/K vexpandpd2H vexpandpd5H vexpandpdK vexpandpd/K vexpandpdH vexpandpd2H vexpandpdH vexpandpd5H+https://www.felixcloutier.com/x86/vexpandpdMOVDQ2QMOVDQ2Qmovdq2q1Move Quadword from XMM to MMX Technology Registermovdq2q3)https://www.felixcloutier.com/x86/movdq2qPTESTPTESTptestPacked Logical Compareptest3ptest3/'https://www.felixcloutier.com/x86/ptestSETNCSETNCsetncSet byte if not carry (CF == 0)setncSETCC3 setncSETCC3# VPGATHERDQ VPGATHERDQ vpgatherdq=Gather Packed Quadword Values Using Signed Doubleword Indices vpgatherdqBH vpgatherdqBH vpgatherdqFH vpgatherdqB! vpgatherdqB!7https://www.felixcloutier.com/x86/vpgatherdq:vpgatherqq VPMOVUSDW VPMOVUSDW vpmovusdwMDown Convert Packed Doubleword Values to Word Values with Unsigned Saturation  vpmovusdwH vpmovusdw,H vpmovusdwH vpmovusdw0H vpmovusdwH vpmovusdw3H vpmovusdwH vpmovusdwH vpmovusdwH vpmovusdw+H vpmovusdw/H vpmovusdw2H<https://www.felixcloutier.com/x86/vpmovdw:vpmovsdw:vpmovusdwVPROLVDVPROLVDvprolvd&Variable Rotate Packed Doubleword Left vprolvd9HvprolvdHvprolvd:HvprolvdHvprolvd;HvprolvdHvprolvd9HvprolvdHvprolvd:HvprolvdHvprolvd;HvprolvdH?https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvq VGATHERPF1QPD VGATHERPF1QPD vgatherpf1qpdmSparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Quadword Indices Using T1 Hint vgatherpf1qpdMLYhttps://www.felixcloutier.com/x86/vgatherpf1dps:vgatherpf1qps:vgatherpf1dpd:vgatherpf1qpdANDNPDANDNPDandnpdHBitwise Logical AND NOT of Packed Double-Precision Floating-Point ValuesandnpdANDNPD3andnpdANDNPD3/(https://www.felixcloutier.com/x86/andnpd VGATHERDPD VGATHERDPD vgatherdpdTGather Packed Double-Precision Floating-Point Values Using Signed Doubleword Indices vgatherdpdBH vgatherdpdBH vgatherdpdFH vgatherdpdB! vgatherdpdB!7https://www.felixcloutier.com/x86/vgatherdps:vgatherdpdPADDUSWPADDUSWpaddusw:Add Packed Unsigned Word Integers with Unsigned SaturationpadduswPADDUSW3 padduswPADDUSW3+ padduswPADDUSW3padduswPADDUSW3/1https://www.felixcloutier.com/x86/paddusb:padduswVSQRTPHVSQRTPHvsqrtphCCompute Square Roots of Packed Half-Precision Floating-Point Valuesvsqrtph<Kvsqrtph>Kvsqrtph@RvsqrtphKvsqrtphKvsqrtphRvsqrtph<KvsqrtphKvsqrtph>KvsqrtphKvsqrtph@RvsqrtphRvsqrtphQRvsqrtphQR)https://www.felixcloutier.com/x86/vsqrtphVSQRTSHVSQRTSHvsqrtshACompute Square Root of Scalar Half-Precision Floating-Point ValuevsqrtshRvsqrtsh$RvsqrtshRvsqrtsh$RvsqrtshQRvsqrtshQR)https://www.felixcloutier.com/x86/vsqrtshCVTSS2SICVTSS2SIcvtss2si9Convert Scalar Single-Precision FP Value to Dword Integercvtss2siCVTSS2SL3cvtss2siCVTSS2SL3'cvtss2siCVTSS2SQ3cvtss2siCVTSS2SQ3'*https://www.felixcloutier.com/x86/cvtss2siVBROADCASTF32X4VBROADCASTF32X4vbroadcastf32x47Broadcast Four Single-Precision Floating-Point Elementsvbroadcastf32x4/Hvbroadcastf32x4/Hvbroadcastf32x4/Hvbroadcastf32x4/H VSCALEFSS VSCALEFSS vscalefss_Scale Scalar Single-Precision Floating-Point Value With a Single-Precision Floating-Point Value vscalefssH vscalefss'H vscalefssH vscalefss'H vscalefssQH vscalefssQH+https://www.felixcloutier.com/x86/vscalefss VPCONFLICTD VPCONFLICTD vpconflictdWDetect Conflicts Within a Vector of Packed Doubleword Values into Dense Memory/Register  vpconflictd9N vpconflictd:N vpconflictd;N vpconflictdN vpconflictdN vpconflictdN vpconflictd9N vpconflictdN vpconflictd:N vpconflictdN vpconflictd;N vpconflictdN9https://www.felixcloutier.com/x86/vpconflictd:vpconflictqPSUBUSWPSUBUSWpsubusw?Subtract Packed Unsigned Word Integers with Unsigned SaturationpsubuswPSUBUSW3 psubuswPSUBUSW3+ psubuswPSUBUSW3psubuswPSUBUSW3/1https://www.felixcloutier.com/x86/psubusb:psubuswORPDORPDorpd<Bitwise Logical OR of Double-Precision Floating-Point ValuesorpdORPD3orpdORPD3/&https://www.felixcloutier.com/x86/orpd VCOMPRESSPS VCOMPRESSPS vcompresspsUStore Sparse Packed Single-Precision Floating-Point Values into Dense Memory/Register  vcompresspsH vcompressps0H vcompresspsH vcompressps3H vcompresspsH vcompressps6H vcompresspsH vcompresspsH vcompresspsH vcompressps/H vcompressps2H vcompressps5H-https://www.felixcloutier.com/x86/vcompresspsVRANGEPDVRANGEPDvrangepdXRange Restriction Calculation For Packed Pairs of Double-Precision Floating-Point Valuesvrangepd=JvrangepdJvrangepd?JvrangepdJvrangepdAJvrangepdJvrangepd=JvrangepdJvrangepd?JvrangepdJvrangepdAJvrangepdJvrangepdRJvrangepdRJ*https://www.felixcloutier.com/x86/vrangepdPMULUDQPMULUDQpmuludq,Multiply Packed Unsigned Doubleword IntegerspmuludqPMULULQ3pmuludqPMULULQ3+pmuludqPMULULQ3pmuludqPMULULQ3/)https://www.felixcloutier.com/x86/pmuludqRDFSBASERDFSBASErdfsbaseReaD FS segment BASErdfsbase=rdfsbase=3https://www.felixcloutier.com/x86/rdfsbase:rdgsbaseVCMPSDVCMPSDvcmpsd5Compare Scalar Double-Precision Floating-Point ValuesvcmpsdHvcmpsdHvcmpsd+Hvcmpsd+Hvcmpsd vcmpsd+ vcmpsdRHvcmpsdRH VSHA512MSG2 VSHA512MSG2 vsha512msg2FPerform a Final Calculation for the Next Four SHA512 Message Quadwords vsha512msg2)VPHSUBBWVPHSUBBWvphsubbw5Packed Horizontal Subtract Signed Byte to Signed Wordvphsubbw"vphsubbw/"CMOVNAECMOVNAEcmovnae$Move if not above or equal (CF == 1)cmovnaew3  cmovnaew3 $cmovnael3cmovnael3'cmovnaeq3cmovnaeq3+MOVLPDMOVLPDmovlpd5Move Low Packed Double-Precision Floating-Point ValuemovlpdMOVLPD3+movlpdMOVLPD3+(https://www.felixcloutier.com/x86/movlpdKSHIFTRQKSHIFTRQkshiftrqShift Right 64-bit MaskskshiftrqIEhttps://www.felixcloutier.com/x86/kshiftrw:kshiftrb:kshiftrq:kshiftrd VPERMT2PD VPERMT2PD vpermt2pdZFull Permute of Double-Precision Floating-Point Values From Two Tables Overwriting a Table  vpermt2pd=H vpermt2pdH vpermt2pd?H vpermt2pdH vpermt2pdAH vpermt2pdH vpermt2pd=H vpermt2pdH vpermt2pd?H vpermt2pdH vpermt2pdAH vpermt2pdHPhttps://www.felixcloutier.com/x86/vpermt2w:vpermt2d:vpermt2q:vpermt2ps:vpermt2pd VCVTUQQ2PH VCVTUQQ2PH vcvtuqq2phXConvert Packed Unsigned Quadword Integers to Packed Half-Precision Floating-Point Values vcvtuqq2phx=K vcvtuqq2phy?K vcvtuqq2phzAR vcvtuqq2phxK vcvtuqq2phyK vcvtuqq2phzR vcvtuqq2phx=K vcvtuqq2phy?K vcvtuqq2phzAR vcvtuqq2phxK vcvtuqq2phyK vcvtuqq2phzR vcvtuqq2phzQR vcvtuqq2phzQR,https://www.felixcloutier.com/x86/vcvtuqq2phVPSHAWVPSHAWvpshawPacked Shift Arithmetic Wordsvpshaw"vpshaw/"vpshaw/" VCVTPH2UQQ VCVTPH2UQQ vcvtph2uqq^Convert Packed Half Precision Floating-Point Values to Packed Unsigned Quadword Integer Values vcvtph2uqq*K vcvtph2uqq.K vcvtph2uqq<R vcvtph2uqqK vcvtph2uqqK vcvtph2uqqR vcvtph2uqq*K vcvtph2uqqK vcvtph2uqq.K vcvtph2uqqK vcvtph2uqq<R vcvtph2uqqR vcvtph2uqqQR vcvtph2uqqQR,https://www.felixcloutier.com/x86/vcvtph2uqqVPSHUFHWVPSHUFHWvpshufhwShuffle Packed High WordsvpshufhwIvpshufhwIvpshufhwIvpshufhw/Ivpshufhw2Ivpshufhw5Ivpshufhw4 vpshufhwIvpshufhw4/ vpshufhw/Ivpshufhw4!vpshufhwIvpshufhw42!vpshufhw2IvpshufhwIvpshufhw5IMINSDMINSDminsd;Return Minimum Scalar Double-Precision Floating-Point ValueminsdMINSD3minsdMINSD3+'https://www.felixcloutier.com/x86/minsd VINSERTI64X2 VINSERTI64X2 vinserti64x21Insert 128 Bits of Packed Quadword Integer Values vinserti64x2J vinserti64x2/J vinserti64x2J vinserti64x2/J vinserti64x2J vinserti64x2/J vinserti64x2J vinserti64x2/J VPHADDUBD VPHADDUBD vphaddubd1Packed Horizontal Add Unsigned Byte to Doubleword vphaddubd" vphaddubd/"VROUNDPSVROUNDPSvroundps3Round Packed Single Precision Floating-Point Valuesvroundps4 vroundps4/ vroundps4 vroundps42 JNBEJNBEjnbe0Jump if not below or equal (CF == 0 and ZF == 0)jnbeJHI3NjnbeJHI3OVPMULLDVPMULLDvpmulld?Multiply Packed Signed Doubleword Integers and Store Low Resultvpmulld9HvpmulldHvpmulld:HvpmulldHvpmulld;HvpmulldHvpmulld9Hvpmulld4 vpmulldHvpmulld4/ vpmulld:Hvpmulld4!vpmulldHvpmulld42!vpmulld;HvpmulldHVPSADBWVPSADBWvpsadbw#Compute Sum of Absolute Differences vpsadbw4 vpsadbwIvpsadbw4/ vpsadbw/Ivpsadbw4!vpsadbwIvpsadbw42!vpsadbw2IvpsadbwIvpsadbw5I VCVTSS2USI VCVTSS2USI vcvtss2usiSConvert Scalar Single-Precision Floating-Point Value to Unsigned Doubleword Integer vcvtss2usiH vcvtss2usi'H vcvtss2usiH vcvtss2usi'H vcvtss2usiQH vcvtss2usiQH,https://www.felixcloutier.com/x86/vcvtss2usiVBROADCASTI32X2VBROADCASTI32X2vbroadcasti32x2!Broadcast Two Doubleword Elements vbroadcasti32x2Jvbroadcasti32x2Jvbroadcasti32x2Jvbroadcasti32x2+Jvbroadcasti32x2+Jvbroadcasti32x2+Jvbroadcasti32x2Jvbroadcasti32x2+Jvbroadcasti32x2Jvbroadcasti32x2+Jvbroadcasti32x2Jvbroadcasti32x2+J VBLENDMPD VBLENDMPD vblendmpdLBlend Packed Double-Precision Floating-Point Vectors Using an OpMask Control  vblendmpd=H vblendmpdH vblendmpd?H vblendmpdH vblendmpdAH vblendmpdH vblendmpd=H vblendmpdH vblendmpd?H vblendmpdH vblendmpdAH vblendmpdH5https://www.felixcloutier.com/x86/vblendmpd:vblendmpsVPCMPBVPCMPBvpcmpb!Compare Packed Signed Byte Values vpcmpbIvpcmpbIvpcmpb/Ivpcmpb/IvpcmpbIvpcmpbIvpcmpb2Ivpcmpb2IvpcmpbIvpcmpbIvpcmpb5Ivpcmpb5I0https://www.felixcloutier.com/x86/vpcmpb:vpcmpubVALIGNQVALIGNQvalignqAlign Quadword Vectors valignq=HvalignqHvalignq?HvalignqHvalignqAHvalignqHvalignq=HvalignqHvalignq?HvalignqHvalignqAHvalignqH1https://www.felixcloutier.com/x86/valignd:valignq VFMADDSUBPD VFMADDSUBPD vfmaddsubpdXFused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Values vfmaddsubpd$ vfmaddsubpd/$ vfmaddsubpd/$ vfmaddsubpd$ vfmaddsubpd2$ vfmaddsubpd2$PF2IWPF2IWpf2iw0Packed Floating-Point to Integer Word Conversionpf2iwPF2IW3pf2iwPF2IW3+MOVLPSMOVLPSmovlps6Move Low Packed Single-Precision Floating-Point ValuesmovlpsMOVLPS3+movlpsMOVLPS3+(https://www.felixcloutier.com/x86/movlpsJZJZjzJump if zero (ZF == 1)jzJEQ3NjzJEQ3OMOVSSMOVSSmovss2Move Scalar Single-Precision Floating-Point ValuesmovssMOVSS3movssMOVSS3'movssMOVSS3''https://www.felixcloutier.com/x86/movss VUNPCKHPD VUNPCKHPD vunpckhpdHUnpack and Interleave High Packed Double-Precision Floating-Point Values vunpckhpd=H vunpckhpdH vunpckhpd?H vunpckhpdH vunpckhpdAH vunpckhpdH vunpckhpd=H vunpckhpd4  vunpckhpdH vunpckhpd4/  vunpckhpd?H vunpckhpd4  vunpckhpdH vunpckhpd42  vunpckhpdAH vunpckhpdHVPCMPUBVPCMPUBvpcmpub#Compare Packed Unsigned Byte Values vpcmpubIvpcmpubIvpcmpub/Ivpcmpub/IvpcmpubIvpcmpubIvpcmpub2Ivpcmpub2IvpcmpubIvpcmpubIvpcmpub5Ivpcmpub5I0https://www.felixcloutier.com/x86/vpcmpb:vpcmpubPSUBSBPSUBSBpsubsb;Subtract Packed Signed Byte Integers with Signed SaturationpsubsbPSUBSB3 psubsbPSUBSB3+ psubsbPSUBSB3psubsbPSUBSB3//https://www.felixcloutier.com/x86/psubsb:psubsw VFMADD213SS VFMADD213SS vfmadd213ssCFused Multiply-Add of Scalar Single-Precision Floating-Point Values vfmadd213ssH vfmadd213ss'H vfmadd213ss4# vfmadd213ssH vfmadd213ss4'# vfmadd213ss'H vfmadd213ssQH vfmadd213ssQHEhttps://www.felixcloutier.com/x86/vfmadd132ss:vfmadd213ss:vfmadd231ss PMADDUBSW PMADDUBSW pmaddubsw9Multiply and Add Packed Signed and Unsigned Byte Integers pmaddubsw3 pmaddubsw3+ pmaddubsw3 pmaddubsw3/+https://www.felixcloutier.com/x86/pmaddubsw VMASKMOVPS VMASKMOVPS vmaskmovps>Conditional Move Packed Single-Precision Floating-Point Values vmaskmovps4/  vmaskmovps42  vmaskmovps4/  vmaskmovps42 KTESTWKTESTWktestw#Bit Test 16-bit Masks and Set FlagsktestwJ=https://www.felixcloutier.com/x86/ktestw:ktestb:ktestq:ktestd VREDUCEPH VREDUCEPH vreducephOPerform Reduction Transformation on Packed Half-Precision Floating-Point Values vreduceph<K vreduceph>K vreduceph@R vreducephK vreducephK vreducephR vreduceph<K vreducephK vreduceph>K vreducephK vreduceph@R vreducephR vreducephRR vreducephRR+https://www.felixcloutier.com/x86/vreducephVPSLLVDVPSLLVDvpsllvd2Variable Shift Packed Doubleword Data Left Logicalvpsllvd9HvpsllvdHvpsllvd:HvpsllvdHvpsllvd;HvpsllvdHvpsllvd9Hvpsllvd4!vpsllvdHvpsllvd4/!vpsllvd:Hvpsllvd4!vpsllvdHvpsllvd42!vpsllvd;HvpsllvdH9https://www.felixcloutier.com/x86/vpsllvw:vpsllvd:vpsllvqVFMULCPHVFMULCPHvfmulcphKFused Fused Multiply of Complex Packed Half-Precision Floating-Point Valuesvfmulcph9KvfmulcphKvfmulcph:KvfmulcphKvfmulcph;RvfmulcphRvfmulcph9KvfmulcphKvfmulcph:KvfmulcphKvfmulcph;RvfmulcphRvfmulcphQRvfmulcphQR4https://www.felixcloutier.com/x86/vfcmulcph:vfmulcph GF2P8MULB GF2P8MULB gf2p8mulbGalois Field Multiply Bytes gf2p8mulb gf2p8mulb/+https://www.felixcloutier.com/x86/gf2p8mulbVMOVUPSVMOVUPSvmovups<Move Unaligned Packed Single-Precision Floating-Point Valuesvmovups0HvmovupsHvmovups3HvmovupsHvmovups6HvmovupsHvmovups/Hvmovups2Hvmovups5Hvmovups4 vmovupsHvmovups4/ vmovups/Hvmovups4 vmovupsHvmovups42 vmovups2HvmovupsHvmovups5Hvmovups4/ vmovups/Hvmovups42 vmovups2Hvmovups5H VRNDSCALESS VRNDSCALESS vrndscaless]Round Scalar Single-Precision Floating-Point Value To Include A Given Number Of Fraction Bits vrndscalessH vrndscaless'H vrndscalessH vrndscaless'H vrndscalessRH vrndscalessRH-https://www.felixcloutier.com/x86/vrndscalessSHA1MSG1SHA1MSG1sha1msg1NPerform an Intermediate Calculation for the Next Four SHA1 Message Doublewordssha1msg1(sha1msg1/(*https://www.felixcloutier.com/x86/sha1msg1 VPSCATTERQD VPSCATTERQD vpscatterqd=Scatter Packed Doubleword Values with Signed Quadword Indices vpscatterqdEH vpscatterqdIH vpscatterqdMHQhttps://www.felixcloutier.com/x86/vpscatterdd:vpscatterdq:vpscatterqd:vpscatterqqLZCNTLZCNTlzcnt%Count the Number of Leading Zero Bitslzcntw  3lzcntw $3lzcntl33lzcntl3'3lzcntq33lzcntq3+3'https://www.felixcloutier.com/x86/lzcntVPMULHUWVPMULHUWvpmulhuw<Multiply Packed Unsigned Word Integers and Store High ResultvpmulhuwIvpmulhuw/IvpmulhuwIvpmulhuw2IvpmulhuwIvpmulhuw5Ivpmulhuw4 vpmulhuwIvpmulhuw4/ vpmulhuw/Ivpmulhuw4!vpmulhuwIvpmulhuw42!vpmulhuw2IvpmulhuwIvpmulhuw5IRCPPSRCPPSrcppsPCompute Approximate Reciprocals of Packed Single-Precision Floating-Point ValuesrcppsRCPPS3rcppsRCPPS3/'https://www.felixcloutier.com/x86/rcppsPSIGNBPSIGNBpsignbPacked Sign of Byte Integerspsignb3psignb3+psignb3psignb3/6https://www.felixcloutier.com/x86/psignb:psignw:psigndVLDDQUVLDDQUvlddquLoad Unaligned Integer 128 Bitsvlddqu4/ vlddqu42 VPMOVSQBVPMOVSQBvpmovsqbIDown Convert Packed Quadword Values to Byte Values with Signed Saturation vpmovsqbHvpmovsqb%HvpmovsqbHvpmovsqb(HvpmovsqbHvpmovsqb,HvpmovsqbHvpmovsqbHvpmovsqbHvpmovsqb$Hvpmovsqb'Hvpmovsqb+H<https://www.felixcloutier.com/x86/vpmovqb:vpmovsqb:vpmovusqbCVTSI2SDCVTSI2SDcvtsi2sd9Convert Dword Integer to Scalar Double-Precision FP Value cvtsi2sdlCVTSL2SD3 cvtsi2sdqCVTSQ2SD3 cvtsi2sdlCVTSL2SD3' cvtsi2sdqCVTSQ2SD3+*https://www.felixcloutier.com/x86/cvtsi2sdJNBJNBjnbJump if not below (CF == 0)jnbJCC3NjnbJCC3OCOMISDCOMISDcomisdLCompare Scalar Ordered Double-Precision Floating-Point Values and Set EFLAGScomisdCOMISD3comisdCOMISD3+(https://www.felixcloutier.com/x86/comisd VGETMANTPH VGETMANTPH vgetmantphMExtract Normalized Mantissas from Packed Half-Precision Floating-Point Values vgetmantph<K vgetmantph>K vgetmantph@R vgetmantphK vgetmantphK vgetmantphR vgetmantph<K vgetmantphK vgetmantph>K vgetmantphK vgetmantph@R vgetmantphR vgetmantphRR vgetmantphRR,https://www.felixcloutier.com/x86/vgetmantphVGF2P8AFFINEINVQBVGF2P8AFFINEINVQBvgf2p8affineinvqb0Galois Field (2^8) Affine Inverse Transformationvgf2p8affineinvqb=Kvgf2p8affineinvqbKvgf2p8affineinvqb?Kvgf2p8affineinvqbKvgf2p8affineinvqbAHvgf2p8affineinvqbHvgf2p8affineinvqb=Kvgf2p8affineinvqb vgf2p8affineinvqbKvgf2p8affineinvqb/ vgf2p8affineinvqb?Kvgf2p8affineinvqb vgf2p8affineinvqbKvgf2p8affineinvqb2 vgf2p8affineinvqbAHvgf2p8affineinvqbHPMAXUDPMAXUDpmaxud.Maximum of Packed Unsigned Doubleword Integerspmaxud3pmaxud3//https://www.felixcloutier.com/x86/pmaxud:pmaxuq VCVTPD2PS VCVTPD2PS vcvtpd2psNConvert Packed Double-Precision FP Values to Packed Single-Precision FP Values vcvtpd2psx=H vcvtpd2psy?H vcvtpd2psAH vcvtpd2psxH vcvtpd2psyH vcvtpd2psH vcvtpd2psx=H vcvtpd2psy?H vcvtpd2psx4  vcvtpd2psxH vcvtpd2psy4  vcvtpd2psyH vcvtpd2psx4/  vcvtpd2psy42  vcvtpd2psAH vcvtpd2psH vcvtpd2psQH vcvtpd2psQH VFNMSUB132SD VFNMSUB132SD vfnmsub132sdQFused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfnmsub132sdH vfnmsub132sd+H vfnmsub132sd4# vfnmsub132sdH vfnmsub132sd4+# vfnmsub132sd+H vfnmsub132sdQH vfnmsub132sdQHHhttps://www.felixcloutier.com/x86/vfnmsub132sd:vfnmsub213sd:vfnmsub231sdVPANDDVPANDDvpandd1Bitwise Logical AND of Packed Doubleword Integers vpandd9HvpanddHvpandd:HvpanddHvpandd;HvpanddHvpandd9HvpanddHvpandd:HvpanddHvpandd;HvpanddHVPCOMUDVPCOMUDvpcomud+Compare Packed Unsigned Doubleword Integersvpcomud"vpcomud/"VPCOMUQVPCOMUQvpcomuq)Compare Packed Unsigned Quadword Integersvpcomuq"vpcomuq/"VMAXSSVMAXSSvmaxss;Return Maximum Scalar Single-Precision Floating-Point ValuevmaxssHvmaxss'Hvmaxss4 vmaxssHvmaxss4' vmaxss'HvmaxssRHvmaxssRHVCOMISDVCOMISDvcomisdLCompare Scalar Ordered Double-Precision Floating-Point Values and Set EFLAGSvcomisd4 vcomisdHvcomisd4+ vcomisd+HvcomisdRHVPSHRDVWVPSHRDVWvpshrdvw=Concatenate and Variable Shift Packed Word Data Right Logical vpshrdvwKvpshrdvw/KvpshrdvwKvpshrdvw2KvpshrdvwUvpshrdvw5UvpshrdvwKvpshrdvw/KvpshrdvwKvpshrdvw2KvpshrdvwUvpshrdvw5U VFMSUB132SH VFMSUB132SH vfmsub132shFFused Multiply-Subtract of Scalar Half-Precision Floating-Point Values vfmsub132shR vfmsub132sh$R vfmsub132shR vfmsub132sh$R vfmsub132shQR vfmsub132shQRlhttps://www.felixcloutier.com/x86/vfmsub132sh:vfnmsub132sh:vfmsub213sh:vfnmsub213sh:vfmsub231sh:vfnmsub231sh VPUNPCKHDQ VPUNPCKHDQ vpunpckhdq;Unpack and Interleave High-Order Doublewords into Quadwords vpunpckhdq9H vpunpckhdqH vpunpckhdq:H vpunpckhdqH vpunpckhdq;H vpunpckhdqH vpunpckhdq9H vpunpckhdq4  vpunpckhdqH vpunpckhdq4/  vpunpckhdq:H vpunpckhdq4! vpunpckhdqH vpunpckhdq42! vpunpckhdq;H vpunpckhdqH VCVTPH2QQ VCVTPH2QQ vcvtph2qq\Convert Packed Half Precision Floating-Point Values to Packed Singed Quadword Integer Values vcvtph2qq*K vcvtph2qq.K vcvtph2qq<R vcvtph2qqK vcvtph2qqK vcvtph2qqR vcvtph2qq*K vcvtph2qqK vcvtph2qq.K vcvtph2qqK vcvtph2qq<R vcvtph2qqR vcvtph2qqQR vcvtph2qqQR+https://www.felixcloutier.com/x86/vcvtph2qqVPMINUQVPMINUQvpminuq,Minimum of Packed Unsigned Quadword Integers vpminuq=HvpminuqHvpminuq?HvpminuqHvpminuqAHvpminuqHvpminuq=HvpminuqHvpminuq?HvpminuqHvpminuqAHvpminuqHXORPSXORPSxorps>Bitwise Logical XOR for Single-Precision Floating-Point ValuesxorpsXORPS3xorpsXORPS3/'https://www.felixcloutier.com/x86/xorpsVPERMPSVPERMPSvpermps0Permute Single-Precision Floating-Point Elements vpermps:HvpermpsHvpermps;HvpermpsHvpermps:Hvpermps4!vpermpsHvpermps42!vpermps;HvpermpsH)https://www.felixcloutier.com/x86/vpermpsVPERMI2WVPERMI2Wvpermi2w;Full Permute of Words From Two Tables Overwriting the Index vpermi2wIvpermi2w/Ivpermi2wIvpermi2w2Ivpermi2wIvpermi2w5Ivpermi2wIvpermi2w/Ivpermi2wIvpermi2w2Ivpermi2wIvpermi2w5IPhttps://www.felixcloutier.com/x86/vpermi2w:vpermi2d:vpermi2q:vpermi2ps:vpermi2pdPHSUBWPHSUBWphsubw(Packed Horizontal Subtract Word Integersphsubw3phsubw3+phsubw3phsubw3//https://www.felixcloutier.com/x86/phsubw:phsubd VFMSUBADDPS VFMSUBADDPS vfmsubaddpsXFused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Values vfmsubaddps$ vfmsubaddps/$ vfmsubaddps/$ vfmsubaddps$ vfmsubaddps2$ vfmsubaddps2$VPADDUSBVPADDUSBvpaddusb:Add Packed Unsigned Byte Integers with Unsigned SaturationvpaddusbIvpaddusb/IvpaddusbIvpaddusb2IvpaddusbIvpaddusb5Ivpaddusb4 vpaddusbIvpaddusb4/ vpaddusb/Ivpaddusb4!vpaddusbIvpaddusb42!vpaddusb2IvpaddusbIvpaddusb5IVBROADCASTI64X2VBROADCASTI64X2vbroadcasti64x2Broadcast Two Quadword Elementsvbroadcasti64x2/Jvbroadcasti64x2/Jvbroadcasti64x2/Jvbroadcasti64x2/JCLCCLCclcClear Carry FlagclcCLC3%https://www.felixcloutier.com/x86/clcVRCPSSVRCPSSvrcpssOCompute Approximate Reciprocal of Scalar Single-Precision Floating-Point Valuesvrcpss4 vrcpss4' VPABSWVPABSWvpabsw&Packed Absolute Value of Word IntegersvpabswIvpabswIvpabswIvpabsw/Ivpabsw2Ivpabsw5Ivpabsw4 vpabswIvpabsw4/ vpabsw/Ivpabsw4!vpabswIvpabsw42!vpabsw2IvpabswIvpabsw5IVPDPBUUDVPDPBUUDvpdpbuudJPacked Dot Product of Unsigned-by-Unsinged Byte subvectors into DoublewordvpdpbuudXvpdpbuud/XvpdpbuudXvpdpbuud2XVTESTPDVTESTPDvtestpd/Packed Double-Precision Floating-Point Bit Testvtestpd4 vtestpd4/ vtestpd4 vtestpd42 1https://www.felixcloutier.com/x86/vtestpd:vtestpsVCMPPDVCMPPDvcmppd5Compare Packed Double-Precision Floating-Point Valuesvcmppd=Hvcmppd=HvcmppdHvcmppdHvcmppd?Hvcmppd?HvcmppdHvcmppdHvcmppdAHvcmppdAHvcmppdHvcmppdHvcmppd4 vcmppd4/ vcmppd4 vcmppd42 vcmppdRHvcmppdRHSETNPSETNPsetnp Set byte if not parity (PF == 0)setnpSETPC3 setnpSETPC3#MONITORMONITORmonitorMonitor a Linear Address RangemonitorD)https://www.felixcloutier.com/x86/monitorCMOVACMOVAcmova#Move if above (CF == 0 and ZF == 0)cmovaw3  cmovaw3 $cmoval3cmoval3'cmovaq3cmovaq3+VFMSUBADD213PHVFMSUBADD213PHvfmsubadd213phVFused Multiply-Alternating Subtract/Add of Packed Half-Precision Floating-Point Valuesvfmsubadd213ph<Kvfmsubadd213phKvfmsubadd213ph>Kvfmsubadd213phKvfmsubadd213ph@Rvfmsubadd213phRvfmsubadd213ph<Kvfmsubadd213phKvfmsubadd213ph>Kvfmsubadd213phKvfmsubadd213ph@Rvfmsubadd213phRvfmsubadd213phQRvfmsubadd213phQRNhttps://www.felixcloutier.com/x86/vfmsubadd132ph:vfmsubadd213ph:vfmsubadd231ph VFNMSUB213SD VFNMSUB213SD vfnmsub213sdQFused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfnmsub213sdH vfnmsub213sd+H vfnmsub213sd4# vfnmsub213sdH vfnmsub213sd4+# vfnmsub213sd+H vfnmsub213sdQH vfnmsub213sdQHHhttps://www.felixcloutier.com/x86/vfnmsub132sd:vfnmsub213sd:vfnmsub231sdCVTPI2PSCVTPI2PScvtpi2psBConvert Packed Dword Integers to Packed Single-Precision FP Valuescvtpi2psCVTPL2PS3cvtpi2psCVTPL2PS3+*https://www.felixcloutier.com/x86/cvtpi2ps VCVTTPH2QQ VCVTTPH2QQ vcvttph2qqlConvert with Truncation Packed Half Precision Floating-Point Values to Packed Singed Quadword Integer Values vcvttph2qq*K vcvttph2qq.K vcvttph2qq<R vcvttph2qqK vcvttph2qqK vcvttph2qqR vcvttph2qq*K vcvttph2qqK vcvttph2qq.K vcvttph2qqK vcvttph2qq<R vcvttph2qqR vcvttph2qqRR vcvttph2qqRR,https://www.felixcloutier.com/x86/vcvttph2qqVMULPSVMULPSvmulps6Multiply Packed Single-Precision Floating-Point Valuesvmulps9HvmulpsHvmulps:HvmulpsHvmulps;HvmulpsHvmulps9Hvmulps4 vmulpsHvmulps4/ vmulps:Hvmulps4 vmulpsHvmulps42 vmulps;HvmulpsHvmulpsQHvmulpsQHKORBKORBkorbBitwise Logical OR 8-bit MaskskorbJ5https://www.felixcloutier.com/x86/korw:korb:korq:kordCMOVNPCMOVNPcmovnpMove if not parity (PF == 0)cmovnpw3  cmovnpw3 $cmovnpl3cmovnpl3'cmovnpq3cmovnpq3+VMAXSHVMAXSHvmaxsh9Return Maximum Scalar Half-Precision Floating-Point ValuevmaxshRvmaxsh$RvmaxshRvmaxsh$RvmaxshRRvmaxshRR(https://www.felixcloutier.com/x86/vmaxsh VCVTPS2QQ VCVTPS2QQ vcvtps2qq^Convert Packed Single Precision Floating-Point Values to Packed Singed Quadword Integer Values vcvtps2qq8J vcvtps2qq9J vcvtps2qq:J vcvtps2qqJ vcvtps2qqJ vcvtps2qqJ vcvtps2qq8J vcvtps2qqJ vcvtps2qq9J vcvtps2qqJ vcvtps2qq:J vcvtps2qqJ vcvtps2qqQJ vcvtps2qqQJ+https://www.felixcloutier.com/x86/vcvtps2qqMOVSHDUPMOVSHDUPmovshdup(Move Packed Single-FP High and Duplicatemovshdup3movshdup3/*https://www.felixcloutier.com/x86/movshdup VGATHERPF0DPS VGATHERPF0DPS vgatherpf0dpsoSparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Doubleword Indices Using T0 Hint vgatherpf0dpsKLYhttps://www.felixcloutier.com/x86/vgatherpf0dps:vgatherpf0qps:vgatherpf0dpd:vgatherpf0qpd VGETEXPPS VGETEXPPS vgetexppslExtract Exponents of Packed Single-Precision Floating-Point Values as Single-Precision Floating-Point Values vgetexpps9H vgetexpps:H vgetexpps;H vgetexppsH vgetexppsH vgetexppsH vgetexpps9H vgetexppsH vgetexpps:H vgetexppsH vgetexpps;H vgetexppsH vgetexppsRH vgetexppsRH+https://www.felixcloutier.com/x86/vgetexppsVSCATTERPF1QPDVSCATTERPF1QPDvscatterpf1qpd‚Sparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Quadword Indices Using T1 Hint with Intent to Writevscatterpf1qpdML]https://www.felixcloutier.com/x86/vscatterpf1dps:vscatterpf1qps:vscatterpf1dpd:vscatterpf1qpdPHADDSWPHADDSWphaddswAPacked Horizontal Add Signed Word Integers with Signed Saturationphaddsw3phaddsw3+phaddsw3phaddsw3/)https://www.felixcloutier.com/x86/phaddsw VPERMT2PS VPERMT2PS vpermt2psZFull Permute of Single-Precision Floating-Point Values From Two Tables Overwriting a Table  vpermt2ps9H vpermt2psH vpermt2ps:H vpermt2psH vpermt2ps;H vpermt2psH vpermt2ps9H vpermt2psH vpermt2ps:H vpermt2psH vpermt2ps;H vpermt2psHPhttps://www.felixcloutier.com/x86/vpermt2w:vpermt2d:vpermt2q:vpermt2ps:vpermt2pdRCPSSRCPSSrcpssOCompute Approximate Reciprocal of Scalar Single-Precision Floating-Point ValuesrcpssRCPSS3rcpssRCPSS3''https://www.felixcloutier.com/x86/rcpss VFNMADD132SD VFNMADD132SD vfnmadd132sdLFused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values vfnmadd132sdH vfnmadd132sd+H vfnmadd132sd4# vfnmadd132sdH vfnmadd132sd4+# vfnmadd132sd+H vfnmadd132sdQH vfnmadd132sdQHHhttps://www.felixcloutier.com/x86/vfnmadd132sd:vfnmadd213sd:vfnmadd231sdIDIVIDIVidiv Signed DivideidivbIDIVB3 idivwIDIVW3 idivlIDIVL3idivqIDIVQ3idivbIDIVB3#idivwIDIVW3$idivlIDIVL3'idivqIDIVQ3+&https://www.felixcloutier.com/x86/idivPAVGBPAVGBpavgbAverage Packed Byte IntegerspavgbPAVGB3 pavgbPAVGB3+ pavgbPAVGB3pavgbPAVGB3/-https://www.felixcloutier.com/x86/pavgb:pavgwJRCXZJRCXZjrcxzJump if RCX register is 0jrcxzJCXZQ3N PREFETCHWT1 PREFETCHWT1 prefetchwt1APrefetch Vector Data Into Caches with Intent to Write and T1 Hint prefetchwt1#C-https://www.felixcloutier.com/x86/prefetchwt1UD2UD2ud2Undefined Instructionud23 VPERM2F128 VPERM2F128 vperm2f128Permute Floating-Point Values vperm2f1284  vperm2f12842 ,https://www.felixcloutier.com/x86/vperm2f128VPCMPWVPCMPWvpcmpw!Compare Packed Signed Word Values vpcmpwIvpcmpwIvpcmpw/Ivpcmpw/IvpcmpwIvpcmpwIvpcmpw2Ivpcmpw2IvpcmpwIvpcmpwIvpcmpw5Ivpcmpw5I0https://www.felixcloutier.com/x86/vpcmpw:vpcmpuwJNSJNSjnsJump if not sign (SF == 0)jnsJPL3NjnsJPL3OVPSUBDVPSUBDvpsubd#Subtract Packed Doubleword Integersvpsubd9HvpsubdHvpsubd:HvpsubdHvpsubd;HvpsubdHvpsubd9Hvpsubd4 vpsubdHvpsubd4/ vpsubd:Hvpsubd4!vpsubdHvpsubd42!vpsubd;HvpsubdHVFMADDSUB213PSVFMADDSUB213PSvfmaddsub213psXFused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Valuesvfmaddsub213ps9Hvfmaddsub213psHvfmaddsub213ps:Hvfmaddsub213psHvfmaddsub213ps;Hvfmaddsub213psHvfmaddsub213ps9Hvfmaddsub213ps4#vfmaddsub213psHvfmaddsub213ps4/#vfmaddsub213ps:Hvfmaddsub213ps4#vfmaddsub213psHvfmaddsub213ps42#vfmaddsub213ps;Hvfmaddsub213psHvfmaddsub213psQHvfmaddsub213psQHNhttps://www.felixcloutier.com/x86/vfmaddsub132ps:vfmaddsub213ps:vfmaddsub231ps VRSQRT14SS VRSQRT14SS vrsqrt14ssaCompute Approximate Reciprocal of a Square Root of a Scalar Single-Precision Floating-Point Value vrsqrt14ssH vrsqrt14ss'H vrsqrt14ssH vrsqrt14ss'H,https://www.felixcloutier.com/x86/vrsqrt14ss VCVTTSD2USI VCVTTSD2USI vcvttsd2usiXConvert with Truncation Scalar Double-Precision Floating-Point Value to Unsigned Integer vcvttsd2usiH vcvttsd2usi+H vcvttsd2usiH vcvttsd2usi+H vcvttsd2usiRH vcvttsd2usiRH-https://www.felixcloutier.com/x86/vcvttsd2usiVCVTW2PHVCVTW2PHvcvtw2phKConvert Packed Word Integers to Packed Half-Precision Floating-Point Valuesvcvtw2ph<Kvcvtw2ph>Kvcvtw2ph@Rvcvtw2phKvcvtw2phKvcvtw2phRvcvtw2ph<Kvcvtw2phKvcvtw2ph>Kvcvtw2phKvcvtw2ph@Rvcvtw2phRvcvtw2phQRvcvtw2phQR*https://www.felixcloutier.com/x86/vcvtw2phPSRLDQPSRLDQpsrldq*Shift Packed Double Quadword Right LogicalpsrldqPSRLO3(https://www.felixcloutier.com/x86/psrldqSQRTSDSQRTSDsqrtsdCCompute Square Root of Scalar Double-Precision Floating-Point ValuesqrtsdSQRTSD3sqrtsdSQRTSD3+(https://www.felixcloutier.com/x86/sqrtsdVPDPBSUDVPDPBSUDvpdpbsudHPacked Dot Product of Signed-by-Unsinged Byte subvectors into DoublewordvpdpbsudXvpdpbsud/XvpdpbsudXvpdpbsud2XVPTESTMDVPTESTMDvptestmd<Logical AND of Packed Doubleword Integer Values and Set Mask vptestmd9Hvptestmd9HvptestmdHvptestmdHvptestmd:Hvptestmd:HvptestmdHvptestmdHvptestmd;Hvptestmd;HvptestmdHvptestmdHEhttps://www.felixcloutier.com/x86/vptestmb:vptestmw:vptestmd:vptestmqVPSLLWVPSLLWvpsllw#Shift Packed Word Data Left LogicalvpsllwIvpsllwIvpsllw/IvpsllwIvpsllwIvpsllw/IvpsllwIvpsllwIvpsllw/Ivpsllw/Ivpsllw2Ivpsllw5Ivpsllw4 vpsllwIvpsllw4 vpsllwIvpsllw4/ vpsllw/Ivpsllw/Ivpsllw4!vpsllwIvpsllw4!vpsllwIvpsllw4/!vpsllw/Ivpsllw2IvpsllwIvpsllwIvpsllw/Ivpsllw5IVPHADDWDVPHADDWDvphaddwd6Packed Horizontal Add Signed Word to Signed Doublewordvphaddwd"vphaddwd/"VPSUBWVPSUBWvpsubwSubtract Packed Word IntegersvpsubwIvpsubw/IvpsubwIvpsubw2IvpsubwIvpsubw5Ivpsubw4 vpsubwIvpsubw4/ vpsubw/Ivpsubw4!vpsubwIvpsubw42!vpsubw2IvpsubwIvpsubw5IPSIGNWPSIGNWpsignwPacked Sign of Word Integerspsignw3psignw3+psignw3psignw3/6https://www.felixcloutier.com/x86/psignb:psignw:psigndCVTSD2SSCVTSD2SScvtsd2ssLConvert Scalar Double-Precision FP Value to Scalar Single-Precision FP Valuecvtsd2ssCVTSD2SS3cvtsd2ssCVTSD2SS3+*https://www.felixcloutier.com/x86/cvtsd2ssVPALIGNRVPALIGNRvpalignrPacked Align RightvpalignrIvpalignr/IvpalignrIvpalignr2IvpalignrIvpalignr5Ivpalignr4 vpalignrIvpalignr4/ vpalignr/Ivpalignr4!vpalignrIvpalignr42!vpalignr2IvpalignrIvpalignr5IVPSLLQVPSLLQvpsllq'Shift Packed Quadword Data Left Logicalvpsllq=Hvpsllq?HvpsllqAHvpsllqHvpsllqHvpsllq/HvpsllqHvpsllqHvpsllq/HvpsllqHvpsllqHvpsllq/Hvpsllq=Hvpsllq4 vpsllqHvpsllq4 vpsllqHvpsllq4/ vpsllq/Hvpsllq?Hvpsllq4!vpsllqHvpsllq4!vpsllqHvpsllq4/!vpsllq/HvpsllqAHvpsllqHvpsllqHvpsllq/HVPTESTVPTESTvptestPacked Logical Comparevptest4 vptest4/ vptest4 vptest42  VSCALEFSH VSCALEFSH vscalefsh[Scale Scalar Half-Precision Floating-Point Value With a Half-Precision Floating-Point Value vscalefshR vscalefsh$R vscalefshR vscalefsh$R vscalefshQR vscalefshQR+https://www.felixcloutier.com/x86/vscalefshRDTSCRDTSCrdtscRead Time-Stamp CounterrdtscRDTSC3'https://www.felixcloutier.com/x86/rdtscPMOVSXDQPMOVSXDQpmovsxdqHMove Packed Doubleword Integers to Quadword Integers with Sign Extensionpmovsxdq3pmovsxdq3+SQRTPSSQRTPSsqrtpsECompute Square Roots of Packed Single-Precision Floating-Point ValuessqrtpsSQRTPS3sqrtpsSQRTPS3/(https://www.felixcloutier.com/x86/sqrtpsPSLLWPSLLWpsllw#Shift Packed Word Data Left LogicalpsllwPSLLW3 psllwPSLLW3 psllwPSLLW3+ psllwPSLLW3psllwPSLLW3psllwPSLLW3/3https://www.felixcloutier.com/x86/psllw:pslld:psllq VCVTPS2PD VCVTPS2PD vcvtps2pdNConvert Packed Single-Precision FP Values to Packed Double-Precision FP Values vcvtps2pd8H vcvtps2pd9K vcvtps2pd:H vcvtps2pdH vcvtps2pdK vcvtps2pdH vcvtps2pd8H vcvtps2pd4  vcvtps2pdH vcvtps2pd4+  vcvtps2pd9K vcvtps2pd4  vcvtps2pdK vcvtps2pd4/  vcvtps2pd:H vcvtps2pdH vcvtps2pdRH vcvtps2pdRH VCVTTPH2DQ VCVTTPH2DQ vcvttph2dqPConvert with Truncation Packed Half-Precision FP Values to Packed Dword Integers vcvttph2dq.K vcvttph2dq<K vcvttph2dq>R vcvttph2dqK vcvttph2dqK vcvttph2dqR vcvttph2dq.K vcvttph2dqK vcvttph2dq<K vcvttph2dqK vcvttph2dq>R vcvttph2dqR vcvttph2dqRR vcvttph2dqRR,https://www.felixcloutier.com/x86/vcvttph2dq VPCOMPRESSB VPCOMPRESSB vpcompressbBStore Sparse Packed Byte Integer Values into Dense Memory/Register  vpcompressb0K vpcompressbK vpcompressb3K vpcompressbK vpcompressb6U vpcompressbU vpcompressbK vpcompressbK vpcompressbU vpcompressb/K vpcompressb2K vpcompressb5U8https://www.felixcloutier.com/x86/vpcompressb:vcompresswVPMOVQBVPMOVQBvpmovqbBDown Convert Packed Quadword Values to Byte Values with Truncation vpmovqbHvpmovqb%HvpmovqbHvpmovqb(HvpmovqbHvpmovqb,HvpmovqbHvpmovqbHvpmovqbHvpmovqb$Hvpmovqb'Hvpmovqb+H<https://www.felixcloutier.com/x86/vpmovqb:vpmovsqb:vpmovusqbVPROLQVPROLQvprolqRotate Packed Quadword Left vprolq=Hvprolq?HvprolqAHvprolqHvprolqHvprolqHvprolq=HvprolqHvprolq?HvprolqHvprolqAHvprolqH?https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvqVPSUBBVPSUBBvpsubbSubtract Packed Byte IntegersvpsubbIvpsubb/IvpsubbIvpsubb2IvpsubbIvpsubb5Ivpsubb4 vpsubbIvpsubb4/ vpsubb/Ivpsubb4!vpsubbIvpsubb42!vpsubb2IvpsubbIvpsubb5I VSCALEFPD VSCALEFPD vscalefpd_Scale Packed Double-Precision Floating-Point Values With Double-Precision Floating-Point Values vscalefpd=H vscalefpdH vscalefpd?H vscalefpdH vscalefpdAH vscalefpdH vscalefpd=H vscalefpdH vscalefpd?H vscalefpdH vscalefpdAH vscalefpdH vscalefpdQH vscalefpdQH+https://www.felixcloutier.com/x86/vscalefpdPMAXSDPMAXSDpmaxsd,Maximum of Packed Signed Doubleword Integerspmaxsd3pmaxsd3/=https://www.felixcloutier.com/x86/pmaxsb:pmaxsw:pmaxsd:pmaxsq VPMOVUSQD VPMOVUSQD vpmovusqdQDown Convert Packed Quadword Values to Doubleword Values with Unsigned Saturation  vpmovusqdH vpmovusqd,H vpmovusqdH vpmovusqd0H vpmovusqdH vpmovusqd3H vpmovusqdH vpmovusqdH vpmovusqdH vpmovusqd+H vpmovusqd/H vpmovusqd2H<https://www.felixcloutier.com/x86/vpmovqd:vpmovsqd:vpmovusqd VINSERTF128 VINSERTF128 vinsertf128#Insert Packed Floating-Point Values vinsertf1284  vinsertf1284/ ahttps://www.felixcloutier.com/x86/vinsertf128:vinsertf32x4:vinsertf64x2:vinsertf32x8:vinsertf64x4VPROTWVPROTWvprotwPacked Rotate Wordsvprotw"vprotw"vprotw/"vprotw/"vprotw/"VXORPSVXORPSvxorps>Bitwise Logical XOR for Single-Precision Floating-Point Valuesvxorps9JvxorpsJvxorps:JvxorpsJvxorps;JvxorpsJvxorps9Jvxorps4 vxorpsJvxorps4/ vxorps:Jvxorps4 vxorpsJvxorps42 vxorps;JvxorpsJ VPMULHRSW VPMULHRSW vpmulhrswOPacked Multiply Signed Word Integers and Store High Result with Round and Scale vpmulhrswI vpmulhrsw/I vpmulhrswI vpmulhrsw2I vpmulhrswI vpmulhrsw5I vpmulhrsw4  vpmulhrswI vpmulhrsw4/  vpmulhrsw/I vpmulhrsw4! vpmulhrswI vpmulhrsw42! vpmulhrsw2I vpmulhrswI vpmulhrsw5IPI2FWPI2FWpi2fw0Packed Integer to Floating-Point Word Conversionpi2fwPI2FW3pi2fwPI2FW3+VUCOMISSVUCOMISSvucomissNUnordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGSvucomiss4 vucomissHvucomiss4' vucomiss'HvucomissRHEXTRQEXTRQextrq Extract Fieldextrq3extrqSHA1MSG2SHA1MSG2sha1msg2FPerform a Final Calculation for the Next Four SHA1 Message Doublewordssha1msg2(sha1msg2/(*https://www.felixcloutier.com/x86/sha1msg2 VCVTPH2PD VCVTPH2PD vcvtph2pdLConvert Packed Half-Precision FP Values to Packed Double-Precision FP Values vcvtph2pd*K vcvtph2pd.K vcvtph2pd<R vcvtph2pdK vcvtph2pdK vcvtph2pdR vcvtph2pd*K vcvtph2pdK vcvtph2pd.K vcvtph2pdK vcvtph2pd<R vcvtph2pdR vcvtph2pdRR vcvtph2pdRR+https://www.felixcloutier.com/x86/vcvtph2pdVPERMBVPERMBvpermbPermute Byte Integers vpermbTvpermb/TvpermbTvpermb2TvpermbTvpermb5TvpermbTvpermb/TvpermbTvpermb2TvpermbTvpermb5T(https://www.felixcloutier.com/x86/vpermbSHLXSHLXshlx*Logical Shift Left Without Affecting Flagsshlxl5shlxl'5shlxq5shlxq+50https://www.felixcloutier.com/x86/sarx:shlx:shrxVPHSUBDVPHSUBDvphsubd.Packed Horizontal Subtract Doubleword Integersvphsubd4 vphsubd4/ vphsubd4!vphsubd42!VHSUBPDVHSUBPDvhsubpd$Packed Double-FP Horizontal Subtractvhsubpd4 vhsubpd4/ vhsubpd4 vhsubpd42 UCOMISDUCOMISDucomisdNUnordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGSucomisdUCOMISD3ucomisdUCOMISD3+)https://www.felixcloutier.com/x86/ucomisdVPLZCNTDVPLZCNTDvplzcntdBCount the Number of Leading Zero Bits for Packed Doubleword Values vplzcntd9Nvplzcntd:Nvplzcntd;NvplzcntdNvplzcntdNvplzcntdNvplzcntd9NvplzcntdNvplzcntd:NvplzcntdNvplzcntd;NvplzcntdN3https://www.felixcloutier.com/x86/vplzcntd:vplzcntq EXTRACTPS EXTRACTPS extractps4Extract Packed Single Precision Floating-Point Value extractps3 extractps3'+https://www.felixcloutier.com/x86/extractpsANDPSANDPSandpsDBitwise Logical AND of Packed Single-Precision Floating-Point ValuesandpsANDPS3andpsANDPS3/'https://www.felixcloutier.com/x86/andps VPSCATTERQQ VPSCATTERQQ vpscatterqq;Scatter Packed Quadword Values with Signed Quadword Indices vpscatterqqEH vpscatterqqIH vpscatterqqMHQhttps://www.felixcloutier.com/x86/vpscatterdd:vpscatterdq:vpscatterqd:vpscatterqq VCVTNEOPH2PS VCVTNEOPH2PS vcvtneoph2ps9Convert Odd Elements of Packed FP16 Values to FP32 Values vcvtneoph2ps/Z vcvtneoph2ps2Z VFNMSUB213PS VFNMSUB213PS vfnmsub213psQFused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values vfnmsub213ps9H vfnmsub213psH vfnmsub213ps:H vfnmsub213psH vfnmsub213ps;H vfnmsub213psH vfnmsub213ps9H vfnmsub213ps4# vfnmsub213psH vfnmsub213ps4/# vfnmsub213ps:H vfnmsub213ps4# vfnmsub213psH vfnmsub213ps42# vfnmsub213ps;H vfnmsub213psH vfnmsub213psQH vfnmsub213psQHHhttps://www.felixcloutier.com/x86/vfnmsub132ps:vfnmsub213ps:vfnmsub231psKORTESTBKORTESTBkortestbOR 8-bit Masks and Set FlagskortestbJEhttps://www.felixcloutier.com/x86/kortestw:kortestb:kortestq:kortestdSHRDSHRDshrd$Integer Double Precision Shift Right shrdw3  shrdw3  shrdl3shrdl3shrdq3shrdq3shrdw3$ shrdw3$ shrdl3'shrdl3'shrdq3+shrdq3+&https://www.felixcloutier.com/x86/shrd CMPNBEXADD CMPNBEXADD cmpnbexadd'Compare for Not Below or Equals and Add cmpnbexadd' cmpnbexadd+PSHUFDPSHUFDpshufdShuffle Packed DoublewordspshufdPSHUFL3pshufdPSHUFL3/(https://www.felixcloutier.com/x86/pshufdSARSARsarArithmetic Shift RightsarbSARB3 sarbSARB3 sarbSARB3 sarwSARW3 sarwSARW3 sarwSARW3 sarlSARL3sarlSARL3sarlSARL3sarqSARQ3sarqSARQ3sarqSARQ3sarbSARB3#sarbSARB3#sarbSARB3#sarwSARW3$sarwSARW3$sarwSARW3$sarlSARL3'sarlSARL3'sarlSARL3'sarqSARQ3+sarqSARQ3+sarqSARQ3+1https://www.felixcloutier.com/x86/sal:sar:shl:shrPF2IDPF2IDpf2id5Packed Floating-Point to Integer Doubleword Conversonpf2id3pf2id3+MINPDMINPDminpd<Return Minimum Packed Double-Precision Floating-Point ValuesminpdMINPD3minpdMINPD3/'https://www.felixcloutier.com/x86/minpd VINSERTI32X4 VINSERTI32X4 vinserti32x43Insert 128 Bits of Packed Doubleword Integer Values vinserti32x4H vinserti32x4/H vinserti32x4H vinserti32x4/H vinserti32x4H vinserti32x4/H vinserti32x4H vinserti32x4/HVPDPWSUDVPDPWSUDvpdpwsudHPacked Dot Product of Signed-by-Unsigned Word subvectors into DoublewordvpdpwsudYvpdpwsud/YvpdpwsudYvpdpwsud2Y VPERMIL2PD VPERMIL2PD vpermil2pd:Permute Two-Source Double-Precision Floating-Point Vectors vpermil2pd" vpermil2pd/" vpermil2pd/" vpermil2pd" vpermil2pd2" vpermil2pd2"PMOVZXWDPMOVZXWDpmovzxwdDMove Packed Word Integers to Doubleword Integers with Zero Extensionpmovzxwd3pmovzxwd3+ VCVTSH2SD VCVTSH2SD vcvtsh2sdJConvert Scalar Half-Precision FP Value to Scalar Double-Precision FP Value vcvtsh2sdR vcvtsh2sd$R vcvtsh2sdR vcvtsh2sd$R vcvtsh2sdRR vcvtsh2sdRR+https://www.felixcloutier.com/x86/vcvtsh2sdVRCPPSVRCPPSvrcppsPCompute Approximate Reciprocals of Packed Single-Precision Floating-Point Valuesvrcpps4 vrcpps4/ vrcpps4 vrcpps42 UMONITORUMONITORumonitor(User mode Monitor a Linear Address RangeumonitorG*https://www.felixcloutier.com/x86/umonitor VFNMSUB213PH VFNMSUB213PH vfnmsub213phOFused Negative Multiply-Subtract of Packed Half-Precision Floating-Point Values vfnmsub213ph<K vfnmsub213phK vfnmsub213ph>K vfnmsub213phK vfnmsub213ph@R vfnmsub213phR vfnmsub213ph<K vfnmsub213phK vfnmsub213ph>K vfnmsub213phK vfnmsub213ph@R vfnmsub213phR vfnmsub213phQR vfnmsub213phQRlhttps://www.felixcloutier.com/x86/vfmsub132ph:vfnmsub132ph:vfmsub213ph:vfnmsub213ph:vfmsub231ph:vfnmsub231phKNOTBKNOTBknotbNOT 8-bit Mask RegisterknotbJ9https://www.felixcloutier.com/x86/knotw:knotb:knotq:knotdJAJAja#Jump if above (CF == 0 and ZF == 0)jaJHI3NjaJHI3OVPANDNVPANDNvpandnPacked Bitwise Logical AND NOTvpandn4 vpandn4/ vpandn4!vpandn42! VPTERNLOGD VPTERNLOGD vpternlogd6Bitwise Ternary Logical Operation on Doubleword Values  vpternlogd9H vpternlogdH vpternlogd:H vpternlogdH vpternlogd;H vpternlogdH vpternlogd9H vpternlogdH vpternlogd:H vpternlogdH vpternlogd;H vpternlogdH7https://www.felixcloutier.com/x86/vpternlogd:vpternlogqCMOVSCMOVScmovsMove if sign (SF == 1)cmovsw3  cmovsw3 $cmovsl3cmovsl3'cmovsq3cmovsq3+PFCMPGTPFCMPGTpfcmpgt.Packed Floating-Point Compare for Greater ThanpfcmpgtPFCMPGT3pfcmpgtPFCMPGT3+VAESIMCVAESIMCvaesimc+Perform the AES InvMixColumn Transformationvaesimc vaesimc/  VPCONFLICTQ VPCONFLICTQ vpconflictqUDetect Conflicts Within a Vector of Packed Quadword Values into Dense Memory/Register  vpconflictq=N vpconflictq?N vpconflictqAN vpconflictqN vpconflictqN vpconflictqN vpconflictq=N vpconflictqN vpconflictq?N vpconflictqN vpconflictqAN vpconflictqN9https://www.felixcloutier.com/x86/vpconflictd:vpconflictq VCVTNEPS2BF16 VCVTNEPS2BF16 vcvtneps2bf16YConvert with Nearest-Even rounding a Single-Precision FP vector into a BFloat16 FP vectorvcvtneps2bf16x9Kvcvtneps2bf16y:K vcvtneps2bf16;Qvcvtneps2bf16xKvcvtneps2bf16yK vcvtneps2bf16Qvcvtneps2bf16x9Kvcvtneps2bf16y:Kvcvtneps2bf16xKvcvtneps2bf16xZvcvtneps2bf16yKvcvtneps2bf16yZvcvtneps2bf16x/Zvcvtneps2bf16y2Z vcvtneps2bf16;Q vcvtneps2bf16Q/https://www.felixcloutier.com/x86/vcvtneps2bf16VPXORVPXORvpxor#Packed Bitwise Logical Exclusive ORvpxor4 vpxor4/ vpxor4!vpxor42!VPMOVM2DVPMOVM2Dvpmovm2d:Expand Bits of Mask Register to Packed Doubleword Integersvpmovm2dJvpmovm2dJvpmovm2dJEhttps://www.felixcloutier.com/x86/vpmovm2b:vpmovm2w:vpmovm2d:vpmovm2qDIVSSDIVSSdivss4Divide Scalar Single-Precision Floating-Point ValuesdivssDIVSS3divssDIVSS3''https://www.felixcloutier.com/x86/divssVBLENDPSVBLENDPSvblendps4 Blend Packed Single Precision Floating-Point Valuesvblendps4 vblendps4/ vblendps4 vblendps42  VPUNPCKLQDQ VPUNPCKLQDQ vpunpcklqdq?Unpack and Interleave Low-Order Quadwords into Double Quadwords vpunpcklqdq=H vpunpcklqdqH vpunpcklqdq?H vpunpcklqdqH vpunpcklqdqAH vpunpcklqdqH vpunpcklqdq=H vpunpcklqdq4  vpunpcklqdqH vpunpcklqdq4/  vpunpcklqdq?H vpunpcklqdq4! vpunpcklqdqH vpunpcklqdq42! vpunpcklqdqAH vpunpcklqdqHTDPBSSDTDPBSSDtdpbssdMTile Dot Product of Signed bytes by Signed bytes with Doubleword accumulationtdpbssdTTTAhttps://www.felixcloutier.com/x86/tdpbssd:tdpbsud:tdpbusd:tdpbuud VFNMADD231PD VFNMADD231PD vfnmadd231pdLFused Negative Multiply-Add of Packed Double-Precision Floating-Point Values vfnmadd231pd=H vfnmadd231pdH vfnmadd231pd?H vfnmadd231pdH vfnmadd231pdAH vfnmadd231pdH vfnmadd231pd=H vfnmadd231pd4# vfnmadd231pdH vfnmadd231pd4/# vfnmadd231pd?H vfnmadd231pd4# vfnmadd231pdH vfnmadd231pd42# vfnmadd231pdAH vfnmadd231pdH vfnmadd231pdQH vfnmadd231pdQHHhttps://www.felixcloutier.com/x86/vfnmadd132pd:vfnmadd213pd:vfnmadd231pdCLZEROCLZEROclzeroZero-out 64-bit Cache Lineclzero?PACKSSWBPACKSSWBpacksswb,Pack Words into Bytes with Signed SaturationpacksswbPACKSSWB3 packsswbPACKSSWB3+ packsswbPACKSSWB3packsswbPACKSSWB3/3https://www.felixcloutier.com/x86/packsswb:packssdwPSLLQPSLLQpsllq'Shift Packed Quadword Data Left LogicalpsllqPSLLQ3 psllqPSLLQ3 psllqPSLLQ3+ psllqPSLLQ3psllqPSLLQ3psllqPSLLQ3/3https://www.felixcloutier.com/x86/psllw:pslld:psllq VFNMSUB231PD VFNMSUB231PD vfnmsub231pdQFused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values vfnmsub231pd=H vfnmsub231pdH vfnmsub231pd?H vfnmsub231pdH vfnmsub231pdAH vfnmsub231pdH vfnmsub231pd=H vfnmsub231pd4# vfnmsub231pdH vfnmsub231pd4/# vfnmsub231pd?H vfnmsub231pd4# vfnmsub231pdH vfnmsub231pd42# vfnmsub231pdAH vfnmsub231pdH vfnmsub231pdQH vfnmsub231pdQHHhttps://www.felixcloutier.com/x86/vfnmsub132pd:vfnmsub213pd:vfnmsub231pdVLDMXCSRVLDMXCSRvldmxcsrLoad MXCSR Registervldmxcsr4'  VRSQRT14PD VRSQRT14PD vrsqrt14pd`Compute Approximate Reciprocals of Square Roots of Packed Double-Precision Floating-Point Values  vrsqrt14pd=H vrsqrt14pd?H vrsqrt14pdAH vrsqrt14pdH vrsqrt14pdH vrsqrt14pdH vrsqrt14pd=H vrsqrt14pdH vrsqrt14pd?H vrsqrt14pdH vrsqrt14pdAH vrsqrt14pdH,https://www.felixcloutier.com/x86/vrsqrt14pdVMOVAPDVMOVAPDvmovapd:Move Aligned Packed Double-Precision Floating-Point Valuesvmovapd0HvmovapdHvmovapd3HvmovapdHvmovapd6HvmovapdHvmovapd/Hvmovapd2Hvmovapd5Hvmovapd4 vmovapdHvmovapd4/ vmovapd/Hvmovapd4 vmovapdHvmovapd42 vmovapd2HvmovapdHvmovapd5Hvmovapd4/ vmovapd/Hvmovapd42 vmovapd2Hvmovapd5H VSM4RNDS4 VSM4RNDS4 vsm4rnds4&Performs Four Rounds of SM4 Encryption vsm4rnds4 vsm4rnds4/ vsm4rnds4 vsm4rnds42VROUNDSDVROUNDSDvroundsd3Round Scalar Double Precision Floating-Point Valuesvroundsd4 vroundsd4+ PMINUWPMINUWpminuw(Minimum of Packed Unsigned Word Integerspminuw3pminuw3//https://www.felixcloutier.com/x86/pminub:pminuwBTCBTCbtcBit Test and Complement btcwBTCW3 btcwBTCW  btclBTCL3btclBTCLbtcqBTCQ3btcqBTCQbtcwBTCW3$btcwBTCW$ btclBTCL3'btclBTCL'btcqBTCQ3+btcqBTCQ+%https://www.felixcloutier.com/x86/btcPMADDWDPMADDWDpmaddwd,Multiply and Add Packed Signed Word IntegerspmaddwdPMADDWL3 pmaddwdPMADDWL3+ pmaddwdPMADDWL3pmaddwdPMADDWL3/)https://www.felixcloutier.com/x86/pmaddwd PUNPCKHDQ PUNPCKHDQ punpckhdq;Unpack and Interleave High-Order Doublewords into Quadwords punpckhdq PUNPCKHLQ3  punpckhdq PUNPCKHLQ3+  punpckhdq PUNPCKHLQ3 punpckhdq PUNPCKHLQ3/Jhttps://www.felixcloutier.com/x86/punpckhbw:punpckhwd:punpckhdq:punpckhqdq CVTTPS2PI CVTTPS2PI cvttps2piRConvert with Truncation Packed Single-Precision FP Values to Packed Dword Integers cvttps2pi CVTTPS2PL3 cvttps2pi CVTTPS2PL3++https://www.felixcloutier.com/x86/cvttps2piMINSSMINSSminss;Return Minimum Scalar Single-Precision Floating-Point ValueminssMINSS3minssMINSS3''https://www.felixcloutier.com/x86/minssCMPZXADDCMPZXADDcmpzxaddCompare for Zero and Addcmpzxadd'cmpzxadd+ VEXTRACTI32X8 VEXTRACTI32X8 vextracti32x84Extract 256 Bits of Packed Doubleword Integer Values vextracti32x8J vextracti32x83J vextracti32x8J vextracti32x82JVPBROADCASTMB2QVPBROADCASTMB2Qvpbroadcastmb2q=Broadcast Low Byte of Mask Register to Packed Quadword Valuesvpbroadcastmb2qNvpbroadcastmb2qNvpbroadcastmb2qNPABSWPABSWpabsw&Packed Absolute Value of Word Integerspabsw3pabsw3+pabsw3pabsw3/9https://www.felixcloutier.com/x86/pabsb:pabsw:pabsd:pabsq VPBROADCASTB VPBROADCASTB vpbroadcastbBroadcast Byte Integer vpbroadcastbI vpbroadcastbI vpbroadcastbI vpbroadcastbI vpbroadcastbI vpbroadcastbI vpbroadcastb#I vpbroadcastb#I vpbroadcastb#I vpbroadcastbI vpbroadcastb4! vpbroadcastbI vpbroadcastb4#! vpbroadcastb#I vpbroadcastbI vpbroadcastb4! vpbroadcastbI vpbroadcastb4#! vpbroadcastb#I vpbroadcastbI vpbroadcastbI vpbroadcastb#IUhttps://www.felixcloutier.com/x86/vpbroadcastb:vpbroadcastw:vpbroadcastd:vpbroadcastqVPERMT2WVPERMT2Wvpermt2w9Full Permute of Words From Two Tables Overwriting a Table vpermt2wIvpermt2w/Ivpermt2wIvpermt2w2Ivpermt2wIvpermt2w5Ivpermt2wIvpermt2w/Ivpermt2wIvpermt2w2Ivpermt2wIvpermt2w5IPhttps://www.felixcloutier.com/x86/vpermt2w:vpermt2d:vpermt2q:vpermt2ps:vpermt2pdCVTSS2SDCVTSS2SDcvtss2sdLConvert Scalar Single-Precision FP Value to Scalar Double-Precision FP Valuecvtss2sdCVTSS2SD3cvtss2sdCVTSS2SD3'*https://www.felixcloutier.com/x86/cvtss2sdRDTSCPRDTSCPrdtscp(Read Time-Stamp Counter and Processor IDrdtscp(https://www.felixcloutier.com/x86/rdtscp VAESENCLAST VAESENCLAST vaesenclast,Perform Last Round of an AES Encryption Flow  vaesenclast  vaesenclastK vaesenclast/  vaesenclast/K vaesenclast vaesenclastK vaesenclast2 vaesenclast2K vaesenclastH vaesenclast5H VCVTTPS2UQQ VCVTTPS2UQQ vcvttps2uqqpConvert with Truncation Packed Single Precision Floating-Point Values to Packed Unsigned Quadword Integer Values vcvttps2uqq8J vcvttps2uqq9J vcvttps2uqq:J vcvttps2uqqJ vcvttps2uqqJ vcvttps2uqqJ vcvttps2uqq8J vcvttps2uqqJ vcvttps2uqq9J vcvttps2uqqJ vcvttps2uqq:J vcvttps2uqqJ vcvttps2uqqRJ vcvttps2uqqRJ-https://www.felixcloutier.com/x86/vcvttps2uqq VPHADDUDQ VPHADDUDQ vphaddudq5Packed Horizontal Add Unsigned Doubleword to Quadword vphaddudq" vphaddudq/"VMAXPDVMAXPDvmaxpd<Return Maximum Packed Double-Precision Floating-Point Valuesvmaxpd=HvmaxpdHvmaxpd?HvmaxpdHvmaxpdAHvmaxpdHvmaxpd=Hvmaxpd4 vmaxpdHvmaxpd4/ vmaxpd?Hvmaxpd4 vmaxpdHvmaxpd42 vmaxpdAHvmaxpdHvmaxpdRHvmaxpdRH VPMOVUSDB VPMOVUSDB vpmovusdbMDown Convert Packed Doubleword Values to Byte Values with Unsigned Saturation  vpmovusdbH vpmovusdb(H vpmovusdbH vpmovusdb,H vpmovusdbH vpmovusdb0H vpmovusdbH vpmovusdbH vpmovusdbH vpmovusdb'H vpmovusdb+H vpmovusdb/H<https://www.felixcloutier.com/x86/vpmovdb:vpmovsdb:vpmovusdbCVTSI2SSCVTSI2SScvtsi2ss9Convert Dword Integer to Scalar Single-Precision FP Value cvtsi2sslCVTSL2SS3 cvtsi2ssqCVTSQ2SS3 cvtsi2sslCVTSL2SS3' cvtsi2ssqCVTSQ2SS3+*https://www.felixcloutier.com/x86/cvtsi2ss VPUNPCKLWD VPUNPCKLWD vpunpcklwd6Unpack and Interleave Low-Order Words into Doublewords vpunpcklwdI vpunpcklwd/I vpunpcklwdI vpunpcklwd2I vpunpcklwdI vpunpcklwd5I vpunpcklwd4  vpunpcklwdI vpunpcklwd4/  vpunpcklwd/I vpunpcklwd4! vpunpcklwdI vpunpcklwd42! vpunpcklwd2I vpunpcklwdI vpunpcklwd5IFEMMSFEMMSfemmsFast Exit Multimedia Statefemms3VMAXSDVMAXSDvmaxsd;Return Maximum Scalar Double-Precision Floating-Point ValuevmaxsdHvmaxsd+Hvmaxsd4 vmaxsdHvmaxsd4+ vmaxsd+HvmaxsdRHvmaxsdRH VCVTTPH2UQQ VCVTTPH2UQQ vcvttph2uqqnConvert with Truncation Packed Half Precision Floating-Point Values to Packed Unsigned Quadword Integer Values vcvttph2uqq*K vcvttph2uqq.K vcvttph2uqq<R vcvttph2uqqK vcvttph2uqqK vcvttph2uqqR vcvttph2uqq*K vcvttph2uqqK vcvttph2uqq.K vcvttph2uqqK vcvttph2uqq<R vcvttph2uqqR vcvttph2uqqRR vcvttph2uqqRR-https://www.felixcloutier.com/x86/vcvttph2uqqMWAITXMWAITXmwaitxMonitor Wait with TimeoutmwaitxEKMOVBKMOVBkmovbMove 8-bit MaskkmovbJkmovbJkmovb#JkmovbJkmovb#J9https://www.felixcloutier.com/x86/kmovw:kmovb:kmovq:kmovdDIVPDDIVPDdivpd4Divide Packed Double-Precision Floating-Point ValuesdivpdDIVPD3divpdDIVPD3/'https://www.felixcloutier.com/x86/divpdSETAESETAEsetae$Set byte if above or equal (CF == 0)setaeSETCC3 setaeSETCC3#SETSSETSsetsSet byte if sign (SF == 1)setsSETMI3 setsSETMI3#VBROADCASTI32X4VBROADCASTI32X4vbroadcasti32x4"Broadcast Four Doubleword Elementsvbroadcasti32x4/Hvbroadcasti32x4/Hvbroadcasti32x4/Hvbroadcasti32x4/H VCVTTPS2DQ VCVTTPS2DQ vcvttps2dqRConvert with Truncation Packed Single-Precision FP Values to Packed Dword Integers vcvttps2dq9H vcvttps2dq:H vcvttps2dq;H vcvttps2dqH vcvttps2dqH vcvttps2dqH vcvttps2dq9H vcvttps2dq4  vcvttps2dqH vcvttps2dq4/  vcvttps2dq:H vcvttps2dq4  vcvttps2dqH vcvttps2dq42  vcvttps2dq;H vcvttps2dqH vcvttps2dqRH vcvttps2dqRH VDPBF16PS VDPBF16PS vdpbf16psLPacked Dot Product of BFloat16 FP subvectors into Single-Precision FP values  vdpbf16ps9K vdpbf16psK vdpbf16ps:K vdpbf16psK vdpbf16ps;Q vdpbf16psQ vdpbf16ps9K vdpbf16psK vdpbf16ps:K vdpbf16psK vdpbf16ps;Q vdpbf16psQ+https://www.felixcloutier.com/x86/vdpbf16ps VFNMSUBSD VFNMSUBSD vfnmsubsdQFused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfnmsubsd$ vfnmsubsd+$ vfnmsubsd+$ VGETMANTSH VGETMANTSH vgetmantshKExtract Normalized Mantissa from Scalar Half-Precision Floating-Point Value vgetmantshR vgetmantsh$R vgetmantshR vgetmantsh$R vgetmantshRR vgetmantshRR,https://www.felixcloutier.com/x86/vgetmantsh VMASKMOVPD VMASKMOVPD vmaskmovpd>Conditional Move Packed Double-Precision Floating-Point Values vmaskmovpd4/  vmaskmovpd42  vmaskmovpd4/  vmaskmovpd42 PMULHUWPMULHUWpmulhuw<Multiply Packed Unsigned Word Integers and Store High ResultpmulhuwPMULHUW3 pmulhuwPMULHUW3+ pmulhuwPMULHUW3pmulhuwPMULHUW3/)https://www.felixcloutier.com/x86/pmulhuwROUNDSSROUNDSSroundss3Round Scalar Single Precision Floating-Point Valuesroundss3roundss3')https://www.felixcloutier.com/x86/roundss TCMMRLFP16PS TCMMRLFP16PS tcmmrlfp16ps^Tile Complex Matrix Multiply ReaL part of FP16 tiles with Packed Single-precision accumulation tcmmrlfp16psTTT VPMADDUBSW VPMADDUBSW vpmaddubsw9Multiply and Add Packed Signed and Unsigned Byte Integers vpmaddubswI vpmaddubsw/I vpmaddubswI vpmaddubsw2I vpmaddubswI vpmaddubsw5I vpmaddubsw4  vpmaddubswI vpmaddubsw4/  vpmaddubsw/I vpmaddubsw4! vpmaddubswI vpmaddubsw42! vpmaddubsw2I vpmaddubswI vpmaddubsw5IVPSHLDQVPSHLDQvpshldq7Concatenate and Shift Packed Quadword Data Left Logical vpshldq=KvpshldqKvpshldq?KvpshldqKvpshldqAUvpshldqUvpshldq=KvpshldqKvpshldq?KvpshldqKvpshldqAUvpshldqU VPSCATTERDQ VPSCATTERDQ vpscatterdq=Scatter Packed Quadword Values with Signed Doubleword Indices vpscatterdqCH vpscatterdqCH vpscatterdqGHQhttps://www.felixcloutier.com/x86/vpscatterdd:vpscatterdq:vpscatterqd:vpscatterqqVPROLDVPROLDvproldRotate Packed Doubleword Left vprold9Hvprold:Hvprold;HvproldHvproldHvproldHvprold9HvproldHvprold:HvproldHvprold;HvproldH?https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvq VCVTPH2UDQ VCVTPH2UDQ vcvtph2udq`Convert Packed Half-Precision Floating-Point Values to Packed Unsigned Doubleword Integer Values vcvtph2udq.K vcvtph2udq<K vcvtph2udq>R vcvtph2udqK vcvtph2udqK vcvtph2udqR vcvtph2udq.K vcvtph2udqK vcvtph2udq<K vcvtph2udqK vcvtph2udq>R vcvtph2udqR vcvtph2udqQR vcvtph2udqQR,https://www.felixcloutier.com/x86/vcvtph2udqVPSRLVWVPSRLVWvpsrlvw-Variable Shift Packed Word Data Right Logical vpsrlvwIvpsrlvw/IvpsrlvwIvpsrlvw2IvpsrlvwIvpsrlvw5IvpsrlvwIvpsrlvw/IvpsrlvwIvpsrlvw2IvpsrlvwIvpsrlvw5I9https://www.felixcloutier.com/x86/vpsrlvw:vpsrlvd:vpsrlvqVRCP14PDVRCP14PDvrcp14pdPCompute Approximate Reciprocals of Packed Double-Precision Floating-Point Values vrcp14pd=Hvrcp14pd?Hvrcp14pdAHvrcp14pdHvrcp14pdHvrcp14pdHvrcp14pd=Hvrcp14pdHvrcp14pd?Hvrcp14pdHvrcp14pdAHvrcp14pdH*https://www.felixcloutier.com/x86/vrcp14pd VPUNPCKHWD VPUNPCKHWD vpunpckhwd7Unpack and Interleave High-Order Words into Doublewords vpunpckhwdI vpunpckhwd/I vpunpckhwdI vpunpckhwd2I vpunpckhwdI vpunpckhwd5I vpunpckhwd4  vpunpckhwdI vpunpckhwd4/  vpunpckhwd/I vpunpckhwd4! vpunpckhwdI vpunpckhwd42! vpunpckhwd2I vpunpckhwdI vpunpckhwd5IPAVGUSBPAVGUSBpavgusbAverage Packed Byte Integerspavgusb3pavgusb3+CVTSD2SICVTSD2SIcvtsd2si3Convert Scalar Double-Precision FP Value to Integercvtsd2siCVTSD2SL3cvtsd2siCVTSD2SL3+cvtsd2siCVTSD2SQ3cvtsd2siCVTSD2SQ3+*https://www.felixcloutier.com/x86/cvtsd2siVPMACSWDVPMACSWDvpmacswd;Packed Multiply Accumulate Signed Word to Signed Doublewordvpmacswd"vpmacswd/"PACKUSDWPACKUSDWpackusdw4Pack Doublewords into Words with Unsigned Saturationpackusdw3packusdw3/*https://www.felixcloutier.com/x86/packusdwVPROLVQVPROLVQvprolvq$Variable Rotate Packed Quadword Left vprolvq=HvprolvqHvprolvq?HvprolvqHvprolvqAHvprolvqHvprolvq=HvprolvqHvprolvq?HvprolvqHvprolvqAHvprolvqH?https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvqVMINSSVMINSSvminss;Return Minimum Scalar Single-Precision Floating-Point ValuevminssHvminss'Hvminss4 vminssHvminss4' vminss'HvminssRHvminssRHVPDPWUUDVPDPWUUDvpdpwuudJPacked Dot Product of Unsigned-by-Unsigned Word subvectors into DoublewordvpdpwuudYvpdpwuud/YvpdpwuudYvpdpwuud2YVSCATTERPF0DPSVSCATTERPF0DPSvscatterpf0dps„Sparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Doubleword Indices Using T0 Hint with Intent to Writevscatterpf0dpsKL]https://www.felixcloutier.com/x86/vscatterpf0dps:vscatterpf0qps:vscatterpf0dpd:vscatterpf0qpd PUNPCKLWD PUNPCKLWD punpcklwd6Unpack and Interleave Low-Order Words into Doublewords punpcklwd PUNPCKLWL3  punpcklwd PUNPCKLWL3'  punpcklwd PUNPCKLWL3 punpcklwd PUNPCKLWL3/Jhttps://www.felixcloutier.com/x86/punpcklbw:punpcklwd:punpckldq:punpcklqdqSUBSUBsubSubtractsubbSUBB3subbSUBB3 subbSUBB3  subbSUBB3 #subwSUBW3 subwSUBW3 subwSUBW3 subwSUBW3  subwSUBW3 $sublSUBL3sublSUBL3sublSUBL3sublSUBL3sublSUBL3'subqSUBQ3subqSUBQ3subqSUBQ3subqSUBQ3subqSUBQ3+subbSUBB3#subbSUBB3# subwSUBW3$subwSUBW3$subwSUBW3$ sublSUBL3'sublSUBL3'sublSUBL3'subqSUBQ3+subqSUBQ3+subqSUBQ3+%https://www.felixcloutier.com/x86/sub PREFETCHT1 PREFETCHT1 prefetcht1'Prefetch Data Into Caches using T1 Hint prefetcht1 PREFETCHT13# LDMXCSRLDMXCSRldmxcsrLoad MXCSR RegisterldmxcsrLDMXCSR3')https://www.felixcloutier.com/x86/ldmxcsr VPMOVZXWD VPMOVZXWD vpmovzxwdDMove Packed Word Integers to Doubleword Integers with Zero Extension vpmovzxwdH vpmovzxwdH vpmovzxwdH vpmovzxwd+H vpmovzxwd/H vpmovzxwd2H vpmovzxwd4  vpmovzxwdH vpmovzxwd4+  vpmovzxwd+H vpmovzxwd4! vpmovzxwdH vpmovzxwd4/! vpmovzxwd/H vpmovzxwdH vpmovzxwd2H VFMSUB132PS VFMSUB132PS vfmsub132psHFused Multiply-Subtract of Packed Single-Precision Floating-Point Values vfmsub132ps9H vfmsub132psH vfmsub132ps:H vfmsub132psH vfmsub132ps;H vfmsub132psH vfmsub132ps9H vfmsub132ps4# vfmsub132psH vfmsub132ps4/# vfmsub132ps:H vfmsub132ps4# vfmsub132psH vfmsub132ps42# vfmsub132ps;H vfmsub132psH vfmsub132psQH vfmsub132psQHEhttps://www.felixcloutier.com/x86/vfmsub132ps:vfmsub213ps:vfmsub231ps VMOVDQA64 VMOVDQA64 vmovdqa64Move Aligned Quadword Values vmovdqa640H vmovdqa64H vmovdqa643H vmovdqa64H vmovdqa646H vmovdqa64H vmovdqa64/H vmovdqa642H vmovdqa645H vmovdqa64H vmovdqa64/H vmovdqa64H vmovdqa642H vmovdqa64H vmovdqa645H vmovdqa64/H vmovdqa642H vmovdqa645H<https://www.felixcloutier.com/x86/movdqa:vmovdqa32:vmovdqa64 VCOMPRESSPD VCOMPRESSPD vcompresspdUStore Sparse Packed Double-Precision Floating-Point Values into Dense Memory/Register  vcompresspdH vcompresspd0H vcompresspdH vcompresspd3H vcompresspdH vcompresspd6H vcompresspdH vcompresspdH vcompresspdH vcompresspd/H vcompresspd2H vcompresspd5H-https://www.felixcloutier.com/x86/vcompresspdKNOTWKNOTWknotwNOT 16-bit Mask RegisterknotwH9https://www.felixcloutier.com/x86/knotw:knotb:knotq:knotdCMOVNACMOVNAcmovna&Move if not above (CF == 1 or ZF == 1)cmovnaw3  cmovnaw3 $cmovnal3cmovnal3'cmovnaq3cmovnaq3+CMPSXADDCMPSXADDcmpsxaddCompare for Sign and Addcmpsxadd'cmpsxadd+CVTDQ2PDCVTDQ2PDcvtdq2pdBConvert Packed Dword Integers to Packed Double-Precision FP Valuescvtdq2pd3cvtdq2pd3+*https://www.felixcloutier.com/x86/cvtdq2pdMOVHPSMOVHPSmovhps7Move High Packed Single-Precision Floating-Point ValuesmovhpsMOVHPS3+movhpsMOVHPS3+(https://www.felixcloutier.com/x86/movhpsHADDPSHADDPShaddpsPacked Single-FP Horizontal Addhaddps3haddps3/(https://www.felixcloutier.com/x86/haddpsMULPSMULPSmulps6Multiply Packed Single-Precision Floating-Point ValuesmulpsMULPS3mulpsMULPS3/'https://www.felixcloutier.com/x86/mulpsSETNESETNEsetneSet byte if not equal (ZF == 0)setneSETNE3 setneSETNE3#VBROADCASTF32X2VBROADCASTF32X2vbroadcastf32x26Broadcast Two Single-Precision Floating-Point Elementsvbroadcastf32x2Jvbroadcastf32x2Jvbroadcastf32x2+Jvbroadcastf32x2+Jvbroadcastf32x2Jvbroadcastf32x2+Jvbroadcastf32x2Jvbroadcastf32x2+JPI2FDPI2FDpi2fd6Packed Integer to Floating-Point Doubleword Conversionpi2fd3pi2fd3+VFMSUBADD231PDVFMSUBADD231PDvfmsubadd231pdXFused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Valuesvfmsubadd231pd=Hvfmsubadd231pdHvfmsubadd231pd?Hvfmsubadd231pdHvfmsubadd231pdAHvfmsubadd231pdHvfmsubadd231pd=Hvfmsubadd231pd4#vfmsubadd231pdHvfmsubadd231pd4/#vfmsubadd231pd?Hvfmsubadd231pd4#vfmsubadd231pdHvfmsubadd231pd42#vfmsubadd231pdAHvfmsubadd231pdHvfmsubadd231pdQHvfmsubadd231pdQHNhttps://www.felixcloutier.com/x86/vfmsubadd132pd:vfmsubadd213pd:vfmsubadd231pdMONITORXMONITORXmonitorx+Monitor a Linear Address Range with TimeoutmonitorxE VBLENDVPD VBLENDVPD vblendvpd= Variable Blend Packed Double Precision Floating-Point Values vblendvpd4  vblendvpd4/  vblendvpd4  vblendvpd42  VFPCLASSSD VFPCLASSSD vfpclasssd:Test Class of Scalar Double-Precision Floating-Point Value vfpclasssdJ vfpclasssdJ vfpclasssd+J vfpclasssd+J,https://www.felixcloutier.com/x86/vfpclasssdVMULSDVMULSDvmulsd6Multiply Scalar Double-Precision Floating-Point ValuesvmulsdHvmulsd+Hvmulsd4 vmulsdHvmulsd4+ vmulsd+HvmulsdQHvmulsdQHVPBLENDWVPBLENDWvpblendwBlend Packed Wordsvpblendw4 vpblendw4/ vpblendw4!vpblendw42!POPPOPpopPop a Value from the StackpopwPOPW3 popqPOPQ3popwPOPW3$popqPOPQ3+%https://www.felixcloutier.com/x86/popCMOVZCMOVZcmovzMove if zero (ZF == 1)cmovzw3  cmovzw3 $cmovzl3cmovzl3'cmovzq3cmovzq3+ VPMACSSWW VPMACSSWW vpmacsswwEPacked Multiply Accumulate with Saturation Signed Word to Signed Word vpmacssww" vpmacssww/" VPSHUFBITQMB VPSHUFBITQMB vpshufbitqmb@Shuffle Bits From Quadword Elements Using Byte Indexes Into Mask  vpshufbitqmbK vpshufbitqmbK vpshufbitqmb/K vpshufbitqmb/K vpshufbitqmbK vpshufbitqmbK vpshufbitqmb2K vpshufbitqmb2K vpshufbitqmbS vpshufbitqmbS vpshufbitqmb5S vpshufbitqmb5S.https://www.felixcloutier.com/x86/vpshufbitqmbXGETBVXGETBVxgetbv&Get Value of Extended Control Registerxgetbv(https://www.felixcloutier.com/x86/xgetbvVPCMPDVPCMPDvpcmpd'Compare Packed Signed Doubleword Values vpcmpd9Hvpcmpd9HvpcmpdHvpcmpdHvpcmpd:Hvpcmpd:HvpcmpdHvpcmpdHvpcmpd;Hvpcmpd;HvpcmpdHvpcmpdH0https://www.felixcloutier.com/x86/vpcmpd:vpcmpudADOXADOXadox<Unsigned Integer Addition of Two Operands with Overflow Flagadoxl7adoxl'7adoxq7adoxq+7&https://www.felixcloutier.com/x86/adox VCVTSD2SH VCVTSD2SH vcvtsd2shJConvert Scalar Double-Precision FP Value to Scalar Half-Precision FP Value vcvtsd2shR vcvtsd2sh+R vcvtsd2shR vcvtsd2sh+R vcvtsd2shQR vcvtsd2shQR+https://www.felixcloutier.com/x86/vcvtsd2sh VFMADD231SS VFMADD231SS vfmadd231ssCFused Multiply-Add of Scalar Single-Precision Floating-Point Values vfmadd231ssH vfmadd231ss'H vfmadd231ss4# vfmadd231ssH vfmadd231ss4'# vfmadd231ss'H vfmadd231ssQH vfmadd231ssQHEhttps://www.felixcloutier.com/x86/vfmadd132ss:vfmadd213ss:vfmadd231ssRSQRTSSRSQRTSSrsqrtssQCompute Reciprocal of Square Root of Scalar Single-Precision Floating-Point ValuersqrtssRSQRTSS3rsqrtssRSQRTSS3')https://www.felixcloutier.com/x86/rsqrtss VFNMADD213PD VFNMADD213PD vfnmadd213pdLFused Negative Multiply-Add of Packed Double-Precision Floating-Point Values vfnmadd213pd=H vfnmadd213pdH vfnmadd213pd?H vfnmadd213pdH vfnmadd213pdAH vfnmadd213pdH vfnmadd213pd=H vfnmadd213pd4# vfnmadd213pdH vfnmadd213pd4/# vfnmadd213pd?H vfnmadd213pd4# vfnmadd213pdH vfnmadd213pd42# vfnmadd213pdAH vfnmadd213pdH vfnmadd213pdQH vfnmadd213pdQHHhttps://www.felixcloutier.com/x86/vfnmadd132pd:vfnmadd213pd:vfnmadd231pd VFPCLASSSH VFPCLASSSH vfpclasssh8Test Class of Scalar Half-Precision Floating-Point Value vfpclassshR vfpclassshR vfpclasssh$R vfpclasssh$R,https://www.felixcloutier.com/x86/vfpclassshADDSUBPSADDSUBPSaddsubpsPacked Single-FP Add/Subtractaddsubps3addsubps3/*https://www.felixcloutier.com/x86/addsubps VFPCLASSPH VFPCLASSPH vfpclassph9Test Class of Packed Half-Precision Floating-Point Values  vfpclassphx<K vfpclassphx<K vfpclassphy>K vfpclassphy>K vfpclassphz@R vfpclassphz@R vfpclassphxK vfpclassphxK vfpclassphyK vfpclassphyK vfpclassphzR vfpclassphzR,https://www.felixcloutier.com/x86/vfpclassph VGETEXPSD VGETEXPSD vgetexpsdiExtract Exponent of Scalar Double-Precision Floating-Point Value as Double-Precision Floating-Point Value vgetexpsdH vgetexpsd+H vgetexpsdH vgetexpsd+H vgetexpsdRH vgetexpsdRH+https://www.felixcloutier.com/x86/vgetexpsdAANDAANDaandAtomically ANDaand'aand+VPCMPEQDVPCMPEQDvpcmpeqd+Compare Packed Doubleword Data for Equalityvpcmpeqd9Hvpcmpeqd9HvpcmpeqdHvpcmpeqdHvpcmpeqd:Hvpcmpeqd:HvpcmpeqdHvpcmpeqdHvpcmpeqd;Hvpcmpeqd;HvpcmpeqdHvpcmpeqdHvpcmpeqd4 vpcmpeqd4/ vpcmpeqd4!vpcmpeqd42! PCLMULQDQ PCLMULQDQ pclmulqdq"Carry-Less Quadword Multiplication pclmulqdq PCLMULQDQ& pclmulqdq PCLMULQDQ/&+https://www.felixcloutier.com/x86/pclmulqdq VGATHERPF1DPS VGATHERPF1DPS vgatherpf1dpsoSparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Doubleword Indices Using T1 Hint vgatherpf1dpsKLYhttps://www.felixcloutier.com/x86/vgatherpf1dps:vgatherpf1qps:vgatherpf1dpd:vgatherpf1qpdRDRANDRDRANDrdrandRead Random Numberrdrand *rdrand*rdrand*(https://www.felixcloutier.com/x86/rdrandCMOVGCMOVGcmovg&Move if greater (ZF == 0 and SF == OF)cmovgw3  cmovgw3 $cmovgl3cmovgl3'cmovgq3cmovgq3+ENDBR64ENDBR64endbr64%END (terminate) BRanch in 64-bit modeendbr64 )https://www.felixcloutier.com/x86/endbr64VRANGEPSVRANGEPSvrangepsXRange Restriction Calculation For Packed Pairs of Single-Precision Floating-Point Valuesvrangeps9JvrangepsJvrangeps:JvrangepsJvrangeps;JvrangepsJvrangeps9JvrangepsJvrangeps:JvrangepsJvrangeps;JvrangepsJvrangepsRJvrangepsRJ*https://www.felixcloutier.com/x86/vrangeps VGETMANTSS VGETMANTSS vgetmantssMExtract Normalized Mantissa from Scalar Single-Precision Floating-Point Value vgetmantssH vgetmantss'H vgetmantssH vgetmantss'H vgetmantssRH vgetmantssRH,https://www.felixcloutier.com/x86/vgetmantssCPUIDCPUIDcpuidCPU IdentificationcpuidCPUID3'https://www.felixcloutier.com/x86/cpuidVBCSTNEBF162PSVBCSTNEBF162PSvbcstnebf162ps;Load BF16 Element and Convert to FP32 Element With Broadcasvbcstnebf162ps$Zvbcstnebf162ps$Z VGATHERPF1QPS VGATHERPF1QPS vgatherpf1qpsmSparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Quadword Indices Using T1 Hint vgatherpf1qpsMLYhttps://www.felixcloutier.com/x86/vgatherpf1dps:vgatherpf1qps:vgatherpf1dpd:vgatherpf1qpdVADDSSVADDSSvaddss1Add Scalar Single-Precision Floating-Point ValuesvaddssHvaddss'Hvaddss4 vaddssHvaddss4' vaddss'HvaddssQHvaddssQHXLATBXLATBxlatbTable Look-up TranslationxlatXLATxlatXLAT,https://www.felixcloutier.com/x86/xlat:xlatb VCVTPS2DQ VCVTPS2DQ vcvtps2dqBConvert Packed Single-Precision FP Values to Packed Dword Integers vcvtps2dq9H vcvtps2dq:H vcvtps2dq;H vcvtps2dqH vcvtps2dqH vcvtps2dqH vcvtps2dq9H vcvtps2dq4  vcvtps2dqH vcvtps2dq4/  vcvtps2dq:H vcvtps2dq4  vcvtps2dqH vcvtps2dq42  vcvtps2dq;H vcvtps2dqH vcvtps2dqQH vcvtps2dqQH VGATHERQPS VGATHERQPS vgatherqpsRGather Packed Single-Precision Floating-Point Values Using Signed Quadword Indices vgatherqpsDH vgatherqpsHH vgatherqpsLH vgatherqpsD! vgatherqpsH!7https://www.felixcloutier.com/x86/vgatherqps:vgatherqpdVMOVSDVMOVSDvmovsd1Move Scalar Double-Precision Floating-Point Value vmovsd,Hvmovsd+Hvmovsd4+ vmovsd+Hvmovsd4+ vmovsd+HvmovsdHvmovsd vmovsdHVPHSUBWVPHSUBWvphsubw(Packed Horizontal Subtract Word Integersvphsubw4 vphsubw4/ vphsubw4!vphsubw42! VPMOVSXBD VPMOVSXBD vpmovsxbdDMove Packed Byte Integers to Doubleword Integers with Sign Extension vpmovsxbdH vpmovsxbdH vpmovsxbdH vpmovsxbd'H vpmovsxbd+H vpmovsxbd/H vpmovsxbd4  vpmovsxbdH vpmovsxbd4'  vpmovsxbd'H vpmovsxbd4! vpmovsxbdH vpmovsxbd4+! vpmovsxbd+H vpmovsxbdH vpmovsxbd/H VFNMADDPS VFNMADDPS vfnmaddpsLFused Negative Multiply-Add of Packed Single-Precision Floating-Point Values vfnmaddps$ vfnmaddps/$ vfnmaddps/$ vfnmaddps$ vfnmaddps2$ vfnmaddps2$VPDPBUSDVPDPBUSDvpdpbusdHPacked Dot Product of Unsigned-by-Singed Byte subvectors into Doublewordvpdpbusd9KvpdpbusdKvpdpbusd:KvpdpbusdKvpdpbusd;VvpdpbusdVvpdpbusd9KvpdpbusdWvpdpbusdKvpdpbusd/Wvpdpbusd:KvpdpbusdWvpdpbusdKvpdpbusd2Wvpdpbusd;VvpdpbusdV*https://www.felixcloutier.com/x86/vpdpbusdVPMAXSBVPMAXSBvpmaxsb&Maximum of Packed Signed Byte IntegersvpmaxsbIvpmaxsb/IvpmaxsbIvpmaxsb2IvpmaxsbIvpmaxsb5Ivpmaxsb4 vpmaxsbIvpmaxsb4/ vpmaxsb/Ivpmaxsb4!vpmaxsbIvpmaxsb42!vpmaxsb2IvpmaxsbIvpmaxsb5IRORXRORXrorx,Rotate Right Logical Without Affecting Flagsrorxl5rorxl'5rorxq5rorxq+5&https://www.felixcloutier.com/x86/rorxVORPDVORPDvorpd<Bitwise Logical OR of Double-Precision Floating-Point Valuesvorpd=JvorpdJvorpd?JvorpdJvorpdAJvorpdJvorpd=Jvorpd4 vorpdJvorpd4/ vorpd?Jvorpd4 vorpdJvorpd42 vorpdAJvorpdJVPCMPUDVPCMPUDvpcmpud)Compare Packed Unsigned Doubleword Values vpcmpud9Hvpcmpud9HvpcmpudHvpcmpudHvpcmpud:Hvpcmpud:HvpcmpudHvpcmpudHvpcmpud;Hvpcmpud;HvpcmpudHvpcmpudH0https://www.felixcloutier.com/x86/vpcmpd:vpcmpudPSWAPDPSWAPDpswapdPacked Swap Doublewordpswapd3pswapd3+CVTPD2DQCVTPD2DQcvtpd2dqBConvert Packed Double-Precision FP Values to Packed Dword Integerscvtpd2dq3cvtpd2dq3/*https://www.felixcloutier.com/x86/cvtpd2dqPSRADPSRADpsrad-Shift Packed Doubleword Data Right ArithmeticpsradPSRAL3 psradPSRAL3 psradPSRAL3+ psradPSRAL3psradPSRAL3psradPSRAL3/3https://www.felixcloutier.com/x86/psraw:psrad:psraqVPERMT2QVPERMT2Qvpermt2q=Full Permute of Quadwords From Two Tables Overwriting a Table vpermt2q=Hvpermt2qHvpermt2q?Hvpermt2qHvpermt2qAHvpermt2qHvpermt2q=Hvpermt2qHvpermt2q?Hvpermt2qHvpermt2qAHvpermt2qHPhttps://www.felixcloutier.com/x86/vpermt2w:vpermt2d:vpermt2q:vpermt2ps:vpermt2pd VCVTSD2SI VCVTSD2SI vcvtsd2si3Convert Scalar Double-Precision FP Value to Integer  vcvtsd2si4  vcvtsd2siH vcvtsd2si4+  vcvtsd2si+H vcvtsd2si4  vcvtsd2siH vcvtsd2si4+  vcvtsd2si+H vcvtsd2siQH vcvtsd2siQHVPSRAWVPSRAWvpsraw'Shift Packed Word Data Right ArithmeticvpsrawIvpsrawIvpsraw/IvpsrawIvpsrawIvpsraw/IvpsrawIvpsrawIvpsraw/Ivpsraw/Ivpsraw2Ivpsraw5Ivpsraw4 vpsrawIvpsraw4 vpsrawIvpsraw4/ vpsraw/Ivpsraw/Ivpsraw4!vpsrawIvpsraw4!vpsrawIvpsraw4/!vpsraw/Ivpsraw2IvpsrawIvpsrawIvpsraw/Ivpsraw5I VGETEXPSH VGETEXPSH vgetexpsheExtract Exponent of Scalar Half-Precision Floating-Point Value as Half-Precision Floating-Point Value vgetexpshR vgetexpsh$R vgetexpshR vgetexpsh$R vgetexpshRR vgetexpshRR+https://www.felixcloutier.com/x86/vgetexpshVPMULHWVPMULHWvpmulhw:Multiply Packed Signed Word Integers and Store High ResultvpmulhwIvpmulhw/IvpmulhwIvpmulhw2IvpmulhwIvpmulhw5Ivpmulhw4 vpmulhwIvpmulhw4/ vpmulhw/Ivpmulhw4!vpmulhwIvpmulhw42!vpmulhw2IvpmulhwIvpmulhw5IMOVNTPSMOVNTPSmovntpsKStore Packed Single-Precision Floating-Point Values Using Non-Temporal HintmovntpsMOVNTPS3/)https://www.felixcloutier.com/x86/movntpsVPADDWVPADDWvpaddwAdd Packed Word IntegersvpaddwIvpaddw/IvpaddwIvpaddw2IvpaddwIvpaddw5Ivpaddw4 vpaddwIvpaddw4/ vpaddw/Ivpaddw4!vpaddwIvpaddw42!vpaddw2IvpaddwIvpaddw5I VFMADDCPH VFMADDCPH vfmaddcphIFused Multiply-Add of Complex Packed Half-Precision Floating-Point Values vfmaddcph9K vfmaddcphK vfmaddcph:K vfmaddcphK vfmaddcph;R vfmaddcphR vfmaddcph9K vfmaddcphK vfmaddcph:K vfmaddcphK vfmaddcph;R vfmaddcphR vfmaddcphQR vfmaddcphQR6https://www.felixcloutier.com/x86/vfcmaddcph:vfmaddcphSHLSHLshlLogical Shift LeftshlbSHLB3 shlbSHLB3 shlbSHLB3 shlwSHLW3 shlwSHLW3 shlwSHLW3 shllSHLL3shllSHLL3shllSHLL3shlqSHLQ3shlqSHLQ3shlqSHLQ3shlbSHLB3#shlbSHLB3#shlbSHLB3#shlwSHLW3$shlwSHLW3$shlwSHLW3$shllSHLL3'shllSHLL3'shllSHLL3'shlqSHLQ3+shlqSHLQ3+shlqSHLQ3+1https://www.felixcloutier.com/x86/sal:sar:shl:shrVDPPSVDPPSvdpps<Dot Product of Packed Single Precision Floating-Point Valuesvdpps4 vdpps4/ vdpps4 vdpps42  VFNMADD132SS VFNMADD132SS vfnmadd132ssLFused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values vfnmadd132ssH vfnmadd132ss'H vfnmadd132ss4# vfnmadd132ssH vfnmadd132ss4'# vfnmadd132ss'H vfnmadd132ssQH vfnmadd132ssQHHhttps://www.felixcloutier.com/x86/vfnmadd132ss:vfnmadd213ss:vfnmadd231ssVSCATTERPF1DPSVSCATTERPF1DPSvscatterpf1dps„Sparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Doubleword Indices Using T1 Hint with Intent to Writevscatterpf1dpsKL]https://www.felixcloutier.com/x86/vscatterpf1dps:vscatterpf1qps:vscatterpf1dpd:vscatterpf1qpdVFMSUBSSVFMSUBSSvfmsubssHFused Multiply-Subtract of Scalar Single-Precision Floating-Point Valuesvfmsubss$vfmsubss'$vfmsubss'$ VFNMADDPD VFNMADDPD vfnmaddpdLFused Negative Multiply-Add of Packed Double-Precision Floating-Point Values vfnmaddpd$ vfnmaddpd/$ vfnmaddpd/$ vfnmaddpd$ vfnmaddpd2$ vfnmaddpd2$ TILELOADDT1 TILELOADDT1 tileloaddt1#TILE LOAD Data with T1 caching hint tileloaddt1TS7https://www.felixcloutier.com/x86/tileloadd:tileloaddt1 VCVTUSI2SD VCVTUSI2SD vcvtusi2sdHConvert Unsigned Integer to Scalar Double-Precision Floating-Point Value vcvtusi2sdlH vcvtusi2sdqH vcvtusi2sdl'H vcvtusi2sdq+H vcvtusi2sdqQH,https://www.felixcloutier.com/x86/vcvtusi2sd VMOVDQU16 VMOVDQU16 vmovdqu16Move Unaligned Word Values vmovdqu160I vmovdqu16I vmovdqu163I vmovdqu16I vmovdqu166I vmovdqu16I vmovdqu16/I vmovdqu162I vmovdqu165I vmovdqu16I vmovdqu16/I vmovdqu16I vmovdqu162I vmovdqu16I vmovdqu165I vmovdqu16/I vmovdqu162I vmovdqu165IOhttps://www.felixcloutier.com/x86/movdqu:vmovdqu8:vmovdqu16:vmovdqu32:vmovdqu64MOVSLDUPMOVSLDUPmovsldup'Move Packed Single-FP Low and Duplicatemovsldup3movsldup3/*https://www.felixcloutier.com/x86/movsldupPHSUBDPHSUBDphsubd.Packed Horizontal Subtract Doubleword Integersphsubd3phsubd3+phsubd3phsubd3//https://www.felixcloutier.com/x86/phsubw:phsubdPMOVZXWQPMOVZXWQpmovzxwqBMove Packed Word Integers to Quadword Integers with Zero Extensionpmovzxwq3pmovzxwq3' VPMADD52HUQ VPMADD52HUQ vpmadd52huqjPacked Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to Quadword Accumulators vpmadd52huq=K vpmadd52huqK vpmadd52huq?K vpmadd52huqK vpmadd52huqAO vpmadd52huqO vpmadd52huq=K vpmadd52huqK vpmadd52huq[ vpmadd52huq/[ vpmadd52huq?K vpmadd52huqK vpmadd52huq[ vpmadd52huq2[ vpmadd52huqAO vpmadd52huqO-https://www.felixcloutier.com/x86/vpmadd52huq VGF2P8MULB VGF2P8MULB vgf2p8mulbGalois Field Multiply Bytes vgf2p8mulb vgf2p8mulb/ vgf2p8mulb vgf2p8mulb2 vgf2p8mulb vgf2p8mulb5 vgf2p8mulb vgf2p8mulb vgf2p8mulb/ vgf2p8mulb/ vgf2p8mulb vgf2p8mulb vgf2p8mulb2 vgf2p8mulb2 vgf2p8mulb vgf2p8mulb5 VPUNPCKLBW VPUNPCKLBW vpunpcklbw0Unpack and Interleave Low-Order Bytes into Words vpunpcklbwI vpunpcklbw/I vpunpcklbwI vpunpcklbw2I vpunpcklbwI vpunpcklbw5I vpunpcklbw4  vpunpcklbwI vpunpcklbw4/  vpunpcklbw/I vpunpcklbw4! vpunpcklbwI vpunpcklbw42! vpunpcklbw2I vpunpcklbwI vpunpcklbw5IKSHIFTRBKSHIFTRBkshiftrbShift Right 8-bit MaskskshiftrbJEhttps://www.felixcloutier.com/x86/kshiftrw:kshiftrb:kshiftrq:kshiftrdPFSUBRPFSUBRpfsubr&Packed Floating-Point Subtract ReversepfsubrPFSUBR3pfsubrPFSUBR3+MOVNTDQAMOVNTDQAmovntdqa.Load Double Quadword Non-Temporal Aligned Hintmovntdqa3/*https://www.felixcloutier.com/x86/movntdqaVPANDNQVPANDNQvpandnq3Bitwise Logical AND NOT of Packed Quadword Integers vpandnq=HvpandnqHvpandnq?HvpandnqHvpandnqAHvpandnqHvpandnq=HvpandnqHvpandnq?HvpandnqHvpandnqAHvpandnqHPMINUBPMINUBpminub(Minimum of Packed Unsigned Byte IntegerspminubPMINUB3 pminubPMINUB3+ pminubPMINUB3pminubPMINUB3//https://www.felixcloutier.com/x86/pminub:pminuwVPSRAVQVPSRAVQvpsravq4Variable Shift Packed Quadword Data Right Arithmetic vpsravq=HvpsravqHvpsravq?HvpsravqHvpsravqAHvpsravqHvpsravq=HvpsravqHvpsravq?HvpsravqHvpsravqAHvpsravqH9https://www.felixcloutier.com/x86/vpsravw:vpsravd:vpsravq VFMADD132PH VFMADD132PH vfmadd132phAFused Multiply-Add of Packed Half-Precision Floating-Point Values vfmadd132ph<K vfmadd132phK vfmadd132ph>K vfmadd132phK vfmadd132ph@R vfmadd132phR vfmadd132ph<K vfmadd132phK vfmadd132ph>K vfmadd132phK vfmadd132ph@R vfmadd132phR vfmadd132phQR vfmadd132phQRlhttps://www.felixcloutier.com/x86/vfmadd132ph:vfnmadd132ph:vfmadd213ph:vfnmadd213ph:vfmadd231ph:vfnmadd231phVDIVPDVDIVPDvdivpd4Divide Packed Double-Precision Floating-Point Valuesvdivpd=HvdivpdHvdivpd?HvdivpdHvdivpdAHvdivpdHvdivpd=Hvdivpd4 vdivpdHvdivpd4/ vdivpd?Hvdivpd4 vdivpdHvdivpd42 vdivpdAHvdivpdHvdivpdQHvdivpdQHVRSQRTSSVRSQRTSSvrsqrtssQCompute Reciprocal of Square Root of Scalar Single-Precision Floating-Point Valuevrsqrtss4 vrsqrtss4' RDSEEDRDSEEDrdseedRead Random SEEDrdseed +rdseed+rdseed+(https://www.felixcloutier.com/x86/rdseed VSCALEFSD VSCALEFSD vscalefsd_Scale Scalar Double-Precision Floating-Point Value With a Double-Precision Floating-Point Value vscalefsdH vscalefsd+H vscalefsdH vscalefsd+H vscalefsdQH vscalefsdQH+https://www.felixcloutier.com/x86/vscalefsdVPROTQVPROTQvprotqPacked Rotate Quadwordsvprotq"vprotq"vprotq/"vprotq/"vprotq/" VSHA512RNDS2 VSHA512RNDS2 vsha512rnds2&Perform Two Rounds of SHA512 Operation vsha512rnds2) VCVTPH2UW VCVTPH2UW vcvtph2uwZConvert Packed Half-Precision Floating-Point Values to Packed Unsigned Word Integer Values vcvtph2uw<K vcvtph2uw>K vcvtph2uw@R vcvtph2uwK vcvtph2uwK vcvtph2uwR vcvtph2uw<K vcvtph2uwK vcvtph2uw>K vcvtph2uwK vcvtph2uw@R vcvtph2uwR vcvtph2uwQR vcvtph2uwQR+https://www.felixcloutier.com/x86/vcvtph2uwVDIVSDVDIVSDvdivsd4Divide Scalar Double-Precision Floating-Point ValuesvdivsdHvdivsd+Hvdivsd4 vdivsdHvdivsd4+ vdivsd+HvdivsdQHvdivsdQHVMOVLPDVMOVLPDvmovlpd5Move Low Packed Double-Precision Floating-Point Valuevmovlpd4+ vmovlpd+Hvmovlpd4+ vmovlpd+HVPERMI2BVPERMI2Bvpermi2b;Full Permute of Bytes From Two Tables Overwriting the Index vpermi2bTvpermi2b/Tvpermi2bTvpermi2b2Tvpermi2bTvpermi2b5Tvpermi2bTvpermi2b/Tvpermi2bTvpermi2b2Tvpermi2bTvpermi2b5T*https://www.felixcloutier.com/x86/vpermi2bVPERMPDVPERMPDvpermpd0Permute Double-Precision Floating-Point Elementsvpermpd?HvpermpdAHvpermpd?HvpermpdHvpermpdHvpermpdAHvpermpdHvpermpdHvpermpd?Hvpermpd?Hvpermpd4!vpermpdHvpermpdHvpermpd42!vpermpdAHvpermpdAHvpermpdHvpermpdH)https://www.felixcloutier.com/x86/vpermpd VSHUFF32X4 VSHUFF32X4 vshuff32x4=Shuffle 128-Bit Packed Single-Precision Floating-Point Values vshuff32x4:H vshuff32x4H vshuff32x4;H vshuff32x4H vshuff32x4:H vshuff32x4H vshuff32x4;H vshuff32x4HXORXORxorLogical Exclusive ORxorbXORB3xorbXORB3 xorbXORB3  xorbXORB3 #xorwXORW3 xorwXORW3 xorwXORW3 xorwXORW3  xorwXORW3 $xorlXORL3xorlXORL3xorlXORL3xorlXORL3xorlXORL3'xorqXORQ3xorqXORQ3xorqXORQ3xorqXORQ3xorqXORQ3+xorbXORB3#xorbXORB3# xorwXORW3$xorwXORW3$xorwXORW3$ xorlXORL3'xorlXORL3'xorlXORL3'xorqXORQ3+xorqXORQ3+xorqXORQ3+%https://www.felixcloutier.com/x86/xor LDTILECFG LDTILECFG ldtilecfgLoaD TILE ConFiGuration ldtilecfg5+https://www.felixcloutier.com/x86/ldtilecfg VPACKSSWB VPACKSSWB vpacksswb,Pack Words into Bytes with Signed Saturation vpacksswbI vpacksswb/I vpacksswbI vpacksswb2I vpacksswbI vpacksswb5I vpacksswb4  vpacksswbI vpacksswb4/  vpacksswb/I vpacksswb4! vpacksswbI vpacksswb42! vpacksswb2I vpacksswbI vpacksswb5IVPMOVM2QVPMOVM2Qvpmovm2q8Expand Bits of Mask Register to Packed Quadword Integersvpmovm2qJvpmovm2qJvpmovm2qJEhttps://www.felixcloutier.com/x86/vpmovm2b:vpmovm2w:vpmovm2d:vpmovm2qPMULLWPMULLWpmullw9Multiply Packed Signed Word Integers and Store Low ResultpmullwPMULLW3 pmullwPMULLW3+ pmullwPMULLW3pmullwPMULLW3/(https://www.felixcloutier.com/x86/pmullw VCVTSS2SI VCVTSS2SI vcvtss2si9Convert Scalar Single-Precision FP Value to Dword Integer  vcvtss2si4  vcvtss2siH vcvtss2si4'  vcvtss2si'H vcvtss2si4  vcvtss2siH vcvtss2si4'  vcvtss2si'H vcvtss2siQH vcvtss2siQHBLSFILLBLSFILLblsfillFill From Lowest Set Bitblsfill6blsfill'6blsfill6blsfill+6VFMSUBSDVFMSUBSDvfmsubsdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Valuesvfmsubsd$vfmsubsd+$vfmsubsd+$VMOVSSVMOVSSvmovss2Move Scalar Single-Precision Floating-Point Values vmovss(Hvmovss'Hvmovss4' vmovss'Hvmovss4' vmovss'HvmovssHvmovss vmovssHVPSHLDVPSHLDvpshld Packed Shift Logical Doublewordsvpshld"vpshld/"vpshld/"(https://www.felixcloutier.com/x86/vpshld VPDPWSSDS VPDPWSSDS vpdpwssdsVPacked Dot Product of Signed-by-Signed Word subvectors into Doubleword with Saturation vpdpwssds9K vpdpwssdsK vpdpwssds:K vpdpwssdsK vpdpwssds;V vpdpwssdsV vpdpwssds9K vpdpwssdsW vpdpwssdsK vpdpwssds/W vpdpwssds:K vpdpwssdsW vpdpwssdsK vpdpwssds2W vpdpwssds;V vpdpwssdsV+https://www.felixcloutier.com/x86/vpdpwssdsSETBSETBsetbSet byte if below (CF == 1)setbSETCS3 setbSETCS3#PMOVSXWDPMOVSXWDpmovsxwdDMove Packed Word Integers to Doubleword Integers with Sign Extensionpmovsxwd3pmovsxwd3+KNOTDKNOTDknotdNOT 32-bit Mask RegisterknotdI9https://www.felixcloutier.com/x86/knotw:knotb:knotq:knotdVRCP14SDVRCP14SDvrcp14sdPCompute Approximate Reciprocal of a Scalar Double-Precision Floating-Point Valuevrcp14sdHvrcp14sd+Hvrcp14sdHvrcp14sd+H*https://www.felixcloutier.com/x86/vrcp14sdCVTPD2PSCVTPD2PScvtpd2psNConvert Packed Double-Precision FP Values to Packed Single-Precision FP Valuescvtpd2psCVTPD2PS3cvtpd2psCVTPD2PS3/*https://www.felixcloutier.com/x86/cvtpd2psJNCJNCjncJump if not carry (CF == 0)jncJCC3NjncJCC3OPMOVZXBWPMOVZXBWpmovzxbw>Move Packed Byte Integers to Word Integers with Zero Extensionpmovzxbw3pmovzxbw3+VPCMPQVPCMPQvpcmpq%Compare Packed Signed Quadword Values vpcmpq=Hvpcmpq=HvpcmpqHvpcmpqHvpcmpq?Hvpcmpq?HvpcmpqHvpcmpqHvpcmpqAHvpcmpqAHvpcmpqHvpcmpqH0https://www.felixcloutier.com/x86/vpcmpq:vpcmpuqJNGEJNGEjnge'Jump if not greater or equal (SF != OF)jngeJLT3NjngeJLT3O VPUNPCKLDQ VPUNPCKLDQ vpunpckldq:Unpack and Interleave Low-Order Doublewords into Quadwords vpunpckldq9H vpunpckldqH vpunpckldq:H vpunpckldqH vpunpckldq;H vpunpckldqH vpunpckldq9H vpunpckldq4  vpunpckldqH vpunpckldq4/  vpunpckldq:H vpunpckldq4! vpunpckldqH vpunpckldq42! vpunpckldq;H vpunpckldqHKADDWKADDWkaddwADD Two 16-bit MaskskaddwJ9https://www.felixcloutier.com/x86/kaddw:kaddb:kaddq:kaddd VPTERNLOGQ VPTERNLOGQ vpternlogq4Bitwise Ternary Logical Operation on Quadword Values  vpternlogq=H vpternlogqH vpternlogq?H vpternlogqH vpternlogqAH vpternlogqH vpternlogq=H vpternlogqH vpternlogq?H vpternlogqH vpternlogqAH vpternlogqH7https://www.felixcloutier.com/x86/vpternlogd:vpternlogqCMPPSCMPPScmpps5Compare Packed Single-Precision Floating-Point ValuescmppsCMPPS3cmppsCMPPS3/'https://www.felixcloutier.com/x86/cmppsPSUBDPSUBDpsubd#Subtract Packed Doubleword IntegerspsubdPSUBL3 psubdPSUBL3+ psubdPSUBL3psubdPSUBL3/3https://www.felixcloutier.com/x86/psubb:psubw:psubd VMOVDQA32 VMOVDQA32 vmovdqa32Move Aligned Doubleword Values vmovdqa320H vmovdqa32H vmovdqa323H vmovdqa32H vmovdqa326H vmovdqa32H vmovdqa32/H vmovdqa322H vmovdqa325H vmovdqa32H vmovdqa32/H vmovdqa32H vmovdqa322H vmovdqa32H vmovdqa325H vmovdqa32/H vmovdqa322H vmovdqa325H<https://www.felixcloutier.com/x86/movdqa:vmovdqa32:vmovdqa64LDDQULDDQUlddquLoad Unaligned Integer 128 Bitslddqu3/'https://www.felixcloutier.com/x86/lddqu VFMSUB231PD VFMSUB231PD vfmsub231pdHFused Multiply-Subtract of Packed Double-Precision Floating-Point Values vfmsub231pd=H vfmsub231pdH vfmsub231pd?H vfmsub231pdH vfmsub231pdAH vfmsub231pdH vfmsub231pd=H vfmsub231pd4# vfmsub231pdH vfmsub231pd4/# vfmsub231pd?H vfmsub231pd4# vfmsub231pdH vfmsub231pd42# vfmsub231pdAH vfmsub231pdH vfmsub231pdQH vfmsub231pdQHEhttps://www.felixcloutier.com/x86/vfmsub132pd:vfmsub213pd:vfmsub231pdVPADDUSWVPADDUSWvpaddusw:Add Packed Unsigned Word Integers with Unsigned SaturationvpadduswIvpaddusw/IvpadduswIvpaddusw2IvpadduswIvpaddusw5Ivpaddusw4 vpadduswIvpaddusw4/ vpaddusw/Ivpaddusw4!vpadduswIvpaddusw42!vpaddusw2IvpadduswIvpaddusw5IVPORDVPORDvpord0Bitwise Logical OR of Packed Doubleword Integers vpord9HvpordHvpord:HvpordHvpord;HvpordHvpord9HvpordHvpord:HvpordHvpord;HvpordHCVTPI2PDCVTPI2PDcvtpi2pdBConvert Packed Dword Integers to Packed Double-Precision FP Valuescvtpi2pdCVTPL2PD3cvtpi2pdCVTPL2PD3+*https://www.felixcloutier.com/x86/cvtpi2pdJLJLjlJump if less (SF != OF)jlJLT3NjlJLT3O VPERMI2PD VPERMI2PD vpermi2pd\Full Permute of Double-Precision Floating-Point Values From Two Tables Overwriting the Index  vpermi2pd=H vpermi2pdH vpermi2pd?H vpermi2pdH vpermi2pdAH vpermi2pdH vpermi2pd=H vpermi2pdH vpermi2pd?H vpermi2pdH vpermi2pdAH vpermi2pdHPhttps://www.felixcloutier.com/x86/vpermi2w:vpermi2d:vpermi2q:vpermi2ps:vpermi2pd VPACKUSDW VPACKUSDW vpackusdw4Pack Doublewords into Words with Unsigned Saturation vpackusdw9I vpackusdwI vpackusdw:I vpackusdwI vpackusdw;I vpackusdwI vpackusdw9I vpackusdw4  vpackusdwI vpackusdw4/  vpackusdw:I vpackusdw4! vpackusdwI vpackusdw42! vpackusdw;I vpackusdwIPACKSSDWPACKSSDWpackssdw2Pack Doublewords into Words with Signed SaturationpackssdwPACKSSLW3 packssdwPACKSSLW3+ packssdwPACKSSLW3packssdwPACKSSLW3/3https://www.felixcloutier.com/x86/packsswb:packssdwSETNZSETNZsetnzSet byte if not zero (ZF == 0)setnzSETNE3 setnzSETNE3#PXORPXORpxor#Packed Bitwise Logical Exclusive ORpxorPXOR3 pxorPXOR3+ pxorPXOR3pxorPXOR3/&https://www.felixcloutier.com/x86/pxor CMPNOXADD CMPNOXADD cmpnoxadd Compare for Not Overflow and Add cmpnoxadd' cmpnoxadd+CMCCMCcmcComplement Carry FlagcmcCMC3%https://www.felixcloutier.com/x86/cmcCMOVNLCMOVNLcmovnlMove if not less (SF == OF)cmovnlw3  cmovnlw3 $cmovnll3cmovnll3'cmovnlq3cmovnlq3+PAUSEPAUSEpauseSpin Loop HintpausePAUSE3'https://www.felixcloutier.com/x86/pause VCVTPS2UQQ VCVTPS2UQQ vcvtps2uqq`Convert Packed Single Precision Floating-Point Values to Packed Unsigned Quadword Integer Values vcvtps2uqq8J vcvtps2uqq9J vcvtps2uqq:J vcvtps2uqqJ vcvtps2uqqJ vcvtps2uqqJ vcvtps2uqq8J vcvtps2uqqJ vcvtps2uqq9J vcvtps2uqqJ vcvtps2uqq:J vcvtps2uqqJ vcvtps2uqqQJ vcvtps2uqqQJ,https://www.felixcloutier.com/x86/vcvtps2uqqVPMOVDWVPMOVDWvpmovdwDDown Convert Packed Doubleword Values to Word Values with Truncation vpmovdwHvpmovdw,HvpmovdwHvpmovdw0HvpmovdwHvpmovdw3HvpmovdwHvpmovdwHvpmovdwHvpmovdw+Hvpmovdw/Hvpmovdw2H<https://www.felixcloutier.com/x86/vpmovdw:vpmovsdw:vpmovusdw VSHUFF64X2 VSHUFF64X2 vshuff64x2=Shuffle 128-Bit Packed Double-Precision Floating-Point Values vshuff64x2?H vshuff64x2H vshuff64x2AH vshuff64x2H vshuff64x2?H vshuff64x2H vshuff64x2AH vshuff64x2HVFMADDSUB231PDVFMADDSUB231PDvfmaddsub231pdXFused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Valuesvfmaddsub231pd=Hvfmaddsub231pdHvfmaddsub231pd?Hvfmaddsub231pdHvfmaddsub231pdAHvfmaddsub231pdHvfmaddsub231pd=Hvfmaddsub231pd4#vfmaddsub231pdHvfmaddsub231pd4/#vfmaddsub231pd?Hvfmaddsub231pd4#vfmaddsub231pdHvfmaddsub231pd42#vfmaddsub231pdAHvfmaddsub231pdHvfmaddsub231pdQHvfmaddsub231pdQHNhttps://www.felixcloutier.com/x86/vfmaddsub132pd:vfmaddsub213pd:vfmaddsub231pdVPSRAQVPSRAQvpsraq+Shift Packed Quadword Data Right Arithmeticvpsraq=Hvpsraq?HvpsraqAHvpsraqHvpsraqHvpsraq/HvpsraqHvpsraqHvpsraq/HvpsraqHvpsraqHvpsraq/Hvpsraq=HvpsraqHvpsraqHvpsraq/Hvpsraq?HvpsraqHvpsraqHvpsraq/HvpsraqAHvpsraqHvpsraqHvpsraq/HVPSHRDWVPSHRDWvpshrdw4Concatenate and Shift Packed Word Data Right Logical vpshrdwKvpshrdw/KvpshrdwKvpshrdw2KvpshrdwUvpshrdw5UvpshrdwKvpshrdw/KvpshrdwKvpshrdw2KvpshrdwUvpshrdw5UPSRAWPSRAWpsraw'Shift Packed Word Data Right ArithmeticpsrawPSRAW3 psrawPSRAW3 psrawPSRAW3+ psrawPSRAW3psrawPSRAW3psrawPSRAW3/3https://www.felixcloutier.com/x86/psraw:psrad:psraqXORPDXORPDxorpd>Bitwise Logical XOR for Double-Precision Floating-Point ValuesxorpdXORPD3xorpdXORPD3/'https://www.felixcloutier.com/x86/xorpdVCVTNEEBF162PSVCVTNEEBF162PSvcvtneebf162ps:Convert Even Elements of Packed BF16 Values to FP32 Valuesvcvtneebf162ps/Zvcvtneebf162ps2Z VFPCLASSSS VFPCLASSSS vfpclassss:Test Class of Scalar Single-Precision Floating-Point Value vfpclassssJ vfpclassssJ vfpclassss'J vfpclassss'J,https://www.felixcloutier.com/x86/vfpclassssVPMOVSQDVPMOVSQDvpmovsqdODown Convert Packed Quadword Values to Doubleword Values with Signed Saturation vpmovsqdHvpmovsqd,HvpmovsqdHvpmovsqd0HvpmovsqdHvpmovsqd3HvpmovsqdHvpmovsqdHvpmovsqdHvpmovsqd+Hvpmovsqd/Hvpmovsqd2H<https://www.felixcloutier.com/x86/vpmovqd:vpmovsqd:vpmovusqdPSLLDPSLLDpslld)Shift Packed Doubleword Data Left LogicalpslldPSLLL3 pslldPSLLL3 pslldPSLLL3+ pslldPSLLL3pslldPSLLL3pslldPSLLL3/3https://www.felixcloutier.com/x86/psllw:pslld:psllq CVTTSS2SI CVTTSS2SI cvttss2siIConvert with Truncation Scalar Single-Precision FP Value to Dword Integer cvttss2si CVTTSS2SL3 cvttss2si CVTTSS2SL3' cvttss2si CVTTSS2SQ3 cvttss2si CVTTSS2SQ3'+https://www.felixcloutier.com/x86/cvttss2siVPMOVQDVPMOVQDvpmovqdHDown Convert Packed Quadword Values to Doubleword Values with Truncation vpmovqdHvpmovqd,HvpmovqdHvpmovqd0HvpmovqdHvpmovqd3HvpmovqdHvpmovqdHvpmovqdHvpmovqd+Hvpmovqd/Hvpmovqd2H<https://www.felixcloutier.com/x86/vpmovqd:vpmovsqd:vpmovusqdPMOVSXBQPMOVSXBQpmovsxbqBMove Packed Byte Integers to Quadword Integers with Sign Extensionpmovsxbq3pmovsxbq3$VPEXTRWVPEXTRWvpextrw Extract Wordvpextrw4 vpextrwIvpextrw4$ vpextrw$IVPMOVSWBVPMOVSWBvpmovswbEDown Convert Packed Word Values to Byte Values with Signed Saturation vpmovswbIvpmovswb,IvpmovswbIvpmovswb0IvpmovswbIvpmovswb3IvpmovswbIvpmovswbIvpmovswbIvpmovswb+Ivpmovswb/Ivpmovswb2I<https://www.felixcloutier.com/x86/vpmovwb:vpmovswb:vpmovuswb VFMADD213PH VFMADD213PH vfmadd213phAFused Multiply-Add of Packed Half-Precision Floating-Point Values vfmadd213ph<K vfmadd213phK vfmadd213ph>K vfmadd213phK vfmadd213ph@R vfmadd213phR vfmadd213ph<K vfmadd213phK vfmadd213ph>K vfmadd213phK vfmadd213ph@R vfmadd213phR vfmadd213phQR vfmadd213phQRlhttps://www.felixcloutier.com/x86/vfmadd132ph:vfnmadd132ph:vfmadd213ph:vfnmadd213ph:vfmadd231ph:vfnmadd231ph VPMOVSXBW VPMOVSXBW vpmovsxbw>Move Packed Byte Integers to Word Integers with Sign Extension vpmovsxbwI vpmovsxbwI vpmovsxbwI vpmovsxbw+I vpmovsxbw/I vpmovsxbw2I vpmovsxbw4  vpmovsxbwI vpmovsxbw4+  vpmovsxbw+I vpmovsxbw4! vpmovsxbwI vpmovsxbw4/! vpmovsxbw/I vpmovsxbwI vpmovsxbw2IVAESENCVAESENCvaesenc+Perform One Round of an AES Encryption Flow vaesenc vaesencKvaesenc/ vaesenc/KvaesencvaesencKvaesenc2vaesenc2KvaesencHvaesenc5H CVTTPD2DQ CVTTPD2DQ cvttpd2dqRConvert with Truncation Packed Double-Precision FP Values to Packed Dword Integers cvttpd2dq3 cvttpd2dq3/+https://www.felixcloutier.com/x86/cvttpd2dqMOVMSKPDMOVMSKPDmovmskpd8Extract Packed Double-Precision Floating-Point Sign MaskmovmskpdMOVMSKPD3*https://www.felixcloutier.com/x86/movmskpdVORPSVORPSvorps<Bitwise Logical OR of Single-Precision Floating-Point Valuesvorps9JvorpsJvorps:JvorpsJvorps;JvorpsJvorps9Jvorps4 vorpsJvorps4/ vorps:Jvorps4 vorpsJvorps42 vorps;JvorpsJVMINSHVMINSHvminsh9Return Minimum Scalar Half-Precision Floating-Point ValuevminshRvminsh$RvminshRvminsh$RvminshRRvminshRR(https://www.felixcloutier.com/x86/vminshVMOVDDUPVMOVDDUPvmovddup Move One Double-FP and DuplicatevmovddupHvmovddupHvmovddupHvmovddup+Hvmovddup2Hvmovddup5Hvmovddup4 vmovddupHvmovddup4+ vmovddup+Hvmovddup4 vmovddupHvmovddup42 vmovddup2HvmovddupHvmovddup5H VGATHERDPS VGATHERDPS vgatherdpsTGather Packed Single-Precision Floating-Point Values Using Signed Doubleword Indices vgatherdpsBH vgatherdpsFH vgatherdpsJH vgatherdpsB! vgatherdpsF!7https://www.felixcloutier.com/x86/vgatherdps:vgatherdpdJBJBjbJump if below (CF == 1)jbJCS3NjbJCS3O PUNPCKLDQ PUNPCKLDQ punpckldq:Unpack and Interleave Low-Order Doublewords into Quadwords punpckldq PUNPCKLLQ3  punpckldq PUNPCKLLQ3'  punpckldq PUNPCKLLQ3 punpckldq PUNPCKLLQ3/Jhttps://www.felixcloutier.com/x86/punpcklbw:punpcklwd:punpckldq:punpcklqdq CVTTPS2DQ CVTTPS2DQ cvttps2dqRConvert with Truncation Packed Single-Precision FP Values to Packed Dword Integers cvttps2dq3 cvttps2dq3/+https://www.felixcloutier.com/x86/cvttps2dq VCVTTPD2QQ VCVTTPD2QQ vcvttpd2qqaConvert with Truncation Packed Double-Precision Floating-Point Values to Packed Quadword Integers vcvttpd2qq=J vcvttpd2qq?J vcvttpd2qqAJ vcvttpd2qqJ vcvttpd2qqJ vcvttpd2qqJ vcvttpd2qq=J vcvttpd2qqJ vcvttpd2qq?J vcvttpd2qqJ vcvttpd2qqAJ vcvttpd2qqJ vcvttpd2qqRJ vcvttpd2qqRJ,https://www.felixcloutier.com/x86/vcvttpd2qq CMPXCHG16B CMPXCHG16B cmpxchg16bCompare and Exchange 16 Bytes cmpxchg16b3/ 6https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16bVPCMOVVPCMOVvpcmovPacked Conditional Movevpcmov"vpcmov/"vpcmov/"vpcmov"vpcmov2"vpcmov2" VPCMPESTRM VPCMPESTRM vpcmpestrm3Packed Compare Explicit Length Strings, Return Mask vpcmpestrmq4  vpcmpestrmq4/ VEXP2PSVEXP2PSvexp2psyApproximation to the Exponential 2^x of Packed Single-Precision Floating-Point Values with Less Than 2^-23 Relative Errorvexp2ps;Mvexp2psMvexp2ps;Mvexp2psMvexp2psRMvexp2psRM)https://www.felixcloutier.com/x86/vexp2psSETPESETPEsetpe!Set byte if parity even (PF == 1)setpeSETPS3 setpeSETPS3# VCVTUDQ2PH VCVTUDQ2PH vcvtudq2phZConvert Packed Unsigned Doubleword Integers to Packed Half-Precision Floating-Point Values vcvtudq2phx9K vcvtudq2phy:K vcvtudq2ph;R vcvtudq2phxK vcvtudq2phyK vcvtudq2phR vcvtudq2phx9K vcvtudq2phy:K vcvtudq2phxK vcvtudq2phyK vcvtudq2ph;R vcvtudq2phR vcvtudq2phQR vcvtudq2phQR,https://www.felixcloutier.com/x86/vcvtudq2phSBBSBBsbbSubtract with BorrowsbbbSBBB3sbbbSBBB3 sbbbSBBB3  sbbbSBBB3 #sbbwSBBW3 sbbwSBBW3 sbbwSBBW3 sbbwSBBW3  sbbwSBBW3 $sbblSBBL3sbblSBBL3sbblSBBL3sbblSBBL3sbblSBBL3'sbbqSBBQ3sbbqSBBQ3sbbqSBBQ3sbbqSBBQ3sbbqSBBQ3+sbbbSBBB3#sbbbSBBB3# sbbwSBBW3$sbbwSBBW3$sbbwSBBW3$ sbblSBBL3'sbblSBBL3'sbblSBBL3'sbbqSBBQ3+sbbqSBBQ3+sbbqSBBQ3+%https://www.felixcloutier.com/x86/sbbPADDBPADDBpaddbAdd Packed Byte IntegerspaddbPADDB3 paddbPADDB3+ paddbPADDB3paddbPADDB3/9https://www.felixcloutier.com/x86/paddb:paddw:paddd:paddq VFNMSUB231PS VFNMSUB231PS vfnmsub231psQFused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values vfnmsub231ps9H vfnmsub231psH vfnmsub231ps:H vfnmsub231psH vfnmsub231ps;H vfnmsub231psH vfnmsub231ps9H vfnmsub231ps4# vfnmsub231psH vfnmsub231ps4/# vfnmsub231ps:H vfnmsub231ps4# vfnmsub231psH vfnmsub231ps42# vfnmsub231ps;H vfnmsub231psH vfnmsub231psQH vfnmsub231psQHHhttps://www.felixcloutier.com/x86/vfnmsub132ps:vfnmsub213ps:vfnmsub231psVPDPWSSDVPDPWSSDvpdpwssdFPacked Dot Product of Signed-by-Signed Word subvectors into Doublewordvpdpwssd9KvpdpwssdKvpdpwssd:KvpdpwssdKvpdpwssd;VvpdpwssdVvpdpwssd9KvpdpwssdWvpdpwssdKvpdpwssd/Wvpdpwssd:KvpdpwssdWvpdpwssdKvpdpwssd2Wvpdpwssd;VvpdpwssdV*https://www.felixcloutier.com/x86/vpdpwssd PCMPESTRI PCMPESTRI pcmpestri4Packed Compare Explicit Length Strings, Return Index pcmpestriq3 pcmpestriq3/+https://www.felixcloutier.com/x86/pcmpestri VPERMILPS VPERMILPS vpermilps.Permute Single-Precision Floating-Point Values  vpermilps9H vpermilps:H vpermilps;H vpermilps9H vpermilpsH vpermilpsH vpermilps:H vpermilpsH vpermilpsH vpermilps;H vpermilpsH vpermilpsH vpermilps9H vpermilps9H vpermilps4  vpermilpsH vpermilps4  vpermilpsH vpermilps4/  vpermilps4/  vpermilps:H vpermilps:H vpermilps4  vpermilpsH vpermilps4  vpermilpsH vpermilps42  vpermilps42  vpermilps;H vpermilps;H vpermilpsH vpermilpsH+https://www.felixcloutier.com/x86/vpermilpsPCMPGTDPCMPGTDpcmpgtd:Compare Packed Signed Doubleword Integers for Greater ThanpcmpgtdPCMPGTL3 pcmpgtdPCMPGTL3+ pcmpgtdPCMPGTL3pcmpgtdPCMPGTL3/9https://www.felixcloutier.com/x86/pcmpgtb:pcmpgtw:pcmpgtdMOVUPDMOVUPDmovupd<Move Unaligned Packed Double-Precision Floating-Point ValuesmovupdMOVUPD3movupdMOVUPD3/movupdMOVUPD3/(https://www.felixcloutier.com/x86/movupdVMULSHVMULSHvmulsh:Fused Multiply Scalar Half-Precision Floating-Point ValuesvmulshRvmulsh$RvmulshRvmulsh$RvmulshQRvmulshQR(https://www.felixcloutier.com/x86/vmulshPSUBQPSUBQpsubq!Subtract Packed Quadword IntegerspsubqPSUBQ3psubqPSUBQ3+psubqPSUBQ3psubqPSUBQ3/'https://www.felixcloutier.com/x86/psubqVPMINSQVPMINSQvpminsq*Minimum of Packed Signed Quadword Integers vpminsq=HvpminsqHvpminsq?HvpminsqHvpminsqAHvpminsqHvpminsq=HvpminsqHvpminsq?HvpminsqHvpminsqAHvpminsqHWRGSBASEWRGSBASEwrgsbaseWRite GS segment BASEwrgsbase=wrgsbase=3https://www.felixcloutier.com/x86/wrfsbase:wrgsbase VFNMADD231SD VFNMADD231SD vfnmadd231sdLFused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values vfnmadd231sdH vfnmadd231sd+H vfnmadd231sd4# vfnmadd231sdH vfnmadd231sd4+# vfnmadd231sd+H vfnmadd231sdQH vfnmadd231sdQHHhttps://www.felixcloutier.com/x86/vfnmadd132sd:vfnmadd213sd:vfnmadd231sd VFPCLASSPD VFPCLASSPD vfpclasspd;Test Class of Packed Double-Precision Floating-Point Values  vfpclasspdx=J vfpclasspdx=J vfpclasspdy?J vfpclasspdy?J vfpclasspdzAJ vfpclasspdzAJ vfpclasspdxJ vfpclasspdxJ vfpclasspdyJ vfpclasspdyJ vfpclasspdzJ vfpclasspdzJ,https://www.felixcloutier.com/x86/vfpclasspdINSERTPSINSERTPSinsertps3Insert Packed Single Precision Floating-Point Valueinsertps3insertps3'*https://www.felixcloutier.com/x86/insertps TILELOADD TILELOADD tileloaddTILE LOAD Data tileloaddTS7https://www.felixcloutier.com/x86/tileloadd:tileloaddt1KSHIFTLWKSHIFTLWkshiftlwShift Left 16-bit MaskskshiftlwHEhttps://www.felixcloutier.com/x86/kshiftlw:kshiftlb:kshiftlq:kshiftldVCOMISSVCOMISSvcomissLCompare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGSvcomiss4 vcomissHvcomiss4' vcomiss'HvcomissRHVPANDNDVPANDNDvpandnd5Bitwise Logical AND NOT of Packed Doubleword Integers vpandnd9HvpandndHvpandnd:HvpandndHvpandnd;HvpandndHvpandnd9HvpandndHvpandnd:HvpandndHvpandnd;HvpandndHVPSHUFBVPSHUFBvpshufbPacked Shuffle BytesvpshufbIvpshufb/IvpshufbIvpshufb2IvpshufbIvpshufb5Ivpshufb4 vpshufbIvpshufb4/ vpshufb/Ivpshufb4!vpshufbIvpshufb42!vpshufb2IvpshufbIvpshufb5I VPTESTNMQ VPTESTNMQ vptestnmq;Logical NAND of Packed Quadword Integer Values and Set Mask  vptestnmq=H vptestnmq=H vptestnmqH vptestnmqH vptestnmq?H vptestnmq?H vptestnmqH vptestnmqH vptestnmqAH vptestnmqAH vptestnmqH vptestnmqHIhttps://www.felixcloutier.com/x86/vptestnmb:vptestnmw:vptestnmd:vptestnmq SHA256RNDS2 SHA256RNDS2 sha256rnds2&Perform Two Rounds of SHA256 Operation sha256rnds2( sha256rnds2/(-https://www.felixcloutier.com/x86/sha256rnds2PADDUSBPADDUSBpaddusb:Add Packed Unsigned Byte Integers with Unsigned SaturationpaddusbPADDUSB3 paddusbPADDUSB3+ paddusbPADDUSB3paddusbPADDUSB3/1https://www.felixcloutier.com/x86/paddusb:padduswPMAXSWPMAXSWpmaxsw&Maximum of Packed Signed Word IntegerspmaxswPMAXSW3 pmaxswPMAXSW3+ pmaxswPMAXSW3pmaxswPMAXSW3/=https://www.felixcloutier.com/x86/pmaxsb:pmaxsw:pmaxsd:pmaxsqJNAEJNAEjnae$Jump if not above or equal (CF == 1)jnaeJCS3NjnaeJCS3OVADDPSVADDPSvaddps1Add Packed Single-Precision Floating-Point Valuesvaddps9HvaddpsHvaddps:HvaddpsHvaddps;HvaddpsHvaddps9Hvaddps4 vaddpsHvaddps4/ vaddps:Hvaddps4 vaddpsHvaddps42 vaddps;HvaddpsHvaddpsQHvaddpsQH VPMOVZXBW VPMOVZXBW vpmovzxbw>Move Packed Byte Integers to Word Integers with Zero Extension vpmovzxbwI vpmovzxbwI vpmovzxbwI vpmovzxbw+I vpmovzxbw/I vpmovzxbw2I vpmovzxbw4  vpmovzxbwI vpmovzxbw4+  vpmovzxbw+I vpmovzxbw4! vpmovzxbwI vpmovzxbw4/! vpmovzxbw/I vpmovzxbwI vpmovzxbw2IPHADDDPHADDDphaddd(Packed Horizontal Add Doubleword Integerphaddd3phaddd3+phaddd3phaddd3//https://www.felixcloutier.com/x86/phaddw:phadddVFRCZPSVFRCZPSvfrczps7Extract Fraction Packed Single-Precision Floating-Pointvfrczps"vfrczps/"vfrczps"vfrczps2"KORQKORQkorqBitwise Logical OR 64-bit MaskskorqI5https://www.felixcloutier.com/x86/korw:korb:korq:kordPCMPGTWPCMPGTWpcmpgtw4Compare Packed Signed Word Integers for Greater ThanpcmpgtwPCMPGTW3 pcmpgtwPCMPGTW3+ pcmpgtwPCMPGTW3pcmpgtwPCMPGTW3/9https://www.felixcloutier.com/x86/pcmpgtb:pcmpgtw:pcmpgtdVMOVNTPDVMOVNTPDvmovntpdKStore Packed Double-Precision Floating-Point Values Using Non-Temporal Hintvmovntpd4/ vmovntpd/Hvmovntpd42 vmovntpd2Hvmovntpd5HPOPCNTPOPCNTpopcnt Count of Number of Bits Set to 1popcntw  2popcntw $2popcntl32popcntl3'2popcntq32popcntq3+2(https://www.felixcloutier.com/x86/popcntVPANDVPANDvpandPacked Bitwise Logical ANDvpand4 vpand4/ vpand4!vpand42! VPDPBUUDS VPDPBUUDS vpdpbuudsZPacked Dot Product of Unsigned-by-Unsinged Byte subvectors into Doubleword with Saturation vpdpbuudsX vpdpbuuds/X vpdpbuudsX vpdpbuuds2XVPMOVQWVPMOVQWvpmovqwBDown Convert Packed Quadword Values to Word Values with Truncation vpmovqwHvpmovqw(HvpmovqwHvpmovqw,HvpmovqwHvpmovqw0HvpmovqwHvpmovqwHvpmovqwHvpmovqw'Hvpmovqw+Hvpmovqw/H<https://www.felixcloutier.com/x86/vpmovqw:vpmovsqw:vpmovusqwBLSMSKBLSMSKblsmskMask From Lowest Set Bitblsmskl4blsmskl'4blsmskq4blsmskq+4(https://www.felixcloutier.com/x86/blsmskVPMOVDBVPMOVDBvpmovdbDDown Convert Packed Doubleword Values to Byte Values with Truncation vpmovdbHvpmovdb(HvpmovdbHvpmovdb,HvpmovdbHvpmovdb0HvpmovdbHvpmovdbHvpmovdbHvpmovdb'Hvpmovdb+Hvpmovdb/H<https://www.felixcloutier.com/x86/vpmovdb:vpmovsdb:vpmovusdbCWDCWDcwdConvert Word to Doublewordcwtd3-https://www.felixcloutier.com/x86/cwd:cdq:cqoVPINSRQVPINSRQvpinsrqInsert Quadwordvpinsrq4 vpinsrqJvpinsrq4+ vpinsrq+JVPSIGNDVPSIGNDvpsignd"Packed Sign of Doubleword Integersvpsignd4 vpsignd4/ vpsignd4!vpsignd42!PINSRDPINSRDpinsrdInsert DoublewordpinsrdPINSRD3pinsrdPINSRD3'6https://www.felixcloutier.com/x86/pinsrb:pinsrd:pinsrqVPCMPUQVPCMPUQvpcmpuq'Compare Packed Unsigned Quadword Values vpcmpuq=Hvpcmpuq=HvpcmpuqHvpcmpuqHvpcmpuq?Hvpcmpuq?HvpcmpuqHvpcmpuqHvpcmpuqAHvpcmpuqAHvpcmpuqHvpcmpuqH0https://www.felixcloutier.com/x86/vpcmpq:vpcmpuqKADDDKADDDkadddADD Two 32-bit MaskskadddI9https://www.felixcloutier.com/x86/kaddw:kaddb:kaddq:kaddd VFPCLASSPS VFPCLASSPS vfpclassps;Test Class of Packed Single-Precision Floating-Point Values  vfpclasspsx9J vfpclasspsx9J vfpclasspsy:J vfpclasspsy:J vfpclasspsz;J vfpclasspsz;J vfpclasspsxJ vfpclasspsxJ vfpclasspsyJ vfpclasspsyJ vfpclasspszJ vfpclasspszJ,https://www.felixcloutier.com/x86/vfpclasspsKTESTBKTESTBktestb"Bit Test 8-bit Masks and Set FlagsktestbJ=https://www.felixcloutier.com/x86/ktestw:ktestb:ktestq:ktestdNEGNEGnegTwo's Complement NegationnegbNEGB3 negwNEGW3 neglNEGL3negqNEGQ3negbNEGB3#negwNEGW3$neglNEGL3'negqNEGQ3+%https://www.felixcloutier.com/x86/negPFRSQRTPFRSQRTpfrsqrt:Packed Floating-Point Reciprocal Square Root ApproximationpfrsqrtPFRSQRT3pfrsqrtPFRSQRT3+PFMULPFMULpfmulPacked Floating-Point MultiplypfmulPFMUL3pfmulPFMUL3+ANDNPSANDNPSandnpsHBitwise Logical AND NOT of Packed Single-Precision Floating-Point ValuesandnpsANDNPS3andnpsANDNPS3/(https://www.felixcloutier.com/x86/andnpsDPPDDPPDdppd<Dot Product of Packed Double Precision Floating-Point Valuesdppd3dppd3/&https://www.felixcloutier.com/x86/dppdPMOVSXBDPMOVSXBDpmovsxbdDMove Packed Byte Integers to Doubleword Integers with Sign Extensionpmovsxbd3pmovsxbd3' VCVTDQ2PH VCVTDQ2PH vcvtdq2ph@Convert Packed Dword Integers to Packed Half-Precision FP Values vcvtdq2phx9K vcvtdq2phy:K vcvtdq2ph;R vcvtdq2phxK vcvtdq2phyK vcvtdq2phR vcvtdq2phx9K vcvtdq2phy:K vcvtdq2phxK vcvtdq2phyK vcvtdq2ph;R vcvtdq2phR vcvtdq2phQR vcvtdq2phQR+https://www.felixcloutier.com/x86/vcvtdq2ph VDBPSADBW VDBPSADBW vdbpsadbw>Double Block Packed Sum-Absolute-Differences on Unsigned Bytes  vdbpsadbwI vdbpsadbw/I vdbpsadbwI vdbpsadbw2I vdbpsadbwI vdbpsadbw5I vdbpsadbwI vdbpsadbw/I vdbpsadbwI vdbpsadbw2I vdbpsadbwI vdbpsadbw5I+https://www.felixcloutier.com/x86/vdbpsadbw CMPXCHG8B CMPXCHG8B cmpxchg8bCompare and Exchange 8 Bytes cmpxchg8b CMPXCHG8B3+ 6https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b VPACKSSDW VPACKSSDW vpackssdw2Pack Doublewords into Words with Signed Saturation vpackssdw9I vpackssdwI vpackssdw:I vpackssdwI vpackssdw;I vpackssdwI vpackssdw9I vpackssdw4  vpackssdwI vpackssdw4/  vpackssdw:I vpackssdw4! vpackssdwI vpackssdw42! vpackssdw;I vpackssdwI VPBLENDVB VPBLENDVB vpblendvbVariable Blend Packed Bytes vpblendvb4  vpblendvb4/  vpblendvb4! vpblendvb42!VPPERMVPPERMvppermPacked Permute Bytesvpperm"vpperm/"vpperm/"VPSHLDVWVPSHLDVWvpshldvw<Concatenate and Variable Shift Packed Word Data Left Logical vpshldvwKvpshldvw/KvpshldvwKvpshldvw2KvpshldvwUvpshldvw5UvpshldvwKvpshldvw/KvpshldvwKvpshldvw2KvpshldvwUvpshldvw5U VPMASKMOVD VPMASKMOVD vpmaskmovd+Conditional Move Packed Doubleword Integers vpmaskmovd4/! vpmaskmovd42! vpmaskmovd4/! vpmaskmovd42!VPMAXSWVPMAXSWvpmaxsw&Maximum of Packed Signed Word IntegersvpmaxswIvpmaxsw/IvpmaxswIvpmaxsw2IvpmaxswIvpmaxsw5Ivpmaxsw4 vpmaxswIvpmaxsw4/ vpmaxsw/Ivpmaxsw4!vpmaxswIvpmaxsw42!vpmaxsw2IvpmaxswIvpmaxsw5ICMOVPCMOVPcmovpMove if parity (PF == 1)cmovpw3  cmovpw3 $cmovpl3cmovpl3'cmovpq3cmovpq3+ VEXTRACTF32X4 VEXTRACTF32X4 vextractf32x4AExtract 128 Bits of Packed Single-Precision Floating-Point Values vextractf32x4H vextractf32x40H vextractf32x4H vextractf32x40H vextractf32x4H vextractf32x4H vextractf32x4/H vextractf32x4/HMOVSXDMOVSXDmovsxd/Move Doubleword to Quadword with Sign-ExtensionmovslqMOVLQSX3movslqMOVLQSX3'.https://www.felixcloutier.com/x86/movsx:movsxd VINSERTI64X4 VINSERTI64X4 vinserti64x41Insert 256 Bits of Packed Quadword Integer Values vinserti64x4H vinserti64x42H vinserti64x4H vinserti64x42HVPSRLWVPSRLWvpsrlw$Shift Packed Word Data Right LogicalvpsrlwIvpsrlwIvpsrlw/IvpsrlwIvpsrlwIvpsrlw/IvpsrlwIvpsrlwIvpsrlw/Ivpsrlw/Ivpsrlw2Ivpsrlw5Ivpsrlw4 vpsrlwIvpsrlw4 vpsrlwIvpsrlw4/ vpsrlw/Ivpsrlw/Ivpsrlw4!vpsrlwIvpsrlw4!vpsrlwIvpsrlw4/!vpsrlw/Ivpsrlw2IvpsrlwIvpsrlwIvpsrlw/Ivpsrlw5IBLSRBLSRblsrReset Lowest Set Bitblsrl4blsrl'4blsrq4blsrq+4&https://www.felixcloutier.com/x86/blsr VFMSUB213PH VFMSUB213PH vfmsub213phFFused Multiply-Subtract of Packed Half-Precision Floating-Point Values vfmsub213ph<K vfmsub213phK vfmsub213ph>K vfmsub213phK vfmsub213ph@R vfmsub213phR vfmsub213ph<K vfmsub213phK vfmsub213ph>K vfmsub213phK vfmsub213ph@R vfmsub213phR vfmsub213phQR vfmsub213phQRlhttps://www.felixcloutier.com/x86/vfmsub132ph:vfnmsub132ph:vfmsub213ph:vfnmsub213ph:vfmsub231ph:vfnmsub231phCMOVNOCMOVNOcmovnoMove if not overflow (OF == 0)cmovnow3  cmovnow3 $cmovnol3cmovnol3'cmovnoq3cmovnoq3+ VFIXUPIMMSD VFIXUPIMMSD vfixupimmsd;Fix Up Special Scalar Double-Precision Floating-Point Value vfixupimmsdH vfixupimmsd+H vfixupimmsdH vfixupimmsd+H vfixupimmsdRH vfixupimmsdRH-https://www.felixcloutier.com/x86/vfixupimmsd VPTESTNMD VPTESTNMD vptestnmd=Logical NAND of Packed Doubleword Integer Values and Set Mask  vptestnmd9H vptestnmd9H vptestnmdH vptestnmdH vptestnmd:H vptestnmd:H vptestnmdH vptestnmdH vptestnmd;H vptestnmd;H vptestnmdH vptestnmdHIhttps://www.felixcloutier.com/x86/vptestnmb:vptestnmw:vptestnmd:vptestnmqKORTESTDKORTESTDkortestdOR 32-bit Masks and Set FlagskortestdIEhttps://www.felixcloutier.com/x86/kortestw:kortestb:kortestq:kortestdSHRXSHRXshrx+Logical Shift Right Without Affecting Flagsshrxl5shrxl'5shrxq5shrxq+50https://www.felixcloutier.com/x86/sarx:shlx:shrxTZCNTTZCNTtzcnt&Count the Number of Trailing Zero Bitstzcntw  4tzcntw $4tzcntl34tzcntl3'4tzcntq34tzcntq3+4'https://www.felixcloutier.com/x86/tzcntVPMULLQVPMULLQvpmullq=Multiply Packed Signed Quadword Integers and Store Low Result vpmullq=JvpmullqJvpmullq?JvpmullqJvpmullqAJvpmullqJvpmullq=JvpmullqJvpmullq?JvpmullqJvpmullqAJvpmullqJ VPSCATTERDD VPSCATTERDD vpscatterdd?Scatter Packed Doubleword Values with Signed Doubleword Indices vpscatterddCH vpscatterddGH vpscatterddKHQhttps://www.felixcloutier.com/x86/vpscatterdd:vpscatterdq:vpscatterqd:vpscatterqqPINSRBPINSRBpinsrb Insert Bytepinsrb3pinsrb3#6https://www.felixcloutier.com/x86/pinsrb:pinsrd:pinsrqROUNDSDROUNDSDroundsd3Round Scalar Double Precision Floating-Point Valuesroundsd3roundsd3+)https://www.felixcloutier.com/x86/roundsd VFMADD231PH VFMADD231PH vfmadd231phAFused Multiply-Add of Packed Half-Precision Floating-Point Values vfmadd231ph<K vfmadd231phK vfmadd231ph>K vfmadd231phK vfmadd231ph@R vfmadd231phR vfmadd231ph<K vfmadd231phK vfmadd231ph>K vfmadd231phK vfmadd231ph@R vfmadd231phR vfmadd231phQR vfmadd231phQRlhttps://www.felixcloutier.com/x86/vfmadd132ph:vfnmadd132ph:vfmadd213ph:vfnmadd213ph:vfmadd231ph:vfnmadd231ph PCMPISTRM PCMPISTRM pcmpistrm3Packed Compare Implicit Length Strings, Return Mask pcmpistrm3 pcmpistrm3/+https://www.felixcloutier.com/x86/pcmpistrmMWAITMWAITmwait Monitor WaitmwaitD'https://www.felixcloutier.com/x86/mwait CMPNBXADD CMPNBXADD cmpnbxaddCompare for Not Below and Add cmpnbxadd' cmpnbxadd+VRSQRTPSVRSQRTPSvrsqrtpsTCompute Reciprocals of Square Roots of Packed Single-Precision Floating-Point Valuesvrsqrtps4 vrsqrtps4/ vrsqrtps4 vrsqrtps42  PREFETCHIT0 PREFETCHIT0 prefetchit04Prefetch Code Into Instruction Caches using IT0 Hint prefetchit0PAPCMPEQQPCMPEQQpcmpeqq)Compare Packed Quadword Data for Equalitypcmpeqq3pcmpeqq3/)https://www.felixcloutier.com/x86/pcmpeqq VFMSUB231SS VFMSUB231SS vfmsub231ssHFused Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfmsub231ssH vfmsub231ss'H vfmsub231ss4# vfmsub231ssH vfmsub231ss4'# vfmsub231ss'H vfmsub231ssQH vfmsub231ssQHEhttps://www.felixcloutier.com/x86/vfmsub132ss:vfmsub213ss:vfmsub231ssVFMSUBPDVFMSUBPDvfmsubpdHFused Multiply-Subtract of Packed Double-Precision Floating-Point Valuesvfmsubpd$vfmsubpd/$vfmsubpd/$vfmsubpd$vfmsubpd2$vfmsubpd2$VPLZCNTQVPLZCNTQvplzcntq@Count the Number of Leading Zero Bits for Packed Quadword Values vplzcntq=Nvplzcntq?NvplzcntqANvplzcntqNvplzcntqNvplzcntqNvplzcntq=NvplzcntqNvplzcntq?NvplzcntqNvplzcntqANvplzcntqN3https://www.felixcloutier.com/x86/vplzcntd:vplzcntq VFIXUPIMMPS VFIXUPIMMPS vfixupimmps<Fix Up Special Packed Single-Precision Floating-Point Values vfixupimmps9K vfixupimmpsK vfixupimmps:H vfixupimmpsH vfixupimmps;H vfixupimmpsH vfixupimmps9K vfixupimmpsK vfixupimmps:H vfixupimmpsH vfixupimmps;H vfixupimmpsH vfixupimmpsRH vfixupimmpsRH-https://www.felixcloutier.com/x86/vfixupimmpsCMOVNCCMOVNCcmovncMove if not carry (CF == 0)cmovncw3  cmovncw3 $cmovncl3cmovncl3'cmovncq3cmovncq3+ VRNDSCALEPD VRNDSCALEPD vrndscalepd^Round Packed Double-Precision Floating-Point Values To Include A Given Number Of Fraction Bits vrndscalepd=H vrndscalepd?H vrndscalepdAH vrndscalepdH vrndscalepdH vrndscalepdH vrndscalepd=H vrndscalepdH vrndscalepd?H vrndscalepdH vrndscalepdAH vrndscalepdH vrndscalepdRH vrndscalepdRH-https://www.felixcloutier.com/x86/vrndscalepd VADDSUBPD VADDSUBPD vaddsubpdPacked Double-FP Add/Subtract vaddsubpd4  vaddsubpd4/  vaddsubpd4  vaddsubpd42 VPSUBUSBVPSUBUSBvpsubusb?Subtract Packed Unsigned Byte Integers with Unsigned SaturationvpsubusbIvpsubusb/IvpsubusbIvpsubusb2IvpsubusbIvpsubusb5Ivpsubusb4 vpsubusbIvpsubusb4/ vpsubusb/Ivpsubusb4!vpsubusbIvpsubusb42!vpsubusb2IvpsubusbIvpsubusb5IVPOPCNTQVPOPCNTQvpopcntq-Packed Population Count for Quadword Integers vpopcntq=Kvpopcntq?KvpopcntqAPvpopcntqKvpopcntqKvpopcntqPvpopcntq=KvpopcntqKvpopcntq?KvpopcntqKvpopcntqAPvpopcntqPAESDECAESDECaesdec+Perform One Round of an AES Decryption FlowaesdecAESDEC'aesdecAESDEC/'(https://www.felixcloutier.com/x86/aesdecVANDNPSVANDNPSvandnpsHBitwise Logical AND NOT of Packed Single-Precision Floating-Point Valuesvandnps9JvandnpsJvandnps:JvandnpsJvandnps;JvandnpsJvandnps9Jvandnps4 vandnpsJvandnps4/ vandnps:Jvandnps4 vandnpsJvandnps42 vandnps;JvandnpsJVCOMISHVCOMISHvcomishJCompare Scalar Ordered Half-Precision Floating-Point Values and Set EFLAGSvcomishRvcomish$RvcomishRR)https://www.felixcloutier.com/x86/vcomishJMPJMPjmpJump UnconditionallyjmpJMP3NjmpJMP3OjmpqJMPjmpqJMP+%https://www.felixcloutier.com/x86/jmpVPADDSWVPADDSWvpaddsw6Add Packed Signed Word Integers with Signed SaturationvpaddswIvpaddsw/IvpaddswIvpaddsw2IvpaddswIvpaddsw5Ivpaddsw4 vpaddswIvpaddsw4/ vpaddsw/Ivpaddsw4!vpaddswIvpaddsw42!vpaddsw2IvpaddswIvpaddsw5IVPDPWUSDVPDPWUSDvpdpwusdHPacked Dot Product of Unsigned-by-Signed Word subvectors into DoublewordvpdpwusdYvpdpwusd/YvpdpwusdYvpdpwusd2Y VREDUCEPD VREDUCEPD vreducepdQPerform Reduction Transformation on Packed Double-Precision Floating-Point Values  vreducepd=J vreducepd?J vreducepdAJ vreducepdJ vreducepdJ vreducepdJ vreducepd=J vreducepdJ vreducepd?J vreducepdJ vreducepdAJ vreducepdJ+https://www.felixcloutier.com/x86/vreducepd VBROADCASTSS VBROADCASTSS vbroadcastss1Broadcast Single-Precision Floating-Point Element  vbroadcastssH vbroadcastssH vbroadcastss'H vbroadcastss'H vbroadcastss4! vbroadcastss4'  vbroadcastss4! vbroadcastssH vbroadcastss4'  vbroadcastss'H vbroadcastssH vbroadcastss'HVFMADDPSVFMADDPSvfmaddpsCFused Multiply-Add of Packed Single-Precision Floating-Point Valuesvfmaddps$vfmaddps/$vfmaddps/$vfmaddps$vfmaddps2$vfmaddps2$PADDWPADDWpaddwAdd Packed Word IntegerspaddwPADDW3 paddwPADDW3+ paddwPADDW3paddwPADDW3/9https://www.felixcloutier.com/x86/paddb:paddw:paddd:paddqCVTPS2DQCVTPS2DQcvtps2dqBConvert Packed Single-Precision FP Values to Packed Dword Integerscvtps2dq3cvtps2dq3/*https://www.felixcloutier.com/x86/cvtps2dq VCVTTSS2SI VCVTTSS2SI vcvttss2siIConvert with Truncation Scalar Single-Precision FP Value to Dword Integer  vcvttss2si4  vcvttss2siH vcvttss2si4'  vcvttss2si'H vcvttss2si4  vcvttss2siH vcvttss2si4'  vcvttss2si'H vcvttss2siRH vcvttss2siRHXADDXADDxaddExchange and AddxaddbXADDB3  xaddwXADDW3  xaddlXADDL3xaddqXADDQ3xaddbXADDB3# xaddwXADDW3$ xaddlXADDL3'xaddqXADDQ3+&https://www.felixcloutier.com/x86/xaddBLCIBLCIblciIsolate Lowest Clear Bitblci6blci'6blci6blci+6INSERTQINSERTQinsertq Insert Fieldinsertq3insertq3VPADDBVPADDBvpaddbAdd Packed Byte IntegersvpaddbIvpaddb/IvpaddbIvpaddb2IvpaddbIvpaddb5Ivpaddb4 vpaddbIvpaddb4/ vpaddb/Ivpaddb4!vpaddbIvpaddb42!vpaddb2IvpaddbIvpaddb5I VFMSUB213SD VFMSUB213SD vfmsub213sdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfmsub213sdH vfmsub213sd+H vfmsub213sd4# vfmsub213sdH vfmsub213sd4+# vfmsub213sd+H vfmsub213sdQH vfmsub213sdQHEhttps://www.felixcloutier.com/x86/vfmsub132sd:vfmsub213sd:vfmsub231sdVPCMPGTQVPCMPGTQvpcmpgtq$Compare Packed Data for Greater Thanvpcmpgtq=Hvpcmpgtq=HvpcmpgtqHvpcmpgtqHvpcmpgtq?Hvpcmpgtq?HvpcmpgtqHvpcmpgtqHvpcmpgtqAHvpcmpgtqAHvpcmpgtqHvpcmpgtqHvpcmpgtq4 vpcmpgtq4/ vpcmpgtq4!vpcmpgtq42!VBLENDPDVBLENDPDvblendpd3Blend Packed Double Precision Floating-Point Valuesvblendpd4 vblendpd4/ vblendpd4 vblendpd42 PEXTRWPEXTRWpextrw Extract WordpextrwPEXTRW3 pextrwPEXTRW3pextrwPEXTRW3$(https://www.felixcloutier.com/x86/pextrw VFNMADD213PH VFNMADD213PH vfnmadd213phJFused Negative Multiply-Add of Packed Half-Precision Floating-Point Values vfnmadd213ph<K vfnmadd213phK vfnmadd213ph>K vfnmadd213phK vfnmadd213ph@R vfnmadd213phR vfnmadd213ph<K vfnmadd213phK vfnmadd213ph>K vfnmadd213phK vfnmadd213ph@R vfnmadd213phR vfnmadd213phQR vfnmadd213phQRlhttps://www.felixcloutier.com/x86/vfmadd132ph:vfnmadd132ph:vfmadd213ph:vfnmadd213ph:vfmadd231ph:vfnmadd231phMOVMOVmovMovemovbMOVB3 movbMOVB3  movbMOVB3 #movwMOVW3 movwMOVW3  movwMOVW3 $movabsl movlMOVL3movlMOVL3movlMOVL3'movabsq!movqMOVQ3movabsq3movqMOVQ3movqMOVQ3+movbMOVB3#movbMOVB3# movwMOVW3$movwMOVW3$ movlMOVL3'movlMOVL3'movqMOVQ3+movqMOVQ3+movabsl movabsq!'https://www.felixcloutier.com/x86/mov-2PEXTRBPEXTRBpextrb Extract Bytepextrb3pextrb3#6https://www.felixcloutier.com/x86/pextrb:pextrd:pextrqVMOVSHVMOVSHvmovsh0Move Scalar Half-Precision Floating-Point Valuesvmovsh%Rvmovsh$Rvmovsh$Rvmovsh$RvmovshRvmovshR(https://www.felixcloutier.com/x86/vmovshVMAXPHVMAXPHvmaxph:Return Maximum Packed Half-Precision Floating-Point Valuesvmaxph<KvmaxphKvmaxph>KvmaxphKvmaxph@RvmaxphRvmaxph<KvmaxphKvmaxph>KvmaxphKvmaxph@RvmaxphRvmaxphRRvmaxphRR(https://www.felixcloutier.com/x86/vmaxphVPCOMDVPCOMDvpcomd)Compare Packed Signed Doubleword Integersvpcomd"vpcomd/"MCOMMITMCOMMITmcommit Memory COMMITmcommit> VFNMSUBPD VFNMSUBPD vfnmsubpdQFused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values vfnmsubpd$ vfnmsubpd/$ vfnmsubpd/$ vfnmsubpd$ vfnmsubpd2$ vfnmsubpd2$PCMPGTBPCMPGTBpcmpgtb4Compare Packed Signed Byte Integers for Greater ThanpcmpgtbPCMPGTB3 pcmpgtbPCMPGTB3+ pcmpgtbPCMPGTB3pcmpgtbPCMPGTB3/9https://www.felixcloutier.com/x86/pcmpgtb:pcmpgtw:pcmpgtdCVTPS2PDCVTPS2PDcvtps2pdNConvert Packed Single-Precision FP Values to Packed Double-Precision FP Valuescvtps2pdCVTPS2PD3cvtps2pdCVTPS2PD3+*https://www.felixcloutier.com/x86/cvtps2pd PREFETCHW PREFETCHW prefetchw4Prefetch Data into Caches in Anticipation of a Write prefetchw3#B+https://www.felixcloutier.com/x86/prefetchwANDANDand Logical ANDandbANDB3andbANDB3 andbANDB3  andbANDB3 #andwANDW3 andwANDW3 andwANDW3 andwANDW3  andwANDW3 $andlANDL3andlANDL3andlANDL3andlANDL3andlANDL3'andqANDQ3andqANDQ3andqANDQ3andqANDQ3andqANDQ3+andbANDB3#andbANDB3# andwANDW3$andwANDW3$andwANDW3$ andlANDL3'andlANDL3'andlANDL3'andqANDQ3+andqANDQ3+andqANDQ3+%https://www.felixcloutier.com/x86/and PUNPCKHQDQ PUNPCKHQDQ punpckhqdq@Unpack and Interleave High-Order Quadwords into Double Quadwords punpckhqdq PUNPCKHQDQ3 punpckhqdq PUNPCKHQDQ3/Jhttps://www.felixcloutier.com/x86/punpckhbw:punpckhwd:punpckhdq:punpckhqdq VFMADDCSH VFMADDCSH vfmaddcshIFused Multiply-Add of Complex Scalar Half-Precision Floating-Point Values vfmaddcshR vfmaddcsh'R vfmaddcshR vfmaddcsh'R vfmaddcshQR vfmaddcshQR6https://www.felixcloutier.com/x86/vfcmaddcsh:vfmaddcshVFRCZPDVFRCZPDvfrczpd7Extract Fraction Packed Double-Precision Floating-Pointvfrczpd"vfrczpd/"vfrczpd"vfrczpd2" VPDPWSUDS VPDPWSUDS vpdpwsudsXPacked Dot Product of Signed-by-Unsigned Word subvectors into Doubleword with Saturation vpdpwsudsY vpdpwsuds/Y vpdpwsudsY vpdpwsuds2Y VPMACSDQL VPMACSDQL vpmacsdqlCPacked Multiply Accumulate Signed Low Doubleword to Signed Quadword vpmacsdql" vpmacsdql/"VPSHADVPSHADvpshad#Packed Shift Arithmetic Doublewordsvpshad"vpshad/"vpshad/" PREFETCHIT1 PREFETCHIT1 prefetchit14Prefetch Code Into Instruction Caches using IT1 Hint prefetchit1PASTCSTCstcSet Carry FlagstcSTC3%https://www.felixcloutier.com/x86/stc CVTTPD2PI CVTTPD2PI cvttpd2piRConvert with Truncation Packed Double-Precision FP Values to Packed Dword Integers cvttpd2pi CVTTPD2PL3 cvttpd2pi CVTTPD2PL3/+https://www.felixcloutier.com/x86/cvttpd2pi VGETMANTSD VGETMANTSD vgetmantsdMExtract Normalized Mantissa from Scalar Double-Precision Floating-Point Value vgetmantsdH vgetmantsd+H vgetmantsdH vgetmantsd+H vgetmantsdRH vgetmantsdRH,https://www.felixcloutier.com/x86/vgetmantsdBLCFILLBLCFILLblcfillFill From Lowest Clear Bitblcfill6blcfill'6blcfill6blcfill+6VSHUFPDVSHUFPDvshufpd5Shuffle Packed Double-Precision Floating-Point Valuesvshufpd=HvshufpdHvshufpd?HvshufpdHvshufpdAHvshufpdHvshufpd=Hvshufpd4 vshufpdHvshufpd4/ vshufpd?Hvshufpd4 vshufpdHvshufpd42 vshufpdAHvshufpdH VPMACSSDQH VPMACSSDQH vpmacssdqhTPacked Multiply Accumulate with Saturation Signed High Doubleword to Signed Quadword vpmacssdqh" vpmacssdqh/" VFMSUB132SS VFMSUB132SS vfmsub132ssHFused Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfmsub132ssH vfmsub132ss'H vfmsub132ss4# vfmsub132ssH vfmsub132ss4'# vfmsub132ss'H vfmsub132ssQH vfmsub132ssQHEhttps://www.felixcloutier.com/x86/vfmsub132ss:vfmsub213ss:vfmsub231ss VFNMSUBSS VFNMSUBSS vfnmsubssQFused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfnmsubss$ vfnmsubss'$ vfnmsubss'$ PUNPCKHWD PUNPCKHWD punpckhwd7Unpack and Interleave High-Order Words into Doublewords punpckhwd PUNPCKHWL3  punpckhwd PUNPCKHWL3+  punpckhwd PUNPCKHWL3 punpckhwd PUNPCKHWL3/Jhttps://www.felixcloutier.com/x86/punpckhbw:punpckhwd:punpckhdq:punpckhqdqCMOVNBECMOVNBEcmovnbe0Move if not below or equal (CF == 0 and ZF == 0)cmovnbew3  cmovnbew3 $cmovnbel3cmovnbel3'cmovnbeq3cmovnbeq3+PCMPEQBPCMPEQBpcmpeqb%Compare Packed Byte Data for EqualitypcmpeqbPCMPEQB3 pcmpeqbPCMPEQB3+ pcmpeqbPCMPEQB3pcmpeqbPCMPEQB3/9https://www.felixcloutier.com/x86/pcmpeqb:pcmpeqw:pcmpeqdPSLLDQPSLLDQpslldq)Shift Packed Double Quadword Left LogicalpslldqPSLLO3(https://www.felixcloutier.com/x86/pslldqSETNOSETNOsetno"Set byte if not overflow (OF == 0)setnoSETOC3 setnoSETOC3#CWDECWDEcwdeConvert Word to Doublewordcwtl3/https://www.felixcloutier.com/x86/cbw:cwde:cdqeSETNBESETNBEsetnbe4Set byte if not below or equal (CF == 0 and ZF == 0)setnbeSETHI3 setnbeSETHI3# TCMMIMFP16PS TCMMIMFP16PS tcmmimfp16pscTile Complex Matrix Multiply IMaginary part of FP16 tiles with Packed Single-precision accumulation tcmmimfp16psTTTVCMPPSVCMPPSvcmpps5Compare Packed Single-Precision Floating-Point Valuesvcmpps9Hvcmpps9HvcmppsHvcmppsHvcmpps:Hvcmpps:HvcmppsHvcmppsHvcmpps;Hvcmpps;HvcmppsHvcmppsHvcmpps4 vcmpps4/ vcmpps4 vcmpps42 vcmppsRHvcmppsRHVPBROADCASTMW2DVPBROADCASTMW2Dvpbroadcastmw2d?Broadcast Low Word of Mask Register to Packed Doubleword Valuesvpbroadcastmw2dNvpbroadcastmw2dNvpbroadcastmw2dNVPCOMQVPCOMQvpcomq'Compare Packed Signed Quadword Integersvpcomq"vpcomq/"VPABSQVPABSQvpabsq*Packed Absolute Value of Quadword Integers vpabsq=Hvpabsq?HvpabsqAHvpabsqHvpabsqHvpabsqHvpabsq=HvpabsqHvpabsq?HvpabsqHvpabsqAHvpabsqH VCVTTPD2UQQ VCVTTPD2UQQ vcvttpd2uqqjConvert with Truncation Packed Double-Precision Floating-Point Values to Packed Unsigned Quadword Integers vcvttpd2uqq=J vcvttpd2uqq?J vcvttpd2uqqAJ vcvttpd2uqqJ vcvttpd2uqqJ vcvttpd2uqqJ vcvttpd2uqq=J vcvttpd2uqqJ vcvttpd2uqq?J vcvttpd2uqqJ vcvttpd2uqqAJ vcvttpd2uqqJ vcvttpd2uqqRJ vcvttpd2uqqRJ-https://www.felixcloutier.com/x86/vcvttpd2uqqVFMADDSUB231PSVFMADDSUB231PSvfmaddsub231psXFused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Valuesvfmaddsub231ps9Hvfmaddsub231psHvfmaddsub231ps:Hvfmaddsub231psHvfmaddsub231ps;Hvfmaddsub231psHvfmaddsub231ps9Hvfmaddsub231ps4#vfmaddsub231psHvfmaddsub231ps4/#vfmaddsub231ps:Hvfmaddsub231ps4#vfmaddsub231psHvfmaddsub231ps42#vfmaddsub231ps;Hvfmaddsub231psHvfmaddsub231psQHvfmaddsub231psQHNhttps://www.felixcloutier.com/x86/vfmaddsub132ps:vfmaddsub213ps:vfmaddsub231ps VPBROADCASTQ VPBROADCASTQ vpbroadcastqBroadcast Quadword Integer vpbroadcastqH vpbroadcastqH vpbroadcastqH vpbroadcastqH vpbroadcastqH vpbroadcastqH vpbroadcastq+H vpbroadcastq+H vpbroadcastq+H vpbroadcastqH vpbroadcastq4! vpbroadcastqH vpbroadcastq4+! vpbroadcastq+H vpbroadcastqH vpbroadcastq4! vpbroadcastqH vpbroadcastq4+! vpbroadcastq+H vpbroadcastqH vpbroadcastqH vpbroadcastq+HUhttps://www.felixcloutier.com/x86/vpbroadcastb:vpbroadcastw:vpbroadcastd:vpbroadcastq VPMOVUSWB VPMOVUSWB vpmovuswbGDown Convert Packed Word Values to Byte Values with Unsigned Saturation  vpmovuswbI vpmovuswb,I vpmovuswbI vpmovuswb0I vpmovuswbI vpmovuswb3I vpmovuswbI vpmovuswbI vpmovuswbI vpmovuswb+I vpmovuswb/I vpmovuswb2I<https://www.felixcloutier.com/x86/vpmovwb:vpmovswb:vpmovuswb VFNMSUB231SS VFNMSUB231SS vfnmsub231ssQFused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfnmsub231ssH vfnmsub231ss'H vfnmsub231ss4# vfnmsub231ssH vfnmsub231ss4'# vfnmsub231ss'H vfnmsub231ssQH vfnmsub231ssQHHhttps://www.felixcloutier.com/x86/vfnmsub132ss:vfnmsub213ss:vfnmsub231ssVPRORVQVPRORVQvprorvq%Variable Rotate Packed Quadword Right vprorvq=HvprorvqHvprorvq?HvprorvqHvprorvqAHvprorvqHvprorvq=HvprorvqHvprorvq?HvprorvqHvprorvqAHvprorvqH?https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvq VRNDSCALESH VRNDSCALESH vrndscalesh[Round Scalar Half-Precision Floating-Point Value To Include A Given Number Of Fraction Bits vrndscaleshR vrndscalesh$R vrndscaleshR vrndscalesh$R vrndscaleshRR vrndscaleshRR-https://www.felixcloutier.com/x86/vrndscalesh SHA1RNDS4 SHA1RNDS4 sha1rnds4%Perform Four Rounds of SHA1 Operation sha1rnds4( sha1rnds4/(+https://www.felixcloutier.com/x86/sha1rnds4KTESTQKTESTQktestq#Bit Test 64-bit Masks and Set FlagsktestqI=https://www.felixcloutier.com/x86/ktestw:ktestb:ktestq:ktestd VEXTRACTPS VEXTRACTPS vextractps4Extract Packed Single Precision Floating-Point Value vextractps  vextractpsH vextractps4'  vextractps'H CLFLUSHOPT CLFLUSHOPT clflushoptFlush Cache Line Optimized clflushopt#:,https://www.felixcloutier.com/x86/clflushopt VFNMSUB132SS VFNMSUB132SS vfnmsub132ssQFused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfnmsub132ssH vfnmsub132ss'H vfnmsub132ss4# vfnmsub132ssH vfnmsub132ss4'# vfnmsub132ss'H vfnmsub132ssQH vfnmsub132ssQHHhttps://www.felixcloutier.com/x86/vfnmsub132ss:vfnmsub213ss:vfnmsub231ss VSHUFI32X4 VSHUFI32X4 vshufi32x40Shuffle 128-Bit Packed Doubleword Integer Values vshufi32x4:H vshufi32x4H vshufi32x4;H vshufi32x4H vshufi32x4:H vshufi32x4H vshufi32x4;H vshufi32x4H VUNPCKHPS VUNPCKHPS vunpckhpsHUnpack and Interleave High Packed Single-Precision Floating-Point Values vunpckhps9H vunpckhpsH vunpckhps:H vunpckhpsH vunpckhps;H vunpckhpsH vunpckhps9H vunpckhps4  vunpckhpsH vunpckhps4/  vunpckhps:H vunpckhps4  vunpckhpsH vunpckhps42  vunpckhps;H vunpckhpsH VZEROUPPER VZEROUPPER vzeroupper Zero Upper Bits of YMM Registers vzeroupper4 ,https://www.felixcloutier.com/x86/vzeroupperVBROADCASTF32X8VBROADCASTF32X8vbroadcastf32x88Broadcast Eight Single-Precision Floating-Point Elementsvbroadcastf32x82Jvbroadcastf32x82JPSUBBPSUBBpsubbSubtract Packed Byte IntegerspsubbPSUBB3 psubbPSUBB3+ psubbPSUBB3psubbPSUBB3/3https://www.felixcloutier.com/x86/psubb:psubw:psubdPFCMPEQPFCMPEQpfcmpeq'Packed Floating-Point Compare for EqualpfcmpeqPFCMPEQ3pfcmpeqPFCMPEQ3+ VPMOVSXWD VPMOVSXWD vpmovsxwdDMove Packed Word Integers to Doubleword Integers with Sign Extension vpmovsxwdH vpmovsxwdH vpmovsxwdH vpmovsxwd+H vpmovsxwd/H vpmovsxwd2H vpmovsxwd4  vpmovsxwdH vpmovsxwd4+  vpmovsxwd+H vpmovsxwd4! vpmovsxwdH vpmovsxwd4/! vpmovsxwd/H vpmovsxwdH vpmovsxwd2HUNPCKLPSUNPCKLPSunpcklpsGUnpack and Interleave Low Packed Single-Precision Floating-Point ValuesunpcklpsUNPCKLPS3unpcklpsUNPCKLPS3/*https://www.felixcloutier.com/x86/unpcklps VFNMSUB132PS VFNMSUB132PS vfnmsub132psQFused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values vfnmsub132ps9H vfnmsub132psH vfnmsub132ps:H vfnmsub132psH vfnmsub132ps;H vfnmsub132psH vfnmsub132ps9H vfnmsub132ps4# vfnmsub132psH vfnmsub132ps4/# vfnmsub132ps:H vfnmsub132ps4# vfnmsub132psH vfnmsub132ps42# vfnmsub132ps;H vfnmsub132psH vfnmsub132psQH vfnmsub132psQHHhttps://www.felixcloutier.com/x86/vfnmsub132ps:vfnmsub213ps:vfnmsub231psVPMACSWWVPMACSWWvpmacsww5Packed Multiply Accumulate Signed Word to Signed Wordvpmacsww"vpmacsww/" VPCOMPRESSW VPCOMPRESSW vpcompresswBStore Sparse Packed Word Integer Values into Dense Memory/Register  vpcompressw0K vpcompresswK vpcompressw3K vpcompresswK vpcompressw6U vpcompresswU vpcompresswK vpcompresswK vpcompresswU vpcompressw/K vpcompressw2K vpcompressw5U VPHADDUBW VPHADDUBW vphaddubw+Packed Horizontal Add Unsigned Byte to Word vphaddubw" vphaddubw/"KADDQKADDQkaddqADD Two 64-bit MaskskaddqI9https://www.felixcloutier.com/x86/kaddw:kaddb:kaddq:kaddd VFMADD213PS VFMADD213PS vfmadd213psCFused Multiply-Add of Packed Single-Precision Floating-Point Values vfmadd213ps9H vfmadd213psH vfmadd213ps:H vfmadd213psH vfmadd213ps;H vfmadd213psH vfmadd213ps9H vfmadd213ps4# vfmadd213psH vfmadd213ps4/# vfmadd213ps:H vfmadd213ps4# vfmadd213psH vfmadd213ps42# vfmadd213ps;H vfmadd213psH vfmadd213psQH vfmadd213psQHEhttps://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231psVPSUBSBVPSUBSBvpsubsb;Subtract Packed Signed Byte Integers with Signed SaturationvpsubsbIvpsubsb/IvpsubsbIvpsubsb2IvpsubsbIvpsubsb5Ivpsubsb4 vpsubsbIvpsubsb4/ vpsubsb/Ivpsubsb4!vpsubsbIvpsubsb42!vpsubsb2IvpsubsbIvpsubsb5ICMOVPOCMOVPOcmovpoMove if parity odd (PF == 0)cmovpow3  cmovpow3 $cmovpol3cmovpol3'cmovpoq3cmovpoq3+ VCVTSH2SS VCVTSH2SS vcvtsh2ssJConvert Scalar Half-Precision FP Value to Scalar Double-Precision FP Value vcvtsh2ssR vcvtsh2ss$R vcvtsh2ssR vcvtsh2ss$R vcvtsh2ssRR vcvtsh2ssRR+https://www.felixcloutier.com/x86/vcvtsh2ssVPMOVSQWVPMOVSQWvpmovsqwIDown Convert Packed Quadword Values to Word Values with Signed Saturation vpmovsqwHvpmovsqw(HvpmovsqwHvpmovsqw,HvpmovsqwHvpmovsqw0HvpmovsqwHvpmovsqwHvpmovsqwHvpmovsqw'Hvpmovsqw+Hvpmovsqw/H<https://www.felixcloutier.com/x86/vpmovqw:vpmovsqw:vpmovusqw VCVTPD2DQ VCVTPD2DQ vcvtpd2dqBConvert Packed Double-Precision FP Values to Packed Dword Integers vcvtpd2dqx=H vcvtpd2dqy?H vcvtpd2dqAH vcvtpd2dqxH vcvtpd2dqyH vcvtpd2dqH vcvtpd2dqx=H vcvtpd2dqy?H vcvtpd2dqx4  vcvtpd2dqxH vcvtpd2dqy4  vcvtpd2dqyH vcvtpd2dqx4/  vcvtpd2dqy42  vcvtpd2dqAH vcvtpd2dqH vcvtpd2dqQH vcvtpd2dqQH VCVTTPS2QQ VCVTTPS2QQ vcvttps2qqnConvert with Truncation Packed Single Precision Floating-Point Values to Packed Singed Quadword Integer Values vcvttps2qq8J vcvttps2qq9J vcvttps2qq:J vcvttps2qqJ vcvttps2qqJ vcvttps2qqJ vcvttps2qq8J vcvttps2qqJ vcvttps2qq9J vcvttps2qqJ vcvttps2qq:J vcvttps2qqJ vcvttps2qqRJ vcvttps2qqRJ,https://www.felixcloutier.com/x86/vcvttps2qqVPSHRDQVPSHRDQvpshrdq8Concatenate and Shift Packed Quadword Data Right Logical vpshrdq=KvpshrdqKvpshrdq?KvpshrdqKvpshrdqAUvpshrdqUvpshrdq=KvpshrdqKvpshrdq?KvpshrdqKvpshrdqAUvpshrdqUINT3INT3int3Interrupt 3 (debug trap)int35https://www.felixcloutier.com/x86/intn:into:int3:int1VPSHLDDVPSHLDDvpshldd9Concatenate and Shift Packed Doubleword Data Left Logical vpshldd9KvpshlddKvpshldd:KvpshlddKvpshldd;UvpshlddUvpshldd9KvpshlddKvpshldd:KvpshlddKvpshldd;UvpshlddU CMPNSXADD CMPNSXADD cmpnsxaddCompare for Not Sign and Add cmpnsxadd' cmpnsxadd+ VFNMADD213PS VFNMADD213PS vfnmadd213psLFused Negative Multiply-Add of Packed Single-Precision Floating-Point Values vfnmadd213ps9H vfnmadd213psH vfnmadd213ps:H vfnmadd213psH vfnmadd213ps;H vfnmadd213psH vfnmadd213ps9H vfnmadd213ps4# vfnmadd213psH vfnmadd213ps4/# vfnmadd213ps:H vfnmadd213ps4# vfnmadd213psH vfnmadd213ps42# vfnmadd213ps;H vfnmadd213psH vfnmadd213psQH vfnmadd213psQHHhttps://www.felixcloutier.com/x86/vfnmadd132ps:vfnmadd213ps:vfnmadd231psVPSHUFDVPSHUFDvpshufdShuffle Packed Doublewordsvpshufd9Hvpshufd:Hvpshufd;HvpshufdHvpshufdHvpshufdHvpshufd9Hvpshufd4 vpshufdHvpshufd4/ vpshufd:Hvpshufd4!vpshufdHvpshufd42!vpshufd;HvpshufdHVPMINUBVPMINUBvpminub(Minimum of Packed Unsigned Byte IntegersvpminubIvpminub/IvpminubIvpminub2IvpminubIvpminub5Ivpminub4 vpminubIvpminub4/ vpminub/Ivpminub4!vpminubIvpminub42!vpminub2IvpminubIvpminub5IVPOPCNTBVPOPCNTBvpopcntb)Packed Population Count for Byte Integers vpopcntbKvpopcntbKvpopcntbSvpopcntb/Kvpopcntb2Kvpopcntb5SvpopcntbKvpopcntb/KvpopcntbKvpopcntb2KvpopcntbSvpopcntb5S VEXPANDPS VEXPANDPS vexpandpsKLoad Sparse Packed Single-Precision Floating-Point Values from Dense Memory  vexpandpsH vexpandpsH vexpandpsH vexpandps/H vexpandps2H vexpandps5H vexpandpsH vexpandps/H vexpandpsH vexpandps2H vexpandpsH vexpandps5H+https://www.felixcloutier.com/x86/vexpandpsPMOVMSKBPMOVMSKBpmovmskbMove Byte MaskpmovmskbPMOVMSKB3 pmovmskbPMOVMSKB3*https://www.felixcloutier.com/x86/pmovmskb VPMOVZXDQ VPMOVZXDQ vpmovzxdqHMove Packed Doubleword Integers to Quadword Integers with Zero Extension vpmovzxdqH vpmovzxdqH vpmovzxdqH vpmovzxdq+H vpmovzxdq/H vpmovzxdq2H vpmovzxdq4  vpmovzxdqH vpmovzxdq4+  vpmovzxdq+H vpmovzxdq4! vpmovzxdqH vpmovzxdq4/! vpmovzxdq/H vpmovzxdqH vpmovzxdq2HBLCSBLCSblcsSet Lowest Clear Bitblcs6blcs'6blcs6blcs+6LEALEAleaLoad Effective AddressleawLEAW3 "lealLEAL3"leaqLEAQ3"%https://www.felixcloutier.com/x86/leaVPSLLVQVPSLLVQvpsllvq0Variable Shift Packed Quadword Data Left Logicalvpsllvq=HvpsllvqHvpsllvq?HvpsllvqHvpsllvqAHvpsllvqHvpsllvq=Hvpsllvq4!vpsllvqHvpsllvq4/!vpsllvq?Hvpsllvq4!vpsllvqHvpsllvq42!vpsllvqAHvpsllvqH9https://www.felixcloutier.com/x86/vpsllvw:vpsllvd:vpsllvqCMOVAECMOVAEcmovae Move if above or equal (CF == 0)cmovaew3  cmovaew3 $cmovael3cmovael3'cmovaeq3cmovaeq3+VEXP2PDVEXP2PDvexp2pdyApproximation to the Exponential 2^x of Packed Double-Precision Floating-Point Values with Less Than 2^-23 Relative Errorvexp2pdAMvexp2pdMvexp2pdAMvexp2pdMvexp2pdRMvexp2pdRM)https://www.felixcloutier.com/x86/vexp2pd VFMSUB132SD VFMSUB132SD vfmsub132sdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfmsub132sdH vfmsub132sd+H vfmsub132sd4# vfmsub132sdH vfmsub132sd4+# vfmsub132sd+H vfmsub132sdQH vfmsub132sdQHEhttps://www.felixcloutier.com/x86/vfmsub132sd:vfmsub213sd:vfmsub231sdMOVNTPDMOVNTPDmovntpdKStore Packed Double-Precision Floating-Point Values Using Non-Temporal HintmovntpdMOVNTPD3/)https://www.felixcloutier.com/x86/movntpdMAXSDMAXSDmaxsd;Return Maximum Scalar Double-Precision Floating-Point ValuemaxsdMAXSD3maxsdMAXSD3+'https://www.felixcloutier.com/x86/maxsd VCVTSS2SD VCVTSS2SD vcvtss2sdLConvert Scalar Single-Precision FP Value to Scalar Double-Precision FP Value vcvtss2sdH vcvtss2sd'H vcvtss2sd4  vcvtss2sdH vcvtss2sd4'  vcvtss2sd'H vcvtss2sdRH vcvtss2sdRHVFMADDPDVFMADDPDvfmaddpdCFused Multiply-Add of Packed Double-Precision Floating-Point Valuesvfmaddpd$vfmaddpd/$vfmaddpd/$vfmaddpd$vfmaddpd2$vfmaddpd2$ VMOVSHDUP VMOVSHDUP vmovshdup(Move Packed Single-FP High and Duplicate vmovshdupH vmovshdupH vmovshdupH vmovshdup/H vmovshdup2H vmovshdup5H vmovshdup4  vmovshdupH vmovshdup4/  vmovshdup/H vmovshdup4  vmovshdupH vmovshdup42  vmovshdup2H vmovshdupH vmovshdup5HVSM3MSG1VSM3MSG1vsm3msg1?Perform Initial Calculation for the Next Four SM3 Message Wordsvsm3msg1vsm3msg1/VFMADDSUB132PDVFMADDSUB132PDvfmaddsub132pdXFused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Valuesvfmaddsub132pd=Hvfmaddsub132pdHvfmaddsub132pd?Hvfmaddsub132pdHvfmaddsub132pdAHvfmaddsub132pdHvfmaddsub132pd=Hvfmaddsub132pd4#vfmaddsub132pdHvfmaddsub132pd4/#vfmaddsub132pd?Hvfmaddsub132pd4#vfmaddsub132pdHvfmaddsub132pd42#vfmaddsub132pdAHvfmaddsub132pdHvfmaddsub132pdQHvfmaddsub132pdQHNhttps://www.felixcloutier.com/x86/vfmaddsub132pd:vfmaddsub213pd:vfmaddsub231pd VPMOVMSKB VPMOVMSKB vpmovmskbMove Byte Mask vpmovmskb4  vpmovmskb4!PMOVZXDQPMOVZXDQpmovzxdqHMove Packed Doubleword Integers to Quadword Integers with Zero Extensionpmovzxdq3pmovzxdq3+VSCATTERPF1QPSVSCATTERPF1QPSvscatterpf1qps‚Sparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Quadword Indices Using T1 Hint with Intent to Writevscatterpf1qpsML]https://www.felixcloutier.com/x86/vscatterpf1dps:vscatterpf1qps:vscatterpf1dpd:vscatterpf1qpdJGJGjg&Jump if greater (ZF == 0 and SF == OF)jgJGT3NjgJGT3O VCVTSD2USI VCVTSD2USI vcvtsd2usiSConvert Scalar Double-Precision Floating-Point Value to Unsigned Doubleword Integer vcvtsd2usiH vcvtsd2usi+H vcvtsd2usiH vcvtsd2usi+H vcvtsd2usiQH vcvtsd2usiQH,https://www.felixcloutier.com/x86/vcvtsd2usi VINSERTF64X4 VINSERTF64X4 vinsertf64x4@Insert 256 Bits of Packed Double-Precision Floating-Point Values vinsertf64x4H vinsertf64x42H vinsertf64x4H vinsertf64x42HPFRSQIT1PFRSQIT1pfrsqit18Packed Floating-Point Reciprocal Square Root Iteration 1pfrsqit1PFRSQIT13pfrsqit1PFRSQIT13+ VCVTUDQ2PD VCVTUDQ2PD vcvtudq2pd\Convert Packed Unsigned Doubleword Integers to Packed Double-Precision Floating-Point Values  vcvtudq2pd8H vcvtudq2pd9H vcvtudq2pd:H vcvtudq2pdH vcvtudq2pdH vcvtudq2pdH vcvtudq2pd8H vcvtudq2pdH vcvtudq2pd9H vcvtudq2pdH vcvtudq2pd:H vcvtudq2pdH,https://www.felixcloutier.com/x86/vcvtudq2pd VRNDSCALEPS VRNDSCALEPS vrndscaleps^Round Packed Single-Precision Floating-Point Values To Include A Given Number Of Fraction Bits vrndscaleps9H vrndscaleps:H vrndscaleps;H vrndscalepsH vrndscalepsH vrndscalepsH vrndscaleps9H vrndscalepsH vrndscaleps:H vrndscalepsH vrndscaleps;H vrndscalepsH vrndscalepsRH vrndscalepsRH-https://www.felixcloutier.com/x86/vrndscaleps VPCMPISTRI VPCMPISTRI vpcmpistri4Packed Compare Implicit Length Strings, Return Index vpcmpistri4  vpcmpistri4/ PFCMPGEPFCMPGEpfcmpge2Packed Floating-Point Compare for Greater or EqualpfcmpgePFCMPGE3pfcmpgePFCMPGE3+PANDPANDpandPacked Bitwise Logical ANDpandPAND3 pandPAND3+ pandPAND3pandPAND3/&https://www.felixcloutier.com/x86/pandPFACCPFACCpfacc Packed Floating-Point AccumulatepfaccPFACC3pfaccPFACC3+ PCMPESTRM PCMPESTRM pcmpestrm3Packed Compare Explicit Length Strings, Return Mask pcmpestrmq3 pcmpestrmq3/+https://www.felixcloutier.com/x86/pcmpestrmSETLESETLEsetle/Set byte if less or equal (ZF == 1 or SF != OF)setleSETLE3 setleSETLE3# VCVTPS2PH VCVTPS2PH vcvtps2ph<Convert Single-Precision FP value to Half-Precision FP value vcvtps2phH vcvtps2ph,H vcvtps2phH vcvtps2ph0H vcvtps2phH vcvtps2ph3H vcvtps2ph% vcvtps2phH vcvtps2ph% vcvtps2phH vcvtps2phH vcvtps2ph+% vcvtps2ph+H vcvtps2ph/% vcvtps2ph/H vcvtps2ph2H vcvtps2phRH vcvtps2phRH+https://www.felixcloutier.com/x86/vcvtps2phVPHSUBSWVPHSUBSWvphsubswFPacked Horizontal Subtract Signed Word Integers with Signed Saturationvphsubsw4 vphsubsw4/ vphsubsw4!vphsubsw42!VPINSRBVPINSRBvpinsrb Insert Bytevpinsrb4 vpinsrbIvpinsrb4# vpinsrb#IVPCOMBVPCOMBvpcomb#Compare Packed Signed Byte Integersvpcomb"vpcomb/"SETOSETOsetoSet byte if overflow (OF == 1)setoSETOS3 setoSETOS3#VFMADDSUB213PHVFMADDSUB213PHvfmaddsub213phVFused Multiply-Alternating Add/Subtract of Packed Half-Precision Floating-Point Valuesvfmaddsub213ph<Kvfmaddsub213phKvfmaddsub213ph>Kvfmaddsub213phKvfmaddsub213ph@Rvfmaddsub213phRvfmaddsub213ph<Kvfmaddsub213phKvfmaddsub213ph>Kvfmaddsub213phKvfmaddsub213ph@Rvfmaddsub213phRvfmaddsub213phQRvfmaddsub213phQRNhttps://www.felixcloutier.com/x86/vfmaddsub132ph:vfmaddsub213ph:vfmaddsub231phVPABSDVPABSDvpabsd,Packed Absolute Value of Doubleword Integersvpabsd9Hvpabsd:Hvpabsd;HvpabsdHvpabsdHvpabsdHvpabsd9Hvpabsd4 vpabsdHvpabsd4/ vpabsd:Hvpabsd4!vpabsdHvpabsd42!vpabsd;HvpabsdHVDIVSHVDIVSHvdivsh2Divide Scalar Half-Precision Floating-Point ValuesvdivshRvdivsh$RvdivshRvdivsh$RvdivshQRvdivshQR(https://www.felixcloutier.com/x86/vdivshVPAVGBVPAVGBvpavgbAverage Packed Byte IntegersvpavgbIvpavgb/IvpavgbIvpavgb2IvpavgbIvpavgb5Ivpavgb4 vpavgbIvpavgb4/ vpavgb/Ivpavgb4!vpavgbIvpavgb42!vpavgb2IvpavgbIvpavgb5IVPMINUDVPMINUDvpminud.Minimum of Packed Unsigned Doubleword Integersvpminud9HvpminudHvpminud:HvpminudHvpminud;HvpminudHvpminud9Hvpminud4 vpminudHvpminud4/ vpminud:Hvpminud4!vpminudHvpminud42!vpminud;HvpminudHVPMOVSDBVPMOVSDBvpmovsdbKDown Convert Packed Doubleword Values to Byte Values with Signed Saturation vpmovsdbHvpmovsdb(HvpmovsdbHvpmovsdb,HvpmovsdbHvpmovsdb0HvpmovsdbHvpmovsdbHvpmovsdbHvpmovsdb'Hvpmovsdb+Hvpmovsdb/H<https://www.felixcloutier.com/x86/vpmovdb:vpmovsdb:vpmovusdbVDIVPSVDIVPSvdivps4Divide Packed Single-Precision Floating-Point Valuesvdivps9HvdivpsHvdivps:HvdivpsHvdivps;HvdivpsHvdivps9Hvdivps4 vdivpsHvdivps4/ vdivps:Hvdivps4 vdivpsHvdivps42 vdivps;HvdivpsHvdivpsQHvdivpsQH VPMADCSWD VPMADCSWD vpmadcswd?Packed Multiply Add Accumulate Signed Word to Signed Doubleword vpmadcswd" vpmadcswd/"PUSHPUSHpushPush Value Onto the StackpushqPUSHQ3pushqPUSHQ3pushwPUSHW3 pushqPUSHQ3pushwPUSHW3$pushqPUSHQ3+&https://www.felixcloutier.com/x86/pushVPSHLWVPSHLWvpshlwPacked Shift Logical Wordsvpshlw"vpshlw/"vpshlw/"SETASETAseta'Set byte if above (CF == 0 and ZF == 0)setaSETHI3 setaSETHI3#PMINSWPMINSWpminsw&Minimum of Packed Signed Word IntegerspminswPMINSW3 pminswPMINSW3+ pminswPMINSW3pminswPMINSW3//https://www.felixcloutier.com/x86/pminsb:pminswCMOVLECMOVLEcmovle+Move if less or equal (ZF == 1 or SF != OF)cmovlew3  cmovlew3 $cmovlel3cmovlel3'cmovleqCMOVLEQ3cmovleqCMOVLEQ3+MOVDDUPMOVDDUPmovddup Move One Double-FP and Duplicatemovddup3movddup3+)https://www.felixcloutier.com/x86/movddupBLCICBLCICblcic%Isolate Lowest Set Bit and Complementblcic6blcic'6blcic6blcic+6PHSUBSWPHSUBSWphsubswFPacked Horizontal Subtract Signed Word Integers with Signed Saturationphsubsw3phsubsw3+phsubsw3phsubsw3/)https://www.felixcloutier.com/x86/phsubswUNPCKHPSUNPCKHPSunpckhpsHUnpack and Interleave High Packed Single-Precision Floating-Point ValuesunpckhpsUNPCKHPS3unpckhpsUNPCKHPS3/*https://www.felixcloutier.com/x86/unpckhpsRETRETretReturn from ProcedureretqRETretq%https://www.felixcloutier.com/x86/retPADDSBPADDSBpaddsb6Add Packed Signed Byte Integers with Signed SaturationpaddsbPADDSB3 paddsbPADDSB3+ paddsbPADDSB3paddsbPADDSB3//https://www.felixcloutier.com/x86/paddsb:paddswMOVQMOVQmovq Move Quadword movqMOVQ3 movqMOVQ3movqMOVQ3 movqMOVQ3 movqMOVQ3+ movqMOVQ3movqMOVQ3movqMOVQ3+movqMOVQ3+ movqMOVQ3+&https://www.felixcloutier.com/x86/movqVAESKEYGENASSISTVAESKEYGENASSISTvaeskeygenassistAES Round Key Generation Assistvaeskeygenassist vaeskeygenassist/  VMOVMSKPD VMOVMSKPD vmovmskpd8Extract Packed Double-Precision Floating-Point Sign Mask vmovmskpd4  vmovmskpd4 VPDPBSSDVPDPBSSDvpdpbssdFPacked Dot Product of Signed-by-Singed Byte subvectors into DoublewordvpdpbssdXvpdpbssd/XvpdpbssdXvpdpbssd2XCMOVGECMOVGEcmovge#Move if greater or equal (SF == OF)cmovgew3  cmovgew3 $cmovgel3cmovgel3'cmovgeq3cmovgeq3+VPOPCNTDVPOPCNTDvpopcntd/Packed Population Count for Doubleword Integers vpopcntd9Kvpopcntd:Kvpopcntd;PvpopcntdKvpopcntdKvpopcntdPvpopcntd9KvpopcntdKvpopcntd:KvpopcntdKvpopcntd;PvpopcntdPVRCPPHVRCPPHvrcpphNCompute Approximate Reciprocals of Packed Half-Precision Floating-Point Values vrcpph<Kvrcpph>Kvrcpph@RvrcpphKvrcpphKvrcpphRvrcpph<KvrcpphKvrcpph>KvrcpphKvrcpph@RvrcpphR(https://www.felixcloutier.com/x86/vrcpph VSHA512MSG1 VSHA512MSG1 vsha512msg1NPerform an Intermediate Calculation for the Next Four SHA512 Message Quadwords vsha512msg1) VPMOVZXBQ VPMOVZXBQ vpmovzxbqBMove Packed Byte Integers to Quadword Integers with Zero Extension vpmovzxbqH vpmovzxbqH vpmovzxbqH vpmovzxbq$H vpmovzxbq'H vpmovzxbq+H vpmovzxbq4  vpmovzxbqH vpmovzxbq4$  vpmovzxbq$H vpmovzxbq4! vpmovzxbqH vpmovzxbq4'! vpmovzxbq'H vpmovzxbqH vpmovzxbq+HKSHIFTLQKSHIFTLQkshiftlqShift Left 64-bit MaskskshiftlqIEhttps://www.felixcloutier.com/x86/kshiftlw:kshiftlb:kshiftlq:kshiftld VMOVNTDQA VMOVNTDQA vmovntdqa.Load Double Quadword Non-Temporal Aligned Hint vmovntdqa4/  vmovntdqa/H vmovntdqa42! vmovntdqa2H vmovntdqa5HPFMAXPFMAXpfmaxPacked Floating-Point MaximumpfmaxPFMAX3pfmaxPFMAX3+PSHUFWPSHUFWpshufwShuffle Packed WordspshufwPSHUFW3 pshufwPSHUFW3+ (https://www.felixcloutier.com/x86/pshufwVPERMWVPERMWvpermwPermute Word Integers vpermwIvpermw/IvpermwIvpermw2IvpermwIvpermw5IvpermwIvpermw/IvpermwIvpermw2IvpermwIvpermw5I/https://www.felixcloutier.com/x86/vpermd:vpermwVRCP14PSVRCP14PSvrcp14psPCompute Approximate Reciprocals of Packed Single-Precision Floating-Point Values vrcp14ps9Hvrcp14ps:Hvrcp14ps;Hvrcp14psHvrcp14psHvrcp14psHvrcp14ps9Hvrcp14psHvrcp14ps:Hvrcp14psHvrcp14ps;Hvrcp14psH*https://www.felixcloutier.com/x86/vrcp14psVPSRLQVPSRLQvpsrlq(Shift Packed Quadword Data Right Logicalvpsrlq=Hvpsrlq?HvpsrlqAHvpsrlqHvpsrlqHvpsrlq/HvpsrlqHvpsrlqHvpsrlq/HvpsrlqHvpsrlqHvpsrlq/Hvpsrlq=Hvpsrlq4 vpsrlqHvpsrlq4 vpsrlqHvpsrlq4/ vpsrlq/Hvpsrlq?Hvpsrlq4!vpsrlqHvpsrlq4!vpsrlqHvpsrlq4/!vpsrlq/HvpsrlqAHvpsrlqHvpsrlqHvpsrlq/HPMULDQPMULDQpmuldqDMultiply Packed Signed Doubleword Integers and Store Quadword Resultpmuldq3pmuldq3/(https://www.felixcloutier.com/x86/pmuldqVPERMQVPERMQvpermqPermute Quadword Integersvpermq?HvpermqAHvpermq?HvpermqHvpermqHvpermqAHvpermqHvpermqHvpermq?Hvpermq?Hvpermq4!vpermqHvpermqHvpermq42!vpermqAHvpermqAHvpermqHvpermqH(https://www.felixcloutier.com/x86/vpermq VFNMADD132PD VFNMADD132PD vfnmadd132pdLFused Negative Multiply-Add of Packed Double-Precision Floating-Point Values vfnmadd132pd=H vfnmadd132pdH vfnmadd132pd?H vfnmadd132pdH vfnmadd132pdAH vfnmadd132pdH vfnmadd132pd=H vfnmadd132pd4# vfnmadd132pdH vfnmadd132pd4/# vfnmadd132pd?H vfnmadd132pd4# vfnmadd132pdH vfnmadd132pd42# vfnmadd132pdAH vfnmadd132pdH vfnmadd132pdQH vfnmadd132pdQHHhttps://www.felixcloutier.com/x86/vfnmadd132pd:vfnmadd213pd:vfnmadd231pdVDPPDVDPPDvdppd<Dot Product of Packed Double Precision Floating-Point Valuesvdppd4 vdppd4/ KMOVDKMOVDkmovdMove 32-bit MaskkmovdIkmovdIkmovd'IkmovdIkmovd'I9https://www.felixcloutier.com/x86/kmovw:kmovb:kmovq:kmovd MASKMOVDQU MASKMOVDQU maskmovdqu'Store Selected Bytes of Double Quadword maskmovdqu,https://www.felixcloutier.com/x86/maskmovdquSQRTPDSQRTPDsqrtpdECompute Square Roots of Packed Double-Precision Floating-Point ValuessqrtpdSQRTPD3sqrtpdSQRTPD3/(https://www.felixcloutier.com/x86/sqrtpdVPINSRDVPINSRDvpinsrdInsert Doublewordvpinsrd4 vpinsrdJvpinsrd4' vpinsrd'JMULSSMULSSmulss6Multiply Scalar Single-Precision Floating-Point ValuesmulssMULSS3mulssMULSS3''https://www.felixcloutier.com/x86/mulssPADDDPADDDpadddAdd Packed Doubleword IntegerspadddPADDL3 padddPADDL3+ padddPADDL3padddPADDL3/9https://www.felixcloutier.com/x86/paddb:paddw:paddd:paddq VEXTRACTI64X4 VEXTRACTI64X4 vextracti64x42Extract 256 Bits of Packed Quadword Integer Values vextracti64x4H vextracti64x43H vextracti64x4H vextracti64x42H VFMADD231SD VFMADD231SD vfmadd231sdCFused Multiply-Add of Scalar Double-Precision Floating-Point Values vfmadd231sdH vfmadd231sd+H vfmadd231sd4# vfmadd231sdH vfmadd231sd4+# vfmadd231sd+H vfmadd231sdQH vfmadd231sdQHEhttps://www.felixcloutier.com/x86/vfmadd132sd:vfmadd213sd:vfmadd231sdVFMSUBPSVFMSUBPSvfmsubpsHFused Multiply-Subtract of Packed Single-Precision Floating-Point Valuesvfmsubps$vfmsubps/$vfmsubps/$vfmsubps$vfmsubps2$vfmsubps2$ VFNMADD213SD VFNMADD213SD vfnmadd213sdLFused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values vfnmadd213sdH vfnmadd213sd+H vfnmadd213sd4# vfnmadd213sdH vfnmadd213sd4+# vfnmadd213sd+H vfnmadd213sdQH vfnmadd213sdQHHhttps://www.felixcloutier.com/x86/vfnmadd132sd:vfnmadd213sd:vfnmadd231sdVPERMDVPERMDvpermdPermute Doubleword Integers vpermd:HvpermdHvpermd;HvpermdHvpermd:Hvpermd4!vpermdHvpermd42!vpermd;HvpermdH/https://www.felixcloutier.com/x86/vpermd:vpermw VPEXPANDQ VPEXPANDQ vpexpandqELoad Sparse Packed Quadword Integer Values from Dense Memory/Register  vpexpandqH vpexpandqH vpexpandqH vpexpandq/H vpexpandq2H vpexpandq5H vpexpandqH vpexpandq/H vpexpandqH vpexpandq2H vpexpandqH vpexpandq5H+https://www.felixcloutier.com/x86/vpexpandqSUBPSSUBPSsubps6Subtract Packed Single-Precision Floating-Point ValuessubpsSUBPS3subpsSUBPS3/'https://www.felixcloutier.com/x86/subps VRSQRT28PS VRSQRT28PS vrsqrt28ps€Approximation to the Reciprocal Square Root of Packed Single-Precision Floating-Point Values with Less Than 2^-28 Relative Error vrsqrt28ps;M vrsqrt28psM vrsqrt28ps;M vrsqrt28psM vrsqrt28psRM vrsqrt28psRM,https://www.felixcloutier.com/x86/vrsqrt28psVTESTPSVTESTPSvtestps/Packed Single-Precision Floating-Point Bit Testvtestps4 vtestps4/ vtestps4 vtestps42 1https://www.felixcloutier.com/x86/vtestpd:vtestpsSHUFPSSHUFPSshufps5Shuffle Packed Single-Precision Floating-Point ValuesshufpsSHUFPS3shufpsSHUFPS3/(https://www.felixcloutier.com/x86/shufpsVPOPCNTWVPOPCNTWvpopcntw)Packed Population Count for Word Integers vpopcntwKvpopcntwKvpopcntwSvpopcntw/Kvpopcntw2Kvpopcntw5SvpopcntwKvpopcntw/KvpopcntwKvpopcntw2KvpopcntwSvpopcntw5SSETGESETGEsetge'Set byte if greater or equal (SF == OF)setgeSETGE3 setgeSETGE3# VRSQRT14PS VRSQRT14PS vrsqrt14ps`Compute Approximate Reciprocals of Square Roots of Packed Single-Precision Floating-Point Values  vrsqrt14ps9H vrsqrt14ps:H vrsqrt14ps;H vrsqrt14psH vrsqrt14psH vrsqrt14psH vrsqrt14ps9H vrsqrt14psH vrsqrt14ps:H vrsqrt14psH vrsqrt14ps;H vrsqrt14psH,https://www.felixcloutier.com/x86/vrsqrt14psVPMAXUDVPMAXUDvpmaxud.Maximum of Packed Unsigned Doubleword Integersvpmaxud9HvpmaxudHvpmaxud:HvpmaxudHvpmaxud;HvpmaxudHvpmaxud9Hvpmaxud4 vpmaxudHvpmaxud4/ vpmaxud:Hvpmaxud4!vpmaxudHvpmaxud42!vpmaxud;HvpmaxudHAESKEYGENASSISTAESKEYGENASSISTaeskeygenassistAES Round Key Generation AssistaeskeygenassistAESKEYGENASSIST'aeskeygenassistAESKEYGENASSIST/'1https://www.felixcloutier.com/x86/aeskeygenassistBEXTRBEXTRbextrBit Field Extractbextrl6bextrl4bextrl'6bextrl'4bextrq6bextrq4bextrq+6bextrq+4'https://www.felixcloutier.com/x86/bextrVPEXTRQVPEXTRQvpextrqExtract Quadwordvpextrq4 vpextrqJvpextrq4+ vpextrq+JVSUBPDVSUBPDvsubpd6Subtract Packed Double-Precision Floating-Point Valuesvsubpd=HvsubpdHvsubpd?HvsubpdHvsubpdAHvsubpdHvsubpd=Hvsubpd4 vsubpdHvsubpd4/ vsubpd?Hvsubpd4 vsubpdHvsubpd42 vsubpdAHvsubpdHvsubpdQHvsubpdQH VBLENDMPS VBLENDMPS vblendmpsLBlend Packed Single-Precision Floating-Point Vectors Using an OpMask Control  vblendmps9H vblendmpsH vblendmps:H vblendmpsH vblendmps;H vblendmpsH vblendmps9H vblendmpsH vblendmps:H vblendmpsH vblendmps;H vblendmpsH5https://www.felixcloutier.com/x86/vblendmpd:vblendmps CMPNLEXADD CMPNLEXADD cmpnlexadd&Compare for Not Less or Equals and Add cmpnlexadd' cmpnlexadd+VPINSRWVPINSRWvpinsrw Insert Wordvpinsrw4 vpinsrwIvpinsrw4$ vpinsrw$IVPMAXUBVPMAXUBvpmaxub(Maximum of Packed Unsigned Byte IntegersvpmaxubIvpmaxub/IvpmaxubIvpmaxub2IvpmaxubIvpmaxub5Ivpmaxub4 vpmaxubIvpmaxub4/ vpmaxub/Ivpmaxub4!vpmaxubIvpmaxub42!vpmaxub2IvpmaxubIvpmaxub5I VPBLENDMD VPBLENDMD vpblendmd0Blend Doubleword Vectors Using an OpMask Control  vpblendmd9H vpblendmdH vpblendmd:H vpblendmdH vpblendmd;H vpblendmdH vpblendmd9H vpblendmdH vpblendmd:H vpblendmdH vpblendmd;H vpblendmdH5https://www.felixcloutier.com/x86/vpblendmd:vpblendmq VMASKMOVDQU VMASKMOVDQU vmaskmovdqu'Store Selected Bytes of Double Quadword vmaskmovdqu VPCOMUBVPCOMUBvpcomub%Compare Packed Unsigned Byte Integersvpcomub"vpcomub/"DIVDIVdivUnsigned DividedivbDIVB3 divwDIVW3 divlDIVL3divqDIVQ3divbDIVB3#divwDIVW3$divlDIVL3'divqDIVQ3+%https://www.felixcloutier.com/x86/divVSUBPSVSUBPSvsubps6Subtract Packed Single-Precision Floating-Point Valuesvsubps9HvsubpsHvsubps:HvsubpsHvsubps;HvsubpsHvsubps9Hvsubps4 vsubpsHvsubps4/ vsubps:Hvsubps4 vsubpsHvsubps42 vsubps;HvsubpsHvsubpsQHvsubpsQHVSUBSDVSUBSDvsubsd6Subtract Scalar Double-Precision Floating-Point ValuesvsubsdHvsubsd+Hvsubsd4 vsubsdHvsubsd4+ vsubsd+HvsubsdQHvsubsdQHVZEROALLVZEROALLvzeroallZero All YMM Registersvzeroall4 *https://www.felixcloutier.com/x86/vzeroall VFMSUB213SS VFMSUB213SS vfmsub213ssHFused Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfmsub213ssH vfmsub213ss'H vfmsub213ss4# vfmsub213ssH vfmsub213ss4'# vfmsub213ss'H vfmsub213ssQH vfmsub213ssQHEhttps://www.felixcloutier.com/x86/vfmsub132ss:vfmsub213ss:vfmsub231ss VAESDECLAST VAESDECLAST vaesdeclast,Perform Last Round of an AES Decryption Flow  vaesdeclast  vaesdeclastK vaesdeclast/  vaesdeclast/K vaesdeclast vaesdeclastK vaesdeclast2 vaesdeclast2K vaesdeclastH vaesdeclast5H PUNPCKLBW PUNPCKLBW punpcklbw0Unpack and Interleave Low-Order Bytes into Words punpcklbw PUNPCKLBW3  punpcklbw PUNPCKLBW3'  punpcklbw PUNPCKLBW3 punpcklbw PUNPCKLBW3/Jhttps://www.felixcloutier.com/x86/punpcklbw:punpcklwd:punpckldq:punpcklqdqCRC32CRC32crc32Accumulate CRC32 Value crc32bCRC32B3 crc32w4 crc32l3crc32bCRC32B3#crc32w4$crc32l3'crc32bCRC32B3 crc32qCRC32Q3crc32bCRC32B3#crc32qCRC32Q3+'https://www.felixcloutier.com/x86/crc32 VCVTUW2PH VCVTUW2PH vcvtuw2phTConvert Packed Unsigned Word Integers to Packed Half-Precision Floating-Point Values vcvtuw2ph<K vcvtuw2ph>K vcvtuw2ph@R vcvtuw2phK vcvtuw2phK vcvtuw2phR vcvtuw2ph<K vcvtuw2phK vcvtuw2ph>K vcvtuw2phK vcvtuw2ph@R vcvtuw2phR vcvtuw2phQR vcvtuw2phQR+https://www.felixcloutier.com/x86/vcvtuw2ph VFNMADDSS VFNMADDSS vfnmaddssLFused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values vfnmaddss$ vfnmaddss'$ vfnmaddss'$PFPNACCPFPNACCpfpnacc2Packed Floating-Point Positive-Negative AccumulatepfpnaccPFPNACC3pfpnaccPFPNACC3+PMULHWPMULHWpmulhw:Multiply Packed Signed Word Integers and Store High ResultpmulhwPMULHW3 pmulhwPMULHW3+ pmulhwPMULHW3pmulhwPMULHW3/(https://www.felixcloutier.com/x86/pmulhwVMINSDVMINSDvminsd;Return Minimum Scalar Double-Precision Floating-Point ValuevminsdHvminsd+Hvminsd4 vminsdHvminsd4+ vminsd+HvminsdRHvminsdRHVRANGESSVRANGESSvrangessYRange Restriction Calculation For a pair of Scalar Single-Precision Floating-Point ValuesvrangessJvrangess'JvrangessJvrangess'JvrangessRJvrangessRJ*https://www.felixcloutier.com/x86/vrangessVPEXTRDVPEXTRDvpextrdExtract Doublewordvpextrd4 vpextrdJvpextrd4' vpextrd'JVADDPDVADDPDvaddpd1Add Packed Double-Precision Floating-Point Valuesvaddpd=HvaddpdHvaddpd?HvaddpdHvaddpdAHvaddpdHvaddpd=Hvaddpd4 vaddpdHvaddpd4/ vaddpd?Hvaddpd4 vaddpdHvaddpd42 vaddpdAHvaddpdHvaddpdQHvaddpdQHRSQRTPSRSQRTPSrsqrtpsTCompute Reciprocals of Square Roots of Packed Single-Precision Floating-Point ValuesrsqrtpsRSQRTPS3rsqrtpsRSQRTPS3/)https://www.felixcloutier.com/x86/rsqrtpsVMULPDVMULPDvmulpd6Multiply Packed Double-Precision Floating-Point Valuesvmulpd=HvmulpdHvmulpd?HvmulpdHvmulpdAHvmulpdHvmulpd=Hvmulpd4 vmulpdHvmulpd4/ vmulpd?Hvmulpd4 vmulpdHvmulpd42 vmulpdAHvmulpdHvmulpdQHvmulpdQHINCINCincIncrement by 1incbINCB3 incwINCW3 inclINCL3incqINCQ3incbINCB3#incwINCW3$inclINCL3'incqINCQ3+%https://www.felixcloutier.com/x86/incVFMSUBADD132PDVFMSUBADD132PDvfmsubadd132pdXFused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Valuesvfmsubadd132pd=Hvfmsubadd132pdHvfmsubadd132pd?Hvfmsubadd132pdHvfmsubadd132pdAHvfmsubadd132pdHvfmsubadd132pd=Hvfmsubadd132pd4#vfmsubadd132pdHvfmsubadd132pd4/#vfmsubadd132pd?Hvfmsubadd132pd4#vfmsubadd132pdHvfmsubadd132pd42#vfmsubadd132pdAHvfmsubadd132pdHvfmsubadd132pdQHvfmsubadd132pdQHNhttps://www.felixcloutier.com/x86/vfmsubadd132pd:vfmsubadd213pd:vfmsubadd231pd VFNMADD213SS VFNMADD213SS vfnmadd213ssLFused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values vfnmadd213ssH vfnmadd213ss'H vfnmadd213ss4# vfnmadd213ssH vfnmadd213ss4'# vfnmadd213ss'H vfnmadd213ssQH vfnmadd213ssQHHhttps://www.felixcloutier.com/x86/vfnmadd132ss:vfnmadd213ss:vfnmadd231ssDIVSDDIVSDdivsd4Divide Scalar Double-Precision Floating-Point ValuesdivsdDIVSD3divsdDIVSD3+'https://www.felixcloutier.com/x86/divsdSETPOSETPOsetpo Set byte if parity odd (PF == 0)setpoSETPC3 setpoSETPC3#BSRBSRbsrBit Scan ReversebsrwBSRW3  bsrwBSRW3 $bsrlBSRL3bsrlBSRL3'bsrqBSRQ3bsrqBSRQ3+%https://www.felixcloutier.com/x86/bsrVPSUBQVPSUBQvpsubq!Subtract Packed Quadword Integersvpsubq=HvpsubqHvpsubq?HvpsubqHvpsubqAHvpsubqHvpsubq=Hvpsubq4 vpsubqHvpsubq4/ vpsubq?Hvpsubq4!vpsubqHvpsubq42!vpsubqAHvpsubqHMOVNTQMOVNTQmovntq)Store of Quadword Using Non-Temporal HintmovntqMOVNTQ3+ (https://www.felixcloutier.com/x86/movntq VFCMADDCSH VFCMADDCSH vfcmaddcshSFused Conjugate Multiply-Add of Complex Scalar Half-Precision Floating-Point Values vfcmaddcshR vfcmaddcsh'R vfcmaddcshR vfcmaddcsh'R vfcmaddcshQR vfcmaddcshQR6https://www.felixcloutier.com/x86/vfcmaddcsh:vfmaddcshPABSBPABSBpabsb&Packed Absolute Value of Byte Integerspabsb3pabsb3+pabsb3pabsb3/9https://www.felixcloutier.com/x86/pabsb:pabsw:pabsd:pabsq VFMADDSUBPS VFMADDSUBPS vfmaddsubpsXFused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Values vfmaddsubps$ vfmaddsubps/$ vfmaddsubps/$ vfmaddsubps$ vfmaddsubps2$ vfmaddsubps2$ VUNPCKLPD VUNPCKLPD vunpcklpdGUnpack and Interleave Low Packed Double-Precision Floating-Point Values vunpcklpd=H vunpcklpdH vunpcklpd?H vunpcklpdH vunpcklpdAH vunpcklpdH vunpcklpd=H vunpcklpd4  vunpcklpdH vunpcklpd4/  vunpcklpd?H vunpcklpd4  vunpcklpdH vunpcklpd42  vunpcklpdAH vunpcklpdH VCVTPD2UQQ VCVTPD2UQQ vcvtpd2uqqZConvert Packed Double-Precision Floating-Point Values to Packed Unsigned Quadword Integers vcvtpd2uqq=J vcvtpd2uqq?J vcvtpd2uqqAJ vcvtpd2uqqJ vcvtpd2uqqJ vcvtpd2uqqJ vcvtpd2uqq=J vcvtpd2uqqJ vcvtpd2uqq?J vcvtpd2uqqJ vcvtpd2uqqAJ vcvtpd2uqqJ vcvtpd2uqqQJ vcvtpd2uqqQJ,https://www.felixcloutier.com/x86/vcvtpd2uqq VGATHERQPD VGATHERQPD vgatherqpdRGather Packed Double-Precision Floating-Point Values Using Signed Quadword Indices vgatherqpdDH vgatherqpdHH vgatherqpdLH vgatherqpdD! vgatherqpdH!7https://www.felixcloutier.com/x86/vgatherqps:vgatherqpdSYSCALLSYSCALLsyscallFast System CallsyscallSYSCALL)https://www.felixcloutier.com/x86/syscallVMINPDVMINPDvminpd<Return Minimum Packed Double-Precision Floating-Point Valuesvminpd=HvminpdHvminpd?HvminpdHvminpdAHvminpdHvminpd=Hvminpd4 vminpdHvminpd4/ vminpd?Hvminpd4 vminpdHvminpd42 vminpdAHvminpdHvminpdRHvminpdRHKMOVQKMOVQkmovqMove 64-bit MaskkmovqIkmovqIkmovq+IkmovqIkmovq+I9https://www.felixcloutier.com/x86/kmovw:kmovb:kmovq:kmovdMOVAPDMOVAPDmovapd:Move Aligned Packed Double-Precision Floating-Point ValuesmovapdMOVAPD3movapdMOVAPD3/movapdMOVAPD3/(https://www.felixcloutier.com/x86/movapd VRSQRT28SD VRSQRT28SD vrsqrt28sdApproximation to the Reciprocal Square Root of a Scalar Double-Precision Floating-Point Value with Less Than 2^-28 Relative Error vrsqrt28sdM vrsqrt28sd+M vrsqrt28sdM vrsqrt28sd+M vrsqrt28sdRM vrsqrt28sdRM,https://www.felixcloutier.com/x86/vrsqrt28sdVCVTNE2PS2BF16VCVTNE2PS2BF16vcvtne2ps2bf16XConvert with Nearest-Even rounding 2 Single-Precision FP vectors into BFloat16 FP vector vcvtne2ps2bf169Kvcvtne2ps2bf16Kvcvtne2ps2bf16:Kvcvtne2ps2bf16Kvcvtne2ps2bf16;Qvcvtne2ps2bf16Qvcvtne2ps2bf169Kvcvtne2ps2bf16Kvcvtne2ps2bf16:Kvcvtne2ps2bf16Kvcvtne2ps2bf16;Qvcvtne2ps2bf16Q0https://www.felixcloutier.com/x86/vcvtne2ps2bf16MOVHPDMOVHPDmovhpd6Move High Packed Double-Precision Floating-Point ValuemovhpdMOVHPD3+movhpdMOVHPD3+(https://www.felixcloutier.com/x86/movhpdVSM3MSG2VSM3MSG2vsm3msg2=Perform Final Calculation for the Next Four SM3 Message Wordsvsm3msg2vsm3msg2/KMOVWKMOVWkmovwMove 16-bit MaskkmovwHkmovwHkmovw$HkmovwHkmovw$H9https://www.felixcloutier.com/x86/kmovw:kmovb:kmovq:kmovd AESENCLAST AESENCLAST aesenclast,Perform Last Round of an AES Encryption Flow aesenclast AESENCLAST' aesenclast AESENCLAST/',https://www.felixcloutier.com/x86/aesenclastVFMADDSSVFMADDSSvfmaddssCFused Multiply-Add of Scalar Single-Precision Floating-Point Valuesvfmaddss$vfmaddss'$vfmaddss'$KXORBKXORBkxorbBitwise Logical XOR 8-bit MaskskxorbJ9https://www.felixcloutier.com/x86/kxorw:kxorb:kxorq:kxord VCVTSD2SS VCVTSD2SS vcvtsd2ssLConvert Scalar Double-Precision FP Value to Scalar Single-Precision FP Value vcvtsd2ssH vcvtsd2ss+H vcvtsd2ss4  vcvtsd2ssH vcvtsd2ss4+  vcvtsd2ss+H vcvtsd2ssQH vcvtsd2ssQHNOPNOPnop No OperationnopNOP3%https://www.felixcloutier.com/x86/nopVPCMPGTBVPCMPGTBvpcmpgtb4Compare Packed Signed Byte Integers for Greater ThanvpcmpgtbIvpcmpgtbIvpcmpgtb/Ivpcmpgtb/IvpcmpgtbIvpcmpgtbIvpcmpgtb2Ivpcmpgtb2IvpcmpgtbIvpcmpgtbIvpcmpgtb5Ivpcmpgtb5Ivpcmpgtb4 vpcmpgtb4/ vpcmpgtb4!vpcmpgtb42!VSQRTPSVSQRTPSvsqrtpsECompute Square Roots of Packed Single-Precision Floating-Point Valuesvsqrtps9Hvsqrtps:Hvsqrtps;HvsqrtpsHvsqrtpsHvsqrtpsHvsqrtps9Hvsqrtps4 vsqrtpsHvsqrtps4/ vsqrtps:Hvsqrtps4 vsqrtpsHvsqrtps42 vsqrtps;HvsqrtpsHvsqrtpsQHvsqrtpsQHVPSHABVPSHABvpshabPacked Shift Arithmetic Bytesvpshab"vpshab/"vpshab/" VPEXPANDW VPEXPANDW vpexpandwALoad Sparse Packed Word Integer Values from Dense Memory/Register  vpexpandwK vpexpandwK vpexpandwU vpexpandw/K vpexpandw2K vpexpandw5U vpexpandwK vpexpandw/K vpexpandwK vpexpandw2K vpexpandwU vpexpandw5U5https://www.felixcloutier.com/x86/vpexpandb:vpexpandwCMOVNGECMOVNGEcmovnge'Move if not greater or equal (SF != OF)cmovngew3  cmovngew3 $cmovngel3cmovngel3'cmovngeq3cmovngeq3+ VPMACSDQH VPMACSDQH vpmacsdqhDPacked Multiply Accumulate Signed High Doubleword to Signed Quadword vpmacsdqh" vpmacsdqh/"T1MSKCT1MSKCt1mskcInverse Mask From Trailing Onest1mskc6t1mskc'6t1mskc6t1mskc+6 VFMADD231PS VFMADD231PS vfmadd231psCFused Multiply-Add of Packed Single-Precision Floating-Point Values vfmadd231ps9H vfmadd231psH vfmadd231ps:H vfmadd231psH vfmadd231ps;H vfmadd231psH vfmadd231ps9H vfmadd231ps4# vfmadd231psH vfmadd231ps4/# vfmadd231ps:H vfmadd231ps4# vfmadd231psH vfmadd231ps42# vfmadd231ps;H vfmadd231psH vfmadd231psQH vfmadd231psQHEhttps://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231psBLENDVPSBLENDVPSblendvps= Variable Blend Packed Single Precision Floating-Point Valuesblendvps3blendvps3/*https://www.felixcloutier.com/x86/blendvps MOVDIR64B MOVDIR64B movdir64bMOVe to DIRect store 64 Bytes movdir64b51+https://www.felixcloutier.com/x86/movdir64bVANDPDVANDPDvandpdDBitwise Logical AND of Packed Double-Precision Floating-Point Valuesvandpd=JvandpdJvandpd?JvandpdJvandpdAJvandpdJvandpd=Jvandpd4 vandpdJvandpd4/ vandpd?Jvandpd4 vandpdJvandpd42 vandpdAJvandpdJ VCVTPH2DQ VCVTPH2DQ vcvtph2dq@Convert Packed Half-Precision FP Values to Packed Dword Integers vcvtph2dq.K vcvtph2dq<K vcvtph2dq>R vcvtph2dqK vcvtph2dqK vcvtph2dqR vcvtph2dq.K vcvtph2dqK vcvtph2dq<K vcvtph2dqK vcvtph2dq>R vcvtph2dqR vcvtph2dqQR vcvtph2dqQR+https://www.felixcloutier.com/x86/vcvtph2dq VEXTRACTI128 VEXTRACTI128 vextracti128Extract Packed Integer Values vextracti1284! vextracti1284/!fhttps://www.felixcloutier.com/x86/vextracti128:vextracti32x4:vextracti64x2:vextracti32x8:vextracti64x4 CMPNZXADD CMPNZXADD cmpnzxaddCompare for Not Zero and Add cmpnzxadd' cmpnzxadd+ VCVTTPS2UDQ VCVTTPS2UDQ vcvttps2udqrConvert with Truncation Packed Single-Precision Floating-Point Values to Packed Unsigned Doubleword Integer Values vcvttps2udq9H vcvttps2udq:H vcvttps2udq;H vcvttps2udqH vcvttps2udqH vcvttps2udqH vcvttps2udq9H vcvttps2udqH vcvttps2udq:H vcvttps2udqH vcvttps2udq;H vcvttps2udqH vcvttps2udqRH vcvttps2udqRH-https://www.felixcloutier.com/x86/vcvttps2udqMAXSSMAXSSmaxss;Return Maximum Scalar Single-Precision Floating-Point ValuemaxssMAXSS3maxssMAXSS3''https://www.felixcloutier.com/x86/maxssVPHADDSWVPHADDSWvphaddswAPacked Horizontal Add Signed Word Integers with Signed Saturationvphaddsw4 vphaddsw4/ vphaddsw4!vphaddsw42! VSCALEFPH VSCALEFPH vscalefph[Scale Packed Half-Precision Floating-Point Values With Half-Precision Floating-Point Values vscalefph<K vscalefphK vscalefph>K vscalefphK vscalefph@R vscalefphR vscalefph<K vscalefphK vscalefph>K vscalefphK vscalefph@R vscalefphR vscalefphQR vscalefphQR+https://www.felixcloutier.com/x86/vscalefphPMAXSBPMAXSBpmaxsb&Maximum of Packed Signed Byte Integerspmaxsb3pmaxsb3/=https://www.felixcloutier.com/x86/pmaxsb:pmaxsw:pmaxsd:pmaxsqSETNSSETNSsetnsSet byte if not sign (SF == 0)setnsSETPL3 setnsSETPL3#SFENCESFENCEsfence Store FencesfenceSFENCE3 (https://www.felixcloutier.com/x86/sfence VEXTRACTF64X4 VEXTRACTF64X4 vextractf64x4AExtract 256 Bits of Packed Double-Precision Floating-Point Values vextractf64x4H vextractf64x43H vextractf64x4H vextractf64x42H VPMACSSDD VPMACSSDD vpmacssddQPacked Multiply Accumulate with Saturation Signed Doubleword to Signed Doubleword vpmacssdd" vpmacssdd/"VPSRADVPSRADvpsrad-Shift Packed Doubleword Data Right Arithmeticvpsrad9Hvpsrad:Hvpsrad;HvpsradHvpsradHvpsrad/HvpsradHvpsradHvpsrad/HvpsradHvpsradHvpsrad/Hvpsrad9Hvpsrad4 vpsradHvpsrad4 vpsradHvpsrad4/ vpsrad/Hvpsrad:Hvpsrad4!vpsradHvpsrad4!vpsradHvpsrad4/!vpsrad/Hvpsrad;HvpsradHvpsradHvpsrad/HVROUNDSSVROUNDSSvroundss3Round Scalar Single Precision Floating-Point Valuesvroundss4 vroundss4' VFRCZSSVFRCZSSvfrczss7Extract Fraction Scalar Single-Precision Floating Pointvfrczss"vfrczss'" VSCATTERDPD VSCATTERDPD vscatterdpdTScatter Packed Double-Precision Floating-Point Values with Signed Doubleword Indices vscatterdpdCH vscatterdpdCH vscatterdpdGHQhttps://www.felixcloutier.com/x86/vscatterdps:vscatterdpd:vscatterqps:vscatterqpdCBWCBWcbwConvert Byte to Wordcbtw3/https://www.felixcloutier.com/x86/cbw:cwde:cdqeMULSDMULSDmulsd6Multiply Scalar Double-Precision Floating-Point ValuesmulsdMULSD3mulsdMULSD3+'https://www.felixcloutier.com/x86/mulsd VCVTTSH2SI VCVTTSH2SI vcvttsh2siGConvert with Truncation Scalar Half-Precision FP Value to Dword Integer vcvttsh2siR vcvttsh2si$R vcvttsh2siR vcvttsh2si$R vcvttsh2siRR vcvttsh2siRR,https://www.felixcloutier.com/x86/vcvttsh2siKNOTQKNOTQknotqNOT 64-bit Mask RegisterknotqI9https://www.felixcloutier.com/x86/knotw:knotb:knotq:knotd VPMACSSDQL VPMACSSDQL vpmacssdqlSPacked Multiply Accumulate with Saturation Signed Low Doubleword to Signed Quadword vpmacssdql" vpmacssdql/" VSCATTERDPS VSCATTERDPS vscatterdpsTScatter Packed Single-Precision Floating-Point Values with Signed Doubleword Indices vscatterdpsCH vscatterdpsGH vscatterdpsKHQhttps://www.felixcloutier.com/x86/vscatterdps:vscatterdpd:vscatterqps:vscatterqpd STTILECFG STTILECFG sttilecfgSTore TILE ConFiGuration sttilecfg5+https://www.felixcloutier.com/x86/sttilecfg VPBROADCASTD VPBROADCASTD vpbroadcastdBroadcast Doubleword Integer vpbroadcastdH vpbroadcastdH vpbroadcastdH vpbroadcastdH vpbroadcastdH vpbroadcastdH vpbroadcastd'H vpbroadcastd'H vpbroadcastd'H vpbroadcastdH vpbroadcastd4! vpbroadcastdH vpbroadcastd4'! vpbroadcastd'H vpbroadcastdH vpbroadcastd4! vpbroadcastdH vpbroadcastd4'! vpbroadcastd'H vpbroadcastdH vpbroadcastdH vpbroadcastd'HUhttps://www.felixcloutier.com/x86/vpbroadcastb:vpbroadcastw:vpbroadcastd:vpbroadcastqKANDDKANDDkandd Bitwise Logical AND 32-bit MaskskanddI9https://www.felixcloutier.com/x86/kandw:kandb:kandq:kandd VPCLMULQDQ VPCLMULQDQ vpclmulqdq"Carry-Less Quadword Multiplication  vpclmulqdq  vpclmulqdqK vpclmulqdq/  vpclmulqdq/K vpclmulqdq vpclmulqdqK vpclmulqdq2 vpclmulqdq2K vpclmulqdqH vpclmulqdq5HBTRBTRbtrBit Test and Reset btrwBTRW3 btrwBTRW  btrlBTRL3btrlBTRLbtrqBTRQ3btrqBTRQbtrwBTRW3$btrwBTRW$ btrlBTRL3'btrlBTRL'btrqBTRQ3+btrqBTRQ+%https://www.felixcloutier.com/x86/btrCMOVOCMOVOcmovoMove if overflow (OF == 1)cmovow3  cmovow3 $cmovol3cmovol3'cmovoq3cmovoq3+ VINSERTF32X4 VINSERTF32X4 vinsertf32x4@Insert 128 Bits of Packed Single-Precision Floating-Point Values vinsertf32x4H vinsertf32x4/H vinsertf32x4H vinsertf32x4/H vinsertf32x4H vinsertf32x4/H vinsertf32x4H vinsertf32x4/H VPBLENDMQ VPBLENDMQ vpblendmq.Blend Quadword Vectors Using an OpMask Control  vpblendmq=H vpblendmqH vpblendmq?H vpblendmqH vpblendmqAH vpblendmqH vpblendmq=H vpblendmqH vpblendmq?H vpblendmqH vpblendmqAH vpblendmqH5https://www.felixcloutier.com/x86/vpblendmd:vpblendmqPMOVZXBDPMOVZXBDpmovzxbdDMove Packed Byte Integers to Doubleword Integers with Zero Extensionpmovzxbd3pmovzxbd3'VPAVGWVPAVGWvpavgwAverage Packed Word IntegersvpavgwIvpavgw/IvpavgwIvpavgw2IvpavgwIvpavgw5Ivpavgw4 vpavgwIvpavgw4/ vpavgw/Ivpavgw4!vpavgwIvpavgw42!vpavgw2IvpavgwIvpavgw5IBLSICBLSICblsic%Isolate Lowest Set Bit and Complementblsic6blsic'6blsic6blsic+6AXORAXORaxorAtomically XORaxor'axor+MOVNTSDMOVNTSDmovntsdKStore Scalar Double-Precision Floating-Point Values Using Non-Temporal Hintmovntsd3+TILEZEROTILEZEROtilezeroTILE ZERO datatilezeroT*https://www.felixcloutier.com/x86/tilezeroVADDSDVADDSDvaddsd1Add Scalar Double-Precision Floating-Point ValuesvaddsdHvaddsd+Hvaddsd4 vaddsdHvaddsd4+ vaddsd+HvaddsdQHvaddsdQHVPCMPUWVPCMPUWvpcmpuw#Compare Packed Unsigned Word Values vpcmpuwIvpcmpuwIvpcmpuw/Ivpcmpuw/IvpcmpuwIvpcmpuwIvpcmpuw2Ivpcmpuw2IvpcmpuwIvpcmpuwIvpcmpuw5Ivpcmpuw5I0https://www.felixcloutier.com/x86/vpcmpw:vpcmpuwVPERMT2DVPERMT2Dvpermt2d?Full Permute of Doublewords From Two Tables Overwriting a Table vpermt2d9Hvpermt2dHvpermt2d:Hvpermt2dHvpermt2d;Hvpermt2dHvpermt2d9Hvpermt2dHvpermt2d:Hvpermt2dHvpermt2d;Hvpermt2dHPhttps://www.felixcloutier.com/x86/vpermt2w:vpermt2d:vpermt2q:vpermt2ps:vpermt2pd SHA256MSG2 SHA256MSG2 sha256msg2HPerform a Final Calculation for the Next Four SHA256 Message Doublewords sha256msg2( sha256msg2/(,https://www.felixcloutier.com/x86/sha256msg2VPMAXUWVPMAXUWvpmaxuw(Maximum of Packed Unsigned Word IntegersvpmaxuwIvpmaxuw/IvpmaxuwIvpmaxuw2IvpmaxuwIvpmaxuw5Ivpmaxuw4 vpmaxuwIvpmaxuw4/ vpmaxuw/Ivpmaxuw4!vpmaxuwIvpmaxuw42!vpmaxuw2IvpmaxuwIvpmaxuw5I AESDECLAST AESDECLAST aesdeclast,Perform Last Round of an AES Decryption Flow aesdeclast AESDECLAST' aesdeclast AESDECLAST/',https://www.felixcloutier.com/x86/aesdeclastVPRORDVPRORDvprordRotate Packed Doubleword Right vprord9Hvprord:Hvprord;HvprordHvprordHvprordHvprord9HvprordHvprord:HvprordHvprord;HvprordH?https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvqVPRORVDVPRORVDvprorvd'Variable Rotate Packed Doubleword Right vprorvd9HvprorvdHvprorvd:HvprorvdHvprorvd;HvprorvdHvprorvd9HvprorvdHvprorvd:HvprorvdHvprorvd;HvprorvdH?https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvqVRCP28PSVRCP28PSvrcp28pstApproximation to the Reciprocal of Packed Single-Precision Floating-Point Values with Less Than 2^-28 Relative Errorvrcp28ps;Mvrcp28psMvrcp28ps;Mvrcp28psMvrcp28psRMvrcp28psRM*https://www.felixcloutier.com/x86/vrcp28psVMOVWVMOVWvmovw Move WordvmovwRvmovwRvmovw$Rvmovw$R'https://www.felixcloutier.com/x86/vmovw VFIXUPIMMSS VFIXUPIMMSS vfixupimmss;Fix Up Special Scalar Single-Precision Floating-Point Value vfixupimmssH vfixupimmss'H vfixupimmssH vfixupimmss'H vfixupimmssRH vfixupimmssRH-https://www.felixcloutier.com/x86/vfixupimmss VFNMADD231PS VFNMADD231PS vfnmadd231psLFused Negative Multiply-Add of Packed Single-Precision Floating-Point Values vfnmadd231ps9H vfnmadd231psH vfnmadd231ps:H vfnmadd231psH vfnmadd231ps;H vfnmadd231psH vfnmadd231ps9H vfnmadd231ps4# vfnmadd231psH vfnmadd231ps4/# vfnmadd231ps:H vfnmadd231ps4# vfnmadd231psH vfnmadd231ps42# vfnmadd231ps;H vfnmadd231psH vfnmadd231psQH vfnmadd231psQHHhttps://www.felixcloutier.com/x86/vfnmadd132ps:vfnmadd213ps:vfnmadd231psKXORDKXORDkxord Bitwise Logical XOR 32-bit MaskskxordI9https://www.felixcloutier.com/x86/kxorw:kxorb:kxorq:kxord PREFETCHNTA PREFETCHNTA prefetchnta(Prefetch Data Into Caches using NTA Hint prefetchnta PREFETCHNTA3# SUBSDSUBSDsubsd6Subtract Scalar Double-Precision Floating-Point ValuessubsdSUBSD3subsdSUBSD3+'https://www.felixcloutier.com/x86/subsd VGATHERPF1DPD VGATHERPF1DPD vgatherpf1dpdoSparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Doubleword Indices Using T1 Hint vgatherpf1dpdGLYhttps://www.felixcloutier.com/x86/vgatherpf1dps:vgatherpf1qps:vgatherpf1dpd:vgatherpf1qpd VPBROADCASTW VPBROADCASTW vpbroadcastwBroadcast Word Integer vpbroadcastwI vpbroadcastwI vpbroadcastwI vpbroadcastwI vpbroadcastwI vpbroadcastwI vpbroadcastw$I vpbroadcastw$I vpbroadcastw$I vpbroadcastwI vpbroadcastw4! vpbroadcastwI vpbroadcastw4$! vpbroadcastw$I vpbroadcastwI vpbroadcastw4! vpbroadcastwI vpbroadcastw4$! vpbroadcastw$I vpbroadcastwI vpbroadcastwI vpbroadcastw$IUhttps://www.felixcloutier.com/x86/vpbroadcastb:vpbroadcastw:vpbroadcastd:vpbroadcastqPMOVSXBWPMOVSXBWpmovsxbw>Move Packed Byte Integers to Word Integers with Sign Extensionpmovsxbw3pmovsxbw3+ROUNDPSROUNDPSroundps3Round Packed Single Precision Floating-Point Valuesroundps3roundps3/)https://www.felixcloutier.com/x86/roundps TILESTORED TILESTORED tilestoredTILE STORE Data tilestoredST,https://www.felixcloutier.com/x86/tilestoredUMWAITUMWAITumwaitUser mode Monitor WaitumwaitG(https://www.felixcloutier.com/x86/umwaitHSUBPDHSUBPDhsubpd$Packed Double-FP Horizontal Subtracthsubpd3hsubpd3/(https://www.felixcloutier.com/x86/hsubpdPABSDPABSDpabsd,Packed Absolute Value of Doubleword Integerspabsd3pabsd3+pabsd3pabsd3/9https://www.felixcloutier.com/x86/pabsb:pabsw:pabsd:pabsqSETNGSETNGsetng-Set byte if not greater (ZF == 1 or SF != OF)setngSETLE3 setngSETLE3# VFMADD132PS VFMADD132PS vfmadd132psCFused Multiply-Add of Packed Single-Precision Floating-Point Values vfmadd132ps9H vfmadd132psH vfmadd132ps:H vfmadd132psH vfmadd132ps;H vfmadd132psH vfmadd132ps9H vfmadd132ps4# vfmadd132psH vfmadd132ps4/# vfmadd132ps:H vfmadd132ps4# vfmadd132psH vfmadd132ps42# vfmadd132ps;H vfmadd132psH vfmadd132psQH vfmadd132psQHEhttps://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231psVPHADDBDVPHADDBDvphaddbd6Packed Horizontal Add Signed Byte to Signed Doublewordvphaddbd"vphaddbd/"VPROTDVPROTDvprotdPacked Rotate Doublewordsvprotd"vprotd"vprotd/"vprotd/"vprotd/" VSCALEFPS VSCALEFPS vscalefps_Scale Packed Single-Precision Floating-Point Values With Single-Precision Floating-Point Values vscalefps9H vscalefpsH vscalefps:H vscalefpsH vscalefps;H vscalefpsH vscalefps9H vscalefpsH vscalefps:H vscalefpsH vscalefps;H vscalefpsH vscalefpsQH vscalefpsQH+https://www.felixcloutier.com/x86/vscalefpsVUCOMISHVUCOMISHvucomishLUnordered Compare Scalar Half-Precision Floating-Point Values and Set EFLAGSvucomishRvucomish$RvucomishRR*https://www.felixcloutier.com/x86/vucomishVXORPDVXORPDvxorpd>Bitwise Logical XOR for Double-Precision Floating-Point Valuesvxorpd=JvxorpdJvxorpd?JvxorpdJvxorpdAJvxorpdJvxorpd=Jvxorpd4 vxorpdJvxorpd4/ vxorpd?Jvxorpd4 vxorpdJvxorpd42 vxorpdAJvxorpdJ VCVTDQ2PS VCVTDQ2PS vcvtdq2psBConvert Packed Dword Integers to Packed Single-Precision FP Values vcvtdq2ps9H vcvtdq2ps:H vcvtdq2ps;H vcvtdq2psH vcvtdq2psH vcvtdq2psH vcvtdq2ps9H vcvtdq2ps4  vcvtdq2psH vcvtdq2ps4/  vcvtdq2ps:H vcvtdq2ps4  vcvtdq2psH vcvtdq2ps42  vcvtdq2ps;H vcvtdq2psH vcvtdq2psQH vcvtdq2psQHSETNAESETNAEsetnae(Set byte if not above or equal (CF == 1)setnaeSETCS3 setnaeSETCS3#UNPCKHPDUNPCKHPDunpckhpdHUnpack and Interleave High Packed Double-Precision Floating-Point ValuesunpckhpdUNPCKHPD3unpckhpdUNPCKHPD3/*https://www.felixcloutier.com/x86/unpckhpdVBROADCASTF64X4VBROADCASTF64X4vbroadcastf64x47Broadcast Four Double-Precision Floating-Point Elementsvbroadcastf64x42Hvbroadcastf64x42HMOVDQUMOVDQUmovdquMove Unaligned Double QuadwordmovdquMOVOU3movdquMOVOU3/movdquMOVOU3/Ohttps://www.felixcloutier.com/x86/movdqu:vmovdqu8:vmovdqu16:vmovdqu32:vmovdqu64VPMOVM2WVPMOVM2Wvpmovm2w4Expand Bits of Mask Register to Packed Word Integersvpmovm2wIvpmovm2wIvpmovm2wIEhttps://www.felixcloutier.com/x86/vpmovm2b:vpmovm2w:vpmovm2d:vpmovm2q GF2P8AFFINEQB GF2P8AFFINEQB gf2p8affineqb(Galois Field (2^8) Affine Transformation gf2p8affineqb gf2p8affineqb//https://www.felixcloutier.com/x86/gf2p8affineqbKXNORQKXNORQkxnorq!Bitwise Logical XNOR 64-bit MaskskxnorqI=https://www.felixcloutier.com/x86/kxnorw:kxnorb:kxnorq:kxnordVPSLLVWVPSLLVWvpsllvw,Variable Shift Packed Word Data Left Logical vpsllvwIvpsllvw/IvpsllvwIvpsllvw2IvpsllvwIvpsllvw5IvpsllvwIvpsllvw/IvpsllvwIvpsllvw2IvpsllvwIvpsllvw5I9https://www.felixcloutier.com/x86/vpsllvw:vpsllvd:vpsllvq VFMADD132SS VFMADD132SS vfmadd132ssCFused Multiply-Add of Scalar Single-Precision Floating-Point Values vfmadd132ssH vfmadd132ss'H vfmadd132ss4# vfmadd132ssH vfmadd132ss4'# vfmadd132ss'H vfmadd132ssQH vfmadd132ssQHEhttps://www.felixcloutier.com/x86/vfmadd132ss:vfmadd213ss:vfmadd231ssUCOMISSUCOMISSucomissNUnordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGSucomissUCOMISS3ucomissUCOMISS3')https://www.felixcloutier.com/x86/ucomissCMPSSCMPSScmpss5Compare Scalar Single-Precision Floating-Point ValuescmpssCMPSS3cmpssCMPSS3''https://www.felixcloutier.com/x86/cmpssCMOVNSCMOVNScmovnsMove if not sign (SF == 0)cmovnsw3  cmovnsw3 $cmovnsl3cmovnsl3'cmovnsq3cmovnsq3+MOVLHPSMOVLHPSmovlhps>Move Packed Single-Precision Floating-Point Values Low to HighmovlhpsMOVLHPS3)https://www.felixcloutier.com/x86/movlhps VPGATHERQQ VPGATHERQQ vpgatherqq;Gather Packed Quadword Values Using Signed Quadword Indices vpgatherqqDH vpgatherqqHH vpgatherqqLH vpgatherqqD! vpgatherqqH!7https://www.felixcloutier.com/x86/vpgatherqd:vpgatherqq VCVTPH2PS VCVTPH2PS vcvtph2ps>Convert Half-Precision FP Values to Single-Precision FP Values vcvtph2psH vcvtph2psH vcvtph2psH vcvtph2ps+H vcvtph2ps/H vcvtph2ps2H vcvtph2ps% vcvtph2psH vcvtph2ps+% vcvtph2ps+H vcvtph2ps% vcvtph2psH vcvtph2ps/% vcvtph2ps/H vcvtph2psH vcvtph2ps2H vcvtph2psRH vcvtph2psRH6https://www.felixcloutier.com/x86/vcvtph2ps:vcvtph2psx VPHMINPOSUW VPHMINPOSUW vphminposuw3Packed Horizontal Minimum of Unsigned Word Integers vphminposuw4  vphminposuw4/ SETPSETPsetpSet byte if parity (PF == 1)setpSETPS3 setpSETPS3# TDPFP16PS TDPFP16PS tdpfp16psHTile Dot Product of FP16 tiles with Packed Single-precision accumulation tdpfp16psTTT VCVTPH2PSX VCVTPH2PSX vcvtph2psx>Convert Half-Precision FP Values to Single-Precision FP Values vcvtph2psx.K vcvtph2psx<K vcvtph2psx>R vcvtph2psxK vcvtph2psxK vcvtph2psxR vcvtph2psx.K vcvtph2psxK vcvtph2psx<K vcvtph2psxK vcvtph2psx>R vcvtph2psxR vcvtph2psxRR vcvtph2psxRR6https://www.felixcloutier.com/x86/vcvtph2ps:vcvtph2psx VCVTTPH2UDQ VCVTTPH2UDQ vcvttph2udqpConvert with Truncation Packed Half-Precision Floating-Point Values to Packed Unsigned Doubleword Integer Values vcvttph2udq.K vcvttph2udq<K vcvttph2udq>R vcvttph2udqK vcvttph2udqK vcvttph2udqR vcvttph2udq.K vcvttph2udqK vcvttph2udq<K vcvttph2udqK vcvttph2udq>R vcvttph2udqR vcvttph2udqRR vcvttph2udqRR-https://www.felixcloutier.com/x86/vcvttph2udqSHLDSHLDshld#Integer Double Precision Shift Left shldw3  shldw3  shldl3shldl3shldq3shldq3shldw3$ shldw3$ shldl3'shldl3'shldq3+shldq3+&https://www.felixcloutier.com/x86/shldJNZJNZjnzJump if not zero (ZF == 0)jnzJNE3NjnzJNE3OORORorLogical Inclusive ORorbORB3orbORB3 orbORB3  orbORB3 #orwORW3 orwORW3 orwORW3 orwORW3  orwORW3 $orlORL3orlORL3orlORL3orlORL3orlORL3'orqORQ3orqORQ3orqORQ3orqORQ3orqORQ3+orbORB3#orbORB3# orwORW3$orwORW3$orwORW3$ orlORL3'orlORL3'orlORL3'orqORQ3+orqORQ3+orqORQ3+$https://www.felixcloutier.com/x86/orRDGSBASERDGSBASErdgsbaseReaD GS segment BASErdgsbase=rdgsbase=3https://www.felixcloutier.com/x86/rdfsbase:rdgsbaseTDPBUSDTDPBUSDtdpbusdOTile Dot Product of Unsigned bytes by Signed bytes with Doubleword accumulationtdpbusdTTTAhttps://www.felixcloutier.com/x86/tdpbssd:tdpbsud:tdpbusd:tdpbuud VFMADD132SD VFMADD132SD vfmadd132sdCFused Multiply-Add of Scalar Double-Precision Floating-Point Values vfmadd132sdH vfmadd132sd+H vfmadd132sd4# vfmadd132sdH vfmadd132sd4+# vfmadd132sd+H vfmadd132sdQH vfmadd132sdQHEhttps://www.felixcloutier.com/x86/vfmadd132sd:vfmadd213sd:vfmadd231sdSUBSSSUBSSsubss6Subtract Scalar Single-Precision Floating-Point ValuessubssSUBSS3subssSUBSS3''https://www.felixcloutier.com/x86/subss VFMADD213SD VFMADD213SD vfmadd213sdCFused Multiply-Add of Scalar Double-Precision Floating-Point Values vfmadd213sdH vfmadd213sd+H vfmadd213sd4# vfmadd213sdH vfmadd213sd4+# vfmadd213sd+H vfmadd213sdQH vfmadd213sdQHEhttps://www.felixcloutier.com/x86/vfmadd132sd:vfmadd213sd:vfmadd231sdVPHADDWQVPHADDWQvphaddwq4Packed Horizontal Add Signed Word to Signed Quadwordvphaddwq"vphaddwq/"PADDSWPADDSWpaddsw6Add Packed Signed Word Integers with Signed SaturationpaddswPADDSW3 paddswPADDSW3+ paddswPADDSW3paddswPADDSW3//https://www.felixcloutier.com/x86/paddsb:paddsw VPCOMPRESSD VPCOMPRESSD vpcompressdHStore Sparse Packed Doubleword Integer Values into Dense Memory/Register  vpcompressdH vpcompressd0H vpcompressdH vpcompressd3H vpcompressdH vpcompressd6H vpcompressdH vpcompressdH vpcompressdH vpcompressd/H vpcompressd2H vpcompressd5H-https://www.felixcloutier.com/x86/vpcompressdBTBTbtBit Test btwBTW3 btwBTW  btlBTL3btlBTLbtqBTQ3btqBTQbtwBTW3$btwBTW$ btlBTL3'btlBTL'btqBTQ3+btqBTQ+$https://www.felixcloutier.com/x86/btKANDQKANDQkandq Bitwise Logical AND 64-bit MaskskandqI9https://www.felixcloutier.com/x86/kandw:kandb:kandq:kandd VRNDSCALESD VRNDSCALESD vrndscalesd]Round Scalar Double-Precision Floating-Point Value To Include A Given Number Of Fraction Bits vrndscalesdH vrndscalesd+H vrndscalesdH vrndscalesd+H vrndscalesdRH vrndscalesdRH-https://www.felixcloutier.com/x86/vrndscalesdVSCATTERPF0QPSVSCATTERPF0QPSvscatterpf0qps‚Sparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Quadword Indices Using T0 Hint with Intent to Writevscatterpf0qpsML]https://www.felixcloutier.com/x86/vscatterpf0dps:vscatterpf0qps:vscatterpf0dpd:vscatterpf0qpdPMULHRSWPMULHRSWpmulhrswOPacked Multiply Signed Word Integers and Store High Result with Round and Scalepmulhrsw3pmulhrsw3+pmulhrsw3pmulhrsw3/*https://www.felixcloutier.com/x86/pmulhrsw VFMSUB213PS VFMSUB213PS vfmsub213psHFused Multiply-Subtract of Packed Single-Precision Floating-Point Values vfmsub213ps9H vfmsub213psH vfmsub213ps:H vfmsub213psH vfmsub213ps;H vfmsub213psH vfmsub213ps9H vfmsub213ps4# vfmsub213psH vfmsub213ps4/# vfmsub213ps:H vfmsub213ps4# vfmsub213psH vfmsub213ps42# vfmsub213ps;H vfmsub213psH vfmsub213psQH vfmsub213psQHEhttps://www.felixcloutier.com/x86/vfmsub132ps:vfmsub213ps:vfmsub231ps VPMOVSXDQ VPMOVSXDQ vpmovsxdqHMove Packed Doubleword Integers to Quadword Integers with Sign Extension vpmovsxdqH vpmovsxdqH vpmovsxdqH vpmovsxdq+H vpmovsxdq/H vpmovsxdq2H vpmovsxdq4  vpmovsxdqH vpmovsxdq4+  vpmovsxdq+H vpmovsxdq4! vpmovsxdqH vpmovsxdq4/! vpmovsxdq/H vpmovsxdqH vpmovsxdq2H VCVTSI2SS VCVTSI2SS vcvtsi2ss9Convert Dword Integer to Scalar Single-Precision FP Value  vcvtsi2ssl4  vcvtsi2sslH vcvtsi2ssq4  vcvtsi2ssqH vcvtsi2ssl4'  vcvtsi2ssl'H vcvtsi2ssq4+  vcvtsi2ssq+H vcvtsi2sslQH vcvtsi2ssqQHSAHFSAHFsahfStore AH into FlagssahfSAHF<&https://www.felixcloutier.com/x86/sahfCLWBCLWBclwbCache Line Write Backclwb#;&https://www.felixcloutier.com/x86/clwb VPACKUSWB VPACKUSWB vpackuswb.Pack Words into Bytes with Unsigned Saturation vpackuswbI vpackuswb/I vpackuswbI vpackuswb2I vpackuswbI vpackuswb5I vpackuswb4  vpackuswbI vpackuswb4/  vpackuswb/I vpackuswb4! vpackuswbI vpackuswb42! vpackuswb2I vpackuswbI vpackuswb5I TDPBF16PS TDPBF16PS tdpbf16psHTile Dot Product of BF16 tiles with Packed Single-precision accumulation tdpbf16psTTT+https://www.felixcloutier.com/x86/tdpbf16ps PUNPCKHBW PUNPCKHBW punpckhbw1Unpack and Interleave High-Order Bytes into Words punpckhbw PUNPCKHBW3  punpckhbw PUNPCKHBW3+  punpckhbw PUNPCKHBW3 punpckhbw PUNPCKHBW3/Jhttps://www.felixcloutier.com/x86/punpckhbw:punpckhwd:punpckhdq:punpckhqdq VPHADDUWD VPHADDUWD vphadduwd1Packed Horizontal Add Unsigned Word to Doubleword vphadduwd" vphadduwd/"VPMULTISHIFTQBVPMULTISHIFTQBvpmultishiftqb3Select Packed Unaligned Bytes from Quadword Sources vpmultishiftqb=KvpmultishiftqbKvpmultishiftqb?KvpmultishiftqbKvpmultishiftqbATvpmultishiftqbTvpmultishiftqb=KvpmultishiftqbKvpmultishiftqb?KvpmultishiftqbKvpmultishiftqbATvpmultishiftqbT0https://www.felixcloutier.com/x86/vpmultishiftqbVFMSUBADD132PHVFMSUBADD132PHvfmsubadd132phVFused Multiply-Alternating Subtract/Add of Packed Half-Precision Floating-Point Valuesvfmsubadd132ph<Kvfmsubadd132phKvfmsubadd132ph>Kvfmsubadd132phKvfmsubadd132ph@Rvfmsubadd132phRvfmsubadd132ph<Kvfmsubadd132phKvfmsubadd132ph>Kvfmsubadd132phKvfmsubadd132ph@Rvfmsubadd132phRvfmsubadd132phQRvfmsubadd132phQRNhttps://www.felixcloutier.com/x86/vfmsubadd132ph:vfmsubadd213ph:vfmsubadd231phVPHADDDVPHADDDvphaddd(Packed Horizontal Add Doubleword Integervphaddd4 vphaddd4/ vphaddd4!vphaddd42!PALIGNRPALIGNRpalignrPacked Align Rightpalignr3palignr3+palignr3palignr3/)https://www.felixcloutier.com/x86/palignr VFMSUB132PD VFMSUB132PD vfmsub132pdHFused Multiply-Subtract of Packed Double-Precision Floating-Point Values vfmsub132pd=H vfmsub132pdH vfmsub132pd?H vfmsub132pdH vfmsub132pdAH vfmsub132pdH vfmsub132pd=H vfmsub132pd4# vfmsub132pdH vfmsub132pd4/# vfmsub132pd?H vfmsub132pd4# vfmsub132pdH vfmsub132pd42# vfmsub132pdAH vfmsub132pdH vfmsub132pdQH vfmsub132pdQHEhttps://www.felixcloutier.com/x86/vfmsub132pd:vfmsub213pd:vfmsub231pdVPMOVWBVPMOVWBvpmovwb>Down Convert Packed Word Values to Byte Values with Truncation vpmovwbIvpmovwb,IvpmovwbIvpmovwb0IvpmovwbIvpmovwb3IvpmovwbIvpmovwbIvpmovwbIvpmovwb+Ivpmovwb/Ivpmovwb2I<https://www.felixcloutier.com/x86/vpmovwb:vpmovswb:vpmovuswbVPHADDDQVPHADDDQvphadddq:Packed Horizontal Add Signed Doubleword to Signed Quadwordvphadddq"vphadddq/"KADDBKADDBkaddbADD Two 8-bit MaskskaddbJ9https://www.felixcloutier.com/x86/kaddw:kaddb:kaddq:kaddd VCVTQQ2PD VCVTQQ2PD vcvtqq2pdQConvert Packed Quadword Integers to Packed Double-Precision Floating-Point Values vcvtqq2pd=J vcvtqq2pd?J vcvtqq2pdAJ vcvtqq2pdJ vcvtqq2pdJ vcvtqq2pdJ vcvtqq2pd=J vcvtqq2pdJ vcvtqq2pd?J vcvtqq2pdJ vcvtqq2pdAJ vcvtqq2pdJ vcvtqq2pdQJ vcvtqq2pdQJ+https://www.felixcloutier.com/x86/vcvtqq2pdPCMPEQWPCMPEQWpcmpeqw%Compare Packed Word Data for EqualitypcmpeqwPCMPEQW3 pcmpeqwPCMPEQW3+ pcmpeqwPCMPEQW3pcmpeqwPCMPEQW3/9https://www.felixcloutier.com/x86/pcmpeqb:pcmpeqw:pcmpeqdSETNASETNAsetna*Set byte if not above (CF == 1 or ZF == 1)setnaSETLS3 setnaSETLS3#VRSQRTSHVRSQRTSHvrsqrtshOCompute Reciprocal of Square Root of Scalar Half-Precision Floating-Point ValuevrsqrtshRvrsqrtsh$RvrsqrtshRvrsqrtsh$R*https://www.felixcloutier.com/x86/vrsqrtshJPEJPEjpeJump if parity even (PF == 1)jpeJPS3NjpeJPS3O VFNMSUB132PH VFNMSUB132PH vfnmsub132phOFused Negative Multiply-Subtract of Packed Half-Precision Floating-Point Values vfnmsub132ph<K vfnmsub132phK vfnmsub132ph>K vfnmsub132phK vfnmsub132ph@R vfnmsub132phR vfnmsub132ph<K vfnmsub132phK vfnmsub132ph>K vfnmsub132phK vfnmsub132ph@R vfnmsub132phR vfnmsub132phQR vfnmsub132phQRlhttps://www.felixcloutier.com/x86/vfmsub132ph:vfnmsub132ph:vfmsub213ph:vfnmsub213ph:vfmsub231ph:vfnmsub231phBLSIBLSIblsiIsolate Lowest Set Bitblsil4blsil'4blsiq4blsiq+4&https://www.felixcloutier.com/x86/blsiCMOVNBCMOVNBcmovnbMove if not below (CF == 0)cmovnbw3  cmovnbw3 $cmovnbl3cmovnbl3'cmovnbq3cmovnbq3+ADCADCadcAdd with CarryadcbADCB3adcbADCB3 adcbADCB3  adcbADCB3 #adcwADCW3 adcwADCW3 adcwADCW3 adcwADCW3  adcwADCW3 $adclADCL3adclADCL3adclADCL3adclADCL3adclADCL3'adcqADCQ3adcqADCQ3adcqADCQ3adcqADCQ3adcqADCQ3+adcbADCB3#adcbADCB3# adcwADCW3$adcwADCW3$adcwADCW3$ adclADCL3'adclADCL3'adclADCL3'adcqADCQ3+adcqADCQ3+adcqADCQ3+%https://www.felixcloutier.com/x86/adcJEJEjeJump if equal (ZF == 1)jeJEQ3NjeJEQ3OJNGJNGjng)Jump if not greater (ZF == 1 or SF != OF)jngJLE3NjngJLE3OVANDNPDVANDNPDvandnpdHBitwise Logical AND NOT of Packed Double-Precision Floating-Point Valuesvandnpd=JvandnpdJvandnpd?JvandnpdJvandnpdAJvandnpdJvandnpd=Jvandnpd4 vandnpdJvandnpd4/ vandnpd?Jvandnpd4 vandnpdJvandnpd42 vandnpdAJvandnpdJ VPMOVSXBQ VPMOVSXBQ vpmovsxbqBMove Packed Byte Integers to Quadword Integers with Sign Extension vpmovsxbqH vpmovsxbqH vpmovsxbqH vpmovsxbq$H vpmovsxbq'H vpmovsxbq+H vpmovsxbq4  vpmovsxbqH vpmovsxbq4$  vpmovsxbq$H vpmovsxbq4! vpmovsxbqH vpmovsxbq4'! vpmovsxbq'H vpmovsxbqH vpmovsxbq+HVPSIGNBVPSIGNBvpsignbPacked Sign of Byte Integersvpsignb4 vpsignb4/ vpsignb4!vpsignb42!VMOVHLPSVMOVHLPSvmovhlps>Move Packed Single-Precision Floating-Point Values High to Lowvmovhlps4 vmovhlpsHVSCATTERPF0QPDVSCATTERPF0QPDvscatterpf0qpd‚Sparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Quadword Indices Using T0 Hint with Intent to Writevscatterpf0qpdML]https://www.felixcloutier.com/x86/vscatterpf0dps:vscatterpf0qps:vscatterpf0dpd:vscatterpf0qpdPFMINPFMINpfminPacked Floating-Point MinimumpfminPFMIN3pfminPFMIN3+PINSRQPINSRQpinsrqInsert QuadwordpinsrqPINSRQ3pinsrqPINSRQ3+6https://www.felixcloutier.com/x86/pinsrb:pinsrd:pinsrq VFNMSUB132PD VFNMSUB132PD vfnmsub132pdQFused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values vfnmsub132pd=H vfnmsub132pdH vfnmsub132pd?H vfnmsub132pdH vfnmsub132pdAH vfnmsub132pdH vfnmsub132pd=H vfnmsub132pd4# vfnmsub132pdH vfnmsub132pd4/# vfnmsub132pd?H vfnmsub132pd4# vfnmsub132pdH vfnmsub132pd42# vfnmsub132pdAH vfnmsub132pdH vfnmsub132pdQH vfnmsub132pdQHHhttps://www.felixcloutier.com/x86/vfnmsub132pd:vfnmsub213pd:vfnmsub231pdPSHUFLWPSHUFLWpshuflwShuffle Packed Low WordspshuflwPSHUFLW3pshuflwPSHUFLW3/)https://www.felixcloutier.com/x86/pshuflwADDSDADDSDaddsd1Add Scalar Double-Precision Floating-Point ValuesaddsdADDSD3addsdADDSD3+'https://www.felixcloutier.com/x86/addsdRDPIDRDPIDrdpidRead Processor IDrdpid,'https://www.felixcloutier.com/x86/rdpidVFMADDSUB132PSVFMADDSUB132PSvfmaddsub132psXFused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Valuesvfmaddsub132ps9Hvfmaddsub132psHvfmaddsub132ps:Hvfmaddsub132psHvfmaddsub132ps;Hvfmaddsub132psHvfmaddsub132ps9Hvfmaddsub132ps4#vfmaddsub132psHvfmaddsub132ps4/#vfmaddsub132ps:Hvfmaddsub132ps4#vfmaddsub132psHvfmaddsub132ps42#vfmaddsub132ps;Hvfmaddsub132psHvfmaddsub132psQHvfmaddsub132psQHNhttps://www.felixcloutier.com/x86/vfmaddsub132ps:vfmaddsub213ps:vfmaddsub231psVPMINSWVPMINSWvpminsw&Minimum of Packed Signed Word IntegersvpminswIvpminsw/IvpminswIvpminsw2IvpminswIvpminsw5Ivpminsw4 vpminswIvpminsw4/ vpminsw/Ivpminsw4!vpminswIvpminsw42!vpminsw2IvpminswIvpminsw5IPSUBUSBPSUBUSBpsubusb?Subtract Packed Unsigned Byte Integers with Unsigned SaturationpsubusbPSUBUSB3 psubusbPSUBUSB3+ psubusbPSUBUSB3psubusbPSUBUSB3/1https://www.felixcloutier.com/x86/psubusb:psubuswKORWKORWkorwBitwise Logical OR 16-bit MaskskorwH5https://www.felixcloutier.com/x86/korw:korb:korq:kordKXORQKXORQkxorq Bitwise Logical XOR 64-bit MaskskxorqI9https://www.felixcloutier.com/x86/kxorw:kxorb:kxorq:kxordVANDPSVANDPSvandpsDBitwise Logical AND of Packed Single-Precision Floating-Point Valuesvandps9JvandpsJvandps:JvandpsJvandps;JvandpsJvandps9Jvandps4 vandpsJvandps4/ vandps:Jvandps4 vandpsJvandps42 vandps;JvandpsJVPMOVD2MVPMOVD2Mvpmovd2m9Move Signs of Packed Doubleword Integers to Mask Registervpmovd2mJvpmovd2mJvpmovd2mJEhttps://www.felixcloutier.com/x86/vpmovb2m:vpmovw2m:vpmovd2m:vpmovq2mPMOVZXBQPMOVZXBQpmovzxbqBMove Packed Byte Integers to Quadword Integers with Zero Extensionpmovzxbq3pmovzxbq3$VPSRLVDVPSRLVDvpsrlvd3Variable Shift Packed Doubleword Data Right Logicalvpsrlvd9HvpsrlvdHvpsrlvd:HvpsrlvdHvpsrlvd;HvpsrlvdHvpsrlvd9Hvpsrlvd4!vpsrlvdHvpsrlvd4/!vpsrlvd:Hvpsrlvd4!vpsrlvdHvpsrlvd42!vpsrlvd;HvpsrlvdH9https://www.felixcloutier.com/x86/vpsrlvw:vpsrlvd:vpsrlvq VSCATTERQPD VSCATTERQPD vscatterqpdRScatter Packed Double-Precision Floating-Point Values with Signed Quadword Indices vscatterqpdEH vscatterqpdIH vscatterqpdMHQhttps://www.felixcloutier.com/x86/vscatterdps:vscatterdpd:vscatterqps:vscatterqpdCMOVCCMOVCcmovcMove if carry (CF == 1)cmovcw3  cmovcw3 $cmovcl3cmovcl3'cmovcq3cmovcq3+STMXCSRSTMXCSRstmxcsrStore MXCSR Register StatestmxcsrSTMXCSR3')https://www.felixcloutier.com/x86/stmxcsrVROUNDPDVROUNDPDvroundpd3Round Packed Double Precision Floating-Point Valuesvroundpd4 vroundpd4/ vroundpd4 vroundpd42 VSHUFPSVSHUFPSvshufps5Shuffle Packed Single-Precision Floating-Point Valuesvshufps9HvshufpsHvshufps:HvshufpsHvshufps;HvshufpsHvshufps9Hvshufps4 vshufpsHvshufps4/ vshufps:Hvshufps4 vshufpsHvshufps42 vshufps;HvshufpsHVGF2P8AFFINEQBVGF2P8AFFINEQBvgf2p8affineqb(Galois Field (2^8) Affine Transformationvgf2p8affineqb=vgf2p8affineqbvgf2p8affineqb?vgf2p8affineqbvgf2p8affineqbAvgf2p8affineqbvgf2p8affineqb=vgf2p8affineqbvgf2p8affineqbvgf2p8affineqb/vgf2p8affineqb?vgf2p8affineqbvgf2p8affineqbvgf2p8affineqb2vgf2p8affineqbAvgf2p8affineqbKXNORWKXNORWkxnorw!Bitwise Logical XNOR 16-bit MaskskxnorwH=https://www.felixcloutier.com/x86/kxnorw:kxnorb:kxnorq:kxnordVPSRLDQVPSRLDQvpsrldq*Shift Packed Double Quadword Right Logicalvpsrldq4 vpsrldqIvpsrldq/Ivpsrldq!vpsrldqIvpsrldq2IvpsrldqIvpsrldq5IVFMADDSUB231PHVFMADDSUB231PHvfmaddsub231phVFused Multiply-Alternating Add/Subtract of Packed Half-Precision Floating-Point Valuesvfmaddsub231ph<Kvfmaddsub231phKvfmaddsub231ph>Kvfmaddsub231phKvfmaddsub231ph@Rvfmaddsub231phRvfmaddsub231ph<Kvfmaddsub231phKvfmaddsub231ph>Kvfmaddsub231phKvfmaddsub231ph@Rvfmaddsub231phRvfmaddsub231phQRvfmaddsub231phQRNhttps://www.felixcloutier.com/x86/vfmaddsub132ph:vfmaddsub213ph:vfmaddsub231ph VEXTRACTF32X8 VEXTRACTF32X8 vextractf32x8AExtract 256 Bits of Packed Single-Precision Floating-Point Values vextractf32x8J vextractf32x83J vextractf32x8J vextractf32x82JVFMADDSUB132PHVFMADDSUB132PHvfmaddsub132phVFused Multiply-Alternating Add/Subtract of Packed Half-Precision Floating-Point Valuesvfmaddsub132ph<Kvfmaddsub132phKvfmaddsub132ph>Kvfmaddsub132phKvfmaddsub132ph@Rvfmaddsub132phRvfmaddsub132ph<Kvfmaddsub132phKvfmaddsub132ph>Kvfmaddsub132phKvfmaddsub132ph@Rvfmaddsub132phRvfmaddsub132phQRvfmaddsub132phQRNhttps://www.felixcloutier.com/x86/vfmaddsub132ph:vfmaddsub213ph:vfmaddsub231phMULMULmulUnsigned MultiplymulbMULB3 mulwMULW3 mullMULL3mulqMULQ3mulbMULB3#mulwMULW3$mullMULL3'mulqMULQ3+%https://www.felixcloutier.com/x86/mulVPCMPEQWVPCMPEQWvpcmpeqw%Compare Packed Word Data for EqualityvpcmpeqwIvpcmpeqwIvpcmpeqw/Ivpcmpeqw/IvpcmpeqwIvpcmpeqwIvpcmpeqw2Ivpcmpeqw2IvpcmpeqwIvpcmpeqwIvpcmpeqw5Ivpcmpeqw5Ivpcmpeqw4 vpcmpeqw4/ vpcmpeqw4!vpcmpeqw42!VPMACSDDVPMACSDDvpmacsddAPacked Multiply Accumulate Signed Doubleword to Signed Doublewordvpmacsdd"vpmacsdd/" VEXTRACTI32X4 VEXTRACTI32X4 vextracti32x44Extract 128 Bits of Packed Doubleword Integer Values vextracti32x4H vextracti32x40H vextracti32x4H vextracti32x40H vextracti32x4H vextracti32x4H vextracti32x4/H vextracti32x4/HVCMPSHVCMPSHvcmpsh3Compare Scalar Half-Precision Floating-Point ValuesvcmpshRvcmpshRvcmpsh$Rvcmpsh$RvcmpshRRvcmpshRR(https://www.felixcloutier.com/x86/vcmpsh VPEXPANDB VPEXPANDB vpexpandbALoad Sparse Packed Byte Integer Values from Dense Memory/Register  vpexpandbK vpexpandbK vpexpandbU vpexpandb/K vpexpandb2K vpexpandb5U vpexpandbK vpexpandb/K vpexpandbK vpexpandb2K vpexpandbU vpexpandb5U5https://www.felixcloutier.com/x86/vpexpandb:vpexpandwCLDCLDcldClear Direction FlagcldCLD3%https://www.felixcloutier.com/x86/cldPSHUFBPSHUFBpshufbPacked Shuffle BytespshufbPSHUFB3pshufbPSHUFB3+pshufbPSHUFB3pshufbPSHUFB3/(https://www.felixcloutier.com/x86/pshufbVPSHLBVPSHLBvpshlbPacked Shift Logical Bytesvpshlb"vpshlb/"vpshlb/"VPSLLDQVPSLLDQvpslldq)Shift Packed Double Quadword Left Logicalvpslldq4 vpslldqIvpslldq/Ivpslldq!vpslldqIvpslldq2IvpslldqIvpslldq5IVPSHUFLWVPSHUFLWvpshuflwShuffle Packed Low WordsvpshuflwIvpshuflwIvpshuflwIvpshuflw/Ivpshuflw2Ivpshuflw5Ivpshuflw4 vpshuflwIvpshuflw4/ vpshuflw/Ivpshuflw4!vpshuflwIvpshuflw42!vpshuflw2IvpshuflwIvpshuflw5ISHRSHRshrLogical Shift RightshrbSHRB3 shrbSHRB3 shrbSHRB3 shrwSHRW3 shrwSHRW3 shrwSHRW3 shrlSHRL3shrlSHRL3shrlSHRL3shrqSHRQ3shrqSHRQ3shrqSHRQ3shrbSHRB3#shrbSHRB3#shrbSHRB3#shrwSHRW3$shrwSHRW3$shrwSHRW3$shrlSHRL3'shrlSHRL3'shrlSHRL3'shrqSHRQ3+shrqSHRQ3+shrqSHRQ3+1https://www.felixcloutier.com/x86/sal:sar:shl:shr VFNMADD132PH VFNMADD132PH vfnmadd132phJFused Negative Multiply-Add of Packed Half-Precision Floating-Point Values vfnmadd132ph<K vfnmadd132phK vfnmadd132ph>K vfnmadd132phK vfnmadd132ph@R vfnmadd132phR vfnmadd132ph<K vfnmadd132phK vfnmadd132ph>K vfnmadd132phK vfnmadd132ph@R vfnmadd132phR vfnmadd132phQR vfnmadd132phQRlhttps://www.felixcloutier.com/x86/vfmadd132ph:vfnmadd132ph:vfmadd213ph:vfnmadd213ph:vfmadd231ph:vfnmadd231ph VCVTPS2UDQ VCVTPS2UDQ vcvtps2udqbConvert Packed Single-Precision Floating-Point Values to Packed Unsigned Doubleword Integer Values vcvtps2udq9H vcvtps2udq:H vcvtps2udq;H vcvtps2udqH vcvtps2udqH vcvtps2udqH vcvtps2udq9H vcvtps2udqH vcvtps2udq:H vcvtps2udqH vcvtps2udq;H vcvtps2udqH vcvtps2udqQH vcvtps2udqQH,https://www.felixcloutier.com/x86/vcvtps2udqVPSLLDVPSLLDvpslld)Shift Packed Doubleword Data Left Logicalvpslld9Hvpslld:Hvpslld;HvpslldHvpslldHvpslld/HvpslldHvpslldHvpslld/HvpslldHvpslldHvpslld/Hvpslld9Hvpslld4 vpslldHvpslld4 vpslldHvpslld4/ vpslld/Hvpslld:Hvpslld4!vpslldHvpslld4!vpslldHvpslld4/!vpslld/Hvpslld;HvpslldHvpslldHvpslld/H VFMSUB231SD VFMSUB231SD vfmsub231sdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfmsub231sdH vfmsub231sd+H vfmsub231sd4# vfmsub231sdH vfmsub231sd4+# vfmsub231sd+H vfmsub231sdQH vfmsub231sdQHEhttps://www.felixcloutier.com/x86/vfmsub132sd:vfmsub213sd:vfmsub231sd VREDUCEPS VREDUCEPS vreducepsQPerform Reduction Transformation on Packed Single-Precision Floating-Point Values  vreduceps9J vreduceps:J vreduceps;J vreducepsJ vreducepsJ vreducepsJ vreduceps9J vreducepsJ vreduceps:J vreducepsJ vreduceps;J vreducepsJ+https://www.felixcloutier.com/x86/vreducepsKANDNBKANDNBkandnb#Bitwise Logical AND NOT 8-bit MaskskandnbJ=https://www.felixcloutier.com/x86/kandnw:kandnb:kandnq:kandndPMAXUBPMAXUBpmaxub(Maximum of Packed Unsigned Byte IntegerspmaxubPMAXUB3 pmaxubPMAXUB3+ pmaxubPMAXUB3pmaxubPMAXUB3//https://www.felixcloutier.com/x86/pmaxub:pmaxuw PREFETCHT2 PREFETCHT2 prefetcht2'Prefetch Data Into Caches using T2 Hint prefetcht2 PREFETCHT23# VUCOMISDVUCOMISDvucomisdNUnordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGSvucomisd4 vucomisdHvucomisd4+ vucomisd+HvucomisdRHPMULHRWPMULHRWpmulhrw!Packed Multiply High Rounded WordpmulhrwPMULHRW3pmulhrwPMULHRW3+ VFMSUB213SH VFMSUB213SH vfmsub213shFFused Multiply-Subtract of Scalar Half-Precision Floating-Point Values vfmsub213shR vfmsub213sh$R vfmsub213shR vfmsub213sh$R vfmsub213shQR vfmsub213shQRlhttps://www.felixcloutier.com/x86/vfmsub132sh:vfnmsub132sh:vfmsub213sh:vfnmsub213sh:vfmsub231sh:vfnmsub231sh VPCOMPRESSQ VPCOMPRESSQ vpcompressqFStore Sparse Packed Quadword Integer Values into Dense Memory/Register  vpcompressqH vpcompressq0H vpcompressqH vpcompressq3H vpcompressqH vpcompressq6H vpcompressqH vpcompressqH vpcompressqH vpcompressq/H vpcompressq2H vpcompressq5H-https://www.felixcloutier.com/x86/vpcompressq VSM3RNDS2 VSM3RNDS2 vsm3rnds2#Perform Two Rounds of SM3 Operation vsm3rnds2 vsm3rnds2/VRCP14SSVRCP14SSvrcp14ssPCompute Approximate Reciprocal of a Scalar Single-Precision Floating-Point Valuevrcp14ssHvrcp14ss'Hvrcp14ssHvrcp14ss'H*https://www.felixcloutier.com/x86/vrcp14ss PUNPCKLQDQ PUNPCKLQDQ punpcklqdq?Unpack and Interleave Low-Order Quadwords into Double Quadwords punpcklqdq PUNPCKLQDQ3 punpcklqdq PUNPCKLQDQ3/Jhttps://www.felixcloutier.com/x86/punpcklbw:punpcklwd:punpckldq:punpcklqdqSETNLSETNLsetnlSet byte if not less (SF == OF)setnlSETGE3 setnlSETGE3#VSM4KEY4VSM4KEY4vsm4key4(Perform Four Rounds of SM4 Key Expansionvsm4key4vsm4key4/vsm4key4vsm4key42LAHFLAHFlahfLoad AH from FlagslahfLAHF<&https://www.felixcloutier.com/x86/lahf VSCATTERQPS VSCATTERQPS vscatterqpsRScatter Packed Single-Precision Floating-Point Values with Signed Quadword Indices vscatterqpsEH vscatterqpsIH vscatterqpsMHQhttps://www.felixcloutier.com/x86/vscatterdps:vscatterdpd:vscatterqps:vscatterqpdAADDAADDaaddAtomically ADDaadd'aadd+ANDNANDNandnLogical AND NOTandnl4andnl'4andnq4andnq+4&https://www.felixcloutier.com/x86/andn VRSQRT28SS VRSQRT28SS vrsqrt28ssApproximation to the Reciprocal Square Root of a Scalar Single-Precision Floating-Point Value with Less Than 2^-28 Relative Error vrsqrt28ssM vrsqrt28ss'M vrsqrt28ssM vrsqrt28ss'M vrsqrt28ssRM vrsqrt28ssRM,https://www.felixcloutier.com/x86/vrsqrt28ssBSFBSFbsfBit Scan ForwardbsfwBSFW3  bsfwBSFW3 $bsflBSFL3bsflBSFL3'bsfqBSFQ3bsfqBSFQ3+%https://www.felixcloutier.com/x86/bsfPFRCPIT1PFRCPIT1pfrcpit1,Packed Floating-Point Reciprocal Iteration 1pfrcpit1PFRCPIT13pfrcpit1PFRCPIT13+VSQRTSDVSQRTSDvsqrtsdCCompute Square Root of Scalar Double-Precision Floating-Point ValuevsqrtsdHvsqrtsd+Hvsqrtsd4 vsqrtsdHvsqrtsd4+ vsqrtsd+HvsqrtsdQHvsqrtsdQHVMOVAPSVMOVAPSvmovaps:Move Aligned Packed Single-Precision Floating-Point Valuesvmovaps0HvmovapsHvmovaps3HvmovapsHvmovaps6HvmovapsHvmovaps/Hvmovaps2Hvmovaps5Hvmovaps4 vmovapsHvmovaps4/ vmovaps/Hvmovaps4 vmovapsHvmovaps42 vmovaps2HvmovapsHvmovaps5Hvmovaps4/ vmovaps/Hvmovaps42 vmovaps2Hvmovaps5HKSHIFTLBKSHIFTLBkshiftlbShift Left 8-bit MaskskshiftlbJEhttps://www.felixcloutier.com/x86/kshiftlw:kshiftlb:kshiftlq:kshiftldJOJOjoJump if overflow (OF == 1)joJOS3NjoJOS3OVCVTNEOBF162PSVCVTNEOBF162PSvcvtneobf162ps9Convert Odd Elements of Packed BF16 Values to FP32 Valuesvcvtneobf162ps/Zvcvtneobf162ps2Z PREFETCHT0 PREFETCHT0 prefetcht0'Prefetch Data Into Caches using T0 Hint prefetcht0 PREFETCHT03# KSHIFTRWKSHIFTRWkshiftrwShift Right 16-bit MaskskshiftrwHEhttps://www.felixcloutier.com/x86/kshiftrw:kshiftrb:kshiftrq:kshiftrdVPMULDQVPMULDQvpmuldqDMultiply Packed Signed Doubleword Integers and Store Quadword Resultvpmuldq=HvpmuldqHvpmuldq?HvpmuldqHvpmuldqAHvpmuldqHvpmuldq=Hvpmuldq4 vpmuldqHvpmuldq4/ vpmuldq?Hvpmuldq4!vpmuldqHvpmuldq42!vpmuldqAHvpmuldqHVPSHLDWVPSHLDWvpshldw3Concatenate and Shift Packed Word Data Left Logical vpshldwKvpshldw/KvpshldwKvpshldw2KvpshldwUvpshldw5UvpshldwKvpshldw/KvpshldwKvpshldw2KvpshldwUvpshldw5UVFMULCSHVFMULCSHvfmulcshEFused Multiply of Complex Scalar Half-Precision Floating-Point ValuesvfmulcshRvfmulcsh'RvfmulcshRvfmulcsh'RvfmulcshQRvfmulcshQR4https://www.felixcloutier.com/x86/vfcmulcsh:vfmulcshVPHSUBDQVPHSUBDQvphsubdq?Packed Horizontal Subtract Signed Doubleword to Signed Quadwordvphsubdq"vphsubdq/"KSHIFTRDKSHIFTRDkshiftrdShift Right 32-bit MaskskshiftrdIEhttps://www.felixcloutier.com/x86/kshiftrw:kshiftrb:kshiftrq:kshiftrdMOVQ2DQMOVQ2DQmovq2dq1Move Quadword from MMX Technology to XMM Registermovq2dq3)https://www.felixcloutier.com/x86/movq2dqSETGSETGsetg*Set byte if greater (ZF == 0 and SF == OF)setgSETGT3 setgSETGT3# VPMOVZXBD VPMOVZXBD vpmovzxbdDMove Packed Byte Integers to Doubleword Integers with Zero Extension vpmovzxbdH vpmovzxbdH vpmovzxbdH vpmovzxbd'H vpmovzxbd+H vpmovzxbd/H vpmovzxbd4  vpmovzxbdH vpmovzxbd4'  vpmovzxbd'H vpmovzxbd4! vpmovzxbdH vpmovzxbd4+! vpmovzxbd+H vpmovzxbdH vpmovzxbd/H VPMOVZXWQ VPMOVZXWQ vpmovzxwqBMove Packed Word Integers to Quadword Integers with Zero Extension vpmovzxwqH vpmovzxwqH vpmovzxwqH vpmovzxwq'H vpmovzxwq+H vpmovzxwq/H vpmovzxwq4  vpmovzxwqH vpmovzxwq4'  vpmovzxwq'H vpmovzxwq4! vpmovzxwqH vpmovzxwq4+! vpmovzxwq+H vpmovzxwqH vpmovzxwq/HTDPBSUDTDPBSUDtdpbsudOTile Dot Product of Signed bytes by Unsigned bytes with Doubleword accumulationtdpbsudTTTAhttps://www.felixcloutier.com/x86/tdpbssd:tdpbsud:tdpbusd:tdpbuudPBLENDWPBLENDWpblendwBlend Packed Wordspblendw3pblendw3/)https://www.felixcloutier.com/x86/pblendw VFNMSUB231SD VFNMSUB231SD vfnmsub231sdQFused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfnmsub231sdH vfnmsub231sd+H vfnmsub231sd4# vfnmsub231sdH vfnmsub231sd4+# vfnmsub231sd+H vfnmsub231sdQH vfnmsub231sdQHHhttps://www.felixcloutier.com/x86/vfnmsub132sd:vfnmsub213sd:vfnmsub231sd VFNMSUB231SH VFNMSUB231SH vfnmsub231shOFused Negative Multiply-Subtract of Scalar Half-Precision Floating-Point Values vfnmsub231shR vfnmsub231sh$R vfnmsub231shR vfnmsub231sh$R vfnmsub231shQR vfnmsub231shQRlhttps://www.felixcloutier.com/x86/vfmsub132sh:vfnmsub132sh:vfmsub213sh:vfnmsub213sh:vfmsub231sh:vfnmsub231shVPMINSBVPMINSBvpminsb&Minimum of Packed Signed Byte IntegersvpminsbIvpminsb/IvpminsbIvpminsb2IvpminsbIvpminsb5Ivpminsb4 vpminsbIvpminsb4/ vpminsb/Ivpminsb4!vpminsbIvpminsb42!vpminsb2IvpminsbIvpminsb5IADDPSADDPSaddps1Add Packed Single-Precision Floating-Point ValuesaddpsADDPS3addpsADDPS3/'https://www.felixcloutier.com/x86/addpsKORTESTQKORTESTQkortestqOR 64-bit Masks and Set FlagskortestqIEhttps://www.felixcloutier.com/x86/kortestw:kortestb:kortestq:kortestdPINSRWPINSRWpinsrw Insert WordpinsrwPINSRW3 pinsrwPINSRW3$ pinsrwPINSRW3pinsrwPINSRW3$(https://www.felixcloutier.com/x86/pinsrwMOVSDMOVSDmovsd1Move Scalar Double-Precision Floating-Point ValuemovsdMOVSD3movsdMOVSD3+movsdMOVSD3+'https://www.felixcloutier.com/x86/movsdAESIMCAESIMCaesimc+Perform the AES InvMixColumn TransformationaesimcAESIMC'aesimcAESIMC/'(https://www.felixcloutier.com/x86/aesimcSETZSETZsetzSet byte if zero (ZF == 1)setzSETEQ3 setzSETEQ3#VADDPHVADDPHvaddph/Add Packed Half-Precision Floating-Point Valuesvaddph<KvaddphKvaddph>KvaddphKvaddph@RvaddphRvaddph<KvaddphKvaddph>KvaddphKvaddph@RvaddphRvaddphQRvaddphQR(https://www.felixcloutier.com/x86/vaddph VCVTTSH2USI VCVTTSH2USI vcvttsh2usiVConvert with Truncation Scalar Half-Precision Floating-Point Value to Unsigned Integer vcvttsh2usiR vcvttsh2usi$R vcvttsh2usiR vcvttsh2usi$R vcvttsh2usiRR vcvttsh2usiRR-https://www.felixcloutier.com/x86/vcvttsh2usiVCMPSSVCMPSSvcmpss5Compare Scalar Single-Precision Floating-Point ValuesvcmpssHvcmpssHvcmpss'Hvcmpss'Hvcmpss4 vcmpss4' vcmpssRHvcmpssRH VEXTRACTF64X2 VEXTRACTF64X2 vextractf64x2AExtract 128 Bits of Packed Double-Precision Floating-Point Values vextractf64x2J vextractf64x20J vextractf64x2J vextractf64x20J vextractf64x2J vextractf64x2J vextractf64x2/J vextractf64x2/JCDQCDQcdqConvert Doubleword to Quadwordcltd3-https://www.felixcloutier.com/x86/cwd:cdq:cqoVMOVHPDVMOVHPDvmovhpd6Move High Packed Double-Precision Floating-Point Valuevmovhpd4+ vmovhpd+Hvmovhpd4+ vmovhpd+H VFNMSUB213SH VFNMSUB213SH vfnmsub213shOFused Negative Multiply-Subtract of Scalar Half-Precision Floating-Point Values vfnmsub213shR vfnmsub213sh$R vfnmsub213shR vfnmsub213sh$R vfnmsub213shQR vfnmsub213shQRlhttps://www.felixcloutier.com/x86/vfmsub132sh:vfnmsub132sh:vfmsub213sh:vfnmsub213sh:vfmsub231sh:vfnmsub231shVPORQVPORQvporq.Bitwise Logical OR of Packed Quadword Integers vporq=HvporqHvporq?HvporqHvporqAHvporqHvporq=HvporqHvporq?HvporqHvporqAHvporqH VFMADD213PD VFMADD213PD vfmadd213pdCFused Multiply-Add of Packed Double-Precision Floating-Point Values vfmadd213pd=H vfmadd213pdH vfmadd213pd?H vfmadd213pdH vfmadd213pdAH vfmadd213pdH vfmadd213pd=H vfmadd213pd4# vfmadd213pdH vfmadd213pd4/# vfmadd213pd?H vfmadd213pd4# vfmadd213pdH vfmadd213pd42# vfmadd213pdAH vfmadd213pdH vfmadd213pdQH vfmadd213pdQHEhttps://www.felixcloutier.com/x86/vfmadd132pd:vfmadd213pd:vfmadd231pdJPOJPOjpoJump if parity odd (PF == 0)jpoJPC3NjpoJPC3OPANDNPANDNpandnPacked Bitwise Logical AND NOTpandnPANDN3 pandnPANDN3+ pandnPANDN3pandnPANDN3/'https://www.felixcloutier.com/x86/pandn VBLENDVPS VBLENDVPS vblendvps= Variable Blend Packed Single Precision Floating-Point Values vblendvps4  vblendvps4/  vblendvps4  vblendvps42  VFCMULCPH VFCMULCPH vfcmulcphOFused Conjugate Multiply of Complex Packed Half-Precision Floating-Point Values vfcmulcph9K vfcmulcphK vfcmulcph:K vfcmulcphK vfcmulcph;R vfcmulcphR vfcmulcph9K vfcmulcphK vfcmulcph:K vfcmulcphK vfcmulcph;R vfcmulcphR vfcmulcphQR vfcmulcphQR4https://www.felixcloutier.com/x86/vfcmulcph:vfmulcphVPHSUBWDVPHSUBWDvphsubwd;Packed Horizontal Subtract Signed Word to Signed Doublewordvphsubwd"vphsubwd/" SHA1NEXTE SHA1NEXTE sha1nexte1Calculate SHA1 State Variable E after Four Rounds sha1nexte( sha1nexte/(+https://www.felixcloutier.com/x86/sha1nexteSUBPDSUBPDsubpd6Subtract Packed Double-Precision Floating-Point ValuessubpdSUBPD3subpdSUBPD3/'https://www.felixcloutier.com/x86/subpdVMOVLPSVMOVLPSvmovlps6Move Low Packed Single-Precision Floating-Point Valuesvmovlps4+ vmovlps+Hvmovlps4+ vmovlps+HJNPJNPjnpJump if not parity (PF == 0)jnpJPC3NjnpJPC3OVFRCZSDVFRCZSDvfrczsd7Extract Fraction Scalar Double-Precision Floating-Pointvfrczsd"vfrczsd+"VPMOVB2MVPMOVB2Mvpmovb2m3Move Signs of Packed Byte Integers to Mask Registervpmovb2mIvpmovb2mIvpmovb2mIEhttps://www.felixcloutier.com/x86/vpmovb2m:vpmovw2m:vpmovd2m:vpmovq2mCDQECDQEcdqeConvert Doubleword to Quadwordcltq3/https://www.felixcloutier.com/x86/cbw:cwde:cdqeVDIVSSVDIVSSvdivss4Divide Scalar Single-Precision Floating-Point ValuesvdivssHvdivss'Hvdivss4 vdivssHvdivss4' vdivss'HvdivssQHvdivssQHCOMISSCOMISScomissLCompare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGScomissCOMISS3comissCOMISS3'(https://www.felixcloutier.com/x86/comiss PCMPISTRI PCMPISTRI pcmpistri4Packed Compare Implicit Length Strings, Return Index pcmpistri3 pcmpistri3/+https://www.felixcloutier.com/x86/pcmpistriAORAORaor Atomically ORaor'aor+SETESETEseteSet byte if equal (ZF == 1)seteSETEQ3 seteSETEQ3#PADDQPADDQpaddqAdd Packed Quadword IntegerspaddqPADDQ3paddqPADDQ3+paddqPADDQ3paddqPADDQ3/9https://www.felixcloutier.com/x86/paddb:paddw:paddd:paddqVPMADDWDVPMADDWDvpmaddwd,Multiply and Add Packed Signed Word IntegersvpmaddwdIvpmaddwd/IvpmaddwdIvpmaddwd2IvpmaddwdIvpmaddwd5Ivpmaddwd4 vpmaddwdIvpmaddwd4/ vpmaddwd/Ivpmaddwd4!vpmaddwdIvpmaddwd42!vpmaddwd2IvpmaddwdIvpmaddwd5I VMOVSLDUP VMOVSLDUP vmovsldup'Move Packed Single-FP Low and Duplicate vmovsldupH vmovsldupH vmovsldupH vmovsldup/H vmovsldup2H vmovsldup5H vmovsldup4  vmovsldupH vmovsldup4/  vmovsldup/H vmovsldup4  vmovsldupH vmovsldup42  vmovsldup2H vmovsldupH vmovsldup5HJNLEJNLEjnle0Jump if not less or equal (ZF == 0 and SF == OF)jnleJGT3NjnleJGT3OVPMAXSQVPMAXSQvpmaxsq*Maximum of Packed Signed Quadword Integers vpmaxsq=HvpmaxsqHvpmaxsq?HvpmaxsqHvpmaxsqAHvpmaxsqHvpmaxsq=HvpmaxsqHvpmaxsq?HvpmaxsqHvpmaxsqAHvpmaxsqHPEXTPEXTpextParallel Bits Extractpextl5pextl'5pextq5pextq+5&https://www.felixcloutier.com/x86/pextVPSUBSWVPSUBSWvpsubsw;Subtract Packed Signed Word Integers with Signed SaturationvpsubswIvpsubsw/IvpsubswIvpsubsw2IvpsubswIvpsubsw5Ivpsubsw4 vpsubswIvpsubsw4/ vpsubsw/Ivpsubsw4!vpsubswIvpsubsw42!vpsubsw2IvpsubswIvpsubsw5IVRANGESDVRANGESDvrangesdYRange Restriction Calculation For a pair of Scalar Double-Precision Floating-Point ValuesvrangesdJvrangesd+JvrangesdJvrangesd+JvrangesdRJvrangesdRJ*https://www.felixcloutier.com/x86/vrangesdVSUBSHVSUBSHvsubsh4Subtract Scalar Half-Precision Floating-Point ValuesvsubshRvsubsh$RvsubshRvsubsh$RvsubshQRvsubshQR(https://www.felixcloutier.com/x86/vsubshPSRLQPSRLQpsrlq(Shift Packed Quadword Data Right LogicalpsrlqPSRLQ3 psrlqPSRLQ3 psrlqPSRLQ3+ psrlqPSRLQ3psrlqPSRLQ3psrlqPSRLQ3/3https://www.felixcloutier.com/x86/psrlw:psrld:psrlqMAXPDMAXPDmaxpd<Return Maximum Packed Double-Precision Floating-Point ValuesmaxpdMAXPD3maxpdMAXPD3/'https://www.felixcloutier.com/x86/maxpdMOVDMOVDmovdMove Doublewordmovd3 movd3movd3 movd3' movd3movd3'movd3' movd3'+https://www.felixcloutier.com/x86/movd:movqVPSHAQVPSHAQvpshaq!Packed Shift Arithmetic Quadwordsvpshaq"vpshaq/"vpshaq/" VFMSUB231PH VFMSUB231PH vfmsub231phFFused Multiply-Subtract of Packed Half-Precision Floating-Point Values vfmsub231ph<K vfmsub231phK vfmsub231ph>K vfmsub231phK vfmsub231ph@R vfmsub231phR vfmsub231ph<K vfmsub231phK vfmsub231ph>K vfmsub231phK vfmsub231ph@R vfmsub231phR vfmsub231phQR vfmsub231phQRlhttps://www.felixcloutier.com/x86/vfmsub132ph:vfnmsub132ph:vfmsub213ph:vfnmsub213ph:vfmsub231ph:vfnmsub231phPSUBWPSUBWpsubwSubtract Packed Word IntegerspsubwPSUBW3 psubwPSUBW3+ psubwPSUBW3psubwPSUBW3/3https://www.felixcloutier.com/x86/psubb:psubw:psubdKORDKORDkordBitwise Logical OR 32-bit MaskskordI5https://www.felixcloutier.com/x86/korw:korb:korq:kordVMOVQVMOVQvmovq Move Quadword vmovq4 vmovqHvmovq4 vmovqHvmovq4 vmovqHvmovq4+ vmovq+Hvmovq4+ vmovq+HPMINUDPMINUDpminud.Minimum of Packed Unsigned Doubleword Integerspminud3pminud3//https://www.felixcloutier.com/x86/pminud:pminuq VEXTRACTI64X2 VEXTRACTI64X2 vextracti64x22Extract 128 Bits of Packed Quadword Integer Values vextracti64x2J vextracti64x20J vextracti64x2J vextracti64x20J vextracti64x2J vextracti64x2J vextracti64x2/J vextracti64x2/JVPTESTMWVPTESTMWvptestmw6Logical AND of Packed Word Integer Values and Set Mask vptestmwIvptestmwIvptestmw/Ivptestmw/IvptestmwIvptestmwIvptestmw2Ivptestmw2IvptestmwIvptestmwIvptestmw5Ivptestmw5IEhttps://www.felixcloutier.com/x86/vptestmb:vptestmw:vptestmd:vptestmqBZHIBZHIbzhi3Zero High Bits Starting with Specified Bit Positionbzhil5bzhil'5bzhiq5bzhiq+5&https://www.felixcloutier.com/x86/bzhiPFNACCPFNACCpfnacc)Packed Floating-Point Negative AccumulatepfnaccPFNACC3pfnaccPFNACC3+CVTPS2PICVTPS2PIcvtps2piBConvert Packed Single-Precision FP Values to Packed Dword Integerscvtps2piCVTPS2PL3cvtps2piCVTPS2PL3+*https://www.felixcloutier.com/x86/cvtps2pi VFIXUPIMMPD VFIXUPIMMPD vfixupimmpd<Fix Up Special Packed Double-Precision Floating-Point Values vfixupimmpd=H vfixupimmpdH vfixupimmpd?H vfixupimmpdH vfixupimmpdAH vfixupimmpdH vfixupimmpd=H vfixupimmpdH vfixupimmpd?H vfixupimmpdH vfixupimmpdAH vfixupimmpdH vfixupimmpdRH vfixupimmpdRH-https://www.felixcloutier.com/x86/vfixupimmpdVMOVHPSVMOVHPSvmovhps7Move High Packed Single-Precision Floating-Point Valuesvmovhps4+ vmovhps+Hvmovhps4+ vmovhps+HCMOVLCMOVLcmovlMove if less (SF != OF)cmovlw3  cmovlw3 $cmovll3cmovll3'cmovlq3cmovlq3+TZMSKTZMSKtzmskMask From Trailing Zerostzmsk6tzmsk'6tzmsk6tzmsk+6 VGETMANTPS VGETMANTPS vgetmantpsOExtract Normalized Mantissas from Packed Single-Precision Floating-Point Values vgetmantps9H vgetmantps:H vgetmantps;H vgetmantpsH vgetmantpsH vgetmantpsH vgetmantps9H vgetmantpsH vgetmantps:H vgetmantpsH vgetmantps;H vgetmantpsH vgetmantpsRH vgetmantpsRH,https://www.felixcloutier.com/x86/vgetmantpsVPSIGNWVPSIGNWvpsignwPacked Sign of Word Integersvpsignw4 vpsignw4/ vpsignw4!vpsignw42!VFMSUBADD231PSVFMSUBADD231PSvfmsubadd231psXFused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Valuesvfmsubadd231ps9Hvfmsubadd231psHvfmsubadd231ps:Hvfmsubadd231psHvfmsubadd231ps;Hvfmsubadd231psHvfmsubadd231ps9Hvfmsubadd231ps4#vfmsubadd231psHvfmsubadd231ps4/#vfmsubadd231ps:Hvfmsubadd231ps4#vfmsubadd231psHvfmsubadd231ps42#vfmsubadd231ps;Hvfmsubadd231psHvfmsubadd231psQHvfmsubadd231psQHNhttps://www.felixcloutier.com/x86/vfmsubadd132ps:vfmsubadd213ps:vfmsubadd231psSETBESETBEsetbe/Set byte if below or equal (CF == 1 or ZF == 1)setbeSETLS3 setbeSETLS3#KSHIFTLDKSHIFTLDkshiftldShift Left 32-bit MaskskshiftldIEhttps://www.felixcloutier.com/x86/kshiftlw:kshiftlb:kshiftlq:kshiftld VADDSUBPS VADDSUBPS vaddsubpsPacked Single-FP Add/Subtract vaddsubps4  vaddsubps4/  vaddsubps4  vaddsubps42 VPERMI2DVPERMI2Dvpermi2dAFull Permute of Doublewords From Two Tables Overwriting the Index vpermi2d9Hvpermi2dHvpermi2d:Hvpermi2dHvpermi2d;Hvpermi2dHvpermi2d9Hvpermi2dHvpermi2d:Hvpermi2dHvpermi2d;Hvpermi2dHPhttps://www.felixcloutier.com/x86/vpermi2w:vpermi2d:vpermi2q:vpermi2ps:vpermi2pdVMOVNTPSVMOVNTPSvmovntpsKStore Packed Single-Precision Floating-Point Values Using Non-Temporal Hintvmovntps4/ vmovntps/Hvmovntps42 vmovntps2Hvmovntps5HCMOVBCMOVBcmovbMove if below (CF == 1)cmovbw3  cmovbw3 $cmovbl3cmovbl3'cmovbq3cmovbq3+PFRCPIT2PFRCPIT2pfrcpit2,Packed Floating-Point Reciprocal Iteration 2pfrcpit23pfrcpit23+SETCSETCsetcSet byte if carry (CF == 1)setcSETCS3 setcSETCS3# VFMADD132PD VFMADD132PD vfmadd132pdCFused Multiply-Add of Packed Double-Precision Floating-Point Values vfmadd132pd=H vfmadd132pdH vfmadd132pd?H vfmadd132pdH vfmadd132pdAH vfmadd132pdH vfmadd132pd=H vfmadd132pd4# vfmadd132pdH vfmadd132pd4/# vfmadd132pd?H vfmadd132pd4# vfmadd132pdH vfmadd132pd42# vfmadd132pdAH vfmadd132pdH vfmadd132pdQH vfmadd132pdQHEhttps://www.felixcloutier.com/x86/vfmadd132pd:vfmadd213pd:vfmadd231pdTESTTESTtestLogical ComparetestbTESTB3testbTESTB3 testbTESTB3  testwTESTW3 testwTESTW3 testwTESTW3  testlTESTL3testlTESTL3testlTESTL3testqTESTQ3testqTESTQ3testqTESTQ3testbTESTB3#testbTESTB3# testwTESTW3$testwTESTW3$ testlTESTL3'testlTESTL3'testqTESTQ3+testqTESTQ3+&https://www.felixcloutier.com/x86/testPCMPGTQPCMPGTQpcmpgtq$Compare Packed Data for Greater Thanpcmpgtq3pcmpgtq3/)https://www.felixcloutier.com/x86/pcmpgtqVPBLENDDVPBLENDDvpblenddBlend Packed Doublewordsvpblendd4!vpblendd4/!vpblendd4!vpblendd42!*https://www.felixcloutier.com/x86/vpblenddVPSHRDVDVPSHRDVDvpshrdvdCConcatenate and Variable Shift Packed Doubleword Data Right Logical vpshrdvd9KvpshrdvdKvpshrdvd:KvpshrdvdKvpshrdvd;UvpshrdvdUvpshrdvd9KvpshrdvdKvpshrdvd:KvpshrdvdKvpshrdvd;UvpshrdvdUMINPSMINPSminps<Return Minimum Packed Single-Precision Floating-Point ValuesminpsMINPS3minpsMINPS3/'https://www.felixcloutier.com/x86/minpsADDPDADDPDaddpd1Add Packed Double-Precision Floating-Point ValuesaddpdADDPD3addpdADDPD3/'https://www.felixcloutier.com/x86/addpdVMOVDVMOVDvmovdMove Doublewordvmovd4 vmovdHvmovd4 vmovdHvmovd4' vmovd'Hvmovd4' vmovd'HVSUBPHVSUBPHvsubph4Subtract Packed Half-Precision Floating-Point Valuesvsubph<KvsubphKvsubph>KvsubphKvsubph@RvsubphRvsubph<KvsubphKvsubph>KvsubphKvsubph@RvsubphRvsubphQRvsubphQR(https://www.felixcloutier.com/x86/vsubph VFMSUB231SH VFMSUB231SH vfmsub231shFFused Multiply-Subtract of Scalar Half-Precision Floating-Point Values vfmsub231shR vfmsub231sh$R vfmsub231shR vfmsub231sh$R vfmsub231shQR vfmsub231shQRlhttps://www.felixcloutier.com/x86/vfmsub132sh:vfnmsub132sh:vfmsub213sh:vfnmsub213sh:vfmsub231sh:vfnmsub231shVMULSSVMULSSvmulss6Multiply Scalar Single-Precision Floating-Point ValuesvmulssHvmulss'Hvmulss4 vmulssHvmulss4' vmulss'HvmulssQHvmulssQH VPMOVUSQB VPMOVUSQB vpmovusqbKDown Convert Packed Quadword Values to Byte Values with Unsigned Saturation  vpmovusqbH vpmovusqb%H vpmovusqbH vpmovusqb(H vpmovusqbH vpmovusqb,H vpmovusqbH vpmovusqbH vpmovusqbH vpmovusqb$H vpmovusqb'H vpmovusqb+H<https://www.felixcloutier.com/x86/vpmovqb:vpmovsqb:vpmovusqbCMPLXADDCMPLXADDcmplxaddCompare for Less and Addcmplxadd'cmplxadd+VPTESTMQVPTESTMQvptestmq:Logical AND of Packed Quadword Integer Values and Set Mask vptestmq=Hvptestmq=HvptestmqHvptestmqHvptestmq?Hvptestmq?HvptestmqHvptestmqHvptestmqAHvptestmqAHvptestmqHvptestmqHEhttps://www.felixcloutier.com/x86/vptestmb:vptestmw:vptestmd:vptestmqVFMSUBADD213PDVFMSUBADD213PDvfmsubadd213pdXFused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Valuesvfmsubadd213pd=Hvfmsubadd213pdHvfmsubadd213pd?Hvfmsubadd213pdHvfmsubadd213pdAHvfmsubadd213pdHvfmsubadd213pd=Hvfmsubadd213pd4#vfmsubadd213pdHvfmsubadd213pd4/#vfmsubadd213pd?Hvfmsubadd213pd4#vfmsubadd213pdHvfmsubadd213pd42#vfmsubadd213pdAHvfmsubadd213pdHvfmsubadd213pdQHvfmsubadd213pdQHNhttps://www.felixcloutier.com/x86/vfmsubadd132pd:vfmsubadd213pd:vfmsubadd231pdCMOVNZCMOVNZcmovnzMove if not zero (ZF == 0)cmovnzw3  cmovnzw3 $cmovnzl3cmovnzl3'cmovnzq3cmovnzq3+VADDSHVADDSHvaddsh/Add Scalar Half-Precision Floating-Point ValuesvaddshRvaddsh$RvaddshRvaddsh$RvaddshQRvaddshQR(https://www.felixcloutier.com/x86/vaddshPSADBWPSADBWpsadbw#Compute Sum of Absolute DifferencespsadbwPSADBW3 psadbwPSADBW3+ psadbwPSADBW3psadbwPSADBW3/(https://www.felixcloutier.com/x86/psadbwHSUBPSHSUBPShsubps$Packed Single-FP Horizontal Subtracthsubps3hsubps3/(https://www.felixcloutier.com/x86/hsubps VPMADCSSWD VPMADCSSWD vpmadcsswdOPacked Multiply Add Accumulate with Saturation Signed Word to Signed Doubleword vpmadcsswd" vpmadcsswd/" TILERELEASE TILERELEASE tilereleaseTILE RELEASE register state tilerelease-https://www.felixcloutier.com/x86/tilerelease VCVTPS2PHX VCVTPS2PHX vcvtps2phx<Convert Single-Precision FP value to Half-Precision FP value vcvtps2phxx9K vcvtps2phxy:K vcvtps2phx;R vcvtps2phxxK vcvtps2phxyK vcvtps2phxR vcvtps2phxx9K vcvtps2phxy:K vcvtps2phxxK vcvtps2phxyK vcvtps2phx;R vcvtps2phxR vcvtps2phxQR vcvtps2phxQR,https://www.felixcloutier.com/x86/vcvtps2phxVPSRAVDVPSRAVDvpsravd6Variable Shift Packed Doubleword Data Right Arithmeticvpsravd9HvpsravdHvpsravd:HvpsravdHvpsravd;HvpsravdHvpsravd9Hvpsravd4!vpsravdHvpsravd4/!vpsravd:Hvpsravd4!vpsravdHvpsravd42!vpsravd;HvpsravdH9https://www.felixcloutier.com/x86/vpsravw:vpsravd:vpsravqVPSRAVWVPSRAVWvpsravw0Variable Shift Packed Word Data Right Arithmetic vpsravwIvpsravw/IvpsravwIvpsravw2IvpsravwIvpsravw5IvpsravwIvpsravw/IvpsravwIvpsravw2IvpsravwIvpsravw5I9https://www.felixcloutier.com/x86/vpsravw:vpsravd:vpsravqMPSADBWMPSADBWmpsadbw3Compute Multiple Packed Sums of Absolute Differencempsadbw3mpsadbw3/)https://www.felixcloutier.com/x86/mpsadbwINTINTintCall to Interrupt ProcedureintINT VREDUCESH VREDUCESH vreduceshPPerform Reduction Transformation on a Scalar Half-Precision Floating-Point Value vreduceshR vreducesh$R vreduceshR vreducesh$R vreduceshRR vreduceshRR+https://www.felixcloutier.com/x86/vreducesh VRSQRT14SD VRSQRT14SD vrsqrt14sdaCompute Approximate Reciprocal of a Square Root of a Scalar Double-Precision Floating-Point Value vrsqrt14sdH vrsqrt14sd+H vrsqrt14sdH vrsqrt14sd+H,https://www.felixcloutier.com/x86/vrsqrt14sdRDPRURDPRUrdpru$Read Processor Register in User moderdpru. VCVTUDQ2PS VCVTUDQ2PS vcvtudq2ps\Convert Packed Unsigned Doubleword Integers to Packed Single-Precision Floating-Point Values vcvtudq2ps9H vcvtudq2ps:H vcvtudq2ps;H vcvtudq2psH vcvtudq2psH vcvtudq2psH vcvtudq2ps9H vcvtudq2psH vcvtudq2ps:H vcvtudq2psH vcvtudq2ps;H vcvtudq2psH vcvtudq2psQH vcvtudq2psQH,https://www.felixcloutier.com/x86/vcvtudq2ps VCVTNEEPH2PS VCVTNEEPH2PS vcvtneeph2ps:Convert Even Elements of Packed FP16 Values to FP32 Values vcvtneeph2ps/Z vcvtneeph2ps2Z VRNDSCALEPH VRNDSCALEPH vrndscaleph\Round Packed Half-Precision Floating-Point Values To Include A Given Number Of Fraction Bits vrndscaleph<K vrndscaleph>K vrndscaleph@R vrndscalephK vrndscalephK vrndscalephR vrndscaleph<K vrndscalephK vrndscaleph>K vrndscalephK vrndscaleph@R vrndscalephR vrndscalephRR vrndscalephRR-https://www.felixcloutier.com/x86/vrndscaleph VFNMADD132PS VFNMADD132PS vfnmadd132psLFused Negative Multiply-Add of Packed Single-Precision Floating-Point Values vfnmadd132ps9H vfnmadd132psH vfnmadd132ps:H vfnmadd132psH vfnmadd132ps;H vfnmadd132psH vfnmadd132ps9H vfnmadd132ps4# vfnmadd132psH vfnmadd132ps4/# vfnmadd132ps:H vfnmadd132ps4# vfnmadd132psH vfnmadd132ps42# vfnmadd132ps;H vfnmadd132psH vfnmadd132psQH vfnmadd132psQHHhttps://www.felixcloutier.com/x86/vfnmadd132ps:vfnmadd213ps:vfnmadd231psVMOVDQAVMOVDQAvmovdqaMove Aligned Double Quadwordvmovdqa4 vmovdqa4/ vmovdqa4 vmovdqa42 vmovdqa4/ vmovdqa42 VFMADDSUB213PDVFMADDSUB213PDvfmaddsub213pdXFused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Valuesvfmaddsub213pd=Hvfmaddsub213pdHvfmaddsub213pd?Hvfmaddsub213pdHvfmaddsub213pdAHvfmaddsub213pdHvfmaddsub213pd=Hvfmaddsub213pd4#vfmaddsub213pdHvfmaddsub213pd4/#vfmaddsub213pd?Hvfmaddsub213pd4#vfmaddsub213pdHvfmaddsub213pd42#vfmaddsub213pdAHvfmaddsub213pdHvfmaddsub213pdQHvfmaddsub213pdQHNhttps://www.felixcloutier.com/x86/vfmaddsub132pd:vfmaddsub213pd:vfmaddsub231pdMOVBEMOVBEmovbeMove Data After Swapping Bytesmovbew3 $/movbel3'/movbeq3+/movbew3$ /movbel3'/movbeq3+/'https://www.felixcloutier.com/x86/movbeJNLJNLjnlJump if not less (SF == OF)jnlJGE3NjnlJGE3OVCVTPH2WVCVTPH2Wvcvtph2wQConvert Packed Half-Precision Floating-Point Values to Packed Word Integer Valuesvcvtph2w<Kvcvtph2w>Kvcvtph2w@Rvcvtph2wKvcvtph2wKvcvtph2wRvcvtph2w<Kvcvtph2wKvcvtph2w>Kvcvtph2wKvcvtph2w@Rvcvtph2wRvcvtph2wQRvcvtph2wQR*https://www.felixcloutier.com/x86/vcvtph2wCMOVNGCMOVNGcmovng)Move if not greater (ZF == 1 or SF != OF)cmovngw3  cmovngw3 $cmovngl3cmovngl3'cmovngq3cmovngq3+ CMPBEXADD CMPBEXADD cmpbexadd#Compare for Below or Equals and Add cmpbexadd' cmpbexadd+ VMOVMSKPS VMOVMSKPS vmovmskps8Extract Packed Single-Precision Floating-Point Sign Mask vmovmskps4  vmovmskps4 MULXMULXmulx)Unsigned Multiply Without Affecting Flagsmulxl5mulxl'5mulxq5mulxq+5&https://www.felixcloutier.com/x86/mulxRCRRCRrcrRotate Right through Carry FlagrcrbRCRB3 rcrbRCRB3 rcrbRCRB3 rcrwRCRW3 rcrwRCRW3 rcrwRCRW3 rcrlRCRL3rcrlRCRL3rcrlRCRL3rcrqRCRQ3rcrqRCRQ3rcrqRCRQ3rcrbRCRB3#rcrbRCRB3#rcrbRCRB3#rcrwRCRW3$rcrwRCRW3$rcrwRCRW3$rcrlRCRL3'rcrlRCRL3'rcrlRCRL3'rcrqRCRQ3+rcrqRCRQ3+rcrqRCRQ3+1https://www.felixcloutier.com/x86/rcl:rcr:rol:ror VMOVDQU64 VMOVDQU64 vmovdqu64Move Unaligned Quadword Values vmovdqu640H vmovdqu64H vmovdqu643H vmovdqu64H vmovdqu646H vmovdqu64H vmovdqu64/H vmovdqu642H vmovdqu645H vmovdqu64H vmovdqu64/H vmovdqu64H vmovdqu642H vmovdqu64H vmovdqu645H vmovdqu64/H vmovdqu642H vmovdqu645HOhttps://www.felixcloutier.com/x86/movdqu:vmovdqu8:vmovdqu16:vmovdqu32:vmovdqu64VPCMPEQQVPCMPEQQvpcmpeqq)Compare Packed Quadword Data for Equalityvpcmpeqq=Hvpcmpeqq=HvpcmpeqqHvpcmpeqqHvpcmpeqq?Hvpcmpeqq?HvpcmpeqqHvpcmpeqqHvpcmpeqqAHvpcmpeqqAHvpcmpeqqHvpcmpeqqHvpcmpeqq4 vpcmpeqq4/ vpcmpeqq4!vpcmpeqq42! VPMADD52LUQ VPMADD52LUQ vpmadd52luqdPacked Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Quadword Accumulators vpmadd52luq=K vpmadd52luqK vpmadd52luq?K vpmadd52luqK vpmadd52luqAO vpmadd52luqO vpmadd52luq=K vpmadd52luqK vpmadd52luq[ vpmadd52luq/[ vpmadd52luq?K vpmadd52luqK vpmadd52luq[ vpmadd52luq2[ vpmadd52luqAO vpmadd52luqO-https://www.felixcloutier.com/x86/vpmadd52luqVPMOVW2MVPMOVW2Mvpmovw2m3Move Signs of Packed Word Integers to Mask Registervpmovw2mIvpmovw2mIvpmovw2mIEhttps://www.felixcloutier.com/x86/vpmovb2m:vpmovw2m:vpmovd2m:vpmovq2mVPMULUDQVPMULUDQvpmuludq,Multiply Packed Unsigned Doubleword Integersvpmuludq=HvpmuludqHvpmuludq?HvpmuludqHvpmuludqAHvpmuludqHvpmuludq=Hvpmuludq4 vpmuludqHvpmuludq4/ vpmuludq?Hvpmuludq4!vpmuludqHvpmuludq42!vpmuludqAHvpmuludqHVPMINUWVPMINUWvpminuw(Minimum of Packed Unsigned Word IntegersvpminuwIvpminuw/IvpminuwIvpminuw2IvpminuwIvpminuw5Ivpminuw4 vpminuwIvpminuw4/ vpminuw/Ivpminuw4!vpminuwIvpminuw42!vpminuw2IvpminuwIvpminuw5I VFNMADDSD VFNMADDSD vfnmaddsdLFused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values vfnmaddsd$ vfnmaddsd+$ vfnmaddsd+$VBROADCASTI128VBROADCASTI128vbroadcasti128"Broadcast 128 Bits of Integer Datavbroadcasti1284/!JBEJBEjbe+Jump if below or equal (CF == 1 or ZF == 1)jbeJLS3NjbeJLS3OCVTDQ2PSCVTDQ2PScvtdq2psBConvert Packed Dword Integers to Packed Single-Precision FP Valuescvtdq2ps3cvtdq2ps3/*https://www.felixcloutier.com/x86/cvtdq2psVHADDPDVHADDPDvhaddpdPacked Double-FP Horizontal Addvhaddpd4 vhaddpd4/ vhaddpd4 vhaddpd42 VBROADCASTI32X8VBROADCASTI32X8vbroadcasti32x8#Broadcast Eight Doubleword Elementsvbroadcasti32x82Jvbroadcasti32x82J VFNMADD231SS VFNMADD231SS vfnmadd231ssLFused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values vfnmadd231ssH vfnmadd231ss'H vfnmadd231ss4# vfnmadd231ssH vfnmadd231ss4'# vfnmadd231ss'H vfnmadd231ssQH vfnmadd231ssQHHhttps://www.felixcloutier.com/x86/vfnmadd132ss:vfnmadd213ss:vfnmadd231ssXCHGXCHGxchg&Exchange Register/Memory with RegisterxchgbXCHGB3  xchgbXCHGB3 #xchgwXCHGW3  xchgwXCHGW3  xchgwXCHGW3  xchgwXCHGW3 $xchglXCHGL3xchglXCHGL3xchglXCHGL3xchglXCHGL3'xchgqXCHGQ3xchgqXCHGQ3xchgqXCHGQ3xchgqXCHGQ3+xchgbXCHGB3# xchgwXCHGW3$ xchglXCHGL3'xchgqXCHGQ3+&https://www.felixcloutier.com/x86/xchgVFMADDSDVFMADDSDvfmaddsdCFused Multiply-Add of Scalar Double-Precision Floating-Point Valuesvfmaddsd$vfmaddsd+$vfmaddsd+$VBROADCASTF128VBROADCASTF128vbroadcastf128(Broadcast 128 Bit of Floating-Point Datavbroadcastf1284/ VMOVUPDVMOVUPDvmovupd<Move Unaligned Packed Double-Precision Floating-Point Valuesvmovupd0HvmovupdHvmovupd3HvmovupdHvmovupd6HvmovupdHvmovupd/Hvmovupd2Hvmovupd5Hvmovupd4 vmovupdHvmovupd4/ vmovupd/Hvmovupd4 vmovupdHvmovupd42 vmovupd2HvmovupdHvmovupd5Hvmovupd4/ vmovupd/Hvmovupd42 vmovupd2Hvmovupd5HDPPSDPPSdpps<Dot Product of Packed Single Precision Floating-Point Valuesdpps3dpps3/&https://www.felixcloutier.com/x86/dppsVPADDDVPADDDvpadddAdd Packed Doubleword Integersvpaddd9HvpadddHvpaddd:HvpadddHvpaddd;HvpadddHvpaddd9Hvpaddd4 vpadddHvpaddd4/ vpaddd:Hvpaddd4!vpadddHvpaddd42!vpaddd;HvpadddHPREFETCHPREFETCHprefetchPrefetch Data into Cachesprefetch#@ VBROADCASTSD VBROADCASTSD vbroadcastsd1Broadcast Double-Precision Floating-Point Element  vbroadcastsdH vbroadcastsdH vbroadcastsd+H vbroadcastsd+H vbroadcastsd4! vbroadcastsdH vbroadcastsd4+  vbroadcastsd+H vbroadcastsdH vbroadcastsd+H VFMADD231PD VFMADD231PD vfmadd231pdCFused Multiply-Add of Packed Double-Precision Floating-Point Values vfmadd231pd=H vfmadd231pdH vfmadd231pd?H vfmadd231pdH vfmadd231pdAH vfmadd231pdH vfmadd231pd=H vfmadd231pd4# vfmadd231pdH vfmadd231pd4/# vfmadd231pd?H vfmadd231pd4# vfmadd231pdH vfmadd231pd42# vfmadd231pdAH vfmadd231pdH vfmadd231pdQH vfmadd231pdQHEhttps://www.felixcloutier.com/x86/vfmadd132pd:vfmadd213pd:vfmadd231pdEMMSEMMSemmsExit MMX StateemmsEMMS3 &https://www.felixcloutier.com/x86/emmsMASKMOVQMASKMOVQmaskmovq Store Selected Bytes of QuadwordmaskmovqMASKMOVQ *https://www.felixcloutier.com/x86/maskmovq CMPLEXADD CMPLEXADD cmplexadd"Compare for Less or Equals and Add cmplexadd' cmplexadd+ CMPNLXADD CMPNLXADD cmpnlxaddCompare for Not Less and Add cmpnlxadd' cmpnlxadd+VMOVDQUVMOVDQUvmovdquMove Unaligned Double Quadwordvmovdqu4 vmovdqu4/ vmovdqu4 vmovdqu42 vmovdqu4/ vmovdqu42 VPCMPEQBVPCMPEQBvpcmpeqb%Compare Packed Byte Data for EqualityvpcmpeqbIvpcmpeqbIvpcmpeqb/Ivpcmpeqb/IvpcmpeqbIvpcmpeqbIvpcmpeqb2Ivpcmpeqb2IvpcmpeqbIvpcmpeqbIvpcmpeqb5Ivpcmpeqb5Ivpcmpeqb4 vpcmpeqb4/ vpcmpeqb4!vpcmpeqb42! VPDPWUSDS VPDPWUSDS vpdpwusdsXPacked Dot Product of Unsigned-by-Signed Word subvectors into Doubleword with Saturation vpdpwusdsY vpdpwusds/Y vpdpwusdsY vpdpwusds2Y VPDPWUUDS VPDPWUUDS vpdpwuudsZPacked Dot Product of Unsigned-by-Unsigned Word subvectors into Doubleword with Saturation vpdpwuudsY vpdpwuuds/Y vpdpwuudsY vpdpwuuds2Y VPGATHERQD VPGATHERQD vpgatherqd=Gather Packed Doubleword Values Using Signed Quadword Indices vpgatherqdDH vpgatherqdHH vpgatherqdLH vpgatherqdD! vpgatherqdH!7https://www.felixcloutier.com/x86/vpgatherqd:vpgatherqqVPHADDWVPHADDWvphaddw#Packed Horizontal Add Word Integersvphaddw4 vphaddw4/ vphaddw4!vphaddw42!VPMOVM2BVPMOVM2Bvpmovm2b4Expand Bits of Mask Register to Packed Byte Integersvpmovm2bIvpmovm2bIvpmovm2bIEhttps://www.felixcloutier.com/x86/vpmovm2b:vpmovm2w:vpmovm2d:vpmovm2q VPBLENDMW VPBLENDMW vpblendmw*Blend Word Vectors Using an OpMask Control  vpblendmwI vpblendmw/I vpblendmwI vpblendmw2I vpblendmwI vpblendmw5I vpblendmwI vpblendmw/I vpblendmwI vpblendmw2I vpblendmwI vpblendmw5I5https://www.felixcloutier.com/x86/vpblendmb:vpblendmw VPUNPCKHBW VPUNPCKHBW vpunpckhbw1Unpack and Interleave High-Order Bytes into Words vpunpckhbwI vpunpckhbw/I vpunpckhbwI vpunpckhbw2I vpunpckhbwI vpunpckhbw5I vpunpckhbw4  vpunpckhbwI vpunpckhbw4/  vpunpckhbw/I vpunpckhbw4! vpunpckhbwI vpunpckhbw42! vpunpckhbw2I vpunpckhbwI vpunpckhbw5IVSQRTPDVSQRTPDvsqrtpdECompute Square Roots of Packed Double-Precision Floating-Point Valuesvsqrtpd9Hvsqrtpd:HvsqrtpdAHvsqrtpdHvsqrtpdHvsqrtpdHvsqrtpd9Hvsqrtpd4 vsqrtpdHvsqrtpd4/ vsqrtpd:Hvsqrtpd4 vsqrtpdHvsqrtpd42 vsqrtpdAHvsqrtpdHvsqrtpdQHvsqrtpdQHGF2P8AFFINEINVQBGF2P8AFFINEINVQBgf2p8affineinvqb0Galois Field (2^8) Affine Inverse Transformationgf2p8affineinvqbgf2p8affineinvqb/2https://www.felixcloutier.com/x86/gf2p8affineinvqbKUNPCKBWKUNPCKBWkunpckbw!Unpack and Interleave 8-bit MaskskunpckbwH<https://www.felixcloutier.com/x86/kunpckbw:kunpckwd:kunpckdq VPDPBSUDS VPDPBSUDS vpdpbsudsXPacked Dot Product of Signed-by-Unsinged Byte subvectors into Doubleword with Saturation vpdpbsudsX vpdpbsuds/X vpdpbsudsX vpdpbsuds2XVMINPSVMINPSvminps<Return Minimum Packed Single-Precision Floating-Point Valuesvminps9HvminpsHvminps:HvminpsHvminps;HvminpsHvminps9Hvminps4 vminpsHvminps4/ vminps:Hvminps4 vminpsHvminps42 vminps;HvminpsHvminpsRHvminpsRH VCVTQQ2PH VCVTQQ2PH vcvtqq2phOConvert Packed Quadword Integers to Packed Half-Precision Floating-Point Values vcvtqq2phx=K vcvtqq2phy?K vcvtqq2phzAR vcvtqq2phxK vcvtqq2phyK vcvtqq2phzR vcvtqq2phx=K vcvtqq2phy?K vcvtqq2phzAR vcvtqq2phxK vcvtqq2phyK vcvtqq2phzR vcvtqq2phzQR vcvtqq2phzQR+https://www.felixcloutier.com/x86/vcvtqq2ph VCVTTSS2USI VCVTTSS2USI vcvttss2usiXConvert with Truncation Scalar Single-Precision Floating-Point Value to Unsigned Integer vcvttss2usiH vcvttss2usi'H vcvttss2usiH vcvttss2usi'H vcvttss2usiRH vcvttss2usiRH-https://www.felixcloutier.com/x86/vcvttss2usiJGEJGEjge#Jump if greater or equal (SF == OF)jgeJGE3NjgeJGE3OPORPORporPacked Bitwise Logical ORporPOR3 porPOR3+ porPOR3porPOR3/%https://www.felixcloutier.com/x86/porCMOVBECMOVBEcmovbe+Move if below or equal (CF == 1 or ZF == 1)cmovbew3  cmovbew3 $cmovbel3cmovbel3'cmovbeq3cmovbeq3+PCMPEQDPCMPEQDpcmpeqd+Compare Packed Doubleword Data for EqualitypcmpeqdPCMPEQL3 pcmpeqdPCMPEQL3+ pcmpeqdPCMPEQL3pcmpeqdPCMPEQL3/9https://www.felixcloutier.com/x86/pcmpeqb:pcmpeqw:pcmpeqdCMPSDCMPSDcmpsd5Compare Scalar Double-Precision Floating-Point ValuescmpsdCMPSD3cmpsdCMPSD3+'https://www.felixcloutier.com/x86/cmpsd VCVTSS2SH VCVTSS2SH vcvtss2shJConvert Scalar Single-Precision FP Value to Scalar Half-Precision FP Value vcvtss2shR vcvtss2sh'R vcvtss2shR vcvtss2sh'R vcvtss2shQR vcvtss2shQR+https://www.felixcloutier.com/x86/vcvtss2shVPADDSBVPADDSBvpaddsb6Add Packed Signed Byte Integers with Signed SaturationvpaddsbIvpaddsb/IvpaddsbIvpaddsb2IvpaddsbIvpaddsb5Ivpaddsb4 vpaddsbIvpaddsb4/ vpaddsb/Ivpaddsb4!vpaddsbIvpaddsb42!vpaddsb2IvpaddsbIvpaddsb5ICVTPD2PICVTPD2PIcvtpd2piBConvert Packed Double-Precision FP Values to Packed Dword Integerscvtpd2piCVTPD2PL3cvtpd2piCVTPD2PL3/*https://www.felixcloutier.com/x86/cvtpd2piVPADDQVPADDQvpaddqAdd Packed Quadword Integersvpaddq=HvpaddqHvpaddq?HvpaddqHvpaddqAHvpaddqHvpaddq=Hvpaddq4 vpaddqHvpaddq4/ vpaddq?Hvpaddq4!vpaddqHvpaddq42!vpaddqAHvpaddqH VGETMANTPD VGETMANTPD vgetmantpdOExtract Normalized Mantissas from Packed Double-Precision Floating-Point Values vgetmantpd=H vgetmantpd?H vgetmantpdAH vgetmantpdH vgetmantpdH vgetmantpdH vgetmantpd=H vgetmantpdH vgetmantpd?H vgetmantpdH vgetmantpdAH vgetmantpdH vgetmantpdRH vgetmantpdRH,https://www.felixcloutier.com/x86/vgetmantpdSETNGESETNGEsetnge+Set byte if not greater or equal (SF != OF)setngeSETLT3 setngeSETLT3#VPSHRDDVPSHRDDvpshrdd:Concatenate and Shift Packed Doubleword Data Right Logical vpshrdd9KvpshrddKvpshrdd:KvpshrddKvpshrdd;UvpshrddUvpshrdd9KvpshrddKvpshrdd:KvpshrddKvpshrdd;UvpshrddUBLENDVPDBLENDVPDblendvpd= Variable Blend Packed Double Precision Floating-Point Valuesblendvpd3blendvpd3/*https://www.felixcloutier.com/x86/blendvpdSETNLESETNLEsetnle4Set byte if not less or equal (ZF == 0 and SF == OF)setnleSETGT3 setnleSETGT3#SQRTSSSQRTSSsqrtssCCompute Square Root of Scalar Single-Precision Floating-Point ValuesqrtssSQRTSS3sqrtssSQRTSS3'(https://www.felixcloutier.com/x86/sqrtss VCVTTPD2UDQ VCVTTPD2UDQ vcvttpd2udqlConvert with Truncation Packed Double-Precision Floating-Point Values to Packed Unsigned Doubleword Integers vcvttpd2udqx=H vcvttpd2udqy?H vcvttpd2udqAH vcvttpd2udqxH vcvttpd2udqyH vcvttpd2udqH vcvttpd2udqx=H vcvttpd2udqy?H vcvttpd2udqxH vcvttpd2udqyH vcvttpd2udqAH vcvttpd2udqH vcvttpd2udqRH vcvttpd2udqRH-https://www.felixcloutier.com/x86/vcvttpd2udqDIVPSDIVPSdivps4Divide Packed Single-Precision Floating-Point ValuesdivpsDIVPS3divpsDIVPS3/'https://www.felixcloutier.com/x86/divpsCMPCMPcmpCompare Two OperandscmpbCMPB3cmpbCMPB3 cmpbCMPB3  cmpbCMPB3 #cmpwCMPW3 cmpwCMPW3 cmpwCMPW3 cmpwCMPW3  cmpwCMPW3 $cmplCMPL3cmplCMPL3cmplCMPL3cmplCMPL3cmplCMPL3'cmpqCMPQ3cmpqCMPQ3cmpqCMPQ3cmpqCMPQ3cmpqCMPQ3+cmpbCMPB3#cmpbCMPB3# cmpwCMPW3$cmpwCMPW3$cmpwCMPW3$ cmplCMPL3'cmplCMPL3'cmplCMPL3'cmpqCMPQ3+cmpqCMPQ3+cmpqCMPQ3+%https://www.felixcloutier.com/x86/cmpMOVSXMOVSXmovsxMove with Sign-Extension movsbwMOVBWSX3  movsbwMOVBWSX3 #movsblMOVBLSX3 movswlMOVWLSX3 movsblMOVBLSX3#movswlMOVWLSX3$movsbqMOVBQSX3 movswqMOVWQSX3 movsbqMOVBQSX3#movswqMOVWQSX3$.https://www.felixcloutier.com/x86/movsx:movsxd VCVTTPH2W VCVTTPH2W vcvttph2waConvert with Truncation Packed Half-Precision Floating-Point Values to Packed Word Integer Values vcvttph2w<K vcvttph2w>K vcvttph2w@R vcvttph2wK vcvttph2wK vcvttph2wR vcvttph2w<K vcvttph2wK vcvttph2w>K vcvttph2wK vcvttph2w@R vcvttph2wR vcvttph2wRR vcvttph2wRR+https://www.felixcloutier.com/x86/vcvttph2wVMAXPSVMAXPSvmaxps<Return Maximum Packed Single-Precision Floating-Point Valuesvmaxps9HvmaxpsHvmaxps:HvmaxpsHvmaxps;HvmaxpsHvmaxps9Hvmaxps4 vmaxpsHvmaxps4/ vmaxps:Hvmaxps4 vmaxpsHvmaxps42 vmaxps;HvmaxpsHvmaxpsRHvmaxpsRHIMULIMULimulSigned MultiplyimulbIMULB3 imulwIMULW3 imullIMULL3imulqIMULQ3imulbIMULB3#imulwIMULW3$imullIMULL3'imulqIMULQ3+imulwIMULW3  imulwIMULW3 $imullIMULL3imullIMULL3'imulqIMULQ3imulqIMULQ3+imulw3  imulw3  imulw3 $imulw3 $imull3imull3imull3'imull3'imulqIMUL3Q3imulqIMUL3Q3imulqIMUL3Q3+imulqIMUL3Q3+&https://www.felixcloutier.com/x86/imulCMPPXADDCMPPXADDcmppxaddCompare for Parity and Addcmppxadd'cmppxadd+KUNPCKDQKUNPCKDQkunpckdq"Unpack and Interleave 32-bit MaskskunpckdqI<https://www.felixcloutier.com/x86/kunpckbw:kunpckwd:kunpckdqPEXTRDPEXTRDpextrdExtract Doublewordpextrd3pextrd3'6https://www.felixcloutier.com/x86/pextrb:pextrd:pextrq VCVTSH2USI VCVTSH2USI vcvtsh2usiQConvert Scalar Half-Precision Floating-Point Value to Unsigned Doubleword Integer vcvtsh2usiR vcvtsh2usi$R vcvtsh2usiR vcvtsh2usi$R vcvtsh2usiQR vcvtsh2usiQR,https://www.felixcloutier.com/x86/vcvtsh2usiMOVNTDQMOVNTDQmovntdq-Store Double Quadword Using Non-Temporal HintmovntdqMOVNTO3/)https://www.felixcloutier.com/x86/movntdq VFNMSUB213PD VFNMSUB213PD vfnmsub213pdQFused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values vfnmsub213pd=H vfnmsub213pdH vfnmsub213pd?H vfnmsub213pdH vfnmsub213pdAH vfnmsub213pdH vfnmsub213pd=H vfnmsub213pd4# vfnmsub213pdH vfnmsub213pd4/# vfnmsub213pd?H vfnmsub213pd4# vfnmsub213pdH vfnmsub213pd42# vfnmsub213pdAH vfnmsub213pdH vfnmsub213pdQH vfnmsub213pdQHHhttps://www.felixcloutier.com/x86/vfnmsub132pd:vfnmsub213pd:vfnmsub231pdKXORWKXORWkxorw Bitwise Logical XOR 16-bit MaskskxorwH9https://www.felixcloutier.com/x86/kxorw:kxorb:kxorq:kxordSALSALsalArithmetic Shift LeftsalbSALB3 salbSALB3 salbSALB3 salwSALW3 salwSALW3 salwSALW3 sallSALL3sallSALL3sallSALL3salqSALQ3salqSALQ3salqSALQ3salbSALB3#salbSALB3#salbSALB3#salwSALW3$salwSALW3$salwSALW3$sallSALL3'sallSALL3'sallSALL3'salqSALQ3+salqSALQ3+salqSALQ3+1https://www.felixcloutier.com/x86/sal:sar:shl:shrRORRORror Rotate RightrorbRORB3 rorbRORB3 rorbRORB3 rorwRORW3 rorwRORW3 rorwRORW3 rorlRORL3rorlRORL3rorlRORL3rorqRORQ3rorqRORQ3rorqRORQ3rorbRORB3#rorbRORB3#rorbRORB3#rorwRORW3$rorwRORW3$rorwRORW3$rorlRORL3'rorlRORL3'rorlRORL3'rorqRORQ3+rorqRORQ3+rorqRORQ3+1https://www.felixcloutier.com/x86/rcl:rcr:rol:ror VCVTSH2SI VCVTSH2SI vcvtsh2si7Convert Scalar Half-Precision FP Value to Dword Integer vcvtsh2siR vcvtsh2si$R vcvtsh2siR vcvtsh2si$R vcvtsh2siQR vcvtsh2siQR+https://www.felixcloutier.com/x86/vcvtsh2siCMOVNLECMOVNLEcmovnle0Move if not less or equal (ZF == 0 and SF == OF)cmovnlew3  cmovnlew3 $cmovnlel3cmovnlel3'cmovnleq3cmovnleq3+ADCXADCXadcx9Unsigned Integer Addition of Two Operands with Carry Flagadcxl7adcxl'7adcxq7adcxq+7&https://www.felixcloutier.com/x86/adcxORPSORPSorps<Bitwise Logical OR of Single-Precision Floating-Point ValuesorpsORPS3orpsORPS3/&https://www.felixcloutier.com/x86/orpsVPCMPGTWVPCMPGTWvpcmpgtw4Compare Packed Signed Word Integers for Greater ThanvpcmpgtwIvpcmpgtwIvpcmpgtw/Ivpcmpgtw/IvpcmpgtwIvpcmpgtwIvpcmpgtw2Ivpcmpgtw2IvpcmpgtwIvpcmpgtwIvpcmpgtw5Ivpcmpgtw5Ivpcmpgtw4 vpcmpgtw4/ vpcmpgtw4!vpcmpgtw42!VPROTBVPROTBvprotbPacked Rotate Bytesvprotb"vprotb"vprotb/"vprotb/"vprotb/" VPTESTNMW VPTESTNMW vptestnmw7Logical NAND of Packed Word Integer Values and Set Mask  vptestnmwI vptestnmwI vptestnmw/I vptestnmw/I vptestnmwI vptestnmwI vptestnmw2I vptestnmw2I vptestnmwI vptestnmwI vptestnmw5I vptestnmw5IIhttps://www.felixcloutier.com/x86/vptestnmb:vptestnmw:vptestnmd:vptestnmqMAXPSMAXPSmaxps<Return Maximum Packed Single-Precision Floating-Point ValuesmaxpsMAXPS3maxpsMAXPS3/'https://www.felixcloutier.com/x86/maxpsTPAUSETPAUSEtpause Timed PAUSEtpauseG(https://www.felixcloutier.com/x86/tpause VEXTRACTF128 VEXTRACTF128 vextractf128$Extract Packed Floating-Point Values vextractf1284  vextractf1284/ fhttps://www.felixcloutier.com/x86/vextractf128:vextractf32x4:vextractf64x2:vextractf32x8:vextractf64x4 VPMOVSXWQ VPMOVSXWQ vpmovsxwqBMove Packed Word Integers to Quadword Integers with Sign Extension vpmovsxwqH vpmovsxwqH vpmovsxwqH vpmovsxwq'H vpmovsxwq+H vpmovsxwq/H vpmovsxwq4  vpmovsxwqH vpmovsxwq4'  vpmovsxwq'H vpmovsxwq4! vpmovsxwqH vpmovsxwq4+! vpmovsxwq+H vpmovsxwqH vpmovsxwq/HSETLSETLsetlSet byte if less (SF != OF)setlSETLT3 setlSETLT3#VAESDECVAESDECvaesdec+Perform One Round of an AES Decryption Flow vaesdec vaesdecKvaesdec/ vaesdec/KvaesdecvaesdecKvaesdec2vaesdec2KvaesdecHvaesdec5H VFMSUB213PD VFMSUB213PD vfmsub213pdHFused Multiply-Subtract of Packed Double-Precision Floating-Point Values vfmsub213pd=H vfmsub213pdH vfmsub213pd?H vfmsub213pdH vfmsub213pdAH vfmsub213pdH vfmsub213pd=H vfmsub213pd4# vfmsub213pdH vfmsub213pd4/# vfmsub213pd?H vfmsub213pd4# vfmsub213pdH vfmsub213pd42# vfmsub213pdAH vfmsub213pdH vfmsub213pdQH vfmsub213pdQHEhttps://www.felixcloutier.com/x86/vfmsub132pd:vfmsub213pd:vfmsub231pdVPEXTRBVPEXTRBvpextrb Extract Bytevpextrb4 vpextrbIvpextrb4# vpextrb#IVPORVPORvporPacked Bitwise Logical ORvpor4 vpor4/ vpor4!vpor42! VINSERTPS VINSERTPS vinsertps3Insert Packed Single Precision Floating-Point Value vinsertps4  vinsertpsH vinsertps4'  vinsertps'HPMINSDPMINSDpminsd,Minimum of Packed Signed Doubleword Integerspminsd3pminsd3//https://www.felixcloutier.com/x86/pminsd:pminsqVPMAXUQVPMAXUQvpmaxuq,Maximum of Packed Unsigned Quadword Integers vpmaxuq=HvpmaxuqHvpmaxuq?HvpmaxuqHvpmaxuqAHvpmaxuqHvpmaxuq=HvpmaxuqHvpmaxuq?HvpmaxuqHvpmaxuqAHvpmaxuqHPEXTRQPEXTRQpextrqExtract Quadwordpextrq3pextrq3+6https://www.felixcloutier.com/x86/pextrb:pextrd:pextrqVPSUBUSWVPSUBUSWvpsubusw?Subtract Packed Unsigned Word Integers with Unsigned SaturationvpsubuswIvpsubusw/IvpsubuswIvpsubusw2IvpsubuswIvpsubusw5Ivpsubusw4 vpsubuswIvpsubusw4/ vpsubusw/Ivpsubusw4!vpsubuswIvpsubusw42!vpsubusw2IvpsubuswIvpsubusw5IVMULPHVMULPHvmulph4Multiply Packed Half-Precision Floating-Point Valuesvmulph<KvmulphKvmulph>KvmulphKvmulph@RvmulphRvmulph<KvmulphKvmulph>KvmulphKvmulph@RvmulphRvmulphQRvmulphQR(https://www.felixcloutier.com/x86/vmulphRDPMCRDPMCrdpmc#Read Performance-Monitoring CounterrdpmcRDPMC-'https://www.felixcloutier.com/x86/rdpmc VCVTUQQ2PS VCVTUQQ2PS vcvtuqq2psZConvert Packed Unsigned Quadword Integers to Packed Single-Precision Floating-Point Values vcvtuqq2psx=J vcvtuqq2psy?J vcvtuqq2psAJ vcvtuqq2psxJ vcvtuqq2psyJ vcvtuqq2psJ vcvtuqq2psx=J vcvtuqq2psy?J vcvtuqq2psxJ vcvtuqq2psyJ vcvtuqq2psAJ vcvtuqq2psJ vcvtuqq2psQJ vcvtuqq2psQJ,https://www.felixcloutier.com/x86/vcvtuqq2ps VPMASKMOVQ VPMASKMOVQ vpmaskmovq)Conditional Move Packed Quadword Integers vpmaskmovq4/! vpmaskmovq42! vpmaskmovq4/! vpmaskmovq42! VPUNPCKHQDQ VPUNPCKHQDQ vpunpckhqdq@Unpack and Interleave High-Order Quadwords into Double Quadwords vpunpckhqdq=H vpunpckhqdqH vpunpckhqdq?H vpunpckhqdqH vpunpckhqdqAH vpunpckhqdqH vpunpckhqdq=H vpunpckhqdq4  vpunpckhqdqH vpunpckhqdq4/  vpunpckhqdq?H vpunpckhqdq4! vpunpckhqdqH vpunpckhqdq42! vpunpckhqdqAH vpunpckhqdqHVRCP28SDVRCP28SDvrcp28sduApproximation to the Reciprocal of a Scalar Double-Precision Floating-Point Value with Less Than 2^-28 Relative Errorvrcp28sdMvrcp28sd+Mvrcp28sdMvrcp28sd+Mvrcp28sdRMvrcp28sdRM*https://www.felixcloutier.com/x86/vrcp28sd VCVTPD2PH VCVTPD2PH vcvtpd2phLConvert Packed Double-Precision FP Values to Packed Half-Precision FP Values vcvtpd2phx=K vcvtpd2phy?K vcvtpd2phzAR vcvtpd2phxK vcvtpd2phyK vcvtpd2phzR vcvtpd2phx=K vcvtpd2phy?K vcvtpd2phzAR vcvtpd2phxK vcvtpd2phyK vcvtpd2phzR vcvtpd2phzQR vcvtpd2phzQR+https://www.felixcloutier.com/x86/vcvtpd2phMOVNTIMOVNTImovnti(Store Doubleword Using Non-Temporal HintmovntilMOVNTIL3'movntiqMOVNTIQ3+(https://www.felixcloutier.com/x86/movntiVHADDPSVHADDPSvhaddpsPacked Single-FP Horizontal Addvhaddps4 vhaddps4/ vhaddps4 vhaddps42  VPMOVUSQW VPMOVUSQW vpmovusqwKDown Convert Packed Quadword Values to Word Values with Unsigned Saturation  vpmovusqwH vpmovusqw(H vpmovusqwH vpmovusqw,H vpmovusqwH vpmovusqw0H vpmovusqwH vpmovusqwH vpmovusqwH vpmovusqw'H vpmovusqw+H vpmovusqw/H<https://www.felixcloutier.com/x86/vpmovqw:vpmovsqw:vpmovusqw VCVTUSI2SH VCVTUSI2SH vcvtusi2shFConvert Unsigned Integer to Scalar Half-Precision Floating-Point Value vcvtusi2shlR vcvtusi2shqR vcvtusi2shl'R vcvtusi2shq+R vcvtusi2shlQR vcvtusi2shqQR,https://www.felixcloutier.com/x86/vcvtusi2sh VPEXPANDD VPEXPANDD vpexpanddGLoad Sparse Packed Doubleword Integer Values from Dense Memory/Register  vpexpanddH vpexpanddH vpexpanddH vpexpandd/H vpexpandd2H vpexpandd5H vpexpanddH vpexpandd/H vpexpanddH vpexpandd2H vpexpanddH vpexpandd5H+https://www.felixcloutier.com/x86/vpexpanddVRSQRTPHVRSQRTPHvrsqrtphRCompute Reciprocals of Square Roots of Packed Half-Precision Floating-Point Values vrsqrtph<Kvrsqrtph>Kvrsqrtph@RvrsqrtphKvrsqrtphKvrsqrtphRvrsqrtph<KvrsqrtphKvrsqrtph>KvrsqrtphKvrsqrtph@RvrsqrtphR*https://www.felixcloutier.com/x86/vrsqrtphMOVDIRIMOVDIRImovdiriMOVe to DIRect store Integermovdiri'0movdiri+0)https://www.felixcloutier.com/x86/movdiri VPBLENDMB VPBLENDMB vpblendmb*Blend Byte Vectors Using an OpMask Control  vpblendmbI vpblendmb/I vpblendmbI vpblendmb2I vpblendmbI vpblendmb5I vpblendmbI vpblendmb/I vpblendmbI vpblendmb2I vpblendmbI vpblendmb5I5https://www.felixcloutier.com/x86/vpblendmb:vpblendmw VCVTUSI2SS VCVTUSI2SS vcvtusi2ssHConvert Unsigned Integer to Scalar Single-Precision Floating-Point Value vcvtusi2sslH vcvtusi2ssqH vcvtusi2ssl'H vcvtusi2ssq+H vcvtusi2sslQH vcvtusi2ssqQH,https://www.felixcloutier.com/x86/vcvtusi2ss VCVTSI2SD VCVTSI2SD vcvtsi2sd9Convert Dword Integer to Scalar Double-Precision FP Value  vcvtsi2sdl4  vcvtsi2sdlH vcvtsi2sdq4  vcvtsi2sdqH vcvtsi2sdl4'  vcvtsi2sdl'H vcvtsi2sdq4+  vcvtsi2sdq+H vcvtsi2sdqQHMOVAPSMOVAPSmovaps:Move Aligned Packed Single-Precision Floating-Point ValuesmovapsMOVAPS3movapsMOVAPS3/movapsMOVAPS3/(https://www.felixcloutier.com/x86/movaps VGETEXPPD VGETEXPPD vgetexppdlExtract Exponents of Packed Double-Precision Floating-Point Values as Double-Precision Floating-Point Values vgetexppd=H vgetexppd?H vgetexppdAH vgetexppdH vgetexppdH vgetexppdH vgetexppd=H vgetexppdH vgetexppd?H vgetexppdH vgetexppdAH vgetexppdH vgetexppdRH vgetexppdRH+https://www.felixcloutier.com/x86/vgetexppdVPSHRDVQVPSHRDVQvpshrdvqAConcatenate and Variable Shift Packed Quadword Data Right Logical vpshrdvq=KvpshrdvqKvpshrdvq?KvpshrdvqKvpshrdvqAUvpshrdvqUvpshrdvq=KvpshrdvqKvpshrdvq?KvpshrdvqKvpshrdvqAUvpshrdvqUVRCP28SSVRCP28SSvrcp28ssuApproximation to the Reciprocal of a Scalar Single-Precision Floating-Point Value with Less Than 2^-28 Relative Errorvrcp28ssMvrcp28ss'Mvrcp28ssMvrcp28ss'Mvrcp28ssRMvrcp28ssRM*https://www.felixcloutier.com/x86/vrcp28ss VCVTDQ2PD VCVTDQ2PD vcvtdq2pdBConvert Packed Dword Integers to Packed Double-Precision FP Values vcvtdq2pd8H vcvtdq2pd9H vcvtdq2pd:H vcvtdq2pdH vcvtdq2pdH vcvtdq2pdH vcvtdq2pd8H vcvtdq2pd4  vcvtdq2pdH vcvtdq2pd4+  vcvtdq2pd9H vcvtdq2pd4  vcvtdq2pdH vcvtdq2pd4/  vcvtdq2pd:H vcvtdq2pdH VCVTTPH2UW VCVTTPH2UW vcvttph2uwjConvert with Truncation Packed Half-Precision Floating-Point Values to Packed Unsigned Word Integer Values vcvttph2uw<K vcvttph2uw>K vcvttph2uw@R vcvttph2uwK vcvttph2uwK vcvttph2uwR vcvttph2uw<K vcvttph2uwK vcvttph2uw>K vcvttph2uwK vcvttph2uw@R vcvttph2uwR vcvttph2uwRR vcvttph2uwRR,https://www.felixcloutier.com/x86/vcvttph2uwVRCPSHVRCPSHvrcpshMCompute Approximate Reciprocal of Scalar Half-Precision Floating-Point ValuesvrcpshRvrcpsh$RvrcpshRvrcpsh$R(https://www.felixcloutier.com/x86/vrcpshVSCATTERPF0DPDVSCATTERPF0DPDvscatterpf0dpd„Sparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Doubleword Indices Using T0 Hint with Intent to Writevscatterpf0dpdGL]https://www.felixcloutier.com/x86/vscatterpf0dps:vscatterpf0qps:vscatterpf0dpd:vscatterpf0qpdPDEPPDEPpdepParallel Bits Depositpdepl5pdepl'5pdepq5pdepq+5&https://www.felixcloutier.com/x86/pdep VMOVDQU32 VMOVDQU32 vmovdqu32 Move Unaligned Doubleword Values vmovdqu320H vmovdqu32H vmovdqu323H vmovdqu32H vmovdqu326H vmovdqu32H vmovdqu32/H vmovdqu322H vmovdqu325H vmovdqu32H vmovdqu32/H vmovdqu32H vmovdqu322H vmovdqu32H vmovdqu325H vmovdqu32/H vmovdqu322H vmovdqu325HOhttps://www.felixcloutier.com/x86/movdqu:vmovdqu8:vmovdqu16:vmovdqu32:vmovdqu64 VFNMSUBPS VFNMSUBPS vfnmsubpsQFused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values vfnmsubps$ vfnmsubps/$ vfnmsubps/$ vfnmsubps$ vfnmsubps2$ vfnmsubps2$CMOVNECMOVNEcmovneMove if not equal (ZF == 0)cmovnew3  cmovnew3 $cmovnel3cmovnel3'cmovneq3cmovneq3+KANDNQKANDNQkandnq$Bitwise Logical AND NOT 64-bit MaskskandnqI=https://www.felixcloutier.com/x86/kandnw:kandnb:kandnq:kandnd VFCMULCSH VFCMULCSH vfcmulcshOFused Conjugate Multiply of Complex Scalar Half-Precision Floating-Point Values vfcmulcshR vfcmulcsh'R vfcmulcshR vfcmulcsh'R vfcmulcshQR vfcmulcshQR4https://www.felixcloutier.com/x86/vfcmulcsh:vfmulcshVPABSBVPABSBvpabsb&Packed Absolute Value of Byte IntegersvpabsbIvpabsbIvpabsbIvpabsb/Ivpabsb2Ivpabsb5Ivpabsb4 vpabsbIvpabsb4/ vpabsb/Ivpabsb4!vpabsbIvpabsb42!vpabsb2IvpabsbIvpabsb5I VPERMILPD VPERMILPD vpermilpd.Permute Double-Precision Floating-Point Values  vpermilpd=H vpermilpd?H vpermilpdAH vpermilpd=H vpermilpdH vpermilpdH vpermilpd?H vpermilpdH vpermilpdH vpermilpdAH vpermilpdH vpermilpdH vpermilpd=H vpermilpd=H vpermilpd4  vpermilpdH vpermilpd4  vpermilpdH vpermilpd4/  vpermilpd4/  vpermilpd?H vpermilpd?H vpermilpd4  vpermilpdH vpermilpd4  vpermilpdH vpermilpd42  vpermilpd42  vpermilpdAH vpermilpdAH vpermilpdH vpermilpdH+https://www.felixcloutier.com/x86/vpermilpd VPGATHERDD VPGATHERDD vpgatherdd?Gather Packed Doubleword Values Using Signed Doubleword Indices vpgatherddBH vpgatherddFH vpgatherddJH vpgatherddB! vpgatherddF!7https://www.felixcloutier.com/x86/vpgatherdd:vpgatherdq PHMINPOSUW PHMINPOSUW phminposuw3Packed Horizontal Minimum of Unsigned Word Integers phminposuw3 phminposuw3/,https://www.felixcloutier.com/x86/phminposuwVSQRTSSVSQRTSSvsqrtssCCompute Square Root of Scalar Single-Precision Floating-Point ValuevsqrtssHvsqrtss'Hvsqrtss4 vsqrtssHvsqrtss4' vsqrtss'HvsqrtssQHvsqrtssQH VINSERTI32X8 VINSERTI32X8 vinserti32x83Insert 256 Bits of Packed Doubleword Integer Values vinserti32x8J vinserti32x82J vinserti32x8J vinserti32x82JADDSSADDSSaddss1Add Scalar Single-Precision Floating-Point ValuesaddssADDSS3addssADDSS3''https://www.felixcloutier.com/x86/addssKANDNWKANDNWkandnw$Bitwise Logical AND NOT 16-bit MaskskandnwH=https://www.felixcloutier.com/x86/kandnw:kandnb:kandnq:kandndNOTNOTnotOne's Complement NegationnotbNOTB3 notwNOTW3 notlNOTL3notqNOTQ3notbNOTB3#notwNOTW3$notlNOTL3'notqNOTQ3+%https://www.felixcloutier.com/x86/not VGATHERPF0QPS VGATHERPF0QPS vgatherpf0qpsmSparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Quadword Indices Using T0 Hint vgatherpf0qpsMLYhttps://www.felixcloutier.com/x86/vgatherpf0dps:vgatherpf0qps:vgatherpf0dpd:vgatherpf0qpdHADDPDHADDPDhaddpdPacked Double-FP Horizontal Addhaddpd3haddpd3/(https://www.felixcloutier.com/x86/haddpd VCVTSI2SH VCVTSI2SH vcvtsi2sh7Convert Dword Integer to Scalar Half-Precision FP Value vcvtsi2shlR vcvtsi2shqR vcvtsi2shl'R vcvtsi2shq+R vcvtsi2shlQR vcvtsi2shqQR+https://www.felixcloutier.com/x86/vcvtsi2shBLENDPDBLENDPDblendpd3Blend Packed Double Precision Floating-Point Valuesblendpd3blendpd3/)https://www.felixcloutier.com/x86/blendpdCMOVPECMOVPEcmovpeMove if parity even (PF == 1)cmovpew3  cmovpew3 $cmovpel3cmovpel3'cmovpeq3cmovpeq3+ VPDPBSSDS VPDPBSSDS vpdpbssdsVPacked Dot Product of Signed-by-Singed Byte subvectors into Doubleword with Saturation vpdpbssdsX vpdpbssds/X vpdpbssdsX vpdpbssds2XVMOVLHPSVMOVLHPSvmovlhps>Move Packed Single-Precision Floating-Point Values Low to Highvmovlhps4 vmovlhpsHTDPBUUDTDPBUUDtdpbuudQTile Dot Product of Unsigned bytes by Unsigned bytes with Doubleword accumulationtdpbuudTTTAhttps://www.felixcloutier.com/x86/tdpbssd:tdpbsud:tdpbusd:tdpbuudMULPDMULPDmulpd6Multiply Packed Double-Precision Floating-Point ValuesmulpdMULPD3mulpdMULPD3/'https://www.felixcloutier.com/x86/mulpdVPXORQVPXORQvpxorq8Bitwise Logical Exclusive OR of Packed Quadword Integers vpxorq=HvpxorqHvpxorq?HvpxorqHvpxorqAHvpxorqHvpxorq=HvpxorqHvpxorq?HvpxorqHvpxorqAHvpxorqH CMPNPXADD CMPNPXADD cmpnpxaddCompare for Not Parity and Add cmpnpxadd' cmpnpxadd+PSRLWPSRLWpsrlw$Shift Packed Word Data Right LogicalpsrlwPSRLW3 psrlwPSRLW3 psrlwPSRLW3+ psrlwPSRLW3psrlwPSRLW3psrlwPSRLW3/3https://www.felixcloutier.com/x86/psrlw:psrld:psrlqPMAXUWPMAXUWpmaxuw(Maximum of Packed Unsigned Word Integerspmaxuw3pmaxuw3//https://www.felixcloutier.com/x86/pmaxub:pmaxuwPSIGNDPSIGNDpsignd"Packed Sign of Doubleword Integerspsignd3psignd3+psignd3psignd3/6https://www.felixcloutier.com/x86/psignb:psignw:psigndJNOJNOjnoJump if not overflow (OF == 0)jnoJOC3NjnoJOC3ORCLRCLrclRotate Left through Carry FlagrclbRCLB3 rclbRCLB3 rclbRCLB3 rclwRCLW3 rclwRCLW3 rclwRCLW3 rcllRCLL3rcllRCLL3rcllRCLL3rclqRCLQ3rclqRCLQ3rclqRCLQ3rclbRCLB3#rclbRCLB3#rclbRCLB3#rclwRCLW3$rclwRCLW3$rclwRCLW3$rcllRCLL3'rcllRCLL3'rcllRCLL3'rclqRCLQ3+rclqRCLQ3+rclqRCLQ3+1https://www.felixcloutier.com/x86/rcl:rcr:rol:rorVMOVDQU8VMOVDQU8vmovdqu8Move Unaligned Byte Valuesvmovdqu80Ivmovdqu8Ivmovdqu83Ivmovdqu8Ivmovdqu86Ivmovdqu8Ivmovdqu8/Ivmovdqu82Ivmovdqu85Ivmovdqu8Ivmovdqu8/Ivmovdqu8Ivmovdqu82Ivmovdqu8Ivmovdqu85Ivmovdqu8/Ivmovdqu82Ivmovdqu85IOhttps://www.felixcloutier.com/x86/movdqu:vmovdqu8:vmovdqu16:vmovdqu32:vmovdqu64 VPERMIL2PS VPERMIL2PS vpermil2ps:Permute Two-Source Single-Precision Floating-Point Vectors vpermil2ps" vpermil2ps/" vpermil2ps/" vpermil2ps" vpermil2ps2" vpermil2ps2" VGETEXPPH VGETEXPPH vgetexpphhExtract Exponents of Packed Half-Precision Floating-Point Values as Half-Precision Floating-Point Values vgetexpph<K vgetexpph>K vgetexpph@R vgetexpphK vgetexpphK vgetexpphR vgetexpph<K vgetexpphK vgetexpph>K vgetexpphK vgetexpph@R vgetexpphR vgetexpphRR vgetexpphRR+https://www.felixcloutier.com/x86/vgetexpphCMOVECMOVEcmoveMove if equal (ZF == 1)cmovew3  cmovew3 $cmovel3cmovel3'cmoveq3cmoveq3+ADDADDaddAddaddbADDB3addbADDB3 addbADDB3  addbADDB3 #addwADDW3 addwADDW3 addwADDW3 addwADDW3  addwADDW3 $addlADDL3addlADDL3addlADDL3addlADDL3addlADDL3'addqADDQ3addqADDQ3addqADDQ3addqADDQ3addqADDQ3+addbADDB3#addbADDB3# addwADDW3$addwADDW3$addwADDW3$ addlADDL3'addlADDL3'addlADDL3'addqADDQ3+addqADDQ3+addqADDQ3+%https://www.felixcloutier.com/x86/addBTSBTSbtsBit Test and Set btswBTSW3 btswBTSW  btslBTSL3btslBTSLbtsqBTSQ3btsqBTSQbtswBTSW3$btswBTSW$ btslBTSL3'btslBTSL'btsqBTSQ3+btsqBTSQ+%https://www.felixcloutier.com/x86/btsJNAJNAjna&Jump if not above (CF == 1 or ZF == 1)jnaJLS3NjnaJLS3OMOVDQAMOVDQAmovdqaMove Aligned Double QuadwordmovdqaMOVO3movdqaMOVO3/movdqaMOVO3/<https://www.felixcloutier.com/x86/movdqa:vmovdqa32:vmovdqa64 VGETEXPSS VGETEXPSS vgetexpssiExtract Exponent of Scalar Single-Precision Floating-Point Value as Single-Precision Floating-Point Value vgetexpssH vgetexpss'H vgetexpssH vgetexpss'H vgetexpssRH vgetexpssRH+https://www.felixcloutier.com/x86/vgetexpssROLROLrol Rotate LeftrolbROLB3 rolbROLB3 rolbROLB3 rolwROLW3 rolwROLW3 rolwROLW3 rollROLL3rollROLL3rollROLL3rolqROLQ3rolqROLQ3rolqROLQ3rolbROLB3#rolbROLB3#rolbROLB3#rolwROLW3$rolwROLW3$rolwROLW3$rollROLL3'rollROLL3'rollROLL3'rolqROLQ3+rolqROLQ3+rolqROLQ3+1https://www.felixcloutier.com/x86/rcl:rcr:rol:ror VCVTPD2QQ VCVTPD2QQ vcvtpd2qqQConvert Packed Double-Precision Floating-Point Values to Packed Quadword Integers vcvtpd2qq=J vcvtpd2qq?J vcvtpd2qqAJ vcvtpd2qqJ vcvtpd2qqJ vcvtpd2qqJ vcvtpd2qq=J vcvtpd2qqJ vcvtpd2qq?J vcvtpd2qqJ vcvtpd2qqAJ vcvtpd2qqJ vcvtpd2qqQJ vcvtpd2qqQJ+https://www.felixcloutier.com/x86/vcvtpd2qqVFMSUBADD231PHVFMSUBADD231PHvfmsubadd231phVFused Multiply-Alternating Subtract/Add of Packed Half-Precision Floating-Point Valuesvfmsubadd231ph<Kvfmsubadd231phKvfmsubadd231ph>Kvfmsubadd231phKvfmsubadd231ph@Rvfmsubadd231phRvfmsubadd231ph<Kvfmsubadd231phKvfmsubadd231ph>Kvfmsubadd231phKvfmsubadd231ph@Rvfmsubadd231phRvfmsubadd231phQRvfmsubadd231phQRNhttps://www.felixcloutier.com/x86/vfmsubadd132ph:vfmsubadd213ph:vfmsubadd231phVPSHLDVQVPSHLDVQvpshldvq@Concatenate and Variable Shift Packed Quadword Data Left Logical vpshldvq=KvpshldvqKvpshldvq?KvpshldvqKvpshldvqAUvpshldvqUvpshldvq=KvpshldvqKvpshldvq?KvpshldvqKvpshldvqAUvpshldvqU VPHADDUBQ VPHADDUBQ vphaddubq/Packed Horizontal Add Unsigned Byte to Quadword vphaddubq" vphaddubq/"MFENCEMFENCEmfence Memory FencemfenceMFENCE3(https://www.felixcloutier.com/x86/mfenceJCJCjcJump if carry (CF == 1)jcJCS3NjcJCS3OKXNORBKXNORBkxnorb Bitwise Logical XNOR 8-bit MaskskxnorbJ=https://www.felixcloutier.com/x86/kxnorw:kxnorb:kxnorq:kxnord VCVTTPD2DQ VCVTTPD2DQ vcvttpd2dqRConvert with Truncation Packed Double-Precision FP Values to Packed Dword Integers vcvttpd2dqx=H vcvttpd2dqy?H vcvttpd2dqAH vcvttpd2dqxH vcvttpd2dqyH vcvttpd2dqH vcvttpd2dqx=H vcvttpd2dqy?H vcvttpd2dqx4  vcvttpd2dqxH vcvttpd2dqy4  vcvttpd2dqyH vcvttpd2dqx4/  vcvttpd2dqy42  vcvttpd2dqAH vcvttpd2dqH vcvttpd2dqRH vcvttpd2dqRH VINSERTF32X8 VINSERTF32X8 vinsertf32x8@Insert 256 Bits of Packed Single-Precision Floating-Point Values vinsertf32x8J vinsertf32x82J vinsertf32x8J vinsertf32x82J VUNPCKLPS VUNPCKLPS vunpcklpsGUnpack and Interleave Low Packed Single-Precision Floating-Point Values vunpcklps9H vunpcklpsH vunpcklps:H vunpcklpsH vunpcklps;H vunpcklpsH vunpcklps9H vunpcklps4  vunpcklpsH vunpcklps4/  vunpcklps:H vunpcklps4  vunpcklpsH vunpcklps42  vunpcklps;H vunpcklpsHVMPSADBWVMPSADBWvmpsadbw3Compute Multiple Packed Sums of Absolute Differencevmpsadbw4 vmpsadbw4/ vmpsadbw4!vmpsadbw42!VPHADDBQVPHADDBQvphaddbq4Packed Horizontal Add Signed Byte to Signed Quadwordvphaddbq"vphaddbq/"VPRORQVPRORQvprorqRotate Packed Quadword Right vprorq=Hvprorq?HvprorqAHvprorqHvprorqHvprorqHvprorq=HvprorqHvprorq?HvprorqHvprorqAHvprorqH?https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvqKANDWKANDWkandw Bitwise Logical AND 16-bit MaskskandwH9https://www.felixcloutier.com/x86/kandw:kandb:kandq:kandd VFNMSUB213SS VFNMSUB213SS vfnmsub213ssQFused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values vfnmsub213ssH vfnmsub213ss'H vfnmsub213ss4# vfnmsub213ssH vfnmsub213ss4'# vfnmsub213ss'H vfnmsub213ssQH vfnmsub213ssQHHhttps://www.felixcloutier.com/x86/vfnmsub132ss:vfnmsub213ss:vfnmsub231ssVPCOMWVPCOMWvpcomw#Compare Packed Signed Word Integersvpcomw"vpcomw/" VCVTQQ2PS VCVTQQ2PS vcvtqq2psQConvert Packed Quadword Integers to Packed Single-Precision Floating-Point Values vcvtqq2psx=J vcvtqq2psy?J vcvtqq2psAJ vcvtqq2psxJ vcvtqq2psyJ vcvtqq2psJ vcvtqq2psx=J vcvtqq2psy?J vcvtqq2psxJ vcvtqq2psyJ vcvtqq2psAJ vcvtqq2psJ vcvtqq2psQJ vcvtqq2psQJ+https://www.felixcloutier.com/x86/vcvtqq2psVPCMPGTDVPCMPGTDvpcmpgtd:Compare Packed Signed Doubleword Integers for Greater Thanvpcmpgtd9Hvpcmpgtd9HvpcmpgtdHvpcmpgtdHvpcmpgtd:Hvpcmpgtd:HvpcmpgtdHvpcmpgtdHvpcmpgtd;Hvpcmpgtd;HvpcmpgtdHvpcmpgtdHvpcmpgtd4 vpcmpgtd4/ vpcmpgtd4!vpcmpgtd42!PMOVSXWQPMOVSXWQpmovsxwqBMove Packed Word Integers to Quadword Integers with Sign Extensionpmovsxwq3pmovsxwq3'VPSHLDVDVPSHLDVDvpshldvdBConcatenate and Variable Shift Packed Doubleword Data Left Logical vpshldvd9KvpshldvdKvpshldvd:KvpshldvdKvpshldvd;UvpshldvdUvpshldvd9KvpshldvdKvpshldvd:KvpshldvdKvpshldvd;UvpshldvdUPFADDPFADDpfaddPacked Floating-Point AddpfaddPFADD3pfaddPFADD3+JNEJNEjneJump if not equal (ZF == 0)jneJNE3NjneJNE3OVFMSUBADD213PSVFMSUBADD213PSvfmsubadd213psXFused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Valuesvfmsubadd213ps9Hvfmsubadd213psHvfmsubadd213ps:Hvfmsubadd213psHvfmsubadd213ps;Hvfmsubadd213psHvfmsubadd213ps9Hvfmsubadd213ps4#vfmsubadd213psHvfmsubadd213ps4/#vfmsubadd213ps:Hvfmsubadd213ps4#vfmsubadd213psHvfmsubadd213ps42#vfmsubadd213ps;Hvfmsubadd213psHvfmsubadd213psQHvfmsubadd213psQHNhttps://www.felixcloutier.com/x86/vfmsubadd132ps:vfmsubadd213ps:vfmsubadd231psVPXORDVPXORDvpxord:Bitwise Logical Exclusive OR of Packed Doubleword Integers vpxord9HvpxordHvpxord:HvpxordHvpxord;HvpxordHvpxord9HvpxordHvpxord:HvpxordHvpxord;HvpxordH VFMSUB231PS VFMSUB231PS vfmsub231psHFused Multiply-Subtract of Packed Single-Precision Floating-Point Values vfmsub231ps9H vfmsub231psH vfmsub231ps:H vfmsub231psH vfmsub231ps;H vfmsub231psH vfmsub231ps9H vfmsub231ps4# vfmsub231psH vfmsub231ps4/# vfmsub231ps:H vfmsub231ps4# vfmsub231psH vfmsub231ps42# vfmsub231ps;H vfmsub231psH vfmsub231psQH vfmsub231psQHEhttps://www.felixcloutier.com/x86/vfmsub132ps:vfmsub213ps:vfmsub231psVPMINSDVPMINSDvpminsd,Minimum of Packed Signed Doubleword Integersvpminsd9HvpminsdHvpminsd:HvpminsdHvpminsd;HvpminsdHvpminsd9Hvpminsd4 vpminsdHvpminsd4/ vpminsd:Hvpminsd4!vpminsdHvpminsd42!vpminsd;HvpminsdHMOVUPSMOVUPSmovups<Move Unaligned Packed Single-Precision Floating-Point ValuesmovupsMOVUPS3movupsMOVUPS3/movupsMOVUPS3/(https://www.felixcloutier.com/x86/movupsMOVNTSSMOVNTSSmovntssKStore Scalar Single-Precision Floating-Point Values Using Non-Temporal Hintmovntss3'VPTESTMBVPTESTMBvptestmb6Logical AND of Packed Byte Integer Values and Set Mask vptestmbIvptestmbIvptestmb/Ivptestmb/IvptestmbIvptestmbIvptestmb2Ivptestmb2IvptestmbIvptestmbIvptestmb5Ivptestmb5IEhttps://www.felixcloutier.com/x86/vptestmb:vptestmw:vptestmd:vptestmq VREDUCESS VREDUCESS vreducessRPerform Reduction Transformation on a Scalar Single-Precision Floating-Point Value vreducessJ vreducess'J vreducessJ vreducess'J+https://www.felixcloutier.com/x86/vreducessVPANDQVPANDQvpandq/Bitwise Logical AND of Packed Quadword Integers vpandq=HvpandqHvpandq?HvpandqHvpandqAHvpandqHvpandq=HvpandqHvpandq?HvpandqHvpandqAHvpandqHVPSHLQVPSHLQvpshlqPacked Shift Logical Quadwordsvpshlq"vpshlq/"vpshlq/" VPDPBUSDS VPDPBUSDS vpdpbusdsXPacked Dot Product of Unsigned-by-Singed Byte subvectors into Doubleword with Saturation vpdpbusds9K vpdpbusdsK vpdpbusds:K vpdpbusdsK vpdpbusds;V vpdpbusdsV vpdpbusds9K vpdpbusdsW vpdpbusdsK vpdpbusds/W vpdpbusds:K vpdpbusdsW vpdpbusdsK vpdpbusds2W vpdpbusds;V vpdpbusdsV+https://www.felixcloutier.com/x86/vpdpbusds VFNMADD231PH VFNMADD231PH vfnmadd231phJFused Negative Multiply-Add of Packed Half-Precision Floating-Point Values vfnmadd231ph<K vfnmadd231phK vfnmadd231ph>K vfnmadd231phK vfnmadd231ph@R vfnmadd231phR vfnmadd231ph<K vfnmadd231phK vfnmadd231ph>K vfnmadd231phK vfnmadd231ph@R vfnmadd231phR vfnmadd231phQR vfnmadd231phQRlhttps://www.felixcloutier.com/x86/vfmadd132ph:vfnmadd132ph:vfmadd213ph:vfnmadd213ph:vfmadd231ph:vfnmadd231phJECXZJECXZjecxzJump if ECX register is 0jecxzJCXZL3NVRCP28PDVRCP28PDvrcp28pdtApproximation to the Reciprocal of Packed Double-Precision Floating-Point Values with Less Than 2^-28 Relative Errorvrcp28pdAMvrcp28pdMvrcp28pdAMvrcp28pdMvrcp28pdRMvrcp28pdRM*https://www.felixcloutier.com/x86/vrcp28pdPAVGWPAVGWpavgwAverage Packed Word IntegerspavgwPAVGW3 pavgwPAVGW3+ pavgwPAVGW3pavgwPAVGW3/-https://www.felixcloutier.com/x86/pavgb:pavgwMOVMSKPSMOVMSKPSmovmskps8Extract Packed Single-Precision Floating-Point Sign MaskmovmskpsMOVMSKPS3*https://www.felixcloutier.com/x86/movmskps VREDUCESD VREDUCESD vreducesdRPerform Reduction Transformation on a Scalar Double-Precision Floating-Point Value vreducesdJ vreducesd+J vreducesdJ vreducesd+J+https://www.felixcloutier.com/x86/vreducesdJSJSjsJump if sign (SF == 1)jsJMI3NjsJMI3OCMPBXADDCMPBXADDcmpbxaddCompare for Below and Addcmpbxadd'cmpbxadd+CMPXCHGCMPXCHGcmpxchgCompare and ExchangecmpxchgbCMPXCHGB3  cmpxchgwCMPXCHGW3  cmpxchglCMPXCHGL3cmpxchgqCMPXCHGQ3cmpxchgbCMPXCHGB3# cmpxchgwCMPXCHGW3$ cmpxchglCMPXCHGL3'cmpxchgqCMPXCHGQ3+)https://www.felixcloutier.com/x86/cmpxchgKXNORDKXNORDkxnord!Bitwise Logical XNOR 32-bit MaskskxnordI=https://www.felixcloutier.com/x86/kxnorw:kxnorb:kxnorq:kxnordJAEJAEjae Jump if above or equal (CF == 0)jaeJCC3NjaeJCC3OPSUBSWPSUBSWpsubsw;Subtract Packed Signed Word Integers with Signed SaturationpsubswPSUBSW3 psubswPSUBSW3+ psubswPSUBSW3psubswPSUBSW3//https://www.felixcloutier.com/x86/psubsb:psubswCALLCALLcallCall ProcedurecallCALLOcallqcallq+&https://www.felixcloutier.com/x86/callVALIGNDVALIGNDvaligndAlign Doubleword Vectors valignd9HvaligndHvalignd:HvaligndHvalignd;HvaligndHvalignd9HvaligndHvalignd:HvaligndHvalignd;HvaligndH1https://www.felixcloutier.com/x86/valignd:valignqVBROADCASTF64X2VBROADCASTF64X2vbroadcastf64x26Broadcast Two Double-Precision Floating-Point Elementsvbroadcastf64x2/Jvbroadcastf64x2/Jvbroadcastf64x2/Jvbroadcastf64x2/J VCVTTSD2SI VCVTTSD2SI vcvttsd2siJConvert with Truncation Scalar Double-Precision FP Value to Signed Integer  vcvttsd2si4  vcvttsd2siH vcvttsd2si4+  vcvttsd2si+H vcvttsd2si4  vcvttsd2siH vcvttsd2si4+  vcvttsd2si+H vcvttsd2siRH vcvttsd2siRHWRFSBASEWRFSBASEwrfsbaseWRite FS segment BASEwrfsbase=wrfsbase=3https://www.felixcloutier.com/x86/wrfsbase:wrgsbasePSHUFHWPSHUFHWpshufhwShuffle Packed High WordspshufhwPSHUFHW3pshufhwPSHUFHW3/)https://www.felixcloutier.com/x86/pshufhw SHA256MSG1 SHA256MSG1 sha256msg1PPerform an Intermediate Calculation for the Next Four SHA256 Message Doublewords sha256msg1( sha256msg1/(,https://www.felixcloutier.com/x86/sha256msg1PFSUBPFSUBpfsubPacked Floating-Point SubtractpfsubPFSUB3pfsubPFSUB3+ VSHUFI64X2 VSHUFI64X2 vshufi64x2.Shuffle 128-Bit Packed Quadword Integer Values vshufi64x2?H vshufi64x2H vshufi64x2AH vshufi64x2H vshufi64x2?H vshufi64x2H vshufi64x2AH vshufi64x2HCMPOXADDCMPOXADDcmpoxaddCompare for Overflow and Addcmpoxadd'cmpoxadd+VDIVPHVDIVPHvdivph2Divide Packed Half-Precision Floating-Point Valuesvdivph<KvdivphKvdivph>KvdivphKvdivph@RvdivphRvdivph<KvdivphKvdivph>KvdivphKvdivph@RvdivphRvdivphQRvdivphQR(https://www.felixcloutier.com/x86/vdivphROUNDPDROUNDPDroundpd3Round Packed Double Precision Floating-Point Valuesroundpd3roundpd3/)https://www.felixcloutier.com/x86/roundpdVPHADDBWVPHADDBWvphaddbw0Packed Horizontal Add Signed Byte to Signed Wordvphaddbw"vphaddbw/"PBLENDVBPBLENDVBpblendvbVariable Blend Packed Bytespblendvb3pblendvb3/*https://www.felixcloutier.com/x86/pblendvb VBCSTNESH2PS VBCSTNESH2PS vbcstnesh2ps<Load FP16 Element and Convert to FP32 Element with Broadcast vbcstnesh2ps$Z vbcstnesh2ps$Z