JBreakによるパフォーマンスタスクの分析(パート4)

最後の4番目のタスクの分析:

public double octaPow(double a) { return Math.pow(a, 8); } public double octaPow(double a) { return a * a * a * a * a * a * a * a; } public double octaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double octaPow(double a) { a *= a; a *= a; return a * a; } 

条件(簡略化):
どのメソッドが高速で、どれが遅いかを判別します(JRE 1.8.0_161)。
カットベンチマーク、アセンブラの一部、およびJVMからの最適化の分析。

シリーズの他の出版物: パート1パート2 、およびパート3

タスクに関する解説


ご存知のように、浮動小数点演算は悪名高いです。

  1. 複雑で実装依存。
  2. 連想的ではありません。
  3. それらは非論理的な結果をもたらします。
  4. ほとんどの場合、結果を==と比較しても意味がありません。

これに関して、提案された方法異なる計算結果を与えることができるが、パフォーマンスの観点ではなく、 算術的な意味で得られることを理解することが重要です。

いくつかの例
  public static void main(String[] args) { double value = 1e15; double delta = 0.0001; System.out.println(value + delta == value); // true double a = 1.010101; double b = 101.0101; double c = 10101.01; System.out.println((a * b) * c != a * (b * c)); // true } 


明らかに間違った答え


このタスクには4つのタイプのアルゴリズムがあったため、より多くの潜在的な答えがあります。
すべてのオプションは同じです、なぜなら JavaにはクールなJITコンパイラがあります! /* */
2番目または4番目のオプションが最速です。 単純な乗算です。

調査した方法の詳細な分析


  public double mathOctaPow(double a) { return Math.pow(a, 8); } public double plainOctaPow(double a) { return a * a * a * a * a * a * a * a; } public double trickyMathOctaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double trickyPlainOctaPow(double a) { a *= a; a *= a; return a * a; } 

逆アセンブルされたコードは、次のキーセットを使用して出力されました。

 -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,<>.<> -XX:PrintAssemblyOptions=intel 

plainOctaPow


plainOctaPowの最も単純なケースから始めましょう。 実際、コード

 a * a * a * a * a * a * a * a 

コードと同等

 ((((((a * a) * a) * a) * a) * a) * a) * a 

乗算演算の左結合性のため。

実質的に、このコードはJITコンパイラー(c1)によって次の命令セットにコンパイルされました( xmm0レジスタにはdouble aパラメーターの値のみが含まれてxmm0ます)。

  0x0000000002c96a3e: vmovapd xmm1, xmm0 0x0000000002c96a42: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a46: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a4a: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a4e: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a52: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a56: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a5a: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a5e: vmovapd xmm0, xmm1 

簡単な取扱説明書
vmovapd xmm1, xmm2レジスタxmm2からレジスタxmm1位置合わせされた倍精度浮動小数点数を配置します (倍精度浮動小数点はどこからでもdoubleと呼ばれます)。 XMMレジスターのサイズは128bitので、一度に最大2つのdoubleを使用できます。 この命令は、それぞれ256ビットと512ビットのサイズのYMMおよびZMMレジスタをサポートします。

vmulsd xmm1, xmm2, xmm3レジスタxmm2およびxmm3からの倍精度値を乗算し、結果をレジスタxmm1ます。 前の手順と同様に、最大2つのdoubleを同時に乗算できます。 YMMおよびZMMレジスターを使用する場合、それぞれ最大4つおよび8つのdoubleです。

命令のシーケンスは、コードに記述されているものと正確に一致します。つまり、 中間結果a順次乗算です。 この場合、左結合性に違反して、結果コードを最適化することはできません。

trickyPlainOctaPow


同等の結果を得るのに問題がないことを思い出させてください。 したがって、操作の数を減らすことでコードの最適化を独自に試みることができます。たとえば、連続する乗算を2乗の3つの操作に置き換えます。

trickyPlainOctaPow()メソッドのコードは、次の一連の命令に意味のある形でコンパイルされます。

 0x0000000002b501be: vmovapd xmm1, xmm0 0x0000000002b501c2: vmulsd xmm1, xmm1, xmm0 0x0000000002b501c6: vmovapd xmm0, xmm1 0x0000000002b501ca: vmulsd xmm0, xmm0, xmm1 0x0000000002b501ce: vmovapd xmm1, xmm0 0x0000000002b501d2: vmulsd xmm1, xmm1, xmm0 0x0000000002b501d6: vmovapd xmm0, xmm1 

ご覧のとおり、演算の総数が減少しました。7回の乗算の代わりに、乗算の2番目のオペランドを準備するために3回の乗算と2回のvmovapd命令が得られました。 各命令の条件付きlatency考慮すると、結果のコードは約2倍高速になります。

mathOctaPow


Math.pow()メソッドの実装の内部を見てください:

  public static double pow(double a, double b) { return StrictMath.pow(a, b); } 

最初に注意することは、2番目の引数で渡される次数の値がdouble型であることです。 このため、関数の実装は通常の乗算​​の場合ほど単純にすることはできません。

同時に、 StrictMath.pow()はネイティブメソッドです。

  public static native double pow(double a, double b); 

実際的な意味では、これはMath.pow()を呼び出すことは、JNIを使​​用してネイティブメソッドを呼び出すことになります。 一方、JDKは組み込み関数を広範囲に使用しますHotSpot組み込み関数完全なリストを参照してください)。 その中には_dpow - Math.pow()呼び出しを置き換える組み込み関数があります。

後者は、ウォームアップ後、コードがJITコンパイラーでコンパイルされると、 mathOctaPow()メソッドで次数を計算するためのコードを取得できることを意味します。

mathOctaPow()メソッドのアセンブラコードの内容
  0x0000000002aaacd0: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff68] # 0x0000000002aaac40 ; {section_word} 0x0000000002aaacd8: vmovsd QWORD PTR [rsp],xmm1 0x0000000002aaacdd: fld QWORD PTR [rsp] 0x0000000002aaace0: vmovsd QWORD PTR [rsp],xmm0 0x0000000002aaace5: fld QWORD PTR [rsp] 0x0000000002aaace8: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002aaacf2: fld QWORD PTR [rax] 0x0000000002aaacf4: fucomip st,st(2) 0x0000000002aaacf6: jp 0x0000000002aaad0f 0x0000000002aaacfc: jne 0x0000000002aaad0f 0x0000000002aaad02: fxch st(1) 0x0000000002aaad04: ffree st(0) 0x0000000002aaad06: fincstp 0x0000000002aaad08: fmul st,st(0) 0x0000000002aaad0a: jmp 0x0000000002aab166 0x0000000002aaad0f: fldz 0x0000000002aaad11: fucomip st,st(1) 0x0000000002aaad13: ja 0x0000000002aaad96 0x0000000002aaad19: fld st(1) 0x0000000002aaad1b: fld st(1) 0x0000000002aaad1d: sub rsp,0x8 0x0000000002aaad21: fstcw WORD PTR [rsp] 0x0000000002aaad25: mov eax,DWORD PTR [rsp] 0x0000000002aaad28: or eax,0x300 0x0000000002aaad2e: push rax 0x0000000002aaad2f: fldcw WORD PTR [rsp] 0x0000000002aaad32: pop rax 0x0000000002aaad33: fyl2x 0x0000000002aaad35: sub rsp,0x8 0x0000000002aaad39: fld st(0) 0x0000000002aaad3b: frndint 0x0000000002aaad3d: fsubr st(1),st 0x0000000002aaad3f: fistp DWORD PTR [rsp] 0x0000000002aaad42: f2xm1 0x0000000002aaad44: fld1 0x0000000002aaad46: faddp st(1),st 0x0000000002aaad48: mov eax,DWORD PTR [rsp] 0x0000000002aaad4b: mov ecx,0xfffff800 0x0000000002aaad50: add eax,0x3ff 0x0000000002aaad56: mov edx,eax 0x0000000002aaad58: shl eax,0x14 0x0000000002aaad5b: add edx,0x1 0x0000000002aaad5e: cmove eax,ecx 0x0000000002aaad61: cmp edx,0x1 0x0000000002aaad64: cmove eax,ecx 0x0000000002aaad67: test ecx,edx 0x0000000002aaad69: cmovne eax,ecx 0x0000000002aaad6c: mov DWORD PTR [rsp+0x4],eax 0x0000000002aaad70: mov DWORD PTR [rsp],0x0 0x0000000002aaad77: fmul QWORD PTR [rsp] 0x0000000002aaad7a: add rsp,0x8 0x0000000002aaad7e: fldcw WORD PTR [rsp] 0x0000000002aaad81: add rsp,0x8 0x0000000002aaad85: fucomi st,st(0) 0x0000000002aaad87: jp 0x0000000002aaae36 0x0000000002aaad8d: ffree st(2) 0x0000000002aaad8f: ffree st(1) 0x0000000002aaad91: jmp 0x0000000002aab166 0x0000000002aaad96: fld st(1) 0x0000000002aaad98: frndint 0x0000000002aaad9a: fucomi st,st(2) 0x0000000002aaad9c: jne 0x0000000002aaae36 0x0000000002aaada2: sub rsp,0x8 0x0000000002aaada6: fistp QWORD PTR [rsp] 0x0000000002aaada9: fld st(1) 0x0000000002aaadab: fld st(1) 0x0000000002aaadad: fabs 0x0000000002aaadaf: sub rsp,0x8 0x0000000002aaadb3: fstcw WORD PTR [rsp] 0x0000000002aaadb7: mov eax,DWORD PTR [rsp] 0x0000000002aaadba: or eax,0x300 0x0000000002aaadc0: push rax 0x0000000002aaadc1: fldcw WORD PTR [rsp] 0x0000000002aaadc4: pop rax 0x0000000002aaadc5: fyl2x 0x0000000002aaadc7: sub rsp,0x8 0x0000000002aaadcb: fld st(0) 0x0000000002aaadcd: frndint 0x0000000002aaadcf: fsubr st(1),st 0x0000000002aaadd1: fistp DWORD PTR [rsp] 0x0000000002aaadd4: f2xm1 0x0000000002aaadd6: fld1 0x0000000002aaadd8: faddp st(1),st 0x0000000002aaadda: mov eax,DWORD PTR [rsp] 0x0000000002aaaddd: mov ecx,0xfffff800 0x0000000002aaade2: add eax,0x3ff 0x0000000002aaade8: mov edx,eax 0x0000000002aaadea: shl eax,0x14 0x0000000002aaaded: add edx,0x1 0x0000000002aaadf0: cmove eax,ecx 0x0000000002aaadf3: cmp edx,0x1 0x0000000002aaadf6: cmove eax,ecx 0x0000000002aaadf9: test ecx,edx 0x0000000002aaadfb: cmovne eax,ecx 0x0000000002aaadfe: mov DWORD PTR [rsp+0x4],eax 0x0000000002aaae02: mov DWORD PTR [rsp],0x0 0x0000000002aaae09: fmul QWORD PTR [rsp] 0x0000000002aaae0c: add rsp,0x8 0x0000000002aaae10: fldcw WORD PTR [rsp] 0x0000000002aaae13: add rsp,0x8 0x0000000002aaae17: fucomi st,st(0) 0x0000000002aaae19: pop rax 0x0000000002aaae1a: jp 0x0000000002aaae36 0x0000000002aaae20: ffree st(2) 0x0000000002aaae22: ffree st(1) 0x0000000002aaae24: test eax,0x1 0x0000000002aaae29: je 0x0000000002aab166 0x0000000002aaae2f: fchs 0x0000000002aaae31: jmp 0x0000000002aab166 0x0000000002aaae36: ffree st(0) 0x0000000002aaae38: fincstp 0x0000000002aaae3a: mov QWORD PTR [rsp-0x28],rsp 0x0000000002aaae3f: sub rsp,0x80 0x0000000002aaae46: mov QWORD PTR [rsp+0x78],rax 0x0000000002aaae4b: mov QWORD PTR [rsp+0x70],rcx 0x0000000002aaae50: mov QWORD PTR [rsp+0x68],rdx 0x0000000002aaae55: mov QWORD PTR [rsp+0x60],rbx 0x0000000002aaae5a: mov QWORD PTR [rsp+0x50],rbp 0x0000000002aaae5f: mov QWORD PTR [rsp+0x48],rsi 0x0000000002aaae64: mov QWORD PTR [rsp+0x40],rdi 0x0000000002aaae69: mov QWORD PTR [rsp+0x38],r8 0x0000000002aaae6e: mov QWORD PTR [rsp+0x30],r9 0x0000000002aaae73: mov QWORD PTR [rsp+0x28],r10 0x0000000002aaae78: mov QWORD PTR [rsp+0x20],r11 0x0000000002aaae7d: mov QWORD PTR [rsp+0x18],r12 0x0000000002aaae82: mov QWORD PTR [rsp+0x10],r13 0x0000000002aaae87: mov QWORD PTR [rsp+0x8],r14 0x0000000002aaae8c: mov QWORD PTR [rsp],r15 0x0000000002aaae90: sub rsp,0x100 0x0000000002aaae97: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002aaae9e: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002aaaea6: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002aaaeae: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002aaaeb6: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002aaaebe: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002aaaec6: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002aaaece: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002aaaed6: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002aaaee1: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002aaaeec: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002aaaef7: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002aaaf02: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002aaaf0d: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002aaaf18: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002aaaf23: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002aaaf2e: sub rsp,0x100 0x0000000002aaaf35: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002aaaf3a: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002aaaf40: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002aaaf46: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002aaaf4c: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002aaaf52: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002aaaf58: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002aaaf5e: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002aaaf64: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002aaaf6d: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002aaaf76: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002aaaf7f: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002aaaf88: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002aaaf91: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002aaaf9a: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002aaafa3: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002aaafac: sub rsp,0x10 0x0000000002aaafb0: fstp QWORD PTR [rsp] 0x0000000002aaafb3: fstp QWORD PTR [rsp+0x8] 0x0000000002aaafb7: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002aaafbc: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002aaafc2: sub rsp,0x20 0x0000000002aaafc6: test esp,0xf 0x0000000002aaafcc: je 0x0000000002aaafe4 0x0000000002aaafd2: sub rsp,0x8 0x0000000002aaafd6: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002aaafdb: add rsp,0x8 0x0000000002aaafdf: jmp 0x0000000002aaafe9 0x0000000002aaafe4: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002aaafe9: add rsp,0x20 0x0000000002aaafed: vmovsd QWORD PTR [rsp],xmm0 0x0000000002aaaff2: fld QWORD PTR [rsp] 0x0000000002aaaff5: add rsp,0x10 0x0000000002aaaff9: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002aaaffe: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002aab004: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002aab00a: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002aab010: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002aab016: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002aab01c: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002aab022: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002aab028: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002aab031: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002aab03a: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002aab043: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002aab04c: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002aab055: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002aab05e: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002aab067: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002aab070: add rsp,0x100 0x0000000002aab077: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002aab07e: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002aab086: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002aab08e: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002aab096: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002aab09e: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002aab0a6: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002aab0ae: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002aab0b6: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002aab0c1: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002aab0cc: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002aab0d7: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002aab0e2: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002aab0ed: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002aab0f8: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002aab103: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002aab10e: add rsp,0x100 0x0000000002aab115: mov r15,QWORD PTR [rsp] 0x0000000002aab119: mov r14,QWORD PTR [rsp+0x8] 0x0000000002aab11e: mov r13,QWORD PTR [rsp+0x10] 0x0000000002aab123: mov r12,QWORD PTR [rsp+0x18] 0x0000000002aab128: mov r11,QWORD PTR [rsp+0x20] 0x0000000002aab12d: mov r10,QWORD PTR [rsp+0x28] 0x0000000002aab132: mov r9,QWORD PTR [rsp+0x30] 0x0000000002aab137: mov r8,QWORD PTR [rsp+0x38] 0x0000000002aab13c: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002aab141: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002aab146: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002aab14b: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002aab150: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002aab155: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002aab15a: mov rax,QWORD PTR [rsp+0x78] 0x0000000002aab15f: add rsp,0x80 0x0000000002aab166: fstp QWORD PTR [rsp] 0x0000000002aab169: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::mathOctaPow@4 (line 55) 

ここで、最初の命令は定数8.0をレジスタxmm1に書き込み、値aはすでにレジスタxmm0ます。 次は、 組み込み関数の本体です。

trickyMathOctaPow


驚くべきことに、 高価な Math.pow()を1回呼び出す代わりに、3つを取得しました。 JITコンパイラーは、 trickyMathOctaPow()メソッドの本体を3つの連続した_dpow実装に置き換えました。

順次インライン化_dpow
  0x0000000002a70b14: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff44] # 0x0000000002a70a60 ; {section_word} 0x0000000002a70b1c: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70b21: fld QWORD PTR [rsp] 0x0000000002a70b24: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70b29: fld QWORD PTR [rsp] 0x0000000002a70b2c: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a70b36: fld QWORD PTR [rax] 0x0000000002a70b38: fucomip st,st(2) 0x0000000002a70b3a: jp 0x0000000002a70b53 0x0000000002a70b40: jne 0x0000000002a70b53 0x0000000002a70b46: fxch st(1) 0x0000000002a70b48: ffree st(0) 0x0000000002a70b4a: fincstp 0x0000000002a70b4c: fmul st,st(0) 0x0000000002a70b4e: jmp 0x0000000002a70faa 0x0000000002a70b53: fldz 0x0000000002a70b55: fucomip st,st(1) 0x0000000002a70b57: ja 0x0000000002a70bda 0x0000000002a70b5d: fld st(1) 0x0000000002a70b5f: fld st(1) 0x0000000002a70b61: sub rsp,0x8 0x0000000002a70b65: fstcw WORD PTR [rsp] 0x0000000002a70b69: mov eax,DWORD PTR [rsp] 0x0000000002a70b6c: or eax,0x300 0x0000000002a70b72: push rax 0x0000000002a70b73: fldcw WORD PTR [rsp] 0x0000000002a70b76: pop rax 0x0000000002a70b77: fyl2x 0x0000000002a70b79: sub rsp,0x8 0x0000000002a70b7d: fld st(0) 0x0000000002a70b7f: frndint 0x0000000002a70b81: fsubr st(1),st 0x0000000002a70b83: fistp DWORD PTR [rsp] 0x0000000002a70b86: f2xm1 0x0000000002a70b88: fld1 0x0000000002a70b8a: faddp st(1),st 0x0000000002a70b8c: mov eax,DWORD PTR [rsp] 0x0000000002a70b8f: mov ecx,0xfffff800 0x0000000002a70b94: add eax,0x3ff 0x0000000002a70b9a: mov edx,eax 0x0000000002a70b9c: shl eax,0x14 0x0000000002a70b9f: add edx,0x1 0x0000000002a70ba2: cmove eax,ecx 0x0000000002a70ba5: cmp edx,0x1 0x0000000002a70ba8: cmove eax,ecx 0x0000000002a70bab: test ecx,edx 0x0000000002a70bad: cmovne eax,ecx 0x0000000002a70bb0: mov DWORD PTR [rsp+0x4],eax 0x0000000002a70bb4: mov DWORD PTR [rsp],0x0 0x0000000002a70bbb: fmul QWORD PTR [rsp] 0x0000000002a70bbe: add rsp,0x8 0x0000000002a70bc2: fldcw WORD PTR [rsp] 0x0000000002a70bc5: add rsp,0x8 0x0000000002a70bc9: fucomi st,st(0) 0x0000000002a70bcb: jp 0x0000000002a70c7a 0x0000000002a70bd1: ffree st(2) 0x0000000002a70bd3: ffree st(1) 0x0000000002a70bd5: jmp 0x0000000002a70faa 0x0000000002a70bda: fld st(1) 0x0000000002a70bdc: frndint 0x0000000002a70bde: fucomi st,st(2) 0x0000000002a70be0: jne 0x0000000002a70c7a 0x0000000002a70be6: sub rsp,0x8 0x0000000002a70bea: fistp QWORD PTR [rsp] 0x0000000002a70bed: fld st(1) 0x0000000002a70bef: fld st(1) 0x0000000002a70bf1: fabs 0x0000000002a70bf3: sub rsp,0x8 0x0000000002a70bf7: fstcw WORD PTR [rsp] 0x0000000002a70bfb: mov eax,DWORD PTR [rsp] 0x0000000002a70bfe: or eax,0x300 0x0000000002a70c04: push rax 0x0000000002a70c05: fldcw WORD PTR [rsp] 0x0000000002a70c08: pop rax 0x0000000002a70c09: fyl2x 0x0000000002a70c0b: sub rsp,0x8 0x0000000002a70c0f: fld st(0) 0x0000000002a70c11: frndint 0x0000000002a70c13: fsubr st(1),st 0x0000000002a70c15: fistp DWORD PTR [rsp] 0x0000000002a70c18: f2xm1 0x0000000002a70c1a: fld1 0x0000000002a70c1c: faddp st(1),st 0x0000000002a70c1e: mov eax,DWORD PTR [rsp] 0x0000000002a70c21: mov ecx,0xfffff800 0x0000000002a70c26: add eax,0x3ff 0x0000000002a70c2c: mov edx,eax 0x0000000002a70c2e: shl eax,0x14 0x0000000002a70c31: add edx,0x1 0x0000000002a70c34: cmove eax,ecx 0x0000000002a70c37: cmp edx,0x1 0x0000000002a70c3a: cmove eax,ecx 0x0000000002a70c3d: test ecx,edx 0x0000000002a70c3f: cmovne eax,ecx 0x0000000002a70c42: mov DWORD PTR [rsp+0x4],eax 0x0000000002a70c46: mov DWORD PTR [rsp],0x0 0x0000000002a70c4d: fmul QWORD PTR [rsp] 0x0000000002a70c50: add rsp,0x8 0x0000000002a70c54: fldcw WORD PTR [rsp] 0x0000000002a70c57: add rsp,0x8 0x0000000002a70c5b: fucomi st,st(0) 0x0000000002a70c5d: pop rax 0x0000000002a70c5e: jp 0x0000000002a70c7a 0x0000000002a70c64: ffree st(2) 0x0000000002a70c66: ffree st(1) 0x0000000002a70c68: test eax,0x1 0x0000000002a70c6d: je 0x0000000002a70faa 0x0000000002a70c73: fchs 0x0000000002a70c75: jmp 0x0000000002a70faa 0x0000000002a70c7a: ffree st(0) 0x0000000002a70c7c: fincstp 0x0000000002a70c7e: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a70c83: sub rsp,0x80 0x0000000002a70c8a: mov QWORD PTR [rsp+0x78],rax 0x0000000002a70c8f: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a70c94: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a70c99: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a70c9e: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a70ca3: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a70ca8: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a70cad: mov QWORD PTR [rsp+0x38],r8 0x0000000002a70cb2: mov QWORD PTR [rsp+0x30],r9 0x0000000002a70cb7: mov QWORD PTR [rsp+0x28],r10 0x0000000002a70cbc: mov QWORD PTR [rsp+0x20],r11 0x0000000002a70cc1: mov QWORD PTR [rsp+0x18],r12 0x0000000002a70cc6: mov QWORD PTR [rsp+0x10],r13 0x0000000002a70ccb: mov QWORD PTR [rsp+0x8],r14 0x0000000002a70cd0: mov QWORD PTR [rsp],r15 0x0000000002a70cd4: sub rsp,0x100 0x0000000002a70cdb: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a70ce2: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a70cea: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a70cf2: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a70cfa: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a70d02: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a70d0a: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a70d12: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a70d1a: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a70d25: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a70d30: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a70d3b: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a70d46: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a70d51: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a70d5c: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a70d67: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a70d72: sub rsp,0x100 0x0000000002a70d79: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a70d7e: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a70d84: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a70d8a: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a70d90: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a70d96: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a70d9c: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a70da2: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a70da8: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a70db1: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a70dba: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a70dc3: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a70dcc: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a70dd5: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a70dde: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a70de7: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a70df0: sub rsp,0x10 0x0000000002a70df4: fstp QWORD PTR [rsp] 0x0000000002a70df7: fstp QWORD PTR [rsp+0x8] 0x0000000002a70dfb: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a70e00: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a70e06: sub rsp,0x20 0x0000000002a70e0a: test esp,0xf 0x0000000002a70e10: je 0x0000000002a70e28 0x0000000002a70e16: sub rsp,0x8 0x0000000002a70e1a: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a70e1f: add rsp,0x8 0x0000000002a70e23: jmp 0x0000000002a70e2d 0x0000000002a70e28: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a70e2d: add rsp,0x20 0x0000000002a70e31: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70e36: fld QWORD PTR [rsp] 0x0000000002a70e39: add rsp,0x10 0x0000000002a70e3d: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a70e42: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a70e48: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a70e4e: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a70e54: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a70e5a: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a70e60: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a70e66: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a70e6c: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a70e75: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a70e7e: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a70e87: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a70e90: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a70e99: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a70ea2: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a70eab: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a70eb4: add rsp,0x100 0x0000000002a70ebb: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a70ec2: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a70eca: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a70ed2: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a70eda: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a70ee2: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a70eea: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a70ef2: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a70efa: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a70f05: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a70f10: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a70f1b: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a70f26: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a70f31: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a70f3c: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a70f47: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a70f52: add rsp,0x100 0x0000000002a70f59: mov r15,QWORD PTR [rsp] 0x0000000002a70f5d: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a70f62: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a70f67: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a70f6c: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a70f71: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a70f76: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a70f7b: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a70f80: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a70f85: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a70f8a: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a70f8f: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a70f94: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a70f99: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a70f9e: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a70fa3: add rsp,0x80 0x0000000002a70faa: fstp QWORD PTR [rsp] 0x0000000002a70fad: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@4 (line 63) 0x0000000002a70fb2: vmovsd xmm1,QWORD PTR [rip+0xfffffffffffffaae] # 0x0000000002a70a68 ; {section_word} 0x0000000002a70fba: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70fbf: fld QWORD PTR [rsp] 0x0000000002a70fc2: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70fc7: fld QWORD PTR [rsp] 0x0000000002a70fca: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a70fd4: fld QWORD PTR [rax] 0x0000000002a70fd6: fucomip st,st(2) 0x0000000002a70fd8: jp 0x0000000002a70ff1 0x0000000002a70fde: jne 0x0000000002a70ff1 0x0000000002a70fe4: fxch st(1) 0x0000000002a70fe6: ffree st(0) 0x0000000002a70fe8: fincstp 0x0000000002a70fea: fmul st,st(0) 0x0000000002a70fec: jmp 0x0000000002a71448 0x0000000002a70ff1: fldz 0x0000000002a70ff3: fucomip st,st(1) 0x0000000002a70ff5: ja 0x0000000002a71078 0x0000000002a70ffb: fld st(1) 0x0000000002a70ffd: fld st(1) 0x0000000002a70fff: sub rsp,0x8 0x0000000002a71003: fstcw WORD PTR [rsp] 0x0000000002a71007: mov eax,DWORD PTR [rsp] 0x0000000002a7100a: or eax,0x300 0x0000000002a71010: push rax 0x0000000002a71011: fldcw WORD PTR [rsp] 0x0000000002a71014: pop rax 0x0000000002a71015: fyl2x 0x0000000002a71017: sub rsp,0x8 0x0000000002a7101b: fld st(0) 0x0000000002a7101d: frndint 0x0000000002a7101f: fsubr st(1),st 0x0000000002a71021: fistp DWORD PTR [rsp] 0x0000000002a71024: f2xm1 0x0000000002a71026: fld1 0x0000000002a71028: faddp st(1),st 0x0000000002a7102a: mov eax,DWORD PTR [rsp] 0x0000000002a7102d: mov ecx,0xfffff800 0x0000000002a71032: add eax,0x3ff 0x0000000002a71038: mov edx,eax 0x0000000002a7103a: shl eax,0x14 0x0000000002a7103d: add edx,0x1 0x0000000002a71040: cmove eax,ecx 0x0000000002a71043: cmp edx,0x1 0x0000000002a71046: cmove eax,ecx 0x0000000002a71049: test ecx,edx 0x0000000002a7104b: cmovne eax,ecx 0x0000000002a7104e: mov DWORD PTR [rsp+0x4],eax 0x0000000002a71052: mov DWORD PTR [rsp],0x0 0x0000000002a71059: fmul QWORD PTR [rsp] 0x0000000002a7105c: add rsp,0x8 0x0000000002a71060: fldcw WORD PTR [rsp] 0x0000000002a71063: add rsp,0x8 0x0000000002a71067: fucomi st,st(0) 0x0000000002a71069: jp 0x0000000002a71118 0x0000000002a7106f: ffree st(2) 0x0000000002a71071: ffree st(1) 0x0000000002a71073: jmp 0x0000000002a71448 0x0000000002a71078: fld st(1) 0x0000000002a7107a: frndint 0x0000000002a7107c: fucomi st,st(2) 0x0000000002a7107e: jne 0x0000000002a71118 0x0000000002a71084: sub rsp,0x8 0x0000000002a71088: fistp QWORD PTR [rsp] 0x0000000002a7108b: fld st(1) 0x0000000002a7108d: fld st(1) 0x0000000002a7108f: fabs 0x0000000002a71091: sub rsp,0x8 0x0000000002a71095: fstcw WORD PTR [rsp] 0x0000000002a71099: mov eax,DWORD PTR [rsp] 0x0000000002a7109c: or eax,0x300 0x0000000002a710a2: push rax 0x0000000002a710a3: fldcw WORD PTR [rsp] 0x0000000002a710a6: pop rax 0x0000000002a710a7: fyl2x 0x0000000002a710a9: sub rsp,0x8 0x0000000002a710ad: fld st(0) 0x0000000002a710af: frndint 0x0000000002a710b1: fsubr st(1),st 0x0000000002a710b3: fistp DWORD PTR [rsp] 0x0000000002a710b6: f2xm1 0x0000000002a710b8: fld1 0x0000000002a710ba: faddp st(1),st 0x0000000002a710bc: mov eax,DWORD PTR [rsp] 0x0000000002a710bf: mov ecx,0xfffff800 0x0000000002a710c4: add eax,0x3ff 0x0000000002a710ca: mov edx,eax 0x0000000002a710cc: shl eax,0x14 0x0000000002a710cf: add edx,0x1 0x0000000002a710d2: cmove eax,ecx 0x0000000002a710d5: cmp edx,0x1 0x0000000002a710d8: cmove eax,ecx 0x0000000002a710db: test ecx,edx 0x0000000002a710dd: cmovne eax,ecx 0x0000000002a710e0: mov DWORD PTR [rsp+0x4],eax 0x0000000002a710e4: mov DWORD PTR [rsp],0x0 0x0000000002a710eb: fmul QWORD PTR [rsp] 0x0000000002a710ee: add rsp,0x8 0x0000000002a710f2: fldcw WORD PTR [rsp] 0x0000000002a710f5: add rsp,0x8 0x0000000002a710f9: fucomi st,st(0) 0x0000000002a710fb: pop rax 0x0000000002a710fc: jp 0x0000000002a71118 0x0000000002a71102: ffree st(2) 0x0000000002a71104: ffree st(1) 0x0000000002a71106: test eax,0x1 0x0000000002a7110b: je 0x0000000002a71448 0x0000000002a71111: fchs 0x0000000002a71113: jmp 0x0000000002a71448 0x0000000002a71118: ffree st(0) 0x0000000002a7111a: fincstp 0x0000000002a7111c: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a71121: sub rsp,0x80 0x0000000002a71128: mov QWORD PTR [rsp+0x78],rax 0x0000000002a7112d: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a71132: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a71137: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a7113c: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a71141: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a71146: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a7114b: mov QWORD PTR [rsp+0x38],r8 0x0000000002a71150: mov QWORD PTR [rsp+0x30],r9 0x0000000002a71155: mov QWORD PTR [rsp+0x28],r10 0x0000000002a7115a: mov QWORD PTR [rsp+0x20],r11 0x0000000002a7115f: mov QWORD PTR [rsp+0x18],r12 0x0000000002a71164: mov QWORD PTR [rsp+0x10],r13 0x0000000002a71169: mov QWORD PTR [rsp+0x8],r14 0x0000000002a7116e: mov QWORD PTR [rsp],r15 0x0000000002a71172: sub rsp,0x100 0x0000000002a71179: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a71180: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a71188: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a71190: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a71198: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a711a0: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a711a8: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a711b0: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a711b8: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a711c3: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a711ce: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a711d9: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a711e4: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a711ef: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a711fa: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a71205: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a71210: sub rsp,0x100 0x0000000002a71217: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a7121c: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a71222: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a71228: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a7122e: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a71234: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a7123a: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a71240: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a71246: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a7124f: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a71258: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a71261: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a7126a: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a71273: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a7127c: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a71285: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a7128e: sub rsp,0x10 0x0000000002a71292: fstp QWORD PTR [rsp] 0x0000000002a71295: fstp QWORD PTR [rsp+0x8] 0x0000000002a71299: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a7129e: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a712a4: sub rsp,0x20 0x0000000002a712a8: test esp,0xf 0x0000000002a712ae: je 0x0000000002a712c6 0x0000000002a712b4: sub rsp,0x8 0x0000000002a712b8: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a712bd: add rsp,0x8 0x0000000002a712c1: jmp 0x0000000002a712cb 0x0000000002a712c6: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a712cb: add rsp,0x20 0x0000000002a712cf: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a712d4: fld QWORD PTR [rsp] 0x0000000002a712d7: add rsp,0x10 0x0000000002a712db: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a712e0: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a712e6: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a712ec: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a712f2: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a712f8: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a712fe: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a71304: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a7130a: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a71313: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a7131c: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a71325: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a7132e: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a71337: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a71340: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a71349: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a71352: add rsp,0x100 0x0000000002a71359: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a71360: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a71368: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a71370: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a71378: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a71380: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a71388: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a71390: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a71398: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a713a3: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a713ae: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a713b9: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a713c4: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a713cf: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a713da: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a713e5: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a713f0: add rsp,0x100 0x0000000002a713f7: mov r15,QWORD PTR [rsp] 0x0000000002a713fb: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a71400: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a71405: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a7140a: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a7140f: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a71414: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a71419: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a7141e: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a71423: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a71428: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a7142d: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a71432: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a71437: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a7143c: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a71441: add rsp,0x80 0x0000000002a71448: fstp QWORD PTR [rsp] 0x0000000002a7144b: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@10 (line 63) 0x0000000002a71450: vmovsd xmm1,QWORD PTR [rip+0xfffffffffffff618] # 0x0000000002a70a70 ; {section_word} 0x0000000002a71458: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a7145d: fld QWORD PTR [rsp] 0x0000000002a71460: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a71465: fld QWORD PTR [rsp] 0x0000000002a71468: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a71472: fld QWORD PTR [rax] 0x0000000002a71474: fucomip st,st(2) 0x0000000002a71476: jp 0x0000000002a7148f 0x0000000002a7147c: jne 0x0000000002a7148f 0x0000000002a71482: fxch st(1) 0x0000000002a71484: ffree st(0) 0x0000000002a71486: fincstp 0x0000000002a71488: fmul st,st(0) 0x0000000002a7148a: jmp 0x0000000002a718e6 0x0000000002a7148f: fldz 0x0000000002a71491: fucomip st,st(1) 0x0000000002a71493: ja 0x0000000002a71516 0x0000000002a71499: fld st(1) 0x0000000002a7149b: fld st(1) 0x0000000002a7149d: sub rsp,0x8 0x0000000002a714a1: fstcw WORD PTR [rsp] 0x0000000002a714a5: mov eax,DWORD PTR [rsp] 0x0000000002a714a8: or eax,0x300 0x0000000002a714ae: push rax 0x0000000002a714af: fldcw WORD PTR [rsp] 0x0000000002a714b2: pop rax 0x0000000002a714b3: fyl2x 0x0000000002a714b5: sub rsp,0x8 0x0000000002a714b9: fld st(0) 0x0000000002a714bb: frndint 0x0000000002a714bd: fsubr st(1),st 0x0000000002a714bf: fistp DWORD PTR [rsp] 0x0000000002a714c2: f2xm1 0x0000000002a714c4: fld1 0x0000000002a714c6: faddp st(1),st 0x0000000002a714c8: mov eax,DWORD PTR [rsp] 0x0000000002a714cb: mov ecx,0xfffff800 0x0000000002a714d0: add eax,0x3ff 0x0000000002a714d6: mov edx,eax 0x0000000002a714d8: shl eax,0x14 0x0000000002a714db: add edx,0x1 0x0000000002a714de: cmove eax,ecx 0x0000000002a714e1: cmp edx,0x1 0x0000000002a714e4: cmove eax,ecx 0x0000000002a714e7: test ecx,edx 0x0000000002a714e9: cmovne eax,ecx 0x0000000002a714ec: mov DWORD PTR [rsp+0x4],eax 0x0000000002a714f0: mov DWORD PTR [rsp],0x0 0x0000000002a714f7: fmul QWORD PTR [rsp] 0x0000000002a714fa: add rsp,0x8 0x0000000002a714fe: fldcw WORD PTR [rsp] 0x0000000002a71501: add rsp,0x8 0x0000000002a71505: fucomi st,st(0) 0x0000000002a71507: jp 0x0000000002a715b6 0x0000000002a7150d: ffree st(2) 0x0000000002a7150f: ffree st(1) 0x0000000002a71511: jmp 0x0000000002a718e6 0x0000000002a71516: fld st(1) 0x0000000002a71518: frndint 0x0000000002a7151a: fucomi st,st(2) 0x0000000002a7151c: jne 0x0000000002a715b6 0x0000000002a71522: sub rsp,0x8 0x0000000002a71526: fistp QWORD PTR [rsp] 0x0000000002a71529: fld st(1) 0x0000000002a7152b: fld st(1) 0x0000000002a7152d: fabs 0x0000000002a7152f: sub rsp,0x8 0x0000000002a71533: fstcw WORD PTR [rsp] 0x0000000002a71537: mov eax,DWORD PTR [rsp] 0x0000000002a7153a: or eax,0x300 0x0000000002a71540: push rax 0x0000000002a71541: fldcw WORD PTR [rsp] 0x0000000002a71544: pop rax 0x0000000002a71545: fyl2x 0x0000000002a71547: sub rsp,0x8 0x0000000002a7154b: fld st(0) 0x0000000002a7154d: frndint 0x0000000002a7154f: fsubr st(1),st 0x0000000002a71551: fistp DWORD PTR [rsp] 0x0000000002a71554: f2xm1 0x0000000002a71556: fld1 0x0000000002a71558: faddp st(1),st 0x0000000002a7155a: mov eax,DWORD PTR [rsp] 0x0000000002a7155d: mov ecx,0xfffff800 0x0000000002a71562: add eax,0x3ff 0x0000000002a71568: mov edx,eax 0x0000000002a7156a: shl eax,0x14 0x0000000002a7156d: add edx,0x1 0x0000000002a71570: cmove eax,ecx 0x0000000002a71573: cmp edx,0x1 0x0000000002a71576: cmove eax,ecx 0x0000000002a71579: test ecx,edx 0x0000000002a7157b: cmovne eax,ecx 0x0000000002a7157e: mov DWORD PTR [rsp+0x4],eax 0x0000000002a71582: mov DWORD PTR [rsp],0x0 0x0000000002a71589: fmul QWORD PTR [rsp] 0x0000000002a7158c: add rsp,0x8 0x0000000002a71590: fldcw WORD PTR [rsp] 0x0000000002a71593: add rsp,0x8 0x0000000002a71597: fucomi st,st(0) 0x0000000002a71599: pop rax 0x0000000002a7159a: jp 0x0000000002a715b6 0x0000000002a715a0: ffree st(2) 0x0000000002a715a2: ffree st(1) 0x0000000002a715a4: test eax,0x1 0x0000000002a715a9: je 0x0000000002a718e6 0x0000000002a715af: fchs 0x0000000002a715b1: jmp 0x0000000002a718e6 0x0000000002a715b6: ffree st(0) 0x0000000002a715b8: fincstp 0x0000000002a715ba: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a715bf: sub rsp,0x80 0x0000000002a715c6: mov QWORD PTR [rsp+0x78],rax 0x0000000002a715cb: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a715d0: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a715d5: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a715da: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a715df: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a715e4: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a715e9: mov QWORD PTR [rsp+0x38],r8 0x0000000002a715ee: mov QWORD PTR [rsp+0x30],r9 0x0000000002a715f3: mov QWORD PTR [rsp+0x28],r10 0x0000000002a715f8: mov QWORD PTR [rsp+0x20],r11 0x0000000002a715fd: mov QWORD PTR [rsp+0x18],r12 0x0000000002a71602: mov QWORD PTR [rsp+0x10],r13 0x0000000002a71607: mov QWORD PTR [rsp+0x8],r14 0x0000000002a7160c: mov QWORD PTR [rsp],r15 0x0000000002a71610: sub rsp,0x100 0x0000000002a71617: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a7161e: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a71626: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a7162e: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a71636: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a7163e: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a71646: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a7164e: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a71656: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a71661: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a7166c: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a71677: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a71682: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a7168d: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a71698: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a716a3: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a716ae: sub rsp,0x100 0x0000000002a716b5: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a716ba: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a716c0: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a716c6: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a716cc: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a716d2: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a716d8: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a716de: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a716e4: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a716ed: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a716f6: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a716ff: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a71708: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a71711: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a7171a: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a71723: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a7172c: sub rsp,0x10 0x0000000002a71730: fstp QWORD PTR [rsp] 0x0000000002a71733: fstp QWORD PTR [rsp+0x8] 0x0000000002a71737: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a7173c: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a71742: sub rsp,0x20 0x0000000002a71746: test esp,0xf 0x0000000002a7174c: je 0x0000000002a71764 0x0000000002a71752: sub rsp,0x8 0x0000000002a71756: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a7175b: add rsp,0x8 0x0000000002a7175f: jmp 0x0000000002a71769 0x0000000002a71764: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a71769: add rsp,0x20 0x0000000002a7176d: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a71772: fld QWORD PTR [rsp] 0x0000000002a71775: add rsp,0x10 0x0000000002a71779: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a7177e: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a71784: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a7178a: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a71790: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a71796: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a7179c: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a717a2: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a717a8: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a717b1: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a717ba: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a717c3: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a717cc: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a717d5: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a717de: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a717e7: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a717f0: add rsp,0x100 0x0000000002a717f7: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a717fe: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a71806: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a7180e: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a71816: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a7181e: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a71826: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a7182e: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a71836: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a71841: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a7184c: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a71857: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a71862: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a7186d: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a71878: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a71883: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a7188e: add rsp,0x100 0x0000000002a71895: mov r15,QWORD PTR [rsp] 0x0000000002a71899: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a7189e: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a718a3: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a718a8: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a718ad: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a718b2: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a718b7: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a718bc: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a718c1: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a718c6: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a718cb: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a718d0: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a718d5: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a718da: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a718df: add rsp,0x80 0x0000000002a718e6: fstp QWORD PTR [rsp] 0x0000000002a718e9: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@16 (line 63) 

ただし、組み込み関数 の実装には_dpow興味深い機能、つまり「特殊なケース」の処理があります。以下は、library_call.cpp OpenJDK 8 ソースコードのスニペットです

 //------------------------------inline_pow------------------------------------- // Inline power instructions, if possible. bool LibraryCallKit::inline_pow() { // Pseudocode for pow // if (y == 2) { // return x * x; // } else { // if (x <= 0.0) { // long longy = (long)y; // if ((double)longy == y) { // if y is long // if (y + 1 == y) longy = 0; // huge number: even // result = ((1&longy) == 0)?-DPow(abs(x), y):DPow(abs(x), y); // } else { // result = NaN; // } // } else { // result = DPow(x,y); // } // if (result != result)? { // result = uncommon_trap() or runtime_call(); // } // return result; // } /* code omitted */ } 

HotSpot開発者は、1つの特定のケースを処理しました-数の2乗です。このため、JITコンパイラーによって置換されたコードは実行のみになりx * xます。例として最初の呼び出しを使用して、逆アセンブルされたコードでこのチェックを見つけますMath.pow(a, 2)

  0x0000000002a70b14: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff44] ;  xmm1   2.0 0x0000000002a70b1c: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70b21: fld QWORD PTR [rsp] ;   2.0  FPU register stack 0x0000000002a70b24: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70b29: fld QWORD PTR [rsp] ;   a  FPU register stack 0x0000000002a70b2c: movabs rax,0x6c4ba7d0 0x0000000002a70b36: fld QWORD PTR [rax] ;   2.0  FPU register stack 0x0000000002a70b38: fucomip st,st(2) ;  2.0  2.0 0x0000000002a70b3a: jp 0x0000000002a70b53 0x0000000002a70b40: jne 0x0000000002a70b53 0x0000000002a70b46: fxch st(1) ;    FPU   a 0x0000000002a70b48: ffree st(0) 0x0000000002a70b4a: fincstp 0x0000000002a70b4c: fmul st,st(0) ;  a  a 0x0000000002a70b4e: jmp 0x0000000002a70faa ; code omitted 0x0000000002a70faa: fstp QWORD PTR [rsp] 0x0000000002a70fad: vmovsd xmm0,QWORD PTR [rsp] ;  xmm0   a * a ; code omitted 

ベンチマーク


ベンチマークコード:

 @Fork(value = 3, warmups = 0) @Warmup(iterations = 5, time = 1_000, timeUnit = TimeUnit.MILLISECONDS) @Measurement(iterations = 10, time = 1_000, timeUnit = TimeUnit.MILLISECONDS) @OutputTimeUnit(value = TimeUnit.NANOSECONDS) @BenchmarkMode(Mode.AverageTime) @State(Scope.Benchmark) public class MathBenchmark { public double a; @Setup public void setup() { a = 1234567.890; } @Benchmark public void mathOctaPowBenchmark(Blackhole bh) { bh.consume(mathOctaPow(a)); } @Benchmark public void plainOctaPowBenchmark(Blackhole bh) { bh.consume(plainOctaPow(a)); } @Benchmark public void trickyMathOctaPowBenchmark(Blackhole bh) { bh.consume(trickyMathOctaPow(a)); } @Benchmark public void trickyPlainOctaPowBenchmark(Blackhole bh) { bh.consume(trickyPlainOctaPow(a)); } public double mathOctaPow(double a) { return Math.pow(a, 8); } public double plainOctaPow(double a) { return a * a * a * a * a * a * a * a; } public double trickyMathOctaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double trickyPlainOctaPow(double a) { a *= a; a *= a; return a * a; } } 

結果:

 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 76,041 ± 0,428 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,174 ± 0,027 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 3,010 ± 0,014 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,011 ± 0,015 ns/op 

ベンチマーク結果全体
 # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark # Run progress: 0,00% complete, ETA 00:03:00 # Fork: 1 of 3 # Warmup Iteration 1: 77,026 ns/op # Warmup Iteration 2: 76,561 ns/op # Warmup Iteration 3: 77,623 ns/op # Warmup Iteration 4: 76,192 ns/op # Warmup Iteration 5: 76,012 ns/op Iteration 1: 75,947 ns/op Iteration 2: 75,739 ns/op Iteration 3: 75,864 ns/op Iteration 4: 76,179 ns/op Iteration 5: 75,934 ns/op Iteration 6: 75,783 ns/op Iteration 7: 75,820 ns/op Iteration 8: 75,898 ns/op Iteration 9: 75,798 ns/op Iteration 10: 76,053 ns/op # Run progress: 8,33% complete, ETA 00:02:48 # Fork: 2 of 3 # Warmup Iteration 1: 75,975 ns/op # Warmup Iteration 2: 76,008 ns/op # Warmup Iteration 3: 75,867 ns/op # Warmup Iteration 4: 76,061 ns/op # Warmup Iteration 5: 75,710 ns/op Iteration 1: 75,874 ns/op Iteration 2: 75,862 ns/op Iteration 3: 76,080 ns/op Iteration 4: 75,948 ns/op Iteration 5: 75,848 ns/op Iteration 6: 75,883 ns/op Iteration 7: 76,004 ns/op Iteration 8: 75,790 ns/op Iteration 9: 75,894 ns/op Iteration 10: 75,847 ns/op # Run progress: 16,67% complete, ETA 00:02:33 # Fork: 3 of 3 # Warmup Iteration 1: 75,778 ns/op # Warmup Iteration 2: 75,850 ns/op # Warmup Iteration 3: 75,878 ns/op # Warmup Iteration 4: 76,025 ns/op # Warmup Iteration 5: 76,450 ns/op Iteration 1: 75,791 ns/op Iteration 2: 75,941 ns/op Iteration 3: 75,652 ns/op Iteration 4: 75,795 ns/op Iteration 5: 75,906 ns/op Iteration 6: 78,971 ns/op Iteration 7: 76,055 ns/op Iteration 8: 75,736 ns/op Iteration 9: 75,816 ns/op Iteration 10: 77,537 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark": 76,041 ±(99.9%) 0,428 ns/op [Average] (min, avg, max) = (75,652, 76,041, 78,971), stdev = 0,640 CI (99.9%): [75,614, 76,469] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark # Run progress: 25,00% complete, ETA 00:02:17 # Fork: 1 of 3 # Warmup Iteration 1: 4,622 ns/op # Warmup Iteration 2: 4,406 ns/op # Warmup Iteration 3: 4,169 ns/op # Warmup Iteration 4: 4,163 ns/op # Warmup Iteration 5: 4,153 ns/op Iteration 1: 4,141 ns/op Iteration 2: 4,144 ns/op Iteration 3: 4,141 ns/op Iteration 4: 4,141 ns/op Iteration 5: 4,149 ns/op Iteration 6: 4,136 ns/op Iteration 7: 4,143 ns/op Iteration 8: 4,136 ns/op Iteration 9: 4,140 ns/op Iteration 10: 4,134 ns/op # Run progress: 33,33% complete, ETA 00:02:02 # Fork: 2 of 3 # Warmup Iteration 1: 4,567 ns/op # Warmup Iteration 2: 4,267 ns/op # Warmup Iteration 3: 4,162 ns/op # Warmup Iteration 4: 4,155 ns/op # Warmup Iteration 5: 4,157 ns/op Iteration 1: 4,157 ns/op Iteration 2: 4,151 ns/op Iteration 3: 4,161 ns/op Iteration 4: 4,175 ns/op Iteration 5: 4,136 ns/op Iteration 6: 4,154 ns/op Iteration 7: 4,192 ns/op Iteration 8: 4,206 ns/op Iteration 9: 4,203 ns/op Iteration 10: 4,180 ns/op # Run progress: 41,67% complete, ETA 00:01:47 # Fork: 3 of 3 # Warmup Iteration 1: 4,569 ns/op # Warmup Iteration 2: 4,204 ns/op # Warmup Iteration 3: 4,172 ns/op # Warmup Iteration 4: 4,151 ns/op # Warmup Iteration 5: 4,159 ns/op Iteration 1: 4,141 ns/op Iteration 2: 4,175 ns/op Iteration 3: 4,182 ns/op Iteration 4: 4,205 ns/op Iteration 5: 4,246 ns/op Iteration 6: 4,186 ns/op Iteration 7: 4,273 ns/op Iteration 8: 4,240 ns/op Iteration 9: 4,169 ns/op Iteration 10: 4,270 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark": 4,174 ±(99.9%) 0,027 ns/op [Average] (min, avg, max) = (4,134, 4,174, 4,273), stdev = 0,040 CI (99.9%): [4,147, 4,201] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark # Run progress: 50,00% complete, ETA 00:01:31 # Fork: 1 of 3 # Warmup Iteration 1: 3,396 ns/op # Warmup Iteration 2: 3,237 ns/op # Warmup Iteration 3: 3,156 ns/op # Warmup Iteration 4: 3,020 ns/op # Warmup Iteration 5: 3,001 ns/op Iteration 1: 2,995 ns/op Iteration 2: 3,012 ns/op Iteration 3: 3,014 ns/op Iteration 4: 2,997 ns/op Iteration 5: 3,025 ns/op Iteration 6: 3,015 ns/op Iteration 7: 3,004 ns/op Iteration 8: 2,999 ns/op Iteration 9: 3,033 ns/op Iteration 10: 3,003 ns/op # Run progress: 58,33% complete, ETA 00:01:16 # Fork: 2 of 3 # Warmup Iteration 1: 3,409 ns/op # Warmup Iteration 2: 3,230 ns/op # Warmup Iteration 3: 3,057 ns/op # Warmup Iteration 4: 3,027 ns/op # Warmup Iteration 5: 3,010 ns/op Iteration 1: 3,001 ns/op Iteration 2: 3,001 ns/op Iteration 3: 3,023 ns/op Iteration 4: 3,097 ns/op Iteration 5: 3,017 ns/op Iteration 6: 2,997 ns/op Iteration 7: 3,017 ns/op Iteration 8: 3,011 ns/op Iteration 9: 2,998 ns/op Iteration 10: 2,991 ns/op # Run progress: 66,67% complete, ETA 00:01:01 # Fork: 3 of 3 # Warmup Iteration 1: 3,476 ns/op # Warmup Iteration 2: 3,188 ns/op # Warmup Iteration 3: 2,998 ns/op # Warmup Iteration 4: 2,984 ns/op # Warmup Iteration 5: 3,023 ns/op Iteration 1: 2,999 ns/op Iteration 2: 3,004 ns/op Iteration 3: 2,998 ns/op Iteration 4: 3,059 ns/op Iteration 5: 3,001 ns/op Iteration 6: 3,006 ns/op Iteration 7: 3,002 ns/op Iteration 8: 2,994 ns/op Iteration 9: 3,005 ns/op Iteration 10: 2,989 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark": 3,010 ±(99.9%) 0,014 ns/op [Average] (min, avg, max) = (2,989, 3,010, 3,097), stdev = 0,022 CI (99.9%): [2,996, 3,025] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark # Run progress: 75,00% complete, ETA 00:00:45 # Fork: 1 of 3 # Warmup Iteration 1: 3,353 ns/op # Warmup Iteration 2: 3,169 ns/op # Warmup Iteration 3: 2,985 ns/op # Warmup Iteration 4: 3,004 ns/op # Warmup Iteration 5: 3,018 ns/op Iteration 1: 2,994 ns/op Iteration 2: 2,986 ns/op Iteration 3: 2,986 ns/op Iteration 4: 3,041 ns/op Iteration 5: 3,000 ns/op Iteration 6: 2,993 ns/op Iteration 7: 2,999 ns/op Iteration 8: 3,001 ns/op Iteration 9: 3,024 ns/op Iteration 10: 2,995 ns/op # Run progress: 83,33% complete, ETA 00:00:30 # Fork: 2 of 3 # Warmup Iteration 1: 3,371 ns/op # Warmup Iteration 2: 3,190 ns/op # Warmup Iteration 3: 3,010 ns/op # Warmup Iteration 4: 2,992 ns/op # Warmup Iteration 5: 2,995 ns/op Iteration 1: 2,993 ns/op Iteration 2: 3,007 ns/op Iteration 3: 2,999 ns/op Iteration 4: 3,006 ns/op Iteration 5: 2,992 ns/op Iteration 6: 3,009 ns/op Iteration 7: 3,013 ns/op Iteration 8: 3,012 ns/op Iteration 9: 3,010 ns/op Iteration 10: 3,000 ns/op # Run progress: 91,67% complete, ETA 00:00:15 # Fork: 3 of 3 # Warmup Iteration 1: 3,388 ns/op # Warmup Iteration 2: 3,239 ns/op # Warmup Iteration 3: 3,046 ns/op # Warmup Iteration 4: 3,146 ns/op # Warmup Iteration 5: 3,008 ns/op Iteration 1: 3,023 ns/op Iteration 2: 3,048 ns/op Iteration 3: 3,039 ns/op Iteration 4: 3,094 ns/op Iteration 5: 3,024 ns/op Iteration 6: 3,004 ns/op Iteration 7: 2,991 ns/op Iteration 8: 3,025 ns/op Iteration 9: 3,006 ns/op Iteration 10: 3,006 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark": 3,011 ±(99.9%) 0,015 ns/op [Average] (min, avg, max) = (2,986, 3,011, 3,094), stdev = 0,023 CI (99.9%): [2,996, 3,026] (assumes normal distribution) # Run complete. Total time: 00:03:03 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 76,041 ± 0,428 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,174 ± 0,027 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 3,010 ± 0,014 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,011 ± 0,015 ns/op 

私たちの推論はベンチマーク結果によって確認されます。使用しての違いMath.pow(a, 2)とは(a * a)有意ではなかったです。組み込み関数

を使用することの有効性を実証するために、同じベンチマークを実行できますが、組み込み関数を無効にします _dpow

 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 195,222 ± 0,850 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,183 ± 0,030 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 41,158 ± 0,381 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,081 ± 0,032 ns/op 

ベンチマーク結果全体
# JMH version: 1.20
# VM version: JDK 1.8.0_161, VM 25.161-b12
# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe
# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow
# Warmup: 5 iterations, 1000 ms each
# Measurement: 10 iterations, 1000 ms each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark

# Run progress: 0,00% complete, ETA 00:03:00
# Fork: 1 of 3
# Warmup Iteration 1: 194,013 ns/op
# Warmup Iteration 2: 197,926 ns/op
# Warmup Iteration 3: 197,374 ns/op
# Warmup Iteration 4: 197,242 ns/op
# Warmup Iteration 5: 202,265 ns/op
Iteration 1: 198,168 ns/op
Iteration 2: 198,107 ns/op
Iteration 3: 197,629 ns/op
Iteration 4: 195,174 ns/op
Iteration 5: 194,771 ns/op
Iteration 6: 194,804 ns/op
Iteration 7: 194,732 ns/op
Iteration 8: 194,932 ns/op
Iteration 9: 194,964 ns/op
Iteration 10: 194,774 ns/op

# Run progress: 8,33% complete, ETA 00:02:48
# Fork: 2 of 3
# Warmup Iteration 1: 200,032 ns/op
# Warmup Iteration 2: 200,323 ns/op
# Warmup Iteration 3: 195,602 ns/op
# Warmup Iteration 4: 194,705 ns/op
# Warmup Iteration 5: 194,277 ns/op
Iteration 1: 194,657 ns/op
Iteration 2: 195,459 ns/op
Iteration 3: 199,108 ns/op
Iteration 4: 195,154 ns/op
Iteration 5: 195,208 ns/op
Iteration 6: 194,692 ns/op
Iteration 7: 194,406 ns/op
Iteration 8: 194,979 ns/op
Iteration 9: 194,950 ns/op
Iteration 10: 194,234 ns/op

# Run progress: 16,67% complete, ETA 00:02:33
# Fork: 3 of 3
# Warmup Iteration 1: 193,094 ns/op
# Warmup Iteration 2: 192,849 ns/op
# Warmup Iteration 3: 195,101 ns/op
# Warmup Iteration 4: 195,456 ns/op
# Warmup Iteration 5: 194,698 ns/op
Iteration 1: 194,806 ns/op
Iteration 2: 194,887 ns/op
Iteration 3: 194,863 ns/op
Iteration 4: 195,134 ns/op
Iteration 5: 194,379 ns/op
Iteration 6: 193,851 ns/op
Iteration 7: 194,085 ns/op
Iteration 8: 194,743 ns/op
Iteration 9: 194,486 ns/op
Iteration 10: 194,508 ns/op

Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark»:
195,222 ±(99.9%) 0,850 ns/op [Average]
(min, avg, max) = (193,851, 195,222, 199,108), stdev = 1,272
CI (99.9%): [194,372, 196,071] (assumes normal distribution)

# JMH version: 1.20
# VM version: JDK 1.8.0_161, VM 25.161-b12
# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe
# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow
# Warmup: 5 iterations, 1000 ms each
# Measurement: 10 iterations, 1000 ms each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark

# Run progress: 25,00% complete, ETA 00:02:17
# Fork: 1 of 3
# Warmup Iteration 1: 4,569 ns/op
# Warmup Iteration 2: 4,238 ns/op
# Warmup Iteration 3: 4,167 ns/op
# Warmup Iteration 4: 4,211 ns/op
# Warmup Iteration 5: 4,267 ns/op
Iteration 1: 4,185 ns/op
Iteration 2: 4,280 ns/op
Iteration 3: 4,186 ns/op
Iteration 4: 4,202 ns/op
Iteration 5: 4,193 ns/op
Iteration 6: 4,360 ns/op
Iteration 7: 4,191 ns/op
Iteration 8: 4,181 ns/op
Iteration 9: 4,176 ns/op
Iteration 10: 4,170 ns/op

# Run progress: 33,33% complete, ETA 00:02:02
# Fork: 2 of 3
# Warmup Iteration 1: 4,573 ns/op
# Warmup Iteration 2: 4,218 ns/op
# Warmup Iteration 3: 4,176 ns/op
# Warmup Iteration 4: 4,155 ns/op
# Warmup Iteration 5: 4,279 ns/op
Iteration 1: 4,251 ns/op
Iteration 2: 4,207 ns/op
Iteration 3: 4,175 ns/op
Iteration 4: 4,174 ns/op
Iteration 5: 4,182 ns/op
Iteration 6: 4,196 ns/op
Iteration 7: 4,169 ns/op
Iteration 8: 4,164 ns/op
Iteration 9: 4,175 ns/op
Iteration 10: 4,157 ns/op

# Run progress: 41,67% complete, ETA 00:01:47
# Fork: 3 of 3
# Warmup Iteration 1: 4,561 ns/op
# Warmup Iteration 2: 4,193 ns/op
# Warmup Iteration 3: 4,139 ns/op
# Warmup Iteration 4: 4,152 ns/op
# Warmup Iteration 5: 4,154 ns/op
Iteration 1: 4,141 ns/op
Iteration 2: 4,144 ns/op
Iteration 3: 4,157 ns/op
Iteration 4: 4,141 ns/op
Iteration 5: 4,162 ns/op
Iteration 6: 4,135 ns/op
Iteration 7: 4,166 ns/op
Iteration 8: 4,156 ns/op
Iteration 9: 4,160 ns/op
Iteration 10: 4,144 ns/op

Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark»:
4,183 ±(99.9%) 0,030 ns/op [Average]
(min, avg, max) = (4,135, 4,183, 4,360), stdev = 0,045
CI (99.9%): [4,152, 4,213] (assumes normal distribution)

# JMH version: 1.20
# VM version: JDK 1.8.0_161, VM 25.161-b12
# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe
# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow
# Warmup: 5 iterations, 1000 ms each
# Measurement: 10 iterations, 1000 ms each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark

# Run progress: 50,00% complete, ETA 00:01:31
# Fork: 1 of 3
# Warmup Iteration 1: 41,544 ns/op
# Warmup Iteration 2: 41,150 ns/op
# Warmup Iteration 3: 41,312 ns/op
# Warmup Iteration 4: 41,196 ns/op
# Warmup Iteration 5: 41,002 ns/op
Iteration 1: 43,681 ns/op
Iteration 2: 41,183 ns/op
Iteration 3: 41,598 ns/op
Iteration 4: 41,703 ns/op
Iteration 5: 41,365 ns/op
Iteration 6: 41,210 ns/op
Iteration 7: 41,380 ns/op
Iteration 8: 41,413 ns/op
Iteration 9: 41,481 ns/op
Iteration 10: 41,763 ns/op

# Run progress: 58,33% complete, ETA 00:01:16
# Fork: 2 of 3
# Warmup Iteration 1: 41,665 ns/op
# Warmup Iteration 2: 40,970 ns/op
# Warmup Iteration 3: 40,872 ns/op
# Warmup Iteration 4: 40,926 ns/op
# Warmup Iteration 5: 40,794 ns/op
Iteration 1: 41,103 ns/op
Iteration 2: 40,991 ns/op
Iteration 3: 40,859 ns/op
Iteration 4: 41,046 ns/op
Iteration 5: 41,241 ns/op
Iteration 6: 40,711 ns/op
Iteration 7: 40,571 ns/op
Iteration 8: 40,928 ns/op
Iteration 9: 40,662 ns/op
Iteration 10: 40,911 ns/op

# Run progress: 66,67% complete, ETA 00:01:01
# Fork: 3 of 3
# Warmup Iteration 1: 42,068 ns/op
# Warmup Iteration 2: 41,017 ns/op
# Warmup Iteration 3: 41,260 ns/op
# Warmup Iteration 4: 41,147 ns/op
# Warmup Iteration 5: 40,777 ns/op
Iteration 1: 41,060 ns/op
Iteration 2: 40,881 ns/op
Iteration 3: 41,014 ns/op
Iteration 4: 40,826 ns/op
Iteration 5: 40,977 ns/op
Iteration 6: 40,837 ns/op
Iteration 7: 41,023 ns/op
Iteration 8: 40,749 ns/op
Iteration 9: 40,959 ns/op
Iteration 10: 40,611 ns/op

Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark»:
41,158 ±(99.9%) 0,381 ns/op [Average]
(min, avg, max) = (40,571, 41,158, 43,681), stdev = 0,570
CI (99.9%): [40,777, 41,538] (assumes normal distribution)

# JMH version: 1.20
# VM version: JDK 1.8.0_161, VM 25.161-b12
# VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe
# VM options: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dpow
# Warmup: 5 iterations, 1000 ms each
# Measurement: 10 iterations, 1000 ms each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark

# Run progress: 75,00% complete, ETA 00:00:45
# Fork: 1 of 3
# Warmup Iteration 1: 3,384 ns/op
# Warmup Iteration 2: 3,214 ns/op
# Warmup Iteration 3: 3,063 ns/op
# Warmup Iteration 4: 3,051 ns/op
# Warmup Iteration 5: 3,073 ns/op
Iteration 1: 3,090 ns/op
Iteration 2: 3,045 ns/op
Iteration 3: 3,054 ns/op
Iteration 4: 3,074 ns/op
Iteration 5: 3,058 ns/op
Iteration 6: 3,059 ns/op
Iteration 7: 3,075 ns/op
Iteration 8: 3,092 ns/op
Iteration 9: 3,155 ns/op
Iteration 10: 3,089 ns/op

# Run progress: 83,33% complete, ETA 00:00:30
# Fork: 2 of 3
# Warmup Iteration 1: 3,442 ns/op
# Warmup Iteration 2: 3,315 ns/op
# Warmup Iteration 3: 3,027 ns/op
# Warmup Iteration 4: 3,031 ns/op
# Warmup Iteration 5: 3,051 ns/op
Iteration 1: 3,032 ns/op
Iteration 2: 3,051 ns/op
Iteration 3: 3,050 ns/op
Iteration 4: 3,076 ns/op
Iteration 5: 3,067 ns/op
Iteration 6: 3,018 ns/op
Iteration 7: 3,034 ns/op
Iteration 8: 3,017 ns/op
Iteration 9: 3,041 ns/op
Iteration 10: 3,023 ns/op

# Run progress: 91,67% complete, ETA 00:00:15
# Fork: 3 of 3
# Warmup Iteration 1: 3,415 ns/op
# Warmup Iteration 2: 3,276 ns/op
# Warmup Iteration 3: 3,344 ns/op
# Warmup Iteration 4: 3,226 ns/op
# Warmup Iteration 5: 3,072 ns/op
Iteration 1: 3,150 ns/op
Iteration 2: 3,132 ns/op
Iteration 3: 3,172 ns/op
Iteration 4: 3,101 ns/op
Iteration 5: 3,053 ns/op
Iteration 6: 3,061 ns/op
Iteration 7: 3,106 ns/op
Iteration 8: 3,150 ns/op
Iteration 9: 3,097 ns/op
Iteration 10: 3,204 ns/op

Result «ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark»:
3,081 ±(99.9%) 0,032 ns/op [Average]
(min, avg, max) = (3,017, 3,081, 3,204), stdev = 0,048
CI (99.9%): [3,049, 3,113] (assumes normal distribution)

# Run complete. Total time: 00:03:03

Benchmark Mode Cnt Score Error Units
MathBenchmark.mathOctaPowBenchmark avgt 30 195,222 ± 0,850 ns/op
MathBenchmark.plainOctaPowBenchmark avgt 30 4,183 ± 0,030 ns/op
MathBenchmark.trickyMathOctaPowBenchmark avgt 30 41,158 ± 0,381 ns/op
MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,081 ± 0,032 ns/op

ネイティブメソッドへの正直な呼び出しの結果が表示されStrictMath.pow()ます。興味深い事実は、いくつかの課題StrictMath.pow(x, 2)がまだ優れているということですStrictMath.pow(x, 8)これは、ネイティブメソッドの実装には、2乗の特殊なケースもあることを示しています。

おわりに


組み込み関数の 実装に関するストーリーは_dpow一般に別の章に値します。OpenJDKリポジトリの変更から判断すると、組み込みはさまざまなリリースで絶えず変更され、開発者は常に特別なケースを忘れています。Andrey apangin Panginは、Joker 2016カンファレンスでこれについて話しました- 神話と遅いJavaに関する事実

正解


バリアント3と4は、本質的にに減少する組み込み関数の実装の特殊なケースにより、等しく高速x * xです。

オプション2は、操作が増えるため速度が低下します。

オプション1は、速度が大幅に劣ります。組み込み関数の使用にもかかわらず、数値を型の累乗に変換する複雑なロジックがdouble呼び出されます。

統計


2人の会議参加者が正しい答えを出しました。別の5つの答えは部分的に正しかった。32のオプションが委託されました。

PS


GitHubの上のすべてのコード:jbreak2018-POW-PERF-テスト

Source: https://habr.com/ru/post/J351812/


All Articles