diff --git a/README.md b/README.md index 30ed37427b23d8fc6c5d208757cf9ca11fc27ba7..c601fa44f88deba92a9f33abcb1c295dc7c3a04f 100644 --- a/README.md +++ b/README.md @@ -309,45 +309,38 @@ DIV_03의 경우 오차율이 조금 증가하지만 범용적으로 사용할 <br> ### 속도 테스트 +  + * 32bit 실행 결과 1. Multiply - MUL_03 < MUL_02 < MUL_01 - -<br> + MUL_03 = MUL_02 < MUL_01 -32 bit에서 곱셈을 실행한 결과, cast하지 않은 것들에서 빠른 성능을 보인다는 것을 알 수 있었다. -<br> -<br> 2. Division - DIV_01 = DIV_03 < DIV_02 + DIV_03 < DIV_02 < DIV_01 -<br> <br> * 64bit 실행 결과 1. Multiply - MUL_03 = MUL_02 < MUL_01 + MUL_03 < MUL_02 < MUL_01 -<br> -<br> 2. Division - DIV_02 < DIV_03 < DIV_01 + DIV_01 = DIV_02 < DIV_03 <br> <br> - ### Sin #### sin table @@ -390,6 +383,67 @@ fx_1516_SinTable은 fx_s1516에 맞춰진 표이므로, fx_s1615에 맞춰 변 <br> <br> + + +### double vs long long + +* double 과 long long으로 형변환을 하여 곱셈 및 나눗셈을 수행할 때, 어떤 것이 더 빠른지 측정해보았다. + +1. multiple test + + + ``` +Flat profile: + +Each sample counts as 0.01 seconds. + % cumulative self self total + time seconds seconds calls s/call s/call name + 63.43 2.00 2.00 1 2.00 2.00 fx_1615_double_mul_test + 12.69 2.41 0.40 1 0.40 0.40 fx_1615_longlong_mul1_test + 12.05 2.79 0.38 1 0.38 0.38 fx_1615_longlong_mul2_test + 12.05 3.17 0.38 1 0.38 0.38 fx_1615_longlong_mul3_test + + Call graph + + +granularity: each sample hit covers 2 byte(s) for 0.32% of 3.17 seconds + +index % time self children called name + <spontaneous> +[1] 100.0 0.00 3.17 main [1] + 2.00 0.00 1/1 fx_1615_double_mul_test [2] + 0.40 0.00 1/1 fx_1615_longlong_mul1_test [3] + 0.38 0.00 1/1 fx_1615_longlong_mul3_test [5] + 0.38 0.00 1/1 fx_1615_longlong_mul2_test [4] +----------------------------------------------- + 2.00 0.00 1/1 main [1] +[2] 63.3 2.00 0.00 1 fx_1615_double_mul_test [2] +----------------------------------------------- + 0.40 0.00 1/1 main [1] +[3] 12.7 0.40 0.00 1 fx_1615_longlong_mul1_test [3] +----------------------------------------------- + 0.38 0.00 1/1 main [1] +[4] 12.0 0.38 0.00 1 fx_1615_longlong_mul2_test [4] +----------------------------------------------- + 0.38 0.00 1/1 main [1] +[5] 12.0 0.38 0.00 1 fx_1615_longlong_mul3_test [5] +----------------------------------------------- + +Index by function name + + [2] fx_1615_double_mul_test [4] fx_1615_longlong_mul2_test + [3] fx_1615_longlong_mul1_test [5] fx_1615_longlong_mul3_test + + ``` + +[실행 결과 정리] +* double이 long long으로 곱셈을 실행하는 것 보다 더 많은 시간이 소요된다는 것을 알 수 있었다. +* fx_s1615로 표현된 값인 a를 8만큼 right shifting 해주고, b를 7만큼 right shifiting 해준 후, 두 값을 곱해주는 fx_1615_longlong_mul3_test의 실행속도가 가장 빠른 것을 알 수 있었다. + +<br> +<br> + + ## 0819 과제 ### 요구사항 명세 diff --git a/multipleOutput.txt b/multipleOutput.txt index dbde321dc734230a46d48c3bdd520a7ea70c4bb0..fd2989454c5a9ce16619f17fe87ef8e78df1f2d9 100644 --- a/multipleOutput.txt +++ b/multipleOutput.txt @@ -3,38 +3,38 @@ Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name - 42.53 2.13 2.13 1 2.13 2.13 fx_1615_double_test - 33.54 3.82 1.68 1 1.68 1.68 fx_1615_longlong_div03_test - 11.98 4.42 0.60 1 0.60 0.60 fx_1615_longlong_div01_test - 11.58 5.00 0.58 1 0.58 0.58 fx_1615_longlong_div02_test + 63.43 2.00 2.00 1 2.00 2.00 fx_1615_double_mul_test + 12.69 2.41 0.40 1 0.40 0.40 fx_1615_longlong_mul1_test + 12.05 2.79 0.38 1 0.38 0.38 fx_1615_longlong_mul2_test + 12.05 3.17 0.38 1 0.38 0.38 fx_1615_longlong_mul3_test Call graph -granularity: each sample hit covers 2 byte(s) for 0.20% of 5.00 seconds +granularity: each sample hit covers 2 byte(s) for 0.32% of 3.17 seconds index % time self children called name <spontaneous> -[1] 100.0 0.00 5.00 main [1] - 2.13 0.00 1/1 fx_1615_double_test [2] - 1.68 0.00 1/1 fx_1615_longlong_div03_test [3] - 0.60 0.00 1/1 fx_1615_longlong_div01_test [4] - 0.58 0.00 1/1 fx_1615_longlong_div02_test [5] +[1] 100.0 0.00 3.17 main [1] + 2.00 0.00 1/1 fx_1615_double_mul_test [2] + 0.40 0.00 1/1 fx_1615_longlong_mul1_test [3] + 0.38 0.00 1/1 fx_1615_longlong_mul3_test [5] + 0.38 0.00 1/1 fx_1615_longlong_mul2_test [4] ----------------------------------------------- - 2.13 0.00 1/1 main [1] -[2] 42.7 2.13 0.00 1 fx_1615_double_test [2] + 2.00 0.00 1/1 main [1] +[2] 63.3 2.00 0.00 1 fx_1615_double_mul_test [2] ----------------------------------------------- - 1.68 0.00 1/1 main [1] -[3] 33.7 1.68 0.00 1 fx_1615_longlong_div03_test [3] + 0.40 0.00 1/1 main [1] +[3] 12.7 0.40 0.00 1 fx_1615_longlong_mul1_test [3] ----------------------------------------------- - 0.60 0.00 1/1 main [1] -[4] 12.0 0.60 0.00 1 fx_1615_longlong_div01_test [4] + 0.38 0.00 1/1 main [1] +[4] 12.0 0.38 0.00 1 fx_1615_longlong_mul2_test [4] ----------------------------------------------- - 0.58 0.00 1/1 main [1] -[5] 11.6 0.58 0.00 1 fx_1615_longlong_div02_test [5] + 0.38 0.00 1/1 main [1] +[5] 12.0 0.38 0.00 1 fx_1615_longlong_mul3_test [5] ----------------------------------------------- Index by function name - [2] fx_1615_double_test [5] fx_1615_longlong_div02_test - [4] fx_1615_longlong_div01_test [3] fx_1615_longlong_div03_test + [2] fx_1615_double_mul_test [4] fx_1615_longlong_mul2_test + [3] fx_1615_longlong_mul1_test [5] fx_1615_longlong_mul3_test