Is Pythonic Code Really Efficient? Why?

효율이 좋다고 유명한건 알겠는데.. 구체적으로 어떻게 효율이 좋으며, 그럴 수 있는가?

Cpu TIme을 직접 보자.

ex1) List에 어떠한 값을 추가할 때의 예시들을 들어보겠다

# list-comprehension
def list_comprehension(x):
    result = [i*i for i in range(x)]
    return result

def list_append(x):
    result=[]
    for i in range(x):
        result.append(i*i)
    return result

def list_extend(x):
    result = []
    result.extend(i*i for i in range(x))
    return result

import cProfile # Cpu Time을 볼 수 있습니다.

# call이 얼마나 불렸는지, 다음과 같이 시간이 얼마나 걸렸는지 볼 수 있습니다.

if __name__  == "__main__":
    import cProfile
    cProfile.run('list_comprehension(100000000)')
    cProfile.run('list_append(100000000)')
    cProfile.run('list_extend(100000000)')

         5 function calls in 16.079 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001   12.395   12.395 <ipython-input-20-50a3b396c8b8>:2(list_comprehension)
        1   12.394   12.394   12.394   12.394 <ipython-input-20-50a3b396c8b8>:3(<listcomp>)
        1    3.683    3.683   16.078   16.078 <string>:1(<module>)
        1    0.001    0.001   16.079   16.079 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

         100000004 function calls in 33.958 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   19.644   19.644   31.368   31.368 <ipython-input-20-50a3b396c8b8>:6(list_append)
        1    2.582    2.582   33.950   33.950 <string>:1(<module>)
        1    0.007    0.007   33.957   33.957 {built-in method builtins.exec}
100000000   11.723    0.000   11.723    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

         100000006 function calls in 32.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   29.513   29.513 <ipython-input-20-50a3b396c8b8>:12(list_extend)
100000001   15.330    0.000   15.330    0.000 <ipython-input-20-50a3b396c8b8>:14(<genexpr>)
        1    2.479    2.479   31.993   31.993 <string>:1(<module>)
        1    0.007    0.007   32.000   32.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1   14.183   14.183   29.513   29.513 {method 'extend' of 'list' objects}

# list-comprehension의 완승이다. 이유가 무엇일까?

function calls 횟수를 주목하면 된다.
결국 함수의 호출이 몇번 되는가의 차이이고,
이는 바로 전역변수를 찾고 지역변수를 찾고 하는 쓸데없는 과정이 추가가 되었다는 뜻일 것이다.
다른 친구들은 N번 반복하지만, list comprehension은 한번으로 끝

ex2) Merging Dictionary

Memory Profiling은 iPython 환경에선 수행되지 않습니다.
그리하여 Pythonic_Code_Memory_Usage.py 파일에서 이를 대신 수행하였습니다

'''
d1 = {i:i for i in range(10000000)}
d2 = {j:j for j in range(10000001,20000000)}
from memory_profiler import profile

@profile(precision=4)
def for_loop(d1, d2):
    result = {}

    for k in d1:
        result[k] = d1[k]
    for k in d2:
        result[k] = d2[k]

    return result

@profile(precision=4)
def update_method(d1,d2):
    result = {}
    result.update(d1)
    result.update(d2)
    return result

@profile(precision=4)
def dict_comprehension(d1,d2):
    result = {k:v for d in [d1,d2] for k,v in d.items()}
    return result

@profile(precision=4)
def dict_kwargs(d1,d2):
    result = {**d1,**d2}
    return result

if __name__ == "__main__":
    data1 = for_loop(d1,d2)
    data2 = update_method(d1,d2)
    data3 = dict_comprehension(d1,d2)
    data4 = dict_kwargs(d1,d2)    '''

Update Method와 keyword arguments가 가장 빨랐습니다.
For loop - For문도 돌아가야 하고, local 변수를 찾았다가 global 변수도 찾는 등 많은 Occurences를 발생시킵니다.
kwargs - 반면에, local 변수만 찾고 unpacking으로 수행하기 때문에 Occurences가 1번입니다.

ex2) Str Formatting 하는 여러가지 방법

def get_fstring(x):
    return [f'Format {i}/{x}' for  i in range(x)]

def get_formatted_string(x):
    return ['Format {i}/{x}'.format(i=i,x=x) for  i in range(x)]

def get_percented_string(x):
    return ['Format %(i)d/%(x)d' % {'i':i, 'x':x} for  i in range(x)]

import cProfile
cProfile.run('get_fstring(100000000)')
cProfile.run('get_formatted_string(100000000)')
cProfile.run('get_percented_string(100000000)')

         5 function calls in 54.034 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.003    0.003   45.480   45.480 <ipython-input-2-9508acec4fc4>:1(get_fstring)
        1   45.477   45.477   45.477   45.477 <ipython-input-2-9508acec4fc4>:2(<listcomp>)
        1    8.550    8.550   54.029   54.029 <string>:1(<module>)
        1    0.004    0.004   54.033   54.033 {built-in method builtins.exec}
        1    0.001    0.001    0.001    0.001 {method 'disable' of '_lsprof.Profiler' objects}

         100000005 function calls in 91.966 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002   81.615   81.615 <ipython-input-2-9508acec4fc4>:4(get_formatted_string)
        1   21.801   21.801   81.612   81.612 <ipython-input-2-9508acec4fc4>:5(<listcomp>)
        1   10.342   10.342   91.956   91.956 <string>:1(<module>)
        1    0.009    0.009   91.966   91.966 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
100000000   59.811    0.000   59.811    0.000 {method 'format' of 'str' objects}

         5 function calls in 71.969 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002   63.683   63.683 <ipython-input-2-9508acec4fc4>:7(get_percented_string)
        1   63.681   63.681   63.681   63.681 <ipython-input-2-9508acec4fc4>:8(<listcomp>)
        1    8.280    8.280   71.964   71.964 <string>:1(<module>)
        1    0.005    0.005   71.969   71.969 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

# f-string이 format보다 훨씬 빠릅니다.

이 역시 함수의 calls 횟수를 보면 현저하게 차이가 나는 것을 알 수 있습니다.
이건 Why? 모두다 같은 문자열 formatting 같아보이는데?
- F-strings provide a way to embed expressions inside string literals, using a minimal syntax. It should be noted that an f-string is really an expression evaluated at run time, not a constant value
- 출처 : https://www.python.org/dev/peps/pep-0498/
- F는 상수값이 아닌 런타임에서 평가되는 표현값이다. 라는 말이다.
- 나는 이를 그냥 "format 함수도 결국 call function에 의해서 호출되고 지역변수와 전역변수를 찾으니까" 그런 것 아닐까? 라고 이해하였다.

저작자표시 비영리 변경금지