improvement Archives

Hacking the performance of Python Solutions with a custom-built library

Posted on July 31, 2023July 31, 2023 by SatyakiDe in api, cloud, code, computing, cython, Data Science, numpy, objects, Pandas, Performance, Python, Technology

Today, I’m very excited to demonstrate an effortless & new way to hack the performance of Python. This post will be a super short & yet crisp presentation of improving the overall performance.

Why not view the demo before going through it?

Demo

Isn’t it exciting? Let’s understand the steps to improve your code.

Python Packages:

pip install cython

Why this way?

Cython is a Python-to-C compiler. It can significantly improve performance for specific tasks, especially those with heavy computation and loops. Also, Cython’s syntax is very similar to Python, which makes it easy to learn.

Let’s consider an example where we calculate the sum of squares for a list of numbers. The code without optimization would look like this:

perfTest_1.py (First untuned Python class.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 31-Jul-2023                         ####
#### Modified On 31-Jul-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### first version of accute computation.            ####
####                                                 ####
#########################################################
from clsConfigClient import clsConfigClient as cf

import time
start = time.time()

n_val = cf.conf['INPUT_VAL']

def compute_sum_of_squares(n):
    return sum([i**2 for i in range(n)])

n = n_val

print(compute_sum_of_squares(n))

print(f"Test - 1: Execution time: {time.time() - start} seconds")

Here, n_val contains the value as – “1000000000”.

Now, let’s optimize it using Cython by installing the abovementioned packages. Then, you will have to create a .pyx file, say “compute.pyx”, with the following code:

cpdef double compute_sum_of_squares(int n):
    return sum([i**2 for i in range(n)])

Now, create a setup.py file to compile it:

###########################################################
#### Written By: SATYAKI DE                            ####
#### Written On: 31-Jul-2023                           ####
#### Modified On 31-Jul-2023                           ####
####                                                   ####
#### Objective: This is the main calling               ####
#### python script that will create the                ####
#### compiled library after executing the compute.pyx. ####
####                                                   ####
###########################################################

from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("compute.pyx")
)

Compile it using the command:

python setup.py build_ext --inplace

This will look like the following –

Finally, you can import the function from the compiled “.pyx” file inside the improved code.

perfTest_2.py (First untuned Python class.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 31-Jul-2023                         ####
#### Modified On 31-Jul-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### optimized & precompiled custom library, which   ####
#### will significantly improve the performance.     ####
####                                                 ####
#########################################################
from clsConfigClient import clsConfigClient as cf
from compute import compute_sum_of_squares

import time
start = time.time()

n_val = cf.conf['INPUT_VAL']

n = n_val

print(compute_sum_of_squares(n))

print(f"Test - 2: Execution time with multiprocessing: {time.time() - start} seconds")

By compiling to C, Cython can speed up loop and function calls, leading to significant speedup for CPU-bound tasks.

Please note that while Cython can dramatically improve performance, it can make the code more complex and harder to debug. Therefore, starting with regular Python and switching to Cython for the performance-critical parts of the code is recommended.

So, finally, we’ve done it. I know that this post is relatively smaller than my earlier post. But, I think, you can get a good hack to improve some of your long-running jobs by applying this trick.

I’ll bring some more exciting topics in the coming days from the Python verse. Please share & subscribe to my post & let me know your feedback.

Till then, Happy Avenging! 🙂

Note: All the data & scenarios posted here are representational data & scenarios & available over the internet & for educational purposes only. Some of the images (except my photo) we’ve used are available over the net. We don’t claim ownership of these images. There is always room for improvement & especially in the prediction quality.

Python performance improvement with 3.11 Version

Posted on October 30, 2022October 30, 2022 by SatyakiDe in call, cloud, code, computing, Crossplatform, function, Python

Today, we’ll share another performance improvement incorporating the latest Python 3.11 version. You can consider this significant advancement over the past versions. Last time, I posted for 3.7 in one of my earlier posts. But, we should diligently update everyone regarding the performance upgrade as it is slowly catching up with some of the finest programming languages.

But, before that, I want to share the latest stats of the machine where I tried these tests (As there is a change of system compared to last time).

Let us explore the base code –

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 06-May-2021              ####
#### Modified On: 30-Oct-2022             ####
####                                      ####
#### Objective: Main calling scripts for  ####
#### normal execution.                    ####
##############################################

from timeit import default_timer as timer

def vecCompute(sizeNum):
    try:
        total = 0
        for i in range(1, sizeNum):
            for j in range(1, sizeNum):
                total += i + j
        return total
    except Excception as e:
        x = str(e)
        print('Error: ', x)

        return 0


def main():

    start = timer()

    totalM = 0
    totalM = vecCompute(100000)

    print('The result is : ' + str(totalM))
    duration = timer() - start
    print('It took ' + str(duration) + ' seconds to compute')

if __name__ == '__main__':
    main()

And here is the outcome comparison between the 3.10 & 3.11 –

The above screenshot shows an improvement of 23% on an average compared to the previous version.

These performance stats are highly crucial. The result shows how Python is slowly emerging as the universal language for various kinds of work and is now targetting one of the vital threads, i.e., improvement of performance.

So, finally, we have done it.

I’ll bring some more exciting topic in the coming days from the Python verse.

Till then, Happy Avenging! 🙂

Note: All the data & scenario posted here are representational data & scenarios & available over the internet & for educational purpose only.

Another marvelous performance tuning tricks in Python

Posted on May 6, 2021May 6, 2021 by SatyakiDe in call, cloud, code, computing, Data Science, function, Python

Hi Guys!

Today, I’ll be showing another post on how one can drastically improve the performance of a python code. Last time, we took advantage of vector computing by using GPU-based computation. This time we’ll explore PyPy (the new just in time compiler, while Python is the interpreter).

What is PyPy?

According to the standard description available over the net ->

PyPy is a very compliant Python interpreter that is a worthy alternative to CPython. By installing and running your application with it, you can gain noticeable speed improvements. How much of an improvement you’ll see depends on the application you’re running.

What is JIT (Just-In Time) compiler?

A compiled programming language always faster in execution as it generates the bytecode based on the CPU architecture & OS. However, they are challenging to port into another system. Example: C, C++ etc.

Interpreted languages are easy to port into a new system. However, they lack performance. Example: Perl, Matlab, etc.

However, python falls between the two. Hence, it performs better than purely interpreted languages. But, indeed not as good as compiler-driven language.

There is a new Just in time compiler comes, which takes advantage of both the world. It identifies the repeatable code & converts those chunks into machine learning code for optimum performance.

To prepare the environment, you need to install the following in MAC (I’m using MacBook) –

brew install pypy3

Let’s revisit our code.

Step 1: largeCompute.py (The main script, which will participate in a performance for both the interpreter):

	##############################################
	#### Written By: SATYAKI DE ####
	#### Written On: 06-May-2021 ####
	#### ####
	#### Objective: Main calling scripts for ####
	#### normal execution. ####
	##############################################

	from timeit import default_timer as timer

	def vecCompute(sizeNum):
	try:
	total = 0
	for i in range(1, sizeNum):
	for j in range(1, sizeNum):
	total += i + j

	return total
	except Excception as e:
	x = str(e)
	print('Error: ', x)

	return 0


	def main():

	start = timer()

	totalM = 0
	totalM = vecCompute(100000)

	print('The result is : ' + str(totalM))

	duration = timer() – start

	print('It took ' + str(duration) +' seconds to compute')

	if __name__ == '__main__':
	main()

view raw

largeCompute.py

hosted with ❤ by GitHub

Key snippets from the above script –

for i in range(1, sizeNum):
            for j in range(1, sizeNum):
                total += i + j

vecCompute function calculates 100000 * 100000 or any new supplied number to process the value (I = I + J) of each iteration.

Let’s see how it performs.

To run the commands in pypy you need to use the following command –

pypy largeCompute.py

or, You have to mention the specific path as follows –

/Users/satyaki_de/Desktop/pypy3.7-v7.3.4-osx64/bin/pypy largeCompute.py

**Performance Comparison between two interpreters**

As you can see there is a significant performance improvement i.e. (352.079 / 14.503) = 24.276. So, I can clearly say 24 times faster than using the standard python interpreter. This is as good as C++ code.

Where not to use?

PyPy works best with the pure python-driven applications. It can’t work with the Python or any C extension in python. Hence, you won’t get that benefits. However, I have a strong believe that one day we may use this for most of our use cases.

For more information, please visit this link. So, this is another shortest yet effective post. 🙂

So, finally, we have done it.

I’ll bring some more exciting topic in the coming days from the Python verse.

Till then, Happy Avenging! 😀

Note: All the data & scenario posted here are representational data & scenarios & available over the internet & for educational purpose only.

Performance improvement of Python application programming

Posted on January 18, 2021June 2, 2021 by SatyakiDe in Azure, cloud, code, computing, Data Science, design, features, gui, integration, Keras, numpy, Pandas, Python, snippet, table, Technology, vector

Hello guys,

Today, I’ll be demonstrating a short but significant topic. There are widespread facts that, on many occasions, Python is relatively slower than other strongly typed programming languages like C++, Java, or even the latest version of PHP.

I found a relatively old post with a comparison shown between Python and the other popular languages. You can find the details at this link.

However, I haven’t verified the outcome. So, I can’t comment on the final statistics provided on that link.

My purpose is to find cases where I can take certain tricks to improve performance drastically.

One preferable option would be the use of Cython. That involves the middle ground between C & Python & brings the best out of both worlds.

The other option would be the use of GPU for vector computations. That would drastically increase the processing power. Today, we’ll be exploring this option.

Let’s find out what we need to prepare our environment before we try out on this.

Step – 1 (Installing dependent packages):

pip install pyopencl
pip install plaidml-keras

So, we will be taking advantage of the Keras package to use our GPU. And, the screen should look like this –

**Installation Process of Python-based Packages**

Once we’ve installed the packages, we’ll configure the package showing on the next screen.

For our case, we need to install pandas as we’ll be using numpy, which comes default with it.

**Installation of supplemental packages**

Let’s explore our standard snippet to test this use case.

Case 1 (Normal computational code in Python):

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 18-Jan-2020              ####
####                                      ####
#### Objective: Main calling scripts for  ####
#### normal execution.                    ####
##############################################

import numpy as np
from timeit import default_timer as timer

def pow(a, b, c):
    for i in range(a.size):
         c[i] = a[i] ** b[i]

def main():
    vec_size = 100000000

    a = b = np.array(np.random.sample(vec_size), dtype=np.float32)
    c = np.zeros(vec_size, dtype=np.float32)

    start = timer()
    pow(a, b, c)
    duration = timer() - start

    print(duration)

if __name__ == '__main__':
    main()

Case 2 (GPU-based computational code in Python):

#################################################
#### Written By: SATYAKI DE                  ####
#### Written On: 18-Jan-2020                 ####
####                                         ####
#### Objective: Main calling scripts for     ####
#### use of GPU to speed-up the performance. ####
#################################################

import numpy as np
from timeit import default_timer as timer

# Adding GPU Instance
from os import environ
environ["KERAS_BACKEND"] = "plaidml.keras.backend"

def pow(a, b):
    return a ** b

def main():
    vec_size = 100000000

    a = b = np.array(np.random.sample(vec_size), dtype=np.float32)
    c = np.zeros(vec_size, dtype=np.float32)

    start = timer()
    c = pow(a, b)
    duration = timer() - start

    print(duration)

if __name__ == '__main__':
    main()

And, here comes the output for your comparisons –

Case 1 Vs Case 2:

As you can see, there is a significant improvement that we can achieve using this. However, it has limited scope. Not everywhere you get the benefits. Until or unless Python decides to work on the performance side, you better need to explore either of the two options that I’ve discussed here (I didn’t mention a lot on Cython here. Maybe some other day.).

To get the codebase you can refer the following Github link.

So, finally, we have done it.

I’ll bring some more exciting topic in the coming days from the Python verse.

Till then, Happy Avenging! 😀

Note: All the data & scenario posted here are representational data & scenarios & available over the internet & for educational purpose only.

	The LLM Security Chr… on The LLM Security Chronicles…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on Agentic AI in the Enterprise:…

Tag: improvement

Hacking the performance of Python Solutions with a custom-built library

Like this:

Python performance improvement with 3.11 Version

Like this:

Another marvelous performance tuning tricks in Python

Like this:

Performance improvement of Python application programming

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: