Tuning your model using the python-based low-code machine-learning library PyCaret

Today, I’ll discuss another important topic before I will share the excellent use case next month, as I still need some time to finish that one. We’ll see how we can leverage the brilliant capability of a low-code machine-learning library named PyCaret.

But before going through the details, why don’t we view the demo & then go through it?

Demo

Architecture:

Let us understand the flow of events –

As one can see, the initial training requests are triggered from the PyCaret-driven training models. And the application can successfully process & identify the best models out of the other combinations.

Python Packages:

Following are the python packages that are necessary to develop this use case –

pip install pandas
pip install pycaret

PyCaret is dependent on a combination of other popular python packages. So, you need to install them successfully to run this package.

CODE:

clsConfigClient.py (Main configuration file)

	################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 15-May-2020 ####
	#### Modified On: 31-Mar-2023 ####
	#### ####
	#### Objective: This script is a config ####
	#### file, contains all the keys for ####
	#### personal AI-driven voice assistant. ####
	#### ####
	################################################

	import os
	import platform as pl

	class clsConfigClient(object):
	Curr_Path = os.path.dirname(os.path.realpath(__file__))

	os_det = pl.system()
	if os_det == "Windows":
	sep = '\\'
	else:
	sep = '/'

	conf = {
	'APP_ID': 1,
	'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
	'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
	'LOG_PATH': Curr_Path + sep + 'log' + sep,
	'DATA_PATH': Curr_Path + sep + 'data' + sep,
	'MODEL_PATH': Curr_Path + sep + 'model' + sep,
	'TEMP_PATH': Curr_Path + sep + 'temp' + sep,
	'MODEL_DIR': 'model',
	'APP_DESC_1': 'PyCaret Training!',
	'DEBUG_IND': 'N',
	'INIT_PATH': Curr_Path,
	'FILE_NAME': 'Titanic.csv',
	'MODEL_NAME': 'PyCaret-ft-personal-2023-03-31-04-29-53',
	'TITLE': "PyCaret Training!",
	'PATH' : Curr_Path,
	'OUT_DIR': 'data'
	}

view raw

clsConfigClient.py

hosted with ❤ by GitHub

I’m skipping this section as it is self-explanatory.

clsTrainModel.py (This is the main class that contains the core logic of low-code machine-learning library to evaluate the best model for your solutions.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 31-Mar-2023 ####
	#### Modified On 31-Mar-2023 ####
	#### ####
	#### Objective: This is the main class that ####
	#### contains the core logic of low-code ####
	#### machine-learning library to evaluate the ####
	#### best model for your solutions. ####
	#### ####
	#####################################################

	import clsL as cl
	from clsConfigClient import clsConfigClient as cf
	import datetime

	# Import necessary libraries
	import pandas as p
	from pycaret.classification import *

	# Disbling Warning
	def warn(args, *kwargs):
	pass

	import warnings
	warnings.warn = warn

	######################################
	### Get your global values ####
	######################################
	debug_ind = 'Y'

	# Initiating Logging Instances
	clog = cl.clsL()
	###############################################
	### End of Global Section ###
	###############################################


	class clsTrainModel:
	def __init__(self):
	self.model_path = cf.conf['MODEL_PATH']
	self.model_name = cf.conf['MODEL_NAME']

	def trainModel(self, FullFileName):
	try:
	df = p.read_csv(FullFileName)
	row_count = int(df.shape[0])
	print('Number of rows: ', str(row_count))

	print(df)

	# Initialize the setup in PyCaret
	clf_setup = setup(
	data=df,
	target="Survived",
	train_size=0.8, # 80% for training, 20% for testing
	categorical_features=["Sex", "Embarked"],
	ordinal_features={"Pclass": ["1", "2", "3"]},
	ignore_features=["Name", "Ticket", "Cabin", "PassengerId"],
	#silent=True, # Set to False for interactive setup
	)

	# Compare various models
	best_model = compare_models()

	# Create a specific model (e.g., Random Forest)
	rf_model = create_model("rf")

	# Hyperparameter tuning
	tuned_rf_model = tune_model(rf_model)

	# Evaluate model performance
	plot_model(tuned_rf_model, plot="confusion_matrix")
	plot_model(tuned_rf_model, plot="auc")

	# Finalize the model (train on the complete dataset)
	final_rf_model = finalize_model(tuned_rf_model)

	# Make predictions on new data
	new_data = df.drop("Survived", axis=1)
	predictions = predict_model(final_rf_model, data=new_data)

	# Writing into the Model
	FullModelName = self.model_path + self.model_name

	print('Model Output @:: ', str(FullModelName))
	print()

	# Save the fine-tuned model
	save_model(final_rf_model, FullModelName)

	return 0

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	return 1

view raw

clsTrainModel.py

hosted with ❤ by GitHub

Let us understand the code in simple terms –

Import necessary libraries and load the Titanic dataset.
Initialize the PyCaret setup, specifying the target variable, train-test split, categorical and ordinal features, and features to ignore.
Compare various models to find the best-performing one.
Create a specific model (Random Forest in this case).
Perform hyper-parameter tuning on the Random Forest model.
Evaluate the model’s performance using a confusion matrix and AUC-ROC curve.
Finalize the model by training it on the complete dataset.
Make predictions on new data.
Save the trained model for future use.

trainPYCARETModel.py (This is the main calling python script that will invoke the training class of PyCaret package.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 31-Mar-2023 ####
	#### Modified On 31-Mar-2023 ####
	#### ####
	#### Objective: This is the main calling ####
	#### python script that will invoke the ####
	#### training class of Pycaret package. ####
	#### ####
	#####################################################

	import clsL as cl
	from clsConfigClient import clsConfigClient as cf
	import datetime

	import clsTrainModel as tm

	# Disbling Warning
	def warn(args, *kwargs):
	pass

	import warnings
	warnings.warn = warn

	######################################
	### Get your global values ####
	######################################
	debug_ind = 'Y'

	# Initiating Logging Instances
	clog = cl.clsL()

	data_path = cf.conf['DATA_PATH']
	data_file_name = cf.conf['FILE_NAME']

	tModel = tm.clsTrainModel()

	######################################
	#### Global Flag ########
	######################################

	def main():
	try:
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print(''120)
	print('Start Time: ' + str(var))
	print(''120)

	FullFileName = data_path + data_file_name

	r1 = tModel.trainModel(FullFileName)

	if r1 == 0:
	print('Successfully Trained!')
	else:
	print('Failed to Train!')

	print(''120)
	var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print('End Time: ' + str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

trainPYCARETModel.py

hosted with ❤ by GitHub

The above code is pretty self-explanatory as well.

testPYCARETModel.py (This is the main calling python script that will invoke the testing script for PyCaret package.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 31-Mar-2023 ####
	#### Modified On 31-Mar-2023 ####
	#### ####
	#### Objective: This is the main calling ####
	#### python script that will invoke the ####
	#### testing script for PyCaret package. ####
	#### ####
	#####################################################

	import clsL as cl
	from clsConfigClient import clsConfigClient as cf
	import datetime

	from pycaret.classification import load_model, predict_model

	import pandas as p

	# Disbling Warning
	def warn(args, *kwargs):
	pass

	import warnings
	warnings.warn = warn

	######################################
	### Get your global values ####
	######################################
	debug_ind = 'Y'

	# Initiating Logging Instances
	clog = cl.clsL()

	model_path = cf.conf['MODEL_PATH']
	model_name = cf.conf['MODEL_NAME']

	######################################
	#### Global Flag ########
	######################################

	def main():
	try:
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print(''120)
	print('Start Time: ' + str(var))
	print(''120)

	FullFileName = model_path + model_name

	# Load the saved model
	loaded_model = load_model(FullFileName)

	# Prepare new data for testing (make sure it has the same columns as the original data)
	new_data = p.DataFrame({
	"Pclass": [3, 1],
	"Sex": ["male", "female"],
	"Age": [22, 38],
	"SibSp": [1, 1],
	"Parch": [0, 0],
	"Fare": [7.25, 71.2833],
	"Embarked": ["S", "C"]
	})

	# Make predictions using the loaded model
	predictions = predict_model(loaded_model, data=new_data)

	# Display the predictions
	print(predictions)

	print(''120)
	var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print('End Time: ' + str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

testPYCARETModel.py

hosted with ❤ by GitHub

In this code, the application uses the stored model & then forecasts based on the optimized PyCaret model tuning.

Conclusion:

The above code demonstrates an end-to-end binary classification pipeline using the PyCaret library for the Titanic dataset. The goal is to predict whether a passenger survived based on the available features. Here are some conclusions you can draw from the code and data:

Ease of use: The code showcases how PyCaret simplifies the machine learning process, from data preprocessing to model training, evaluation, and deployment. With just a few lines of code, you can perform tasks that would require much more effort using lower-level libraries.
Model selection: The compare_models() function provides a quick and easy way to compare various machine learning algorithms and identify the best-performing one based on the chosen evaluation metric (accuracy by default). This selection helps you select a suitable model for the given problem.
Hyper-parameter tuning: The tune_model() function automates the process of hyper-parameter tuning to improve model performance. We tuned a Random Forest model to optimize its predictive power in the example.
Model evaluation: PyCaret provides several built-in visualization tools for assessing model performance. In the example, we used a confusion matrix and AUC-ROC curve to evaluate the performance of the tuned Random Forest model.
Model deployment: The example demonstrates how to make predictions using the trained model and save the model for future use. This deployment showcases how PyCaret can streamline the process of deploying a machine-learning model in a production environment.

It is important to note that the conclusions drawn from the code and data are specific to the Titanic dataset and the chosen features. Adjust the feature engineering, preprocessing, and model selection steps for different datasets or problems accordingly. However, the general workflow and benefits provided by PyCaret would remain the same.

So, finally, we’ve done it.

I know that this post is relatively bigger than my earlier post. But, I think, you can get all the details once you go through it.

You will get the complete codebase in the following GitHub link.

I’ll bring some more exciting topics in the coming days from the Python verse. Please share & subscribe to my post & let me know your feedback.

Till then, Happy Avenging! 🙂

Note: All the data & scenarios posted here are representational data & scenarios & available over the internet & for educational purposes only. Some of the images (except my photo) we’ve used are available over the net. We don’t claim ownership of these images. There is always room for improvement & especially in the prediction quality.

	The LLM Security Chr… on The LLM Security Chronicles…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on Agentic AI in the Enterprise:…

Tuning your model using the python-based low-code machine-learning library PyCaret

Like this:

Related

Published by SatyakiDe

Leave a ReplyCancel reply

Share this:

Like this:

Related

Published by SatyakiDe

Leave a ReplyCancel reply

Discover more from Satyaki De's Blog