Keras Archives

AGENTIC AI IN THE ENTERPRISE: STRATEGY, ARCHITECTURE, AND IMPLEMENTATION – PART 3

Posted on July 31, 2025August 31, 2025 by SatyakiDe in agents, ai, anthropic, api, audio, Azure, bharatgpt, BOT, call, circuitbreaker, clob, cloud, Computer-Vision, computing, CPU, Crossplatform, Data Science, deepseek, design, exposure, Fabric, faiss, features, function, gpt3, GPU, grok, gui, Haystack, HuggingFace, ibm, IoT, json, Keras, LangChain, Langflow, Linear-Regression, Listagg, llm, Logistic-regression, loop, machine-learning, mcpprotocol, Microsoft, mobile, Model, mulesoft, natural-language, neural prophet, ngrok, objects, Open-CV, openai, oracle-cloud, Performance, pl sql, Polars, prophet-api, React, Real-time, sarvam, Silicon, StabilityAI, StableDefussion, Technology, Tensorflow, Torch, video, voice, watson

This is a continuation of my previous post, which can be found here.

Let us recap the key takaways from our previous post –

Enterprise AI, utilizing the Model Context Protocol (MCP), leverages an open standard that enables AI systems to securely and consistently access enterprise data and tools. MCP replaces brittle “N×M” integrations between models and systems with a standardized client–server pattern: an MCP host (e.g., IDE or chatbot) runs an MCP client that communicates with lightweight MCP servers, which wrap external systems via JSON-RPC. Servers expose three assets—Resources (data), Tools (actions), and Prompts (templates)—behind permissions, access control, and auditability. This design enables real-time context, reduces hallucinations, supports model- and cloud-agnostic interoperability, and accelerates “build once, integrate everywhere” deployment. A typical flow (e.g., retrieving a customer’s latest order) encompasses intent parsing, authorized tool invocation, query translation/execution, and the return of a normalized JSON result to the model for natural-language delivery. Performance introduces modest overhead (RPC hops, JSON (de)serialization, network transit) and scale considerations (request volume, significant results, context-window pressure). Mitigations include in-memory/semantic caching, optimized SQL with indexing, pagination, and filtering, connection pooling, and horizontal scaling with load balancing. In practice, small latency costs are often outweighed by the benefits of higher accuracy, stronger governance, and a decoupled, scalable architecture.

How does MCP compare with other AI integration approaches?

Compared to other approaches, the Model Context Protocol (MCP) offers a uniquely standardized and secure framework for AI-tool integration, shifting from brittle, custom-coded connections to a universal plug-and-play model. It is not a replacement for underlying systems, such as APIs or databases, but instead acts as an intelligent, secure abstraction layer designed explicitly for AI agents.

MCP vs. Custom API integrations:

This approach was the traditional method for AI integration before standards like MCP emerged.

Custom API integrations (traditional): Each AI application requires a custom-built connector for every external system it needs to access, leading to an N x M integration problem (the number of connectors grows exponentially with the number of models and systems). This approach is resource-intensive, challenging to maintain, and prone to breaking when underlying APIs change.
MCP: The standardized protocol eliminates the N x M problem by creating a universal interface. Tool creators build a single MCP server for their system, and any MCP-compatible AI agent can instantly access it. This process decouples the AI model from the underlying implementation details, drastically reducing integration and maintenance costs.

For more detailed information, please refer to the following link.

MCP vs. Retrieval-Augmented Generation (RAG):

RAG is a technique that retrieves static documents to augment an LLM’s knowledge, while MCP focuses on live interactions. They are complementary, not competing.

RAG:
- Focus: Retrieving and summarizing static, unstructured data, such as documents, manuals, or knowledge bases.
- Best for: Providing background knowledge and general information, as in a policy lookup tool or customer service bot.
- Data type: Unstructured, static knowledge.
MCP:
- Focus: Accessing and acting on real-time, structured, and dynamic data from databases, APIs, and business systems.
- Best for: Agentic use cases involving real-world actions, like pulling live sales reports from a CRM or creating a ticket in a project management tool.
- Data type: Structured, real-time, and dynamic data.

MCP vs. LLM plugins and extensions:

Before MCP, platforms like OpenAI offered proprietary plugin systems to extend LLM capabilities.

LLM plugins:
- Proprietary: Tied to a specific AI vendor (e.g., OpenAI).
- Limited: Rely on the vendor’s API function-calling mechanism, which focuses on call formatting but not standardized execution.
- Centralized: Managed by the AI vendor, creating a risk of vendor lock-in.
MCP:
- Open standard: Based on a public, interoperable protocol (JSON-RPC 2.0), making it model-agnostic and usable across different platforms.
- Infrastructure layer: Provides a standardized infrastructure for agents to discover and use any compliant tool, regardless of the underlying LLM.
- Decentralized: Promotes a flexible ecosystem and reduces the risk of vendor lock-in.

How enterprise AI with MCP has opened up a specific Architecture pattern for Azure, AWS & GCP?

Microsoft Azure:

The “agent factory” pattern: Azure focuses on providing managed services for building and orchestrating AI agents, tightly integrated with its enterprise security and governance features. The MCP architecture is a core component of the Azure AI Foundry, serving as a secure, managed “agent factory.”

Azure architecture pattern with MCP:

AI orchestration layer: The Azure AI Agent Service, within Azure AI Foundry, acts as the central host and orchestrator. It provides the control plane for creating, deploying, and managing multiple specialized agents, and it natively supports the MCP standard.
AI model layer: Agents in the Foundry can be powered by various models, including those from Azure OpenAI Service, commercial models from partners, or open-source models.
MCP server and tool layer: MCP servers are deployed using serverless functions, such as Azure Functions or Azure Logic Apps, to wrap existing enterprise systems. These servers expose tools for interacting with enterprise data sources like SharePoint, Azure AI Search, and Azure Blob Storage.
Data and security layer: Data is secured using Microsoft Entra ID (formerly Azure AD) for authentication and access control, with robust security policies enforced via Azure API Management. Access to data sources, such as databases and storage, is managed securely through private networks and Managed Identity.

Amazon Web Services (AWS):

The “composable serverless agent” pattern: AWS emphasizes a modular, composable, and serverless approach, leveraging its extensive portfolio of services to build sophisticated, flexible, and scalable AI solutions. The MCP architecture here aligns with the principle of creating lightweight, event-driven services that AI agents can orchestrate.

AWS architecture pattern with MCP:

The AI orchestration layer, which includes Amazon Bedrock Agents or custom agent frameworks deployed via AWS Fargate or Lambda, acts as the MCP hosts. Bedrock Agents provide built-in orchestration, while custom agents offer greater flexibility and customization options.
AI model layer: The models are sourced from Amazon Bedrock, which provides a wide selection of foundation models.
MCP server and tool layer: MCP servers are deployed as serverless AWS Lambda functions. AWS offers pre-built MCP servers for many of its services, including the AWS Serverless MCP Server for managing serverless applications and the AWS Lambda Tool MCP Server for invoking existing Lambda functions as tools.
Data and security layer: Access is tightly controlled using AWS Identity and Access Management (IAM) roles and policies, with fine-grained permissions for each MCP server. Private data sources like databases (Amazon DynamoDB) and storage (Amazon S3) are accessed securely within a Virtual Private Cloud (VPC).

Google Cloud Platform (GCP):

The “unified workbench” pattern: GCP focuses on providing a unified, open, and data-centric platform for AI development. The MCP architecture on GCP integrates natively with the Vertex AI platform, treating MCP servers as first-class tools that can be dynamically discovered and used within a single workbench.

GCP architecture pattern with MCP:

AI orchestration layer: The Vertex AI Agent Builder serves as the central environment for building and managing conversational AI and other agents. It orchestrates workflows and manages tool invocation for agents.
AI model layer: Agents use foundation models available through the Vertex AI Model Garden or the Gemini API.
MCP server and tool layer: MCP servers are deployed as containerized microservices on Cloud Run or managed by services like App Engine. These servers contain tools that interact with GCP services, such as BigQuery, Cloud Storage, and Cloud SQL. GCP offers pre-built MCP server implementations, such as the GCP MCP Toolbox, for integration with its databases.
Data and security layer: Vertex AI Vector Search and other data sources are encapsulated within the MCP server tools to provide contextual information. Access to these services is managed by Identity and Access Management (IAM) and secured through virtual private clouds. The MCP server can leverage Vertex AI Context Caching for improved performance.

Note that all the native technology is referred to in each respective cloud. Hence, some of the better technologies can be used in place of the tool mentioned here. This is more of a concept-level comparison rather than industry-wise implementation approaches.

We’ll go ahead and conclude this post here & continue discussing on a further deep dive in the next post.

Till then, Happy Avenging! 🙂

Note: All the data & scenarios posted here are representational data & scenarios & available over the internet & for educational purposes only. There is always room for improvement in this kind of model & the solution associated with it. I’ve shown the basic ways to achieve the same for educational purposes only.

RAG implementation of LLMs by using Python, Haystack & React (Part – 2)

Posted on September 29, 2023September 29, 2023 by SatyakiDe in api, Azure, cloud, code, computing, Data Science, design, faiss, gpt3, Haystack, json, Keras, machine-learning, natural-language, numpy, objects, openai, Pandas, Performance, Python, Technology

Today, we’ll share the second installment of the RAG implementation. If you are new here, please visit the previous post for full context.

In this post, we’ll be discussing the Haystack framework more. Again, before discussing the main context, I want to present the demo here.

Demo

FLOW OF EVENTS:

Let us look at the flow diagram as it captures the sequence of events that unfold as part of the process, where today, we’ll pay our primary attention.

As you can see today, we’ll discuss the red dotted line, which contextualizes the source data into the Vector DBs.

Let us understand the flow of events here –

The main Python application will consume the nested JSON by invoking the museum API in multiple threads.
The application will clean the nested data & extract the relevant attributes after flattening the JSON.
It will create the unstructured text-based context, which is later fed to the Vector DB framework.

IMPORTANT PACKAGES:

pip install farm-haystack==1.19.0
pip install Flask==2.2.5
pip install Flask-Cors==4.0.0
pip install Flask-JWT-Extended==4.5.2
pip install Flask-Session==0.5.0
pip install openai==0.27.8
pip install pandas==2.0.3
pip install tensorflow==2.11.1

We’re using the Metropolitan Museum API to feed the data to our Vector DB. For more information, please visit the following lin k. And this is free to use & moreover, we’re using it for education scenarios.

CODE:

We’ll discuss the tokenization part highlighted in a red dotted line from the above picture.

Python:

We’ll discuss the scripts in the diagram as part of the flow mentioned above.

clsExtractJSON.py (This is the main class that will extract the content from the museum API using parallel calls.)

def genData(self):
    try:
        base_url = self.base_url
        header_token = self.header_token
        basePath = self.basePath
        outputPath = self.outputPath
        mergedFile = self.mergedFile
        subdir = self.subdir
        Ind = self.Ind
        var_1 = datetime.now().strftime("%H.%M.%S")


        devVal = list()
        objVal = list()

        # Main Details
        headers = {'Cookie':header_token}
        payload={}

        url = base_url + '/departments'

        date_ranges = self.generateFirstDayOfLastTenYears()

        # Getting all the departments
        try:
            print('Department URL:')
            print(str(url))

            response = requests.request("GET", url, headers=headers, data=payload)
            parsed_data = json.loads(response.text)

            print('Department JSON:')
            print(str(parsed_data))

            # Extract the "departmentId" values into a Python list
            for dept_det in parsed_data['departments']:
                for info in dept_det:
                    if info == 'departmentId':
                        devVal.append(dept_det[info])

        except Exception as e:
            x = str(e)
            print('Error: ', x)
            devVal = list()

        # List to hold thread objects
        threads = []

        # Calling the Data using threads
        for dep in devVal:
            t = threading.Thread(target=self.getDataThread, args=(dep, base_url, headers, payload, date_ranges, objVal, subdir, Ind,))
            threads.append(t)
            t.start()

        # Wait for all threads to complete
        for t in threads:
            t.join()

        res = self.mergeCsvFilesInDirectory(basePath, outputPath, mergedFile)

        if res == 0:
            print('Successful!')
        else:
            print('Failure!')

        return 0

    except Exception as e:
        x = str(e)
        print('Error: ', x)

        return 1

The above code translates into the following steps –

The above method first calls the generateFirstDayOfLastTenYears() plan to populate records for every department after getting all the unique departments by calling another API.
Then, it will call the getDataThread() methods to fetch all the relevant APIs simultaneously to reduce the overall wait time & create individual smaller files.
Finally, the application will invoke the mergeCsvFilesInDirectory() method to merge all the chunk files into one extensive historical data.

def generateFirstDayOfLastTenYears(self):
    yearRange = self.yearRange
    date_format = "%Y-%m-%d"
    current_year = datetime.now().year

    date_ranges = []
    for year in range(current_year - yearRange, current_year + 1):
        first_day_of_year_full = datetime(year, 1, 1)
        first_day_of_year = first_day_of_year_full.strftime(date_format)
        date_ranges.append(first_day_of_year)

    return date_ranges

The first method will generate the first day of each year for the last ten years, including the current year.

def getDataThread(self, dep, base_url, headers, payload, date_ranges, objVal, subdir, Ind):
    try:
        cnt = 0
        cnt_x = 1
        var_1 = datetime.now().strftime("%H.%M.%S")

        for x_start_date in date_ranges:
            try:
                urlM = base_url + '/objects?metadataDate=' + str(x_start_date) + '&departmentIds=' + str(dep)

                print('Nested URL:')
                print(str(urlM))

                response_obj = requests.request("GET", urlM, headers=headers, data=payload)
                objectDets = json.loads(response_obj.text)

                for obj_det in objectDets['objectIDs']:
                    objVal.append(obj_det)

                for objId in objVal:
                    urlS = base_url + '/objects/' + str(objId)

                    print('Final URL:')
                    print(str(urlS))

                    response_det = requests.request("GET", urlS, headers=headers, data=payload)
                    objDetJSON = response_det.text

                    retDB = self.createData(objDetJSON)
                    retDB['departmentId'] = str(dep)

                    if cnt == 0:
                        df_M = retDB
                    else:
                        d_frames = [df_M, retDB]
                        df_M = pd.concat(d_frames)

                    if cnt == 1000:
                        cnt = 0
                        clog.logr('df_M_' + var_1 + '_' + str(cnt_x) + '_' + str(dep) +'.csv', Ind, df_M, subdir)
                        cnt_x += 1
                        df_M = pd.DataFrame()

                    cnt += 1

            except Exception as e:
                x = str(e)
                print('Error X:', x)
        return 0

    except Exception as e:
        x = str(e)
        print('Error: ', x)

        return 1

The above method will invoke the individual API call to fetch the relevant artifact information.

def mergeCsvFilesInDirectory(self, directory_path, output_path, output_file):
    try:
        csv_files = [file for file in os.listdir(directory_path) if file.endswith('.csv')]
        data_frames = []

        for file in csv_files:
            encodings_to_try = ['utf-8', 'utf-8-sig', 'latin-1', 'cp1252']
            for encoding in encodings_to_try:
                try:
                    FullFileName = directory_path + file
                    print('File Name: ', FullFileName)
                    df = pd.read_csv(FullFileName, encoding=encoding)
                    data_frames.append(df)
                    break  # Stop trying other encodings if the reading is successful
                except UnicodeDecodeError:
                    continue

        if not data_frames:
            raise Exception("Unable to read CSV files. Check encoding or file format.")

        merged_df = pd.concat(data_frames, ignore_index=True)

        merged_full_name = os.path.join(output_path, output_file)
        merged_df.to_csv(merged_full_name, index=False)

        for file in csv_files:
            os.remove(os.path.join(directory_path, file))

        return 0

    except Exception as e:
        x = str(e)
        print('Error: ', x)
        return 1

The above method will merge all the small files into a single, more extensive historical data that contains over ten years of data (the first day of ten years of data, to be precise).

For the complete code, please visit the GitHub.

1_ReadMuseumJSON.py (This is the main class that will invoke the class, which will extract the content from the museum API using parallel calls.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 27-Jun-2023                         ####
#### Modified On 28-Jun-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### shortcut application created inside MAC         ####
#### enviornment including MacBook, IPad or IPhone.  ####
####                                                 ####
#########################################################
import datetime
from clsConfigClient import clsConfigClient as cf

import clsExtractJSON as cej

########################################################
################    Global Area   ######################
########################################################

cJSON = cej.clsExtractJSON()

basePath = cf.conf['DATA_PATH']
outputPath = cf.conf['OUTPUT_PATH']
mergedFile = cf.conf['MERGED_FILE']

########################################################
################  End Of Global Area   #################
########################################################

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

def main():
    try:
        var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        print('*'*120)
        print('Start Time: ' + str(var))
        print('*'*120)

        r1 = cJSON.genData()

        if r1 == 0:
            print()
            print('Successfully Scrapped!')
        else:
            print()
            print('Failed to Scrappe!')

        print('*'*120)
        var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        print('End Time: ' + str(var1))

    except Exception as e:
        x = str(e)
        print('Error: ', x)

if __name__ == '__main__':
    main()

The above script calls the main class after instantiating the class.

clsCreateList.py (This is the main class that will extract the relevant attributes from the historical files & then create the right input text to create the documents for contextualize into the Vector DB framework.)

def createRec(self):
    try:
        basePath = self.basePath
        fileName = self.fileName
        Ind = self.Ind
        subdir = self.subdir
        base_url = self.base_url
        outputPath = self.outputPath
        mergedFile = self.mergedFile
        cleanedFile = self.cleanedFile

        FullFileName = outputPath + mergedFile

        df = pd.read_csv(FullFileName)
        df2 = df[listCol]
        dfFin = df2.drop_duplicates().reset_index(drop=True)

        dfFin['artist_URL'] = dfFin['artistWikidata_URL'].combine_first(dfFin['artistULAN_URL'])
        dfFin['object_URL'] = dfFin['objectURL'].combine_first(dfFin['objectWikidata_URL'])
        dfFin['Wiki_URL'] = dfFin['Wikidata_URL'].combine_first(dfFin['AAT_URL']).combine_first(dfFin['URL']).combine_first(dfFin['object_URL'])

        # Dropping the old Dtype Columns
        dfFin.drop(['artistWikidata_URL'], axis=1, inplace=True)
        dfFin.drop(['artistULAN_URL'], axis=1, inplace=True)
        dfFin.drop(['objectURL'], axis=1, inplace=True)
        dfFin.drop(['objectWikidata_URL'], axis=1, inplace=True)
        dfFin.drop(['AAT_URL'], axis=1, inplace=True)
        dfFin.drop(['Wikidata_URL'], axis=1, inplace=True)
        dfFin.drop(['URL'], axis=1, inplace=True)

        # Save the filtered DataFrame to a new CSV file
        #clog.logr(cleanedFile, Ind, dfFin, subdir)
        res = self.addHash(dfFin)

        if res == 0:
            print('Added Hash!')
        else:
            print('Failed to add hash!')

        # Generate the text for each row in the dataframe
        for _, row in dfFin.iterrows():
            x = self.genPrompt(row)
            self.addDocument(x, cleanedFile)

        return documents

    except Exception as e:
        x = str(e)
        print('Record Error: ', x)

        return documents

The above code will read the data from the extensive historical file created from the earlier steps & then it will clean the file by removing all the duplicate records (if any) & finally, it will create three unique URLs that constitute artist, object & wiki.

Also, this application will remove the hyperlink with a specific hash value, which will feed into the vector DB. Vector DB could be better with the URLs. Hence, we will store the URLs in a separate file by storing the associate hash value & later, we’ll fetch it in a lookup from the open AI response.

Then, this application will generate prompts dynamically & finally create the documents for later steps of vector DB consumption by invoking the addDocument() methods.

For more details, please visit the GitHub link.

1_1_testCreateRec.py (This is the main class that will call the above class.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 27-Jun-2023                         ####
#### Modified On 28-Jun-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### shortcut application created inside MAC         ####
#### enviornment including MacBook, IPad or IPhone.  ####
####                                                 ####
#########################################################

from clsConfigClient import clsConfigClient as cf
import clsL as log
import clsCreateList as ccl

from datetime import datetime, timedelta

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

###############################################
###           Global Section                ###
###############################################

#Initiating Logging Instances
clog = log.clsL()
cl = ccl.clsCreateList()

var = datetime.now().strftime(".%H.%M.%S")

documents = []

###############################################
###    End of Global Section                ###
###############################################
def main():
    try:
        var = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        print('*'*120)
        print('Start Time: ' + str(var))
        print('*'*120)

        print('*'*240)
        print('Creating Index store:: ')
        print('*'*240)

        documents = cl.createRec()

        print('Inserted Sample Records: ')
        print(str(documents))
        print('\n')

        r1 = len(documents)

        if r1 > 0:
            print()
            print('Successfully Indexed sample records!')
        else:
            print()
            print('Failed to sample Indexed recrods!')

        print('*'*120)
        var1 = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        print('End Time: ' + str(var1))

    except Exception as e:
        x = str(e)
        print('Error: ', x)

if __name__ == '__main__':
    main()

The above script invokes the main class after instantiating it & invokes the createRec() methods to tokenize the data into the vector DB.

This above test script will be used to test the above clsCreateList class. However, the class will be used inside another class.
– Satyaki

clsFeedVectorDB.py (This is the main class that will feed the documents into the vector DB.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 27-Jun-2023                         ####
#### Modified On 28-Sep-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### haystack frameowrk to contextulioze the docs    ####
#### inside the vector DB.                           ####
####                                                 ####
#########################################################

from haystack.document_stores.faiss import FAISSDocumentStore
from haystack.nodes import DensePassageRetriever
import openai
import pandas as pd
import os
import clsCreateList as ccl

from clsConfigClient import clsConfigClient as cf
import clsL as log

from datetime import datetime, timedelta

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

###############################################
###           Global Section                ###
###############################################

Ind = cf.conf['DEBUG_IND']
openAIKey = cf.conf['OPEN_AI_KEY']

os.environ["TOKENIZERS_PARALLELISM"] = "false"

#Initiating Logging Instances
clog = log.clsL()
cl = ccl.clsCreateList()

var = datetime.now().strftime(".%H.%M.%S")

# Encode your data to create embeddings
documents = []

var_1 = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
print('*'*120)
print('Start Time: ' + str(var_1))
print('*'*120)

print('*'*240)
print('Creating Index store:: ')
print('*'*240)

documents = cl.createRec()

print('Inserted Sample Records: ')
print(documents[:5])
print('\n')
print('Type:')
print(type(documents))

r1 = len(documents)

if r1 > 0:
    print()
    print('Successfully Indexed records!')
else:
    print()
    print('Failed to Indexed recrods!')

print('*'*120)
var_2 = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
print('End Time: ' + str(var_2))

# Passing OpenAI API Key
openai.api_key = openAIKey

###############################################
###    End of Global Section                ###
###############################################

class clsFeedVectorDB:
    def __init__(self):
        self.basePath = cf.conf['DATA_PATH']
        self.modelFileName = cf.conf['CACHE_FILE']
        self.vectorDBPath = cf.conf['VECTORDB_PATH']
        self.vectorDBFileName = cf.conf['VECTORDB_FILE_NM']
        self.queryModel = cf.conf['QUERY_MODEL']
        self.passageModel = cf.conf['PASSAGE_MODEL']

    def retrieveDocuments(self, question, retriever, top_k=3):
        return retriever.retrieve(question, top_k=top_k)

    def generateAnswerWithGPT3(self, retrievedDocs, question):
        documents_text = " ".join([doc.content for doc in retrievedDocs])
        prompt = f"Given the following documents: {documents_text}, answer the question: {question}"

        response = openai.Completion.create(
            model="text-davinci-003",
            prompt=prompt,
            max_tokens=150
        )
        return response.choices[0].text.strip()

    def ragAnswerWithHaystackAndGPT3(self, question, retriever):
        retrievedDocs = self.retrieveDocuments(question, retriever)
        return self.generateAnswerWithGPT3(retrievedDocs, question)

    def genData(self, strVal):
        try:
            basePath = self.basePath
            modelFileName = self.modelFileName
            vectorDBPath = self.vectorDBPath
            vectorDBFileName = self.vectorDBFileName
            queryModel = self.queryModel
            passageModel = self.passageModel

            print('*'*120)
            print('Index Your Data for Retrieval:')
            print('*'*120)

            FullFileName = basePath + modelFileName
            FullVectorDBname = vectorDBPath + vectorDBFileName

            sqlite_path = "sqlite:///" + FullVectorDBname + '.db'
            print('Vector DB Path: ', str(sqlite_path))

            indexFile = "vectorDB/" + str(vectorDBFileName) + '.faiss'
            indexConfig = "vectorDB/" + str(vectorDBFileName) + ".json"

            print('File: ', str(indexFile))
            print('Config: ', str(indexConfig))

            # Initialize DocumentStore
            document_store = FAISSDocumentStore(sql_url=sqlite_path)

            libName = "vectorDB/" + str(vectorDBFileName) + '.faiss'

            document_store.write_documents(documents)

            # Initialize Retriever
            retriever = DensePassageRetriever(document_store=document_store,
                                              query_embedding_model=queryModel,
                                              passage_embedding_model=passageModel,
                                              use_gpu=False)

            document_store.update_embeddings(retriever=retriever)

            document_store.save(index_path=libName, config_path="vectorDB/" + str(vectorDBFileName) + ".json")

            print('*'*120)
            print('Testing with RAG & OpenAI...')
            print('*'*120)

            answer = self.ragAnswerWithHaystackAndGPT3(strVal, retriever)

            print('*'*120)
            print('Testing Answer:: ')
            print(answer)
            print('*'*120)

            return 0

        except Exception as e:
            x = str(e)
            print('Error: ', x)

            return 1

In the above script, the following essential steps took place –

First, the application calls the clsCreateList class to store all the documents inside a dictionary.
Then it stores the data inside the vector DB & creates & stores the model, which will be later reused (If you remember, we’ve used this as a model in our previous post).
Finally, test with some sample use cases by providing the proper context to OpenAI & confirm the response.

Here is a short clip of how the RAG models contextualize with the source data.

RAG-Model Contextualization

So, finally, we’ve done it.

I know that this post is relatively bigger than my earlier post. But, I think, you can get all the details once you go through it.

You will get the complete codebase in the following GitHub link.

I’ll bring some more exciting topics in the coming days from the Python verse. Please share & subscribe to my post & let me know your feedback.

Till then, Happy Avenging! 🙂

Note: All the data & scenarios posted here are representational data & scenarios & available over the internet & for educational purposes only. Some of the images (except my photo) we’ve used are available over the net. We don’t claim ownership of these images. There is always room for improvement & especially in the prediction quality.

RAG implementation of LLMs by using Python, Haystack & React (Part – 1)

Posted on August 31, 2023August 31, 2023 by SatyakiDe in analytic function, api, Azure, chainlang, cloud, code, faiss, gpt3, Haystack, integration, IoT, java, json, Keras, machine-learning, mobile, natural-language, numpy, openai, Pandas, Performance, Python, React, Real-time, regex, replace, snippet, sql, String Manipulation, Technology, Tensorflow

Today, I will share a new post in a part series about creating end-end LLMs that feed source data with RAG implementation. I’ll also use OpenAI python-based SDK and Haystack embeddings in this case.

In this post, I’ve directly subscribed to OpenAI & I’m not using OpenAI from Azure. However, I’ll explore that in the future as well.

Before I explain the process to invoke this new library, why not view the demo first & then discuss it?

Demo

FLOW OF EVENTS:

Let us look at the flow diagram as it captures the sequence of events that unfold as part of the process.

As you can see, to enable this large & complex solution, we must first establish the capabilities to build applications powered by LLMs, Transformer models, vector search, and more. You can use state-of-the-art NLP models to perform question-answering, answer generation, semantic document search, or build tools capable of complex decision-making and query resolution. Hence, steps no. 1 & 2 showcased the data embedding & creating that informed repository. We’ll be discussing that in our second part.

Once you have the informed repository, the system can interact with the end-users. As part of the query (shown in step 3), the prompt & the question are shared with the process engine, which then turned to reduce the volume & get relevant context from our informed repository & get the tuned context as part of the response (Shown in steps 4, 5 & 6).

Then, this tuned context is shared with the OpenAI for better response & summary & concluding remarks that are very user-friendly & easier to understand for end-users (Shown in steps 8 & 9).

IMPORTANT PACKAGES:

The following are the important packages that are essential to this project –

pip install farm-haystack==1.19.0
pip install Flask==2.2.5
pip install Flask-Cors==4.0.0
pip install Flask-JWT-Extended==4.5.2
pip install Flask-Session==0.5.0
pip install openai==0.27.8
pip install pandas==2.0.3
pip install tensorflow==2.11.1

CODE:

We’ve both the front-end using react & back-end APIs with Python-flask and the Open AI to create this experience.

Python:

Today, we’ll be going in reverse mode. We first discuss the main script & then explain all the other class scripts.

flaskServer.py (This is the main calling Python script to invoke the RAG-Server.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 27-Jun-2023                         ####
#### Modified On 28-Jun-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### shortcut application created inside MAC         ####
#### enviornment including MacBook, IPad or IPhone.  ####
####                                                 ####
#########################################################

from flask import Flask, jsonify, request, session
from flask_cors import CORS
from werkzeug.security import check_password_hash, generate_password_hash
from flask_jwt_extended import JWTManager, jwt_required, create_access_token
import pandas as pd
from clsConfigClient import clsConfigClient as cf
import clsL as log
import clsContentScrapper as csc
import clsRAGOpenAI as crao
import csv
from datetime import timedelta
import os
import re
import json

########################################################
################    Global Area   ######################
########################################################
#Initiating Logging Instances
clog = log.clsL()

admin_key = cf.conf['ADMIN_KEY']
secret_key = cf.conf['SECRET_KEY']
session_path = cf.conf['SESSION_PATH']
sessionFile = cf.conf['SESSION_CACHE_FILE']

app = Flask(__name__)
CORS(app)  # This will enable CORS for all routes
app.config['JWT_SECRET_KEY'] = admin_key  # Change this!
app.secret_key = secret_key

jwt = JWTManager(app)

users = cf.conf['USER_NM']
passwd = cf.conf['USER_PWD']

cCScrapper = csc.clsContentScrapper()
cr = crao.clsRAGOpenAI()

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

# Define the aggregation functions
def join_unique(series):
    unique_vals = series.drop_duplicates().astype(str)
    return ', '.join(filter(lambda x: x != 'nan', unique_vals))

# Building the preaggregate cache
def groupImageWiki():
    try:
        base_path = cf.conf['OUTPUT_PATH']
        inputFile = cf.conf['CLEANED_FILE']
        outputFile = cf.conf['CLEANED_FILE_SHORT']
        subdir = cf.conf['SUBDIR_OUT']
        Ind = cf.conf['DEBUG_IND']

        inputCleanedFileLookUp = base_path + inputFile

        #Opening the file in dataframe
        df = pd.read_csv(inputCleanedFileLookUp)
        hash_values = df['Total_Hash'].unique()

        dFin = df[['primaryImage','Wiki_URL','Total_Hash']]

        # Ensure columns are strings and not NaN
        # Convert columns to string and replace 'nan' with an empty string
        dFin['primaryImage'] = dFin['primaryImage'].astype(str).replace('nan', '')
        dFin['Wiki_URL'] = dFin['Wiki_URL'].astype(str).replace('nan', '')

        dFin.drop_duplicates()

        # Group by 'Total_Hash' and aggregate
        dfAgg = dFin.groupby('Total_Hash').agg({'primaryImage': join_unique,'Wiki_URL': join_unique}).reset_index()

        return dfAgg

    except Exception as e:
        x = str(e)
        print('Error: ', x)

        df = pd.DataFrame()

        return df

resDf = groupImageWiki()

########################################################
################  End  Global Area  ####################
########################################################

def extractRemoveUrls(hash_value):
    image_urls = ''
    wiki_urls = ''
    # Parse the inner message JSON string
    try:

        resDf['Total_Hash'] = resDf['Total_Hash'].astype(int)
        filtered_df = resDf[resDf['Total_Hash'] == int(hash_value)]

        if not filtered_df.empty:
            image_urls = filtered_df['primaryImage'].values[0]
            wiki_urls = filtered_df['Wiki_URL'].values[0]

        return image_urls, wiki_urls

    except Exception as e:
        x = str(e)
        print('extractRemoveUrls Error: ', x)
        return image_urls, wiki_urls

def isIncomplete(line):
    """Check if a line appears to be incomplete."""

    # Check if the line ends with certain patterns indicating it might be incomplete.
    incomplete_patterns = [': [Link](', ': Approximately ', ': ']
    return any(line.endswith(pattern) for pattern in incomplete_patterns)

def filterData(data):
    """Return only the complete lines from the data."""

    lines = data.split('\n')
    complete_lines = [line for line in lines if not isIncomplete(line)]

    return '\n'.join(complete_lines)

def updateCounter(sessionFile):
    try:
        counter = 0

        # Check if the CSV file exists
        if os.path.exists(sessionFile):
            with open(sessionFile, 'r') as f:
                reader = csv.reader(f)
                for row in reader:
                    # Assuming the counter is the first value in the CSV
                    counter = int(row[0])

        # Increment counter
        counter += 1

        # Write counter back to CSV
        with open(sessionFile, 'w', newline='') as f:
            writer = csv.writer(f)
            writer.writerow([counter])

        return counter
    except Exception as e:
        x = str(e)
        print('Error: ', x)

        return 1

def getPreviousResult():
    try:
        fullFileName = session_path + sessionFile
        newCounterValue = updateCounter(fullFileName)

        return newCounterValue
    except Exception as e:
        x = str(e)
        print('Error: ', x)

        return 1

@app.route('/login', methods=['POST'])
def login():
    username = request.json.get('username', None)
    password = request.json.get('password', None)

    print('User Name: ', str(username))
    print('Password: ', str(password))

    #if username not in users or not check_password_hash(users.get(username), password):
    if ((username not in users) or (password not in passwd)):
        return jsonify({'login': False}), 401

    access_token = create_access_token(identity=username)
    return jsonify(access_token=access_token)

@app.route('/chat', methods=['POST'])
def get_chat():
    try:
        #session["key"] = "1D98KI"
        #session_id = session.sid
        #print('Session Id: ', str(session_id))

        cnt = getPreviousResult()
        print('Running Session Count: ', str(cnt))

        username = request.json.get('username', None)
        message = request.json.get('message', None)

        print('User: ', str(username))
        print('Content: ', str(message))

        if cnt == 1:
            retList = cCScrapper.extractCatalog()
        else:
            hashValue, cleanedData = cr.getData(str(message))
            print('Main Hash Value:', str(hashValue))

            imageUrls, wikiUrls = extractRemoveUrls(hashValue)
            print('Image URLs: ', str(imageUrls))
            print('Wiki URLs: ', str(wikiUrls))
            print('Clean Text:')
            print(str(cleanedData))
            retList = '{"records":[{"Id":"' + str(cleanedData) + '", "Image":"' + str(imageUrls) + '", "Wiki": "' + str(wikiUrls) + '"}]}'

        response = {
            'message': retList
        }

        print('JSON: ', str(response))
        return jsonify(response)

    except Exception as e:
        x = str(e)

        response = {
            'message': 'Error: ' + x
        }
        return jsonify(response)

@app.route('/api/data', methods=['GET'])
@jwt_required()
def get_data():
    response = {
        'message': 'Hello from Flask!'
    }
    return jsonify(response)

if __name__ == '__main__':
    app.run(debug=True)

Let us understand some of the important sections of the above script –

Function – login():

The login function retrieves a ‘username’ and ‘password’ from a JSON request and prints them. It checks if the provided credentials are missing from users or password lists, returning a failure JSON response if so. It creates and returns an access token in a JSON response if valid.

Function – get_chat():

The get_chat function retrieves the running session count and user input from a JSON request. Based on the session count, it extracts catalog data or processes the user’s message from the RAG framework that finally receives the refined response from the OpenAI, extracting hash values, image URLs, and wiki URLs. If an error arises, the function captures and returns the error as a JSON message.

Function – updateCounter():

The updateCounter function checks if a given CSV file exists and retrieves its counter value. It then increments the counter and writes it back to the CSV. If any errors occur, an error message is printed, and the function returns a value of 1.

Function – extractRemoveUrls():

The extractRemoveUrls function attempts to filter a data frame, resDf, based on a provided hash value to extract image and wiki URLs. If the data frame contains matching entries, it retrieves the corresponding URLs. Any errors encountered are printed, but the function always returns the image and wiki URLs, even if they are empty.

clsContentScrapper.py (This is the main class that brings the default options for the users if they agree with the initial prompt by the bot.)

#####################################################
#### Written By: SATYAKI DE                      ####
#### Written On: 27-May-2023                     ####
#### Modified On 28-May-2023                     ####
####                                             ####
#### Objective: This is the main calling         ####
#### python class that will invoke the           ####
#### LangChain of package to extract             ####
#### the transcript from the YouTube videos &    ####
#### then answer the questions based on the      ####
#### topics selected by the users.               ####
####                                             ####
#####################################################

from langchain.document_loaders import YoutubeLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain

from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

from googleapiclient.discovery import build

import clsTemplate as ct
from clsConfigClient import clsConfigClient as cf

import os

from flask import jsonify
import requests

###############################################
###           Global Section                ###
###############################################
open_ai_Key = cf.conf['OPEN_AI_KEY']
os.environ["OPENAI_API_KEY"] = open_ai_Key
embeddings = OpenAIEmbeddings(openai_api_key=open_ai_Key)

YouTube_Key = cf.conf['YOUTUBE_KEY']
youtube = build('youtube', 'v3', developerKey=YouTube_Key)

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

###############################################
###    End of Global Section                ###
###############################################

class clsContentScrapper:
    def __init__(self):
        self.model_name = cf.conf['MODEL_NAME']
        self.temp_val = cf.conf['TEMP_VAL']
        self.max_cnt = int(cf.conf['MAX_CNT'])
        self.url = cf.conf['BASE_URL']
        self.header_token = cf.conf['HEADER_TOKEN']

    def extractCatalog(self):
        try:
            base_url = self.url
            header_token = self.header_token

            url = base_url + '/departments'

            print('Full URL: ', str(url))

            payload={}
            headers = {'Cookie': header_token}

            response = requests.request("GET", url, headers=headers, data=payload)

            x = response.text

            return x
        except Exception as e:
            discussedTopic = []
            x = str(e)
            print('Error: ', x)

            return x

Let us understand the the core part that require from this class.

Function – extractCatalog():

The extractCatalog function uses specific headers to make a GET request to a constructed URL. The URL is derived by appending ‘/departments’ to a base_url, and a header token is used in the request headers. If successful, it returns the text of the response; if there’s an exception, it prints the error and returns the error message.

clsRAGOpenAI.py (This is the main class that brings the RAG-enabled context that is fed to OpenAI for fine-tuned response with less cost.)

#########################################################
#### Written By: SATYAKI DE                          ####
#### Written On: 27-Jun-2023                         ####
#### Modified On 28-Jun-2023                         ####
####                                                 ####
#### Objective: This is the main calling             ####
#### python script that will invoke the              ####
#### shortcut application created inside MAC         ####
#### enviornment including MacBook, IPad or IPhone.  ####
####                                                 ####
#########################################################

from haystack.document_stores.faiss import FAISSDocumentStore
from haystack.nodes import DensePassageRetriever
import openai

from clsConfigClient import clsConfigClient as cf
import clsL as log

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

import os
import re
###############################################
###           Global Section                ###
###############################################
Ind = cf.conf['DEBUG_IND']
queryModel = cf.conf['QUERY_MODEL']
passageModel = cf.conf['PASSAGE_MODEL']

#Initiating Logging Instances
clog = log.clsL()

os.environ["TOKENIZERS_PARALLELISM"] = "false"

vectorDBFileName = cf.conf['VECTORDB_FILE_NM']

indexFile = "vectorDB/" + str(vectorDBFileName) + '.faiss'
indexConfig = "vectorDB/" + str(vectorDBFileName) + ".json"

print('File: ', str(indexFile))
print('Config: ', str(indexConfig))

# Also, provide `config_path` parameter if you set it when calling the `save()` method:
new_document_store = FAISSDocumentStore.load(index_path=indexFile, config_path=indexConfig)

# Initialize Retriever
retriever = DensePassageRetriever(document_store=new_document_store,
                                  query_embedding_model=queryModel,
                                  passage_embedding_model=passageModel,
                                  use_gpu=False)


###############################################
###    End of Global Section                ###
###############################################

class clsRAGOpenAI:
    def __init__(self):
        self.basePath = cf.conf['DATA_PATH']
        self.fileName = cf.conf['FILE_NAME']
        self.Ind = cf.conf['DEBUG_IND']
        self.subdir = str(cf.conf['OUT_DIR'])
        self.base_url = cf.conf['BASE_URL']
        self.outputPath = cf.conf['OUTPUT_PATH']
        self.vectorDBPath = cf.conf['VECTORDB_PATH']
        self.openAIKey = cf.conf['OPEN_AI_KEY']
        self.temp = cf.conf['TEMP_VAL']
        self.modelName = cf.conf['MODEL_NAME']
        self.maxToken = cf.conf['MAX_TOKEN']

    def extractHash(self, text):
        try:
            # Regular expression pattern to match 'Ref: {' followed by a number and then '}'
            pattern = r"Ref: \{'(\d+)'\}"
            match = re.search(pattern, text)

            if match:
                return match.group(1)
            else:
                return None
        except Exception as e:
            x = str(e)
            print('Error: ', x)

            return None

    def removeSentencesWithNaN(self, text):
        try:
            # Split text into sentences using regular expression
            sentences = re.split('(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
            # Filter out sentences containing 'nan'
            filteredSentences = [sentence for sentence in sentences if 'nan' not in sentence]
            # Rejoin the sentences
            return ' '.join(filteredSentences)
        except Exception as e:
            x = str(e)
            print('Error: ', x)

            return ''

    def retrieveDocumentsReader(self, question, top_k=9):
        return retriever.retrieve(question, top_k=top_k)

    def generateAnswerWithGPT3(self, retrieved_docs, question):
        try:
            openai.api_key = self.openAIKey
            temp = self.temp
            modelName = self.modelName
            maxToken = self.maxToken

            documentsText = " ".join([doc.content for doc in retrieved_docs])

            filteredDocs = self.removeSentencesWithNaN(documentsText)
            hashValue = self.extractHash(filteredDocs)

            print('RAG Docs:: ')
            print(filteredDocs)
            #prompt = f"Given the following documents: {documentsText}, answer the question accurately based on the above data with the supplied http urls: {question}"

            # Set up a chat-style prompt with your data
            messages = [
                {"role": "system", "content": "You are a helpful assistant, answer the question accurately based on the above data with the supplied http urls. Only relevant content needs to publish. Please do not provide the facts or the texts that results crossing the max_token limits."},
                {"role": "user", "content": filteredDocs}
            ]

            # Chat style invoking the latest model
            response = openai.ChatCompletion.create(
                model=modelName,
                messages=messages,
                temperature = temp,
                max_tokens=maxToken
            )
            return hashValue, response.choices[0].message['content'].strip().replace('\n','\\n')
        except Exception as e:
            x = str(e)
            print('failed to get from OpenAI: ', x)
            return 'Not Available!'

    def ragAnswerWithHaystackAndGPT3(self, question):
        retrievedDocs = self.retrieveDocumentsReader(question)
        return self.generateAnswerWithGPT3(retrievedDocs, question)

    def getData(self, strVal):
        try:
            print('*'*120)
            print('Index Your Data for Retrieval:')
            print('*'*120)

            print('Response from New Docs: ')
            print()

            hashValue, answer = self.ragAnswerWithHaystackAndGPT3(strVal)

            print('GPT3 Answer::')
            print(answer)
            print('Hash Value:')
            print(str(hashValue))

            print('*'*240)
            print('End Of Use RAG to Generate Answers:')
            print('*'*240)

            return hashValue, answer
        except Exception as e:
            x = str(e)
            print('Error: ', x)
            answer = x
            hashValue = 1

            return hashValue, answer

Let us understand some of the important block –

Function – ragAnswerWithHaystackAndGPT3():

The ragAnswerWithHaystackAndGPT3 function retrieves relevant documents for a given question using the retrieveDocumentsReader method. It then generates an answer for the query using GPT-3 with the retrieved documents via the generateAnswerWithGPT3 method. The final response is returned.

Function – generateAnswerWithGPT3():

The generateAnswerWithGPT3 function, given a list of retrieved documents and a question, communicates with OpenAI’s GPT-3 to generate an answer. It first processes the documents, filtering and extracting a hash value. Using a chat-style format, it prompts GPT-3 with the processed documents and captures its response. If an error occurs, an error message is printed, and “Not Available!” is returned.

Function – retrieveDocumentsReader():

The retrieveDocumentsReader function takes in a question and an optional parameter, top_k (defaulted to 9). It is called the retriever.retrieve method with the given parameters. The result of the retrieval will generate at max nine responses from the RAG engine, which will be fed to OpenAI.

React:

App.js (This is the main react script, that will create the interface & parse the data apart from the authentication)

// App.js
import React, { useState } from 'react';
import axios from 'axios';
import './App.css';

const App = () => {
  const [isLoggedIn, setIsLoggedIn] = useState(false);
  const [username, setUsername] = useState('');
  const [password, setPassword] = useState('');
  const [message, setMessage] = useState('');
  const [chatLog, setChatLog] = useState([{ sender: 'MuBot', message: 'Welcome to MuBot! Please explore the world of History from our brilliant collections! Do you want to proceed to see the catalog?'}]);

  const handleLogin = async (e) => {
    e.preventDefault();
    try {
      const response = await axios.post('http://localhost:5000/login', { username, password });
      if (response.status === 200) {
        setIsLoggedIn(true);
      }
    } catch (error) {
      console.error('Login error:', error);
    }
  };

  const sendMessage = async (username) => {
    if (message.trim() === '') return;

    // Create a new chat entry
    const newChatEntry = {
      sender: 'user',
      message: message.trim(),
    };

    // Clear the input field
    setMessage('');

    try {
      // Make API request to Python-based API
      const response = await axios.post('http://localhost:5000/chat', { message: newChatEntry.message }); // Replace with your API endpoint URL
      const responseData = response.data;

      // Print the response to the console for debugging
      console.log('API Response:', responseData);

      // Parse the nested JSON from the 'message' attribute
      const jsonData = JSON.parse(responseData.message);

      // Check if the data contains 'departments'
      if (jsonData.departments) {

        // Extract the 'departments' attribute from the parsed data
        const departments = jsonData.departments;

        // Extract the department names and create a single string with line breaks
        const botResponseText = departments.reduce((acc, department) => {return acc + department.departmentId + ' ' + department.displayName + '\n';}, '');

        // Update the chat log with the bot's response
        setChatLog((prevChatLog) => [...prevChatLog, { sender: 'user', message: message }, { sender: 'bot', message: botResponseText },]);
      }
      else if (jsonData.records)
      {
        // Data structure 2: Artwork information
        const records = jsonData.records;

        // Prepare chat entries
        const chatEntries = [];

        // Iterate through records and extract text, image, and wiki information
        records.forEach((record) => {
          const textInfo = Object.entries(record).map(([key, value]) => {
            if (key !== 'Image' && key !== 'Wiki') {
              return `${key}: ${value}`;
            }
            return null;
          }).filter((info) => info !== null).join('\n');

          const imageLink = record.Image;
          //const wikiLinks = JSON.parse(record.Wiki.replace(/'/g, '"'));
          //const wikiLinks = record.Wiki;
          const wikiLinks = record.Wiki.split(',').map(link => link.trim());

          console.log('Wiki:', wikiLinks);

          // Check if there is a valid image link
          const hasValidImage = imageLink && imageLink !== '[]';

          const imageElement = hasValidImage ? (
            <img src={imageLink} alt="Artwork" style={{ maxWidth: '100%' }} />
          ) : null;

          // Create JSX elements for rendering the wiki links (if available)
          const wikiElements = wikiLinks.map((link, index) => (
            <div key={index}>
              <a href={link} target="_blank" rel="noopener noreferrer">
                Wiki Link {index + 1}
              </a>
            </div>
          ));

          if (textInfo) {
            chatEntries.push({ sender: 'bot', message: textInfo });
          }

          if (imageElement) {
            chatEntries.push({ sender: 'bot', message: imageElement });
          }

          if (wikiElements.length > 0) {
            chatEntries.push({ sender: 'bot', message: wikiElements });
          }
        });

        // Update the chat log with the bot's response
        setChatLog((prevChatLog) => [...prevChatLog, { sender: 'user', message }, ...chatEntries, ]);
      }

    } catch (error) {
      console.error('Error sending message:', error);
    }
  };

  if (!isLoggedIn) {
    return (
      <div className="login-container">
        <h2>Welcome to the MuBot</h2>
        <form onSubmit={handleLogin} className="login-form">
          <input
            type="text"
            placeholder="Enter your name"
            value={username}
            onChange={(e) => setUsername(e.target.value)}
            required
          />
          <input
            type="password"
            placeholder="Enter your password"
            value={password}
            onChange={(e) => setPassword(e.target.value)}
            required
          />
          <button type="submit">Login</button>
        </form>
      </div>
    );
  }

  return (
    <div className="chat-container">
      <div className="chat-header">
        <h2>Hello, {username}</h2>
        <h3>Chat with MuBot</h3>
      </div>
      <div className="chat-log">
        {chatLog.map((chatEntry, index) => (
          <div
            key={index}
            className={`chat-entry ${chatEntry.sender === 'user' ? 'user' : 'bot'}`}
          >
            <span className="user-name">{chatEntry.sender === 'user' ? username : 'MuBot'}</span>
            <p className="chat-message">{chatEntry.message}</p>
          </div>
        ))}
      </div>
      <div className="chat-input">
        <input
          type="text"
          placeholder="Type your message..."
          value={message}
          onChange={(e) => setMessage(e.target.value)}
          onKeyPress={(e) => {
            if (e.key === 'Enter') {
              sendMessage();
            }
          }}
        />
        <button onClick={sendMessage}>Send</button>
      </div>
    </div>
  );
};

export default App;

Please find some of the important logic –

Function – handleLogin():

The handleLogin asynchronous function responds to an event by preventing its default action. It attempts to post a login request with a username and password to a local server endpoint. If the response is successful with a status of 200, it updates a state variable to indicate a successful login; otherwise, it logs any encountered errors.

Function – sendMessage():

The sendMessage asynchronous function is designed to handle the user’s chat interaction:

If the message is empty (after trimming spaces), the function exits without further action.
A chat entry object is created with the sender set as ‘user’ and the trimmed message.
The input field’s message is cleared, and an API request is made to a local server endpoint with the chat message.
If the API responds with a ‘departments’ attribute in its JSON, a bot response is crafted by iterating over department details.
If the API responds with ‘records’ indicating artwork information, the bot crafts responses for each record, extracting text, images, and wiki links, and generating JSX elements for rendering them.
After processing the API response, the chat log state is updated with the user’s original message and the bot’s responses.
Errors, if encountered, are logged to the console.

This function enables interactive chat with bot responses that vary based on the nature of the data received from the API.

DIRECTORY STRUCTURES:

Let us explore the directory structure starting from the parent to some of the important child folder should look like this –

So, finally, we’ve done it.

I know that this post is relatively bigger than my earlier post. But, I think, you can get all the details once you go through it.

You will get the complete codebase in the following GitHub link.

I’ll bring some more exciting topics in the coming days from the Python verse. Please share & subscribe to my post & let me know your feedback.

Till then, Happy Avenging! 🙂

Tuning your model using the python-based low-code machine-learning library PyCaret

Posted on March 31, 2023March 31, 2023 by SatyakiDe in ai, Azure, call, cloud, code, computing, Crossplatform, Data Science, features, function, json, Keras, machine-learning, Model, numpy, Pandas, Performance, Python, Technology, video

Today, I’ll discuss another important topic before I will share the excellent use case next month, as I still need some time to finish that one. We’ll see how we can leverage the brilliant capability of a low-code machine-learning library named PyCaret.

But before going through the details, why don’t we view the demo & then go through it?

Demo

Architecture:

Let us understand the flow of events –

As one can see, the initial training requests are triggered from the PyCaret-driven training models. And the application can successfully process & identify the best models out of the other combinations.

Python Packages:

Following are the python packages that are necessary to develop this use case –

pip install pandas
pip install pycaret

PyCaret is dependent on a combination of other popular python packages. So, you need to install them successfully to run this package.

CODE:

clsConfigClient.py (Main configuration file)

	################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 15-May-2020 ####
	#### Modified On: 31-Mar-2023 ####
	#### ####
	#### Objective: This script is a config ####
	#### file, contains all the keys for ####
	#### personal AI-driven voice assistant. ####
	#### ####
	################################################

	import os
	import platform as pl

	class clsConfigClient(object):
	Curr_Path = os.path.dirname(os.path.realpath(__file__))

	os_det = pl.system()
	if os_det == "Windows":
	sep = '\\'
	else:
	sep = '/'

	conf = {
	'APP_ID': 1,
	'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
	'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
	'LOG_PATH': Curr_Path + sep + 'log' + sep,
	'DATA_PATH': Curr_Path + sep + 'data' + sep,
	'MODEL_PATH': Curr_Path + sep + 'model' + sep,
	'TEMP_PATH': Curr_Path + sep + 'temp' + sep,
	'MODEL_DIR': 'model',
	'APP_DESC_1': 'PyCaret Training!',
	'DEBUG_IND': 'N',
	'INIT_PATH': Curr_Path,
	'FILE_NAME': 'Titanic.csv',
	'MODEL_NAME': 'PyCaret-ft-personal-2023-03-31-04-29-53',
	'TITLE': "PyCaret Training!",
	'PATH' : Curr_Path,
	'OUT_DIR': 'data'
	}

view raw

clsConfigClient.py

hosted with ❤ by GitHub

I’m skipping this section as it is self-explanatory.

clsTrainModel.py (This is the main class that contains the core logic of low-code machine-learning library to evaluate the best model for your solutions.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 31-Mar-2023 ####
	#### Modified On 31-Mar-2023 ####
	#### ####
	#### Objective: This is the main class that ####
	#### contains the core logic of low-code ####
	#### machine-learning library to evaluate the ####
	#### best model for your solutions. ####
	#### ####
	#####################################################

	import clsL as cl
	from clsConfigClient import clsConfigClient as cf
	import datetime

	# Import necessary libraries
	import pandas as p
	from pycaret.classification import *

	# Disbling Warning
	def warn(args, *kwargs):
	pass

	import warnings
	warnings.warn = warn

	######################################
	### Get your global values ####
	######################################
	debug_ind = 'Y'

	# Initiating Logging Instances
	clog = cl.clsL()
	###############################################
	### End of Global Section ###
	###############################################


	class clsTrainModel:
	def __init__(self):
	self.model_path = cf.conf['MODEL_PATH']
	self.model_name = cf.conf['MODEL_NAME']

	def trainModel(self, FullFileName):
	try:
	df = p.read_csv(FullFileName)
	row_count = int(df.shape[0])
	print('Number of rows: ', str(row_count))

	print(df)

	# Initialize the setup in PyCaret
	clf_setup = setup(
	data=df,
	target="Survived",
	train_size=0.8, # 80% for training, 20% for testing
	categorical_features=["Sex", "Embarked"],
	ordinal_features={"Pclass": ["1", "2", "3"]},
	ignore_features=["Name", "Ticket", "Cabin", "PassengerId"],
	#silent=True, # Set to False for interactive setup
	)

	# Compare various models
	best_model = compare_models()

	# Create a specific model (e.g., Random Forest)
	rf_model = create_model("rf")

	# Hyperparameter tuning
	tuned_rf_model = tune_model(rf_model)

	# Evaluate model performance
	plot_model(tuned_rf_model, plot="confusion_matrix")
	plot_model(tuned_rf_model, plot="auc")

	# Finalize the model (train on the complete dataset)
	final_rf_model = finalize_model(tuned_rf_model)

	# Make predictions on new data
	new_data = df.drop("Survived", axis=1)
	predictions = predict_model(final_rf_model, data=new_data)

	# Writing into the Model
	FullModelName = self.model_path + self.model_name

	print('Model Output @:: ', str(FullModelName))
	print()

	# Save the fine-tuned model
	save_model(final_rf_model, FullModelName)

	return 0

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	return 1

view raw

clsTrainModel.py

hosted with ❤ by GitHub

Let us understand the code in simple terms –

Import necessary libraries and load the Titanic dataset.
Initialize the PyCaret setup, specifying the target variable, train-test split, categorical and ordinal features, and features to ignore.
Compare various models to find the best-performing one.
Create a specific model (Random Forest in this case).
Perform hyper-parameter tuning on the Random Forest model.
Evaluate the model’s performance using a confusion matrix and AUC-ROC curve.
Finalize the model by training it on the complete dataset.
Make predictions on new data.
Save the trained model for future use.

trainPYCARETModel.py (This is the main calling python script that will invoke the training class of PyCaret package.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 31-Mar-2023 ####
	#### Modified On 31-Mar-2023 ####
	#### ####
	#### Objective: This is the main calling ####
	#### python script that will invoke the ####
	#### training class of Pycaret package. ####
	#### ####
	#####################################################

	import clsL as cl
	from clsConfigClient import clsConfigClient as cf
	import datetime

	import clsTrainModel as tm

	# Disbling Warning
	def warn(args, *kwargs):
	pass

	import warnings
	warnings.warn = warn

	######################################
	### Get your global values ####
	######################################
	debug_ind = 'Y'

	# Initiating Logging Instances
	clog = cl.clsL()

	data_path = cf.conf['DATA_PATH']
	data_file_name = cf.conf['FILE_NAME']

	tModel = tm.clsTrainModel()

	######################################
	#### Global Flag ########
	######################################

	def main():
	try:
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print(''120)
	print('Start Time: ' + str(var))
	print(''120)

	FullFileName = data_path + data_file_name

	r1 = tModel.trainModel(FullFileName)

	if r1 == 0:
	print('Successfully Trained!')
	else:
	print('Failed to Train!')

	print(''120)
	var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print('End Time: ' + str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

trainPYCARETModel.py

hosted with ❤ by GitHub

The above code is pretty self-explanatory as well.

testPYCARETModel.py (This is the main calling python script that will invoke the testing script for PyCaret package.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 31-Mar-2023 ####
	#### Modified On 31-Mar-2023 ####
	#### ####
	#### Objective: This is the main calling ####
	#### python script that will invoke the ####
	#### testing script for PyCaret package. ####
	#### ####
	#####################################################

	import clsL as cl
	from clsConfigClient import clsConfigClient as cf
	import datetime

	from pycaret.classification import load_model, predict_model

	import pandas as p

	# Disbling Warning
	def warn(args, *kwargs):
	pass

	import warnings
	warnings.warn = warn

	######################################
	### Get your global values ####
	######################################
	debug_ind = 'Y'

	# Initiating Logging Instances
	clog = cl.clsL()

	model_path = cf.conf['MODEL_PATH']
	model_name = cf.conf['MODEL_NAME']

	######################################
	#### Global Flag ########
	######################################

	def main():
	try:
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print(''120)
	print('Start Time: ' + str(var))
	print(''120)

	FullFileName = model_path + model_name

	# Load the saved model
	loaded_model = load_model(FullFileName)

	# Prepare new data for testing (make sure it has the same columns as the original data)
	new_data = p.DataFrame({
	"Pclass": [3, 1],
	"Sex": ["male", "female"],
	"Age": [22, 38],
	"SibSp": [1, 1],
	"Parch": [0, 0],
	"Fare": [7.25, 71.2833],
	"Embarked": ["S", "C"]
	})

	# Make predictions using the loaded model
	predictions = predict_model(loaded_model, data=new_data)

	# Display the predictions
	print(predictions)

	print(''120)
	var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	print('End Time: ' + str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

testPYCARETModel.py

hosted with ❤ by GitHub

In this code, the application uses the stored model & then forecasts based on the optimized PyCaret model tuning.

Conclusion:

The above code demonstrates an end-to-end binary classification pipeline using the PyCaret library for the Titanic dataset. The goal is to predict whether a passenger survived based on the available features. Here are some conclusions you can draw from the code and data:

Ease of use: The code showcases how PyCaret simplifies the machine learning process, from data preprocessing to model training, evaluation, and deployment. With just a few lines of code, you can perform tasks that would require much more effort using lower-level libraries.
Model selection: The compare_models() function provides a quick and easy way to compare various machine learning algorithms and identify the best-performing one based on the chosen evaluation metric (accuracy by default). This selection helps you select a suitable model for the given problem.
Hyper-parameter tuning: The tune_model() function automates the process of hyper-parameter tuning to improve model performance. We tuned a Random Forest model to optimize its predictive power in the example.
Model evaluation: PyCaret provides several built-in visualization tools for assessing model performance. In the example, we used a confusion matrix and AUC-ROC curve to evaluate the performance of the tuned Random Forest model.
Model deployment: The example demonstrates how to make predictions using the trained model and save the model for future use. This deployment showcases how PyCaret can streamline the process of deploying a machine-learning model in a production environment.

It is important to note that the conclusions drawn from the code and data are specific to the Titanic dataset and the chosen features. Adjust the feature engineering, preprocessing, and model selection steps for different datasets or problems accordingly. However, the general workflow and benefits provided by PyCaret would remain the same.

So, finally, we’ve done it.

I know that this post is relatively bigger than my earlier post. But, I think, you can get all the details once you go through it.

You will get the complete codebase in the following GitHub link.

I’ll bring some more exciting topics in the coming days from the Python verse. Please share & subscribe to my post & let me know your feedback.

Till then, Happy Avenging! 🙂

Realtime reading from a Streaming using Computer Vision

Posted on July 26, 2022 by SatyakiDe in api, Azure, call, cloud, code, Computer-Vision, computing, Crossplatform, Data Science, exposure, extends, function, gui, IoT, json, Keras, machine-learning, matplotlib, mobile, Model, numpy, objects, Open-CV, Pandas, pytesseract, Python, Real-time, snippet, video

This week we’re going to extend one of our earlier posts & trying to read an entire text from streaming using computer vision. If you want to view the previous post, please click the following link.

But, before we proceed, why don’t we view the demo first?

Demo

Architecture:

Let us understand the architecture flow –

The above diagram shows that the application, which uses the Open-CV, analyzes individual frames from the source & extracts the complete text within the video & displays it on top of the target screen besides prints the same in the console.

Python Packages:

pip install imutils==0.5.4
pip install matplotlib==3.5.2
pip install numpy==1.21.6
pip install opencv-contrib-python==4.6.0.66
pip install opencv-contrib-python-headless==4.6.0.66
pip install opencv-python==4.6.0.66
pip install opencv-python-headless==4.6.0.66
pip install pandas==1.3.5
pip install Pillow==9.1.1
pip install pytesseract==0.3.9
pip install python-dateutil==2.8.2

CODE:

Let us now understand the code. For this use case, we will only discuss three python scripts. However, we need more than these three. However, we have already discussed them in some of the early posts. Hence, we will skip them here.

clsReadingTextFromStream.py (This is the main class of python script that will extract the text from the WebCAM streaming in real-time.)

	##################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 22-Jul-2022 ####
	#### Modified On 25-Jul-2022 ####
	#### ####
	#### Objective: This is the main class of ####
	#### python script that will invoke the ####
	#### extraction of texts from a WebCAM. ####
	#### ####
	##################################################

	# Importing necessary packages
	from clsConfig import clsConfig as cf

	from imutils.object_detection import non_max_suppression
	import numpy as np
	import pytesseract
	import imutils
	import time
	import cv2
	import time

	###############################################
	### Global Section ###
	###############################################

	# Two output layer names for the text detector model

	lNames = cf.conf['LAYER_DET']

	# Tesseract OCR text param values

	strVal = "-l " + str(cf.conf['LANG']) + " –oem " + str(cf.conf['OEM_VAL']) + " –psm " + str(cf.conf['PSM_VAL']) + ""
	config = (strVal)

	###############################################
	### End of Global Section ###
	###############################################

	class clsReadingTextFromStream:
	def __init__(self):
	self.sep = str(cf.conf['SEP'])
	self.Curr_Path = str(cf.conf['INIT_PATH'])
	self.CacheL = int(cf.conf['CACHE_LIM'])
	self.modelPath = str(cf.conf['MODEL_PATH']) + str(cf.conf['MODEL_FILE_NAME'])
	self.minConf = float(cf.conf['MIN_CONFIDENCE'])
	self.wt = int(cf.conf['WIDTH'])
	self.ht = int(cf.conf['HEIGHT'])
	self.pad = float(cf.conf['PADDING'])
	self.title = str(cf.conf['TITLE'])
	self.Otitle = str(cf.conf['ORIG_TITLE'])
	self.drawTag = cf.conf['DRAW_TAG']
	self.aRange = int(cf.conf['ASCII_RANGE'])
	self.sParam = cf.conf['SUBTRACT_PARAM']

	def findBoundBox(self, boxes, res, rW, rH, orig, origW, origH, pad):
	try:
	# Loop over the bounding boxes
	for (spX, spY, epX, epY) in boxes:
	# Scale the bounding box coordinates based on the respective
	# ratios
	spX = int(spX * rW)
	spY = int(spY * rH)
	epX = int(epX * rW)
	epY = int(epY * rH)

	# To obtain a better OCR of the text we can potentially
	# apply a bit of padding surrounding the bounding box.
	# And, computing the deltas in both the x and y directions
	dX = int((epX – spX) * pad)
	dY = int((epY – spY) * pad)

	# Apply padding to each side of the bounding box, respectively
	spX = max(0, spX – dX)
	spY = max(0, spY – dY)
	epX = min(origW, epX + (dX * 2))
	epY = min(origH, epY + (dY * 2))

	# Extract the actual padded ROI
	roi = orig[spY:epY, spX:epX]

	# Choose the proper OCR Config
	text = pytesseract.image_to_string(roi, config=config)

	# Add the bounding box coordinates and OCR'd text to the list
	# of results
	res.append(((spX, spY, epX, epY), text))

	# Sort the results bounding box coordinates from top to bottom
	res = sorted(res, key=lambda r:r[0][1])

	return res
	except Exception as e:
	x = str(e)
	print(x)

	return res

	def predictText(self, imgScore, imgGeo):
	try:
	minConf = self.minConf

	# Initializing the bounding box rectangles & confidence score by
	# extracting the rows & columns from the imgScore volume.
	(numRows, numCols) = imgScore.shape[2:4]
	rects = []
	confScore = []

	for y in range(0, numRows):
	# Extract the imgScore probabilities to derive potential
	# bounding box coordinates that surround text
	imgScoreData = imgScore[0, 0, y]
	xVal0 = imgGeo[0, 0, y]
	xVal1 = imgGeo[0, 1, y]
	xVal2 = imgGeo[0, 2, y]
	xVal3 = imgGeo[0, 3, y]
	anglesData = imgGeo[0, 4, y]

	for x in range(0, numCols):
	# If our score does not have sufficient probability,
	# ignore it
	if imgScoreData[x] < minConf:
	continue

	# Compute the offset factor as our resulting feature
	# maps will be 4x smaller than the input frame
	(offX, offY) = (x * 4.0, y * 4.0)

	# Extract the rotation angle for the prediction and
	# then compute the sin and cosine
	angle = anglesData[x]
	cos = np.cos(angle)
	sin = np.sin(angle)

	# Derive the width and height of the bounding box from
	# imgGeo
	h = xVal0[x] + xVal2[x]
	w = xVal1[x] + xVal3[x]

	# Compute both the starting and ending (x, y)-coordinates
	# for the text prediction bounding box
	epX = int(offX + (cos * xVal1[x]) + (sin * xVal2[x]))
	epY = int(offY – (sin * xVal1[x]) + (cos * xVal2[x]))
	spX = int(epX – w)
	spY = int(epY – h)

	# Adding bounding box coordinates and probability score
	# to the respective lists
	rects.append((spX, spY, epX, epY))
	confScore.append(imgScoreData[x])

	# return a tuple of the bounding boxes and associated confScore
	return (rects, confScore)

	except Exception as e:
	x = str(e)
	print(x)

	rects = []
	confScore = []

	return (rects, confScore)

	def processStream(self, debugInd, var):
	try:
	sep = self.sep
	Curr_Path = self.Curr_Path
	CacheL = self.CacheL
	modelPath = self.modelPath
	minConf = self.minConf
	wt = self.wt
	ht = self.ht
	pad = self.pad
	title = self.title
	Otitle = self.Otitle
	drawTag = self.drawTag
	aRange = self.aRange
	sParam = self.sParam

	val = 0

	# Initialize the video stream and allow the camera sensor to warm up
	print("[INFO] Starting video stream…")
	cap = cv2.VideoCapture(0)

	# Loading the pre-trained text detector
	print("[INFO] Loading Text Detector…")
	net = cv2.dnn.readNet(modelPath)

	# Loop over the frames from the video stream
	while True:
	try:
	# Grab the frame from our video stream and resize it
	success, frame = cap.read()

	orig = frame.copy()
	(origH, origW) = frame.shape[:2]

	# Setting new width and height and then determine the ratio in change
	# for both the width and height
	(newW, newH) = (wt, ht)
	rW = origW / float(newW)
	rH = origH / float(newH)

	# Resize the frame and grab the new frame dimensions
	frame = cv2.resize(frame, (newW, newH))
	(H, W) = frame.shape[:2]

	# Construct a blob from the frame and then perform a forward pass of
	# the model to obtain the two output layer sets
	blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H), sParam, swapRB=True, crop=False)
	net.setInput(blob)
	(confScore, imgGeo) = net.forward(lNames)

	# Decode the predictions, then apply non-maxima suppression to
	# suppress weak, overlapping bounding boxes
	(rects, confidences) = self.predictText(confScore, imgGeo)
	boxes = non_max_suppression(np.array(rects), probs=confidences)

	# Initialize the list of results
	res = []

	# Getting BoundingBox boundaries
	res = self.findBoundBox(boxes, res, rW, rH, orig, origW, origH, pad)

	for ((spX, spY, epX, epY), text) in res:
	# Display the text OCR by using Tesseract APIs
	print("Reading Text::")
	print("=" *60)
	print(text)
	print("=" *60)

	# Removing the non-ASCII text so it can draw the text on the frame
	# using OpenCV, then draw the text and a bounding box surrounding
	# the text region of the input frame
	text = "".join([c if ord(c) < aRange else "" for c in text]).strip()
	output = orig.copy()

	cv2.rectangle(output, (spX, spY), (epX, epY), drawTag, 2)
	cv2.putText(output, text, (spX, spY – 20), cv2.FONT_HERSHEY_SIMPLEX, 1.2, drawTag, 3)

	# Show the output frame
	cv2.imshow(title, output)
	#cv2.imshow(Otitle, frame)

	# If the `q` key was pressed, break from the loop
	if cv2.waitKey(1) == ord('q'):
	break

	val = 0

	except Exception as e:
	x = str(e)
	print(x)

	val = 1

	# Performing cleanup at the end
	cap.release()
	cv2.destroyAllWindows()

	return val
	except Exception as e:
	x = str(e)
	print('Error:', x)

	return 1

view raw

clsReadingTextFromStream.py

hosted with ❤ by GitHub

Please find the key snippet from the above script –

# Two output layer names for the text detector model

lNames = cf.conf['LAYER_DET']

# Tesseract OCR text param values

strVal = "-l " + str(cf.conf['LANG']) + " --oem " + str(cf.conf['OEM_VAL']) + " --psm " + str(cf.conf['PSM_VAL']) + ""
config = (strVal)

The first line contains the two output layers’ names for the text detector model. Among them, the first one indicates the outcome possibilities & the second one use to derive the bounding box coordinates of the predicted text.

The second line contains various options for the tesseract APIs. You need to understand the opportunities in detail to make them work. These are the essential options for our use case –

Language – The intended language, for example, English, Spanish, Hindi, Bengali, etc.
OEM flag – In this case, the application will use 4 to indicate LSTM neural net model for OCR.
OEM Value – In this case, the selected value is 7, indicating that the application treats the ROI as a single line of text.

For more details, please refer to the config file.

print("[INFO] Loading Text Detector...")
net = cv2.dnn.readNet(modelPath)

The above lines bring the already created model & load it to memory for evaluation.

# Setting new width and height and then determine the ratio in change
# for both the width and height
(newW, newH) = (wt, ht)
rW = origW / float(newW)
rH = origH / float(newH)

# Resize the frame and grab the new frame dimensions
frame = cv2.resize(frame, (newW, newH))
(H, W) = frame.shape[:2]

# Construct a blob from the frame and then perform a forward pass of
# the model to obtain the two output layer sets
blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H), sParam, swapRB=True, crop=False)
net.setInput(blob)
(confScore, imgGeo) = net.forward(lNames)

# Decode the predictions, then apply non-maxima suppression to
# suppress weak, overlapping bounding boxes
(rects, confidences) = self.predictText(confScore, imgGeo)
boxes = non_max_suppression(np.array(rects), probs=confidences)

The above lines are more of preparing individual frames to get the bounding box by resizing the height & width followed by a forward pass of the model to obtain two output layer sets. And then apply the non-maxima suppression to remove the weak, overlapping bounding box by interpreting the prediction. In short, this will identify the potential text region & put the bounding box surrounding it.

# Initialize the list of results
res = []

# Getting BoundingBox boundaries
res = self.findBoundBox(boxes, res, rW, rH, orig, origW, origH, pad)

The above function will create the bounding box surrounding the predicted text regions. Also, we will capture the expected text inside the result variable.

for (spX, spY, epX, epY) in boxes:
  # Scale the bounding box coordinates based on the respective
  # ratios
  spX = int(spX * rW)
  spY = int(spY * rH)
  epX = int(epX * rW)
  epY = int(epY * rH)

  # To obtain a better OCR of the text we can potentially
  # apply a bit of padding surrounding the bounding box.
  # And, computing the deltas in both the x and y directions
  dX = int((epX - spX) * pad)
  dY = int((epY - spY) * pad)

  # Apply padding to each side of the bounding box, respectively
  spX = max(0, spX - dX)
  spY = max(0, spY - dY)
  epX = min(origW, epX + (dX * 2))
  epY = min(origH, epY + (dY * 2))

  # Extract the actual padded ROI
  roi = orig[spY:epY, spX:epX]

Now, the application will scale the bounding boxes based on the previously computed ratio for actual text recognition. In this process, the application also padded the bounding boxes & then extracted the padded region of interest.

# Choose the proper OCR Config
text = pytesseract.image_to_string(roi, config=config)

# Add the bounding box coordinates and OCR'd text to the list
# of results
res.append(((spX, spY, epX, epY), text))

Using OCR options, the application extracts the text within the video frame & adds that to the res list.

# Sort the results bounding box coordinates from top to bottom
res = sorted(res, key=lambda r:r[0][1])

It then sends a sorted output to the primary calling functions.

for ((spX, spY, epX, epY), text) in res:
  # Display the text OCR by using Tesseract APIs
  print("Reading Text::")
  print("=" *60)
  print(text)
  print("=" *60)

  # Removing the non-ASCII text so it can draw the text on the frame
  # using OpenCV, then draw the text and a bounding box surrounding
  # the text region of the input frame
  text = "".join([c if ord(c) < aRange else "" for c in text]).strip()
  output = orig.copy()

  cv2.rectangle(output, (spX, spY), (epX, epY), drawTag, 2)
  cv2.putText(output, text, (spX, spY - 20), cv2.FONT_HERSHEY_SIMPLEX, 1.2, drawTag, 3)

  # Show the output frame
  cv2.imshow(title, output)

Finally, it fetches the potential text region along with the text & then prints on top of the source video. Also, it removed some non-printable characters during this time to avoid any cryptic texts.

readingVideo.py (Main calling script.)

	#####################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 22-Jul-2022 ####
	#### Modified On 25-Jul-2022 ####
	#### ####
	#### Objective: This is the main calling ####
	#### python script that will invoke the ####
	#### clsReadingTextFromStream class to initiate ####
	#### the reading capability in real-time ####
	#### & display text via Web-CAM. ####
	#####################################################

	# We keep the setup code in a different class as shown below.
	import clsReadingTextFromStream as rtfs

	from clsConfig import clsConfig as cf

	import datetime
	import logging

	###############################################
	### Global Section ###
	###############################################
	# Instantiating all the main class

	x1 = rtfs.clsReadingTextFromStream()

	###############################################
	### End of Global Section ###
	###############################################

	def main():
	try:
	# Other useful variables
	debugInd = 'Y'
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	var1 = datetime.datetime.now()

	print('Start Time: ', str(var))
	# End of useful variables

	# Initiating Log Class
	general_log_path = str(cf.conf['LOG_PATH'])

	# Enabling Logging Info
	logging.basicConfig(filename=general_log_path + 'readingTextFromVideo.log', level=logging.INFO)

	print('Started reading text from videos!')

	# Execute all the pass
	r1 = x1.processStream(debugInd, var)

	if (r1 == 0):
	print('Successfully read text from the Live Stream!')
	else:
	print('Failed to read text from the Live Stream!')

	var2 = datetime.datetime.now()

	c = var2 – var1
	minutes = c.total_seconds() / 60
	print('Total difference in minutes: ', str(minutes))

	print('End Time: ', str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

readingVideo.py

hosted with ❤ by GitHub

Please find the key snippet –

# Instantiating all the main class

x1 = rtfs.clsReadingTextFromStream()

# Execute all the pass
r1 = x1.processStream(debugInd, var)

if (r1 == 0):
    print('Successfully read text from the Live Stream!')
else:
    print('Failed to read text from the Live Stream!')

The above lines instantiate the main calling class & then invoke the function to get the desired extracted text from the live streaming video if that is successful.

FOLDER STRUCTURE:

Here is the folder structure that contains all the files & directories in MAC O/S –

You will get the complete codebase in the following Github link.

Unfortunately, I cannot upload the model due to it’s size. I will share on the need basis.

I’ll bring some more exciting topic in the coming days from the Python verse. Please share & subscribe my post & let me know your feedback.

Till then, Happy Avenging! 🙂

Note: All the data & scenario posted here are representational data & scenarios & available over the internet & for educational purpose only. Some of the images (except my photo) that we’ve used are available over the net. We don’t claim the ownership of these images. There is an always room for improvement & especially the prediction quality.

Live visual reading using Convolutional Neural Network (CNN) through Python-based machine-learning application.

Posted on January 19, 2022 by SatyakiDe in api, cloud, code, combining, Computer-Vision, computing, Crossplatform, Data Science, exposure, features, function, gui, integration, IoT, json, Keras, machine-learning, matplotlib, member function, Model, numpy, objects, Open-CV, Pandas, Pickle, Python, Scikit-Learn, snippet, Technology, Tensorflow, video

This week we’re planning to touch on one of the exciting posts of visually reading characters from WebCAM & predict the letters using CNN methods. Before we dig deep, why don’t we see the demo run first?

Isn’t it fascinating? As we can see, the computer can record events and read like humans. And, thanks to the brilliant packages available in Python, which can help us predict the correct letter out of an Image.

What do we need to test it out?

Preferably an external WebCAM.
A moderate or good Laptop to test out this.
Python
And a few other packages that we’ll mention next block.

What Python packages do we need?

Some of the critical packages that we must need to test out this application are –

cmake==3.22.1
dlib==19.19.0
face-recognition==1.3.0
face-recognition-models==0.3.0
imutils==0.5.3
jsonschema==4.4.0
keras==2.7.0
Keras-Preprocessing==1.1.2
matplotlib==3.5.1
matplotlib-inline==0.1.3
oauthlib==3.1.1
opencv-contrib-python==4.1.2.30
opencv-contrib-python-headless==4.4.0.46
opencv-python==4.5.5.62
opencv-python-headless==4.5.5.62
pickleshare==0.7.5
Pillow==9.0.0
python-dateutil==2.8.2
requests==2.27.1
requests-oauthlib==1.3.0
scikit-image==0.19.1
scikit-learn==1.0.2
tensorboard==2.7.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.7.0
tensorflow-estimator==2.7.0
tensorflow-io-gcs-filesystem==0.23.1
tqdm==4.62.3

What is CNN?

In deep learning, a convolutional neural network (CNN/ConvNet) is a class of deep neural networks most commonly applied to analyze visual imagery.

We can understand from the above picture that a CNN generally takes an image as input. The neural network analyzes each pixel separately. The weights and biases of the model are then tweaked to detect the desired letters (In our use case) from the image. Like other algorithms, the data also has to pass through pre-processing stage. However, a CNN needs relatively less pre-processing than most other Deep Learning algorithms.

If you want to know more about this, there is an excellent article on CNN with some on-point animations explaining this concept. Please read it here.

Where do we get the data sets for our testing?

For testing, we are fortunate enough to have Kaggle with us. We have received a wide variety of sample data, which you can get from here.

Our use-case:

From the above diagram, one can see that the python application will consume a live video feed of any random letters (both printed & handwritten) & predict the character as part of the machine learning model that we trained.

Code:

clsConfig.py (Configuration file for the entire application.)

	################################################
	#### Written By: SATYAKI DE ####
	#### Written On: 15-May-2020 ####
	#### Modified On: 28-Dec-2021 ####
	#### ####
	#### Objective: This script is a config ####
	#### file, contains all the keys for ####
	#### Machine-Learning & streaming dashboard.####
	#### ####
	################################################

	import os
	import platform as pl

	class clsConfig(object):
	Curr_Path = os.path.dirname(os.path.realpath(__file__))

	os_det = pl.system()
	if os_det == "Windows":
	sep = '\\'
	else:
	sep = '/'

	conf = {
	'APP_ID': 1,
	'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
	'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
	'LOG_PATH': Curr_Path + sep + 'log' + sep,
	'REPORT_PATH': Curr_Path + sep + 'report',
	'FILE_NAME': Curr_Path + sep + 'Data' + sep + 'A_Z_Handwritten_Data.csv',
	'SRC_PATH': Curr_Path + sep + 'data' + sep,
	'APP_DESC_1': 'Old Video Enhancement!',
	'DEBUG_IND': 'N',
	'INIT_PATH': Curr_Path,
	'SUBDIR': 'data',
	'SEP': sep,
	'testRatio':0.2,
	'valRatio':0.2,
	'epochsVal':8,
	'activationType':'relu',
	'activationType2':'softmax',
	'numOfClasses':26,
	'kernelSize':(3, 3),
	'poolSize':(2, 2),
	'filterVal1':32,
	'filterVal2':64,
	'filterVal3':128,
	'stridesVal':2,
	'monitorVal':'val_loss',
	'paddingVal1':'same',
	'paddingVal2':'valid',
	'reshapeVal':28,
	'reshapeVal1':(28,28),
	'patienceVal1':1,
	'patienceVal2':2,
	'sleepTime':3,
	'sleepTime1':6,
	'factorVal':0.2,
	'learningRateVal':0.001,
	'minDeltaVal':0,
	'minLrVal':0.0001,
	'verboseFlag':0,
	'modeInd':'auto',
	'shuffleVal':100,
	'DenkseVal1':26,
	'DenkseVal2':64,
	'DenkseVal3':128,
	'predParam':9,
	'word_dict':{0:'A',1:'B',2:'C',3:'D',4:'E',5:'F',6:'G',7:'H',8:'I',9:'J',10:'K',11:'L',12:'M',13:'N',14:'O',15:'P',16:'Q',17:'R',18:'S',19:'T',20:'U',21:'V',22:'W',23:'X', 24:'Y',25:'Z'},
	'width':640,
	'height':480,
	'imgSize': (32,32),
	'threshold': 0.45,
	'imgDimension': (400, 440),
	'imgSmallDim': (7, 7),
	'imgMidDim': (28, 28),
	'reshapeParam1':1,
	'reshapeParam2':28,
	'colorFeed':(0,0,130),
	'colorPredict':(0,25,255)
	}

view raw

clsConfig.py

hosted with ❤ by GitHub

Important parameters that we need to follow from the above snippets are –

'testRatio':0.2,
'valRatio':0.2,
'epochsVal':8,
'activationType':'relu',
'activationType2':'softmax',
'numOfClasses':26,
'kernelSize':(3, 3),
'poolSize':(2, 2),
'word_dict':{0:'A',1:'B',2:'C',3:'D',4:'E',5:'F',6:'G',7:'H',8:'I',9:'J',10:'K',11:'L',12:'M',13:'N',14:'O',15:'P',16:'Q',17:'R',18:'S',19:'T',20:'U',21:'V',22:'W',23:'X', 24:'Y',25:'Z'},

Since we have 26 letters, we have classified it as 26 in the numOfClasses.

Since we are talking about characters, we had to come up with a process of identifying each character as numbers & then processing our entire logic. Hence, the above parameter named word_dict captured all the characters in a python dictionary & stored them. Moreover, the application translates the final number output to more appropriate characters as the prediction.

2. clsAlphabetReading.py (Main training class to teach the model to predict alphabets from visual reader.)

	###############################################
	#### Written By: SATYAKI DE ####
	#### Written On: 17-Jan-2022 ####
	#### Modified On 17-Jan-2022 ####
	#### ####
	#### Objective: This python script will ####
	#### teach & perfect the model to read ####
	#### visual alphabets using Convolutional ####
	#### Neural Network (CNN). ####
	###############################################

	from keras.datasets import mnist
	import matplotlib.pyplot as plt
	import cv2
	import numpy as np
	from keras.models import Sequential
	from keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Dropout
	from tensorflow.keras.optimizers import SGD, Adam
	from keras.callbacks import ReduceLROnPlateau, EarlyStopping
	from keras.utils.np_utils import to_categorical
	import pandas as p
	import numpy as np
	from sklearn.model_selection import train_test_split
	from keras.utils import np_utils
	import matplotlib.pyplot as plt
	from tqdm import tqdm_notebook
	from sklearn.utils import shuffle

	import pickle

	import os
	import platform as pl

	from clsConfig import clsConfig as cf

	class clsAlphabetReading:
	def __init__(self):
	self.sep = str(cf.conf['SEP'])
	self.Curr_Path = str(cf.conf['INIT_PATH'])
	self.fileName = str(cf.conf['FILE_NAME'])
	self.testRatio = float(cf.conf['testRatio'])
	self.valRatio = float(cf.conf['valRatio'])
	self.epochsVal = int(cf.conf['epochsVal'])
	self.activationType = str(cf.conf['activationType'])
	self.activationType2 = str(cf.conf['activationType2'])
	self.numOfClasses = int(cf.conf['numOfClasses'])
	self.kernelSize = cf.conf['kernelSize']
	self.poolSize = cf.conf['poolSize']
	self.filterVal1 = int(cf.conf['filterVal1'])
	self.filterVal2 = int(cf.conf['filterVal2'])
	self.filterVal3 = int(cf.conf['filterVal3'])
	self.stridesVal = int(cf.conf['stridesVal'])
	self.monitorVal = str(cf.conf['monitorVal'])
	self.paddingVal1 = str(cf.conf['paddingVal1'])
	self.paddingVal2 = str(cf.conf['paddingVal2'])
	self.reshapeVal = int(cf.conf['reshapeVal'])
	self.reshapeVal1 = cf.conf['reshapeVal1']
	self.patienceVal1 = int(cf.conf['patienceVal1'])
	self.patienceVal2 = int(cf.conf['patienceVal2'])
	self.sleepTime = int(cf.conf['sleepTime'])
	self.sleepTime1 = int(cf.conf['sleepTime1'])
	self.factorVal = float(cf.conf['factorVal'])
	self.learningRateVal = float(cf.conf['learningRateVal'])
	self.minDeltaVal = int(cf.conf['minDeltaVal'])
	self.minLrVal = float(cf.conf['minLrVal'])
	self.verboseFlag = int(cf.conf['verboseFlag'])
	self.modeInd = str(cf.conf['modeInd'])
	self.shuffleVal = int(cf.conf['shuffleVal'])
	self.DenkseVal1 = int(cf.conf['DenkseVal1'])
	self.DenkseVal2 = int(cf.conf['DenkseVal2'])
	self.DenkseVal3 = int(cf.conf['DenkseVal3'])
	self.predParam = int(cf.conf['predParam'])
	self.word_dict = cf.conf['word_dict']

	def applyCNN(self, X_Train, Y_Train_Catg, X_Validation, Y_Validation_Catg):
	try:
	testRatio = self.testRatio
	epochsVal = self.epochsVal
	activationType = self.activationType
	activationType2 = self.activationType2
	numOfClasses = self.numOfClasses
	kernelSize = self.kernelSize
	poolSize = self.poolSize
	filterVal1 = self.filterVal1
	filterVal2 = self.filterVal2
	filterVal3 = self.filterVal3
	stridesVal = self.stridesVal
	monitorVal = self.monitorVal
	paddingVal1 = self.paddingVal1
	paddingVal2 = self.paddingVal2
	reshapeVal = self.reshapeVal
	patienceVal1 = self.patienceVal1
	patienceVal2 = self.patienceVal2
	sleepTime = self.sleepTime
	sleepTime1 = self.sleepTime1
	factorVal = self.factorVal
	learningRateVal = self.learningRateVal
	minDeltaVal = self.minDeltaVal
	minLrVal = self.minLrVal
	verboseFlag = self.verboseFlag
	modeInd = self.modeInd
	shuffleVal = self.shuffleVal
	DenkseVal1 = self.DenkseVal1
	DenkseVal2 = self.DenkseVal2
	DenkseVal3 = self.DenkseVal3

	model = Sequential()

	model.add(Conv2D(filters=filterVal1, kernel_size=kernelSize, activation=activationType, input_shape=(28,28,1)))
	model.add(MaxPool2D(pool_size=poolSize, strides=stridesVal))

	model.add(Conv2D(filters=filterVal2, kernel_size=kernelSize, activation=activationType, padding = paddingVal1))
	model.add(MaxPool2D(pool_size=poolSize, strides=stridesVal))

	model.add(Conv2D(filters=filterVal3, kernel_size=kernelSize, activation=activationType, padding = paddingVal2))
	model.add(MaxPool2D(pool_size=poolSize, strides=stridesVal))

	model.add(Flatten())

	model.add(Dense(DenkseVal2,activation = activationType))
	model.add(Dense(DenkseVal3,activation = activationType))

	model.add(Dense(DenkseVal1,activation = activationType2))

	model.compile(optimizer = Adam(learning_rate=learningRateVal), loss='categorical_crossentropy', metrics=['accuracy'])
	reduce_lr = ReduceLROnPlateau(monitor=monitorVal, factor=factorVal, patience=patienceVal1, min_lr=minLrVal)
	early_stop = EarlyStopping(monitor=monitorVal, min_delta=minDeltaVal, patience=patienceVal2, verbose=verboseFlag, mode=modeInd)


	fittedModel = model.fit(X_Train, Y_Train_Catg, epochs=epochsVal, callbacks=[reduce_lr, early_stop], validation_data = (X_Validation,Y_Validation_Catg))

	return (model, fittedModel)

	except Exception as e:
	x = str(e)
	model = Sequential()
	print('Error: ', x)

	return (model, model)

	def trainModel(self, debugInd, var):
	try:
	sep = self.sep
	Curr_Path = self.Curr_Path
	fileName = self.fileName
	epochsVal = self.epochsVal
	valRatio = self.valRatio
	predParam = self.predParam
	testRatio = self.testRatio
	reshapeVal = self.reshapeVal
	numOfClasses = self.numOfClasses
	sleepTime = self.sleepTime
	sleepTime1 = self.sleepTime1
	shuffleVal = self.shuffleVal
	reshapeVal1 = self.reshapeVal1

	# Dictionary for getting characters from index values
	word_dict = self.word_dict

	print('File Name: ', str(fileName))

	# Read the data
	df_HW_Alphabet = p.read_csv(fileName).astype('float32')

	# Sample Data
	print('Sample Data: ')
	print(df_HW_Alphabet.head())

	# Split data the (x – Our data) & (y – the prdict label)
	x = df_HW_Alphabet.drop('0',axis = 1)
	y = df_HW_Alphabet['0']


	# Reshaping the data in csv file to display as an image
	X_Train, X_Test, Y_Train, Y_Test = train_test_split(x, y, test_size = testRatio)
	X_Train, X_Validation, Y_Train, Y_Validation = train_test_split(X_Train, Y_Train, test_size = valRatio)

	X_Train = np.reshape(X_Train.values, (X_Train.shape[0], reshapeVal, reshapeVal))
	X_Test = np.reshape(X_Test.values, (X_Test.shape[0], reshapeVal, reshapeVal))
	X_Validation = np.reshape(X_Validation.values, (X_Validation.shape[0], reshapeVal, reshapeVal))


	print("Train Data Shape: ", X_Train.shape)
	print("Test Data Shape: ", X_Test.shape)
	print("Validation Data shape: ", X_Validation.shape)

	# Plotting the number of alphabets in the dataset
	Y_Train_Num = np.int0(y)
	count = np.zeros(numOfClasses, dtype='int')
	for i in Y_Train_Num:
	count[i] +=1

	alphabets = []
	for i in word_dict.values():
	alphabets.append(i)

	fig, ax = plt.subplots(1,1, figsize=(7,7))
	ax.barh(alphabets, count)

	plt.xlabel("Number of elements ")
	plt.ylabel("Alphabets")
	plt.grid()
	plt.show(block=False)
	plt.pause(sleepTime)
	plt.close()

	# Shuffling the data
	shuff = shuffle(X_Train[:shuffleVal])

	# Model reshaping the training & test dataset
	X_Train = X_Train.reshape(X_Train.shape[0],X_Train.shape[1],X_Train.shape[2],1)
	print("Shape of Train Data: ", X_Train.shape)

	X_Test = X_Test.reshape(X_Test.shape[0], X_Test.shape[1], X_Test.shape[2],1)
	print("Shape of Test Data: ", X_Test.shape)

	X_Validation = X_Validation.reshape(X_Validation.shape[0], X_Validation.shape[1], X_Validation.shape[2],1)
	print("Shape of Validation data: ", X_Validation.shape)

	# Converting the labels to categorical values
	Y_Train_Catg = to_categorical(Y_Train, num_classes = numOfClasses, dtype='int')
	print("Shape of Train Labels: ", Y_Train_Catg.shape)

	Y_Test_Catg = to_categorical(Y_Test, num_classes = numOfClasses, dtype='int')
	print("Shape of Test Labels: ", Y_Test_Catg.shape)

	Y_Validation_Catg = to_categorical(Y_Validation, num_classes = numOfClasses, dtype='int')
	print("Shape of validation labels: ", Y_Validation_Catg.shape)

	model, history = self.applyCNN(X_Train, Y_Train_Catg, X_Validation, Y_Validation_Catg)

	print('Model Summary: ')
	print(model.summary())

	# Displaying the accuracies & losses for train & validation set
	print("Validation Accuracy :", history.history['val_accuracy'])
	print("Training Accuracy :", history.history['accuracy'])
	print("Validation Loss :", history.history['val_loss'])
	print("Training Loss :", history.history['loss'])

	# Displaying the Loss Graph
	plt.figure(1)
	plt.plot(history.history['loss'])
	plt.plot(history.history['val_loss'])
	plt.legend(['training','validation'])
	plt.title('Loss')
	plt.xlabel('epoch')
	plt.show(block=False)
	plt.pause(sleepTime1)
	plt.close()

	# Dsiplaying the Accuracy Graph
	plt.figure(2)
	plt.plot(history.history['accuracy'])
	plt.plot(history.history['val_accuracy'])
	plt.legend(['training','validation'])
	plt.title('Accuracy')
	plt.xlabel('epoch')
	plt.show(block=False)
	plt.pause(sleepTime1)
	plt.close()

	# Making the model to predict
	pred = model.predict(X_Test[:predParam])

	print('Test Details::')
	print('X_Test: ', X_Test.shape)
	print('Y_Test_Catg: ', Y_Test_Catg.shape)

	try:
	score = model.evaluate(X_Test, Y_Test_Catg, verbose=0)
	print('Test Score = ', score[0])
	print('Test Accuracy = ', score[1])
	except Exception as e:
	x = str(e)
	print('Error: ', x)

	# Displaying some of the test images & their predicted labels
	fig, ax = plt.subplots(3,3, figsize=(8,9))
	axes = ax.flatten()

	for i in range(9):
	axes[i].imshow(np.reshape(X_Test[i], reshapeVal1), cmap="Greys")
	pred = word_dict[np.argmax(Y_Test_Catg[i])]
	print('Prediction: ', pred)
	axes[i].set_title("Test Prediction: " + pred)
	axes[i].grid()
	plt.show(block=False)
	plt.pause(sleepTime1)
	plt.close()

	fileName = Curr_Path + sep + 'Model' + sep + 'model_trained_' + str(epochsVal) + '.p'
	print('Model Name: ', str(fileName))

	pickle_out = open(fileName, 'wb')
	pickle.dump(model, pickle_out)
	pickle_out.close()

	return 0
	except Exception as e:
	x = str(e)
	print('Error: ', x)

	return 1

view raw

clsAlphabetReading.py

hosted with ❤ by GitHub

Some of the key snippets from the above scripts are –

x = df_HW_Alphabet.drop('0',axis = 1)
y = df_HW_Alphabet['0']

In the above snippet, we have split the data into images & their corresponding labels.

X_Train, X_Test, Y_Train, Y_Test = train_test_split(x, y, test_size = testRatio)
X_Train, X_Validation, Y_Train, Y_Validation = train_test_split(X_Train, Y_Train, test_size = valRatio)

X_Train = np.reshape(X_Train.values, (X_Train.shape[0], reshapeVal, reshapeVal))
X_Test = np.reshape(X_Test.values, (X_Test.shape[0], reshapeVal, reshapeVal))
X_Validation = np.reshape(X_Validation.values, (X_Validation.shape[0], reshapeVal, reshapeVal))


print("Train Data Shape: ", X_Train.shape)
print("Test Data Shape: ", X_Test.shape)
print("Validation Data shape: ", X_Validation.shape)

We are splitting the data into Train, Test & Validation sets to get more accurate predictions and reshaping the raw data into the image by consuming the 784 data columns to 28×28 pixel images.

Since we are talking about characters, we had to come up with a process of identifying The following snippet will plot the character equivalent number into a matplotlib chart & showcase the overall distribution trend after splitting.

Y_Train_Num = np.int0(y)
count = np.zeros(numOfClasses, dtype='int')
for i in Y_Train_Num:
    count[i] +=1

alphabets = []
for i in word_dict.values():
    alphabets.append(i)

fig, ax = plt.subplots(1,1, figsize=(7,7))
ax.barh(alphabets, count)

plt.xlabel("Number of elements ")
plt.ylabel("Alphabets")
plt.grid()
plt.show(block=False)
plt.pause(sleepTime)
plt.close()

Note that we have tweaked the plt.show property with (block=False). This property will enable us to continue execution without human interventions after the initial pause.

# Model reshaping the training & test dataset
X_Train = X_Train.reshape(X_Train.shape[0],X_Train.shape[1],X_Train.shape[2],1)
print("Shape of Train Data: ", X_Train.shape)

X_Test = X_Test.reshape(X_Test.shape[0], X_Test.shape[1], X_Test.shape[2],1)
print("Shape of Test Data: ", X_Test.shape)

X_Validation = X_Validation.reshape(X_Validation.shape[0], X_Validation.shape[1], X_Validation.shape[2],1)
print("Shape of Validation data: ", X_Validation.shape)

# Converting the labels to categorical values
Y_Train_Catg = to_categorical(Y_Train, num_classes = numOfClasses, dtype='int')
print("Shape of Train Labels: ", Y_Train_Catg.shape)

Y_Test_Catg = to_categorical(Y_Test, num_classes = numOfClasses, dtype='int')
print("Shape of Test Labels: ", Y_Test_Catg.shape)

Y_Validation_Catg = to_categorical(Y_Validation, num_classes = numOfClasses, dtype='int')
print("Shape of validation labels: ", Y_Validation_Catg.shape)

In the above diagram, the application did reshape all three categories of data before calling the primary CNN function.

model = Sequential()

model.add(Conv2D(filters=filterVal1, kernel_size=kernelSize, activation=activationType, input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=poolSize, strides=stridesVal))

model.add(Conv2D(filters=filterVal2, kernel_size=kernelSize, activation=activationType, padding = paddingVal1))
model.add(MaxPool2D(pool_size=poolSize, strides=stridesVal))

model.add(Conv2D(filters=filterVal3, kernel_size=kernelSize, activation=activationType, padding = paddingVal2))
model.add(MaxPool2D(pool_size=poolSize, strides=stridesVal))

model.add(Flatten())

model.add(Dense(DenkseVal2,activation = activationType))
model.add(Dense(DenkseVal3,activation = activationType))

model.add(Dense(DenkseVal1,activation = activationType2))

model.compile(optimizer = Adam(learning_rate=learningRateVal), loss='categorical_crossentropy', metrics=['accuracy'])
reduce_lr = ReduceLROnPlateau(monitor=monitorVal, factor=factorVal, patience=patienceVal1, min_lr=minLrVal)
early_stop = EarlyStopping(monitor=monitorVal, min_delta=minDeltaVal, patience=patienceVal2, verbose=verboseFlag, mode=modeInd)


fittedModel = model.fit(X_Train, Y_Train_Catg, epochs=epochsVal, callbacks=[reduce_lr, early_stop],  validation_data = (X_Validation,Y_Validation_Catg))

return (model, fittedModel)

In the above snippet, the convolution layers are followed by maxpool layers, which reduce the number of features extracted. The output of the maxpool layers and convolution layers are flattened into a vector of a single dimension and supplied as an input to the Dense layer—the CNN model prepared for training the model using the training dataset.

We have used optimization parameters like Adam, RMSProp & the application we trained for eight epochs for better accuracy & predictions.

# Displaying the accuracies & losses for train & validation set
print("Validation Accuracy :", history.history['val_accuracy'])
print("Training Accuracy :", history.history['accuracy'])
print("Validation Loss :", history.history['val_loss'])
print("Training Loss :", history.history['loss'])

# Displaying the Loss Graph
plt.figure(1)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.legend(['training','validation'])
plt.title('Loss')
plt.xlabel('epoch')
plt.show(block=False)
plt.pause(sleepTime1)
plt.close()

# Dsiplaying the Accuracy Graph
plt.figure(2)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.legend(['training','validation'])
plt.title('Accuracy')
plt.xlabel('epoch')
plt.show(block=False)
plt.pause(sleepTime1)
plt.close()

Also, we have captured the validation Accuracy & Loss & plot them into two separate graphs for better understanding.

try:
    score = model.evaluate(X_Test, Y_Test_Catg, verbose=0)
    print('Test Score = ', score[0])
    print('Test Accuracy = ', score[1])
except Exception as e:
    x = str(e)
    print('Error: ', x)

Also, the application is trying to get the accuracy of the model that we trained & validated with the training & validation data. This time we have used test data to predict the confidence score.

# Displaying some of the test images & their predicted labels
fig, ax = plt.subplots(3,3, figsize=(8,9))
axes = ax.flatten()

for i in range(9):
    axes[i].imshow(np.reshape(X_Test[i], reshapeVal1), cmap="Greys")
    pred = word_dict[np.argmax(Y_Test_Catg[i])]
    print('Prediction: ', pred)
    axes[i].set_title("Test Prediction: " + pred)
    axes[i].grid()
plt.show(block=False)
plt.pause(sleepTime1)
plt.close()

Finally, the application testing with some random test data & tried to plot the output & prediction assessment.

fileName = Curr_Path + sep + 'Model' + sep + 'model_trained_' + str(epochsVal) + '.p'
print('Model Name: ', str(fileName))

pickle_out = open(fileName, 'wb')
pickle.dump(model, pickle_out)
pickle_out.close()

As a part of the last step, the application will generate the models using a pickle package & save them under a specific location, which the reader application will use.

3. trainingVisualDataRead.py (Main application that will invoke the training class to predict alphabet through WebCam using Convolutional Neural Network (CNN).)

	###############################################
	#### Written By: SATYAKI DE ####
	#### Written On: 17-Jan-2022 ####
	#### Modified On 17-Jan-2022 ####
	#### ####
	#### Objective: This is the main calling ####
	#### python script that will invoke the ####
	#### clsAlhpabetReading class to initiate ####
	#### teach & perfect the model to read ####
	#### visual alphabets using Convolutional ####
	#### Neural Network (CNN). ####
	###############################################

	# We keep the setup code in a different class as shown below.
	import clsAlphabetReading as ar
	from clsConfig import clsConfig as cf

	import datetime
	import logging

	###############################################
	### Global Section ###
	###############################################
	# Instantiating all the three classes

	x1 = ar.clsAlphabetReading()

	###############################################
	### End of Global Section ###
	###############################################

	def main():
	try:
	# Other useful variables
	debugInd = 'Y'
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	var1 = datetime.datetime.now()

	print('Start Time: ', str(var))
	# End of useful variables

	# Initiating Log Class
	general_log_path = str(cf.conf['LOG_PATH'])

	# Enabling Logging Info
	logging.basicConfig(filename=general_log_path + 'restoreVideo.log', level=logging.INFO)

	print('Started Transformation!')

	# Execute all the pass
	r1 = x1.trainModel(debugInd, var)

	if (r1 == 0):
	print('Successfully Visual Alphabet Training Completed!')
	else:
	print('Failed to complete the Visual Alphabet Training!')

	var2 = datetime.datetime.now()

	c = var2 – var1
	minutes = c.total_seconds() / 60
	print('Total difference in minutes: ', str(minutes))

	print('End Time: ', str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

trainingVisualDataRead.py

hosted with ❤ by GitHub

And the core snippet from the above script is –

x1 = ar.clsAlphabetReading()

Instantiate the main class.

r1 = x1.trainModel(debugInd, var)

The python application will invoke the class & capture the returned value inside the r1 variable.

4. readingVisualData.py (Reading the model to predict Alphabet using WebCAM.)

	###############################################
	#### Written By: SATYAKI DE ####
	#### Written On: 18-Jan-2022 ####
	#### Modified On 18-Jan-2022 ####
	#### ####
	#### Objective: This python script will ####
	#### scan the live video feed from the ####
	#### web-cam & predict the alphabet that ####
	#### read it. ####
	###############################################

	# We keep the setup code in a different class as shown below.
	from clsConfig import clsConfig as cf

	import datetime
	import logging
	import cv2
	import pickle
	import numpy as np
	###############################################
	### Global Section ###
	###############################################

	sep = str(cf.conf['SEP'])
	Curr_Path = str(cf.conf['INIT_PATH'])
	fileName = str(cf.conf['FILE_NAME'])
	epochsVal = int(cf.conf['epochsVal'])
	numOfClasses = int(cf.conf['numOfClasses'])
	word_dict = cf.conf['word_dict']
	width = int(cf.conf['width'])
	height = int(cf.conf['height'])
	imgSize = cf.conf['imgSize']
	threshold = float(cf.conf['threshold'])
	imgDimension = cf.conf['imgDimension']
	imgSmallDim = cf.conf['imgSmallDim']
	imgMidDim = cf.conf['imgMidDim']
	reshapeParam1 = int(cf.conf['reshapeParam1'])
	reshapeParam2 = int(cf.conf['reshapeParam2'])
	colorFeed = cf.conf['colorFeed']
	colorPredict = cf.conf['colorPredict']
	###############################################
	### End of Global Section ###
	###############################################

	def main():
	try:
	# Other useful variables
	debugInd = 'Y'
	var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
	var1 = datetime.datetime.now()

	print('Start Time: ', str(var))
	# End of useful variables

	# Initiating Log Class
	general_log_path = str(cf.conf['LOG_PATH'])

	# Enabling Logging Info
	logging.basicConfig(filename=general_log_path + 'restoreVideo.log', level=logging.INFO)

	print('Started Live Streaming!')

	cap = cv2.VideoCapture(0)
	cap.set(3, width)
	cap.set(4, height)

	fileName = Curr_Path + sep + 'Model' + sep + 'model_trained_' + str(epochsVal) + '.p'
	print('Model Name: ', str(fileName))

	pickle_in = open(fileName, 'rb')
	model = pickle.load(pickle_in)

	while True:
	status, img = cap.read()

	if status == False:
	break

	img_copy = img.copy()

	img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
	img = cv2.resize(img, imgDimension)

	img_copy = cv2.GaussianBlur(img_copy, imgSmallDim, 0)
	img_gray = cv2.cvtColor(img_copy, cv2.COLOR_BGR2GRAY)
	bin, img_thresh = cv2.threshold(img_gray, 100, 255, cv2.THRESH_BINARY_INV)

	img_final = cv2.resize(img_thresh, imgMidDim)
	img_final = np.reshape(img_final, (reshapeParam1,reshapeParam2,reshapeParam2,reshapeParam1))


	img_pred = word_dict[np.argmax(model.predict(img_final))]

	# Extracting Probability Values
	Predict_X = model.predict(img_final)
	probVal = round(np.amax(Predict_X) * 100)

	cv2.putText(img, "Live Feed : (" + str(probVal) + "%) ", (20,25), cv2.FONT_HERSHEY_TRIPLEX, 0.7, color = colorFeed)
	cv2.putText(img, "Prediction: " + img_pred, (20,410), cv2.FONT_HERSHEY_DUPLEX, 1.3, color = colorPredict)

	cv2.imshow("Original Image", img)

	if cv2.waitKey(1) & 0xFF == ord('q'):
	r1=0
	break

	if (r1 == 0):
	print('Successfully Alphabets predicted!')
	else:
	print('Failed to predict alphabet!')

	var2 = datetime.datetime.now()

	c = var2 – var1
	minutes = c.total_seconds() / 60
	print('Total Run Time in minutes: ', str(minutes))

	print('End Time: ', str(var1))

	except Exception as e:
	x = str(e)
	print('Error: ', x)

	if __name__ == "__main__":
	main()

view raw

readingVisualData.py

hosted with ❤ by GitHub

And the key snippet from the above code is –

cap = cv2.VideoCapture(0)
cap.set(3, width)
cap.set(4, height)

The application is reading the live video data from WebCAM. Also, set out the height & width for the video output.

fileName = Curr_Path + sep + 'Model' + sep + 'model_trained_' + str(epochsVal) + '.p'
print('Model Name: ', str(fileName))

pickle_in = open(fileName, 'rb')
model = pickle.load(pickle_in)

The application reads the model output generated as part of the previous script using the pickle package.

while True:
    status, img = cap.read()

    if status == False:
        break

The application will read the WebCAM & it exits if there is an end of video transmission or some kind of corrupt video frame.

img_copy = img.copy()

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, imgDimension)

img_copy = cv2.GaussianBlur(img_copy, imgSmallDim, 0)
img_gray = cv2.cvtColor(img_copy, cv2.COLOR_BGR2GRAY)
bin, img_thresh = cv2.threshold(img_gray, 100, 255, cv2.THRESH_BINARY_INV)

img_final = cv2.resize(img_thresh, imgMidDim)
img_final = np.reshape(img_final, (reshapeParam1,reshapeParam2,reshapeParam2,reshapeParam1))


img_pred = word_dict[np.argmax(model.predict(img_final))]

We have initially cloned the original video frame & then it converted from BGR2GRAYSCALE while applying the threshold on it doe better prediction outcomes. Then the image has resized & reshaped for model input. Finally, the np.argmax function extracted the class index with the highest predicted probability. Furthermore, it is translated using the word_dict dictionary to an Alphabet & displayed on top of the Live View.

# Extracting Probability Values
Predict_X = model.predict(img_final)
probVal = round(np.amax(Predict_X) * 100)

Also, derive the confidence score of that probability & display that on top of the Live View.

if cv2.waitKey(1) & 0xFF == ord('q'):
    r1=0
    break

The above code will let the developer exit from this application by pressing the “Esc” or “q”-key from the keyboard & the program will terminate.

So, we’ve done it.

You will get the complete codebase in the following Github link.

I’ll bring some more exciting topic in the coming days from the Python verse. Please share & subscribe my post & let me know your feedback.

Till then, Happy Avenging! 😀

Performance improvement of Python application programming

Posted on January 18, 2021June 2, 2021 by SatyakiDe in Azure, cloud, code, computing, Data Science, design, features, gui, integration, Keras, numpy, Pandas, Python, snippet, table, Technology, vector

Hello guys,

Today, I’ll be demonstrating a short but significant topic. There are widespread facts that, on many occasions, Python is relatively slower than other strongly typed programming languages like C++, Java, or even the latest version of PHP.

I found a relatively old post with a comparison shown between Python and the other popular languages. You can find the details at this link.

However, I haven’t verified the outcome. So, I can’t comment on the final statistics provided on that link.

My purpose is to find cases where I can take certain tricks to improve performance drastically.

One preferable option would be the use of Cython. That involves the middle ground between C & Python & brings the best out of both worlds.

The other option would be the use of GPU for vector computations. That would drastically increase the processing power. Today, we’ll be exploring this option.

Let’s find out what we need to prepare our environment before we try out on this.

Step – 1 (Installing dependent packages):

pip install pyopencl
pip install plaidml-keras

So, we will be taking advantage of the Keras package to use our GPU. And, the screen should look like this –

**Installation Process of Python-based Packages**

Once we’ve installed the packages, we’ll configure the package showing on the next screen.

For our case, we need to install pandas as we’ll be using numpy, which comes default with it.

**Installation of supplemental packages**

Let’s explore our standard snippet to test this use case.

Case 1 (Normal computational code in Python):

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 18-Jan-2020              ####
####                                      ####
#### Objective: Main calling scripts for  ####
#### normal execution.                    ####
##############################################

import numpy as np
from timeit import default_timer as timer

def pow(a, b, c):
    for i in range(a.size):
         c[i] = a[i] ** b[i]

def main():
    vec_size = 100000000

    a = b = np.array(np.random.sample(vec_size), dtype=np.float32)
    c = np.zeros(vec_size, dtype=np.float32)

    start = timer()
    pow(a, b, c)
    duration = timer() - start

    print(duration)

if __name__ == '__main__':
    main()

Case 2 (GPU-based computational code in Python):

#################################################
#### Written By: SATYAKI DE                  ####
#### Written On: 18-Jan-2020                 ####
####                                         ####
#### Objective: Main calling scripts for     ####
#### use of GPU to speed-up the performance. ####
#################################################

import numpy as np
from timeit import default_timer as timer

# Adding GPU Instance
from os import environ
environ["KERAS_BACKEND"] = "plaidml.keras.backend"

def pow(a, b):
    return a ** b

def main():
    vec_size = 100000000

    a = b = np.array(np.random.sample(vec_size), dtype=np.float32)
    c = np.zeros(vec_size, dtype=np.float32)

    start = timer()
    c = pow(a, b)
    duration = timer() - start

    print(duration)

if __name__ == '__main__':
    main()

And, here comes the output for your comparisons –

Case 1 Vs Case 2:

As you can see, there is a significant improvement that we can achieve using this. However, it has limited scope. Not everywhere you get the benefits. Until or unless Python decides to work on the performance side, you better need to explore either of the two options that I’ve discussed here (I didn’t mention a lot on Cython here. Maybe some other day.).

To get the codebase you can refer the following Github link.

So, finally, we have done it.

I’ll bring some more exciting topic in the coming days from the Python verse.

Till then, Happy Avenging! 😀

Note: All the data & scenario posted here are representational data & scenarios & available over the internet & for educational purpose only.

	The LLM Security Chr… on The LLM Security Chronicles…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on AGENTIC AI IN THE ENTERPRISE:…
	AGENTIC AI IN THE EN… on Agentic AI in the Enterprise:…

Category: Keras

Tuning your model using the python-based low-code machine-learning library PyCaret

Like this:

Live visual reading using Convolutional Neural Network (CNN) through Python-based machine-learning application.

Like this:

Performance improvement of Python application programming

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: