Building Python-based best-route apps for Indian Railways

Hi Guys!

Today, I’ll present a way to get the best route from Indian Railways train between two specific sources & destination using third-party API.

This approach is particularly beneficial if you want to integrate this logic in Azure Function or Lambda Function or any serverless functions.

Before we dig into the details. Let us explore what kind of cloud-based architecture we can implement this.

Architecture

Fig: 1 (Cloud Architecture)

In this case, I’ve considered Azure as the implementation platform.

Let’s discuss how the events will take place. At first, a user searches for the best routes between two fixed stations. The user has to provide the source & destination stations. The request will go through the Azure Firewall after validating the initial authentication. As part of the API service, it will check for similar queries & if it is there, then it will fetch it from the cache & send it back to the user through their mobile application. However, for the first time, it will retrieve the information from the DB & keep a copy in the cache. This part also managed through a load balancer for high-level availability. However, periodically system will push the data from the cache to the DB with the updated information.

Let’s see the program directory structure –

ProgramDir

Let’s discuss our code –

1. clsConfig.py (This script contains all the parameters for the main Indian Railway API & try to get the response between two railway stations. Hence, the name comes into the picture.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 12-Oct-2019              ####
####                                      ####
#### Objective: This script is a config   ####
#### file, contains all the keys for      ####
#### azure cosmos db. Application will    ####
#### process these information & perform  ####
#### various CRUD operation on Cosmos DB. ####
##############################################

import os
import platform as pl

class clsConfig(object):
    Curr_Path = os.path.dirname(os.path.realpath(__file__))

    os_det = pl.system()
    if os_det == "Windows":
        sep = '\\'
    else:
        sep = '/'

    config = {
        'APP_ID': 1,
        'URL': "https://trains.p.rapidapi.com/",
        'RAPID_API_HOST': "trains.p.rapidapi.com",
        'RAPID_API_KEY': "hrfjjdfjfjfjfjxxxxxjffjjfjfjfjfjfjfjf",
        'RAPID_API_TYPE': "application/json",
        'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
        'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
        'LOG_PATH': Curr_Path + sep + 'log' + sep,
        'REPORT_PATH': Curr_Path + sep + 'report',
        'APP_DESC_1': 'Indian Railway Train Schedule Search',
        'DEBUG_IND': 'N',
        'INIT_PATH': Curr_Path,
        'COL_LIST': ['name','train_num','train_from','train_to','classes','departTime','arriveTime','Mon','Tue','Wed','Thu','Fri','Sat','Sun']
    }

As of now, I’ve replaced the API Key with the dummy value.

2. clsIndianRailway.py (This script will invoke the main Indian Railway API & try to get the response between two railway stations. Hence, the name comes into the picture.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 20-Dec-2019              ####
#### Modified On 20-Dec-2019              ####
####                                      ####
#### Objective: Main scripts to invoke    ####
#### Indian Railway API.                  ####
##############################################

import requests
import logging
import json
from clsConfig import clsConfig as cf

class clsIndianRailway:
    def __init__(self):
        self.url = cf.config['URL']
        self.rapidapi_host = cf.config['RAPID_API_HOST']
        self.rapidapi_key = cf.config['RAPID_API_KEY']
        self.type = cf.config['RAPID_API_TYPE']

    def searchQry(self, rawQry):
        try:
            url = self.url
            rapidapi_host = self.rapidapi_host
            rapidapi_key = self.rapidapi_key
            type = self.type

            Ipayload = "{\"search\":\"" + rawQry + "\"}"

            jpayload = json.dumps(Ipayload)
            payload = json.loads(jpayload)

            print('Input JSON: ', str(payload))

            headers = {
                'x-rapidapi-host': rapidapi_host,
                'x-rapidapi-key': rapidapi_key,
                'content-type': type,
                'accept': type
                }

            response = requests.request("POST", url, data=payload, headers=headers)

            ResJson  = response.text

            jdata = json.dumps(ResJson)
            ResJson = json.loads(jdata)

            return ResJson

        except Exception as e:
            ResJson = ''
            x = str(e)
            print(x)

            logging.info(x)
            ResJson = {'errorDetails': x}

            return ResJson

Let’s explain the critical snippet from the code.

url = self.url
rapidapi_host = self.rapidapi_host
rapidapi_key = self.rapidapi_key
type = self.type

Ipayload = "{\"search\":\"" + rawQry + "\"}"

jpayload = json.dumps(Ipayload)
payload = json.loads(jpayload)

The first four lines are to receive the parameter values. Our application needs to frame the search query, which is done in the IPayload variable. After that, our app will convert it into a json object type.

headers = {
    'x-rapidapi-host': rapidapi_host,
    'x-rapidapi-key': rapidapi_key,
    'content-type': type,
    'accept': type
    }

response = requests.request("POST", url, data=payload, headers=headers)

Now, the application will prepare the headers & send the request & received the response. Finally, that response will be sent by this script to the main callee application after extracting part of the response & converting that back to JSON are as follows –

response = requests.request("POST", url, data=payload, headers=headers)

ResJson  = response.text

jdata = json.dumps(ResJson)
ResJson = json.loads(jdata)

return ResJson

3. callIndianRailwayAPI.py (This is the main script which invokes the main Indian Railway API & tries to get the response between two railway stations. Hence, the name comes into the picture.)

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 20-Dec-2019              ####
#### Modified On 20-Dec-2019              ####
####                                      ####
#### Objective: Main calling scripts.     ####
##############################################

from clsConfig import clsConfig as cf
import pandas as p
import clsL as cl
import logging
import datetime
import json
import clsIndianRailway as ct
import re
import numpy as np

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

# Lookup functions from
# Azure cloud SQL DB

var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

def getArriveTimeOnly(row):
    try:
        # Using regular expression to fetch time part only

        lkp_arriveTime = str(row['arriveTime'])

        str_arr_time, remain = lkp_arriveTime.split('+')

        return str_arr_time

    except Exception as e:
        x = str(e)
        str_arr_time = ''

        return str_arr_time

def getArriveDateDiff(row):
    try:
        # Using regular expression to fetch time part only

        lkp_arriveTime = str(row['arriveTime'])

        first_half, str_date_diff_init = lkp_arriveTime.split('+')

        # Replacing the text part from it & only capturing the integer part
        str_date_diff = int(re.sub(r"[a-z]","",str_date_diff_init, flags=re.I))

        return str_date_diff

    except Exception as e:
        x = str(e)
        str_date_diff = 0

        return str_date_diff

def getArriveTimeDiff(row):
    try:
        # Using regular expression to fetch time part only

        lkp_arriveTimeM = str(row['arriveTimeM'])

        str_time_diff_init = int(re.sub(r'[^\w\s]', '', lkp_arriveTimeM))

        # Replacing the text part from it & only capturing the integer part
        str_time_diff = (2400 - str_time_diff_init)

        return str_time_diff

    except Exception as e:
        x = str(e)
        str_time_diff = 0

        return str_time_diff

def main():
    try:
        dfSrc = p.DataFrame()
        df_ret = p.DataFrame()
        ret_2 = ''
        debug_ind = 'Y'
        col_list = cf.config['COL_LIST']

        general_log_path = str(cf.config['LOG_PATH'])

        # Enabling Logging Info
        logging.basicConfig(filename=general_log_path + 'consolidatedIR.log', level=logging.INFO)

        # Initiating Log Class
        l = cl.clsL()

        # Moving previous day log files to archive directory
        arch_dir = cf.config['ARCH_DIR']
        log_dir = cf.config['LOG_PATH']

        tmpR0 = "*" * 157

        logging.info(tmpR0)
        tmpR9 = 'Start Time: ' + str(var)
        logging.info(tmpR9)
        logging.info(tmpR0)

        print("Archive Directory:: ", arch_dir)
        print("Log Directory::", log_dir)
        tmpR1 = 'Log Directory::' + log_dir
        logging.info(tmpR1)

        # Query using parameters
        rawQry = str(input('Please enter the name of the train service that you want to find out (Either by Name or by Number): '))

        x1 = ct.clsIndianRailway()
        ret_2 = x1.searchQry(rawQry)

        # Capturing the JSON Payload
        res = json.loads(ret_2)

        # Converting dictionary to Pandas Dataframe
        # df_ret = p.read_json(ret_2, orient='records')

        df_ret = p.io.json.json_normalize(res)
        df_ret.columns = df_ret.columns.map(lambda x: x.split(".")[-1])

        # Resetting the column orders as per JSON
        # df_ret = df_ret[list(res[0].keys())]
        column_order = col_list
        df_mod_ret = df_ret.reindex(column_order, axis=1)

        # Sorting the source data for better viewing
        df_mod_resp = df_mod_ret.sort_values(by=['train_from','train_to','train_num'])

        l.logr('1.IndianRailway_' + var + '.csv', debug_ind, df_mod_resp, 'log')

        # Fetching Data for Delhi To Howrah
        df_del_how = df_mod_resp[(df_mod_resp['train_from'] == 'NDLS') & (df_mod_resp['train_to'] == 'HWH')]

        l.logr('2.IndianRailway_Delhi2Howrah_' + var + '.csv', debug_ind, df_del_how, 'log')

        # Splitting Arrive time into two separate fields for better calculation
        df_del_how['arriveTimeM'] = df_del_how.apply(lambda row: getArriveTimeOnly(row), axis=1)
        df_del_how['arriveTimeDayDiff'] = df_del_how.apply(lambda row: getArriveDateDiff(row), axis=1)
        df_del_how['arriveTimeDiff'] = df_del_how.apply(lambda row: getArriveTimeDiff(row), axis=1)

        l.logr('3.IndianRailway_Del2How_Mod_' + var + '.csv', debug_ind, df_del_how, 'log')

        # To fetch the best route which saves time
        lstTimeDayDiff = df_del_how['arriveTimeDayDiff'].values.tolist()
        min_lstTimeDayDiff = int(min(lstTimeDayDiff))

        df_min_timedaydiff = df_del_how[(df_del_how['arriveTimeDayDiff'] == min_lstTimeDayDiff)]

        l.logr('4.IndianRailway_Del2How_TimeCalc_' + var + '.csv', debug_ind, df_min_timedaydiff, 'log')

        # Now application will check the maximum arrivetimediff, this will bring the record
        # which arrives early at Howrah station
        lstTimeDiff = df_min_timedaydiff['arriveTimeDiff'].values.tolist()
        max_lstTimeDiff = int(max(lstTimeDiff))

        df_best_route = df_min_timedaydiff[(df_min_timedaydiff['arriveTimeDiff'] == max_lstTimeDiff)]

        # Dropping unwanted columns
        df_best_route.drop(columns=['arriveTimeM'], inplace=True)
        df_best_route.drop(columns=['arriveTimeDayDiff'], inplace=True)
        df_best_route.drop(columns=['arriveTimeDiff'], inplace=True)

        l.logr('5.IndianRailway_Del2How_BestRoute_' + var + '.csv', debug_ind, df_best_route, 'log')

        print("-" * 60)

        print('Realtime Indian Railway Data:: ')
        logging.info('Realtime Indian Railway Data:: ')
        print(df_mod_resp)
        print()
        print('Best Route from Delhi -> Howrah:: ')
        print(df_best_route)
        print()

        # Checking execution status
        ret_val_2 = df_best_route.shape[0]

        if ret_val_2 == 0:
            print("Indian Railway hasn't returned any rows. Please check your queries!")
            logging.info("Indian Railway hasn't returned any rows. Please check your queries!")
            print("*" * 157)
            logging.info(tmpR0)
        else:
            print("Successfuly row feteched!")
            logging.info("Successfuly row feteched!")
            print("*" * 157)
            logging.info(tmpR0)

        print('Finding Story points..')
        print("*" * 157)
        logging.info('Finding Story points..')
        logging.info(tmpR0)


        tmpR10 = 'End Time: ' + str(var)
        logging.info(tmpR10)
        logging.info(tmpR0)

    except ValueError:
        print("No relevant data to proceed!")
        logging.info("No relevant data to proceed!")

    except Exception as e:
        print("Top level Error: args:{0}, message{1}".format(e.args, e.message))

if __name__ == "__main__":
    main()

Key snippet to explore –

# Query using parameters
rawQry = str(input('Please enter the name of the train service that you want to find out (Either by Name or by Number): '))

In this case, we make it interactive mode. However, in the actual scenario, you would receive these values from your mobile application.

x1 = ct.clsIndianRailway()
ret_2 = x1.searchQry(rawQry)

# Capturing the JSON Payload
res = json.loads(ret_2)

The above four lines initially invoke the API & receive the JSON response.

# Converting dictionary to Pandas Dataframe
df_ret = p.io.json.json_normalize(res)
df_ret.columns = df_ret.columns.map(lambda x: x.split(".")[-1])

# Resetting the column orders as per JSON
column_order = col_list
df_mod_ret = df_ret.reindex(column_order, axis=1)

# Sorting the source data for better viewing
df_mod_resp = df_mod_ret.sort_values(by=['train_from','train_to','train_num'])

In these last five lines, our application will convert the JSON & serialize it into pandas dataframe, which is sorted after that.

The result will look like this –

SerializeJson2PandasDF

This is exceptionally critical, as this will allow you to achieve your target. Without flattening the data, you won’t get to your goal.

# Fetching Data for Delhi To Howrah
df_del_how = df_mod_resp[(df_mod_resp['train_from'] == 'NDLS') & (df_mod_resp['train_to'] == 'HWH')]

As the line suggested, our application will pick-up only those records between New Delhi & Howrah. Thus, we’ve used our filter to eliminate additional records. And, the data will look like this –

SilteredRecords

Now, we need to identify the minimum time taken by anyone of the two records. For that, we’ll be doing some calculations to fetch the minimum time taken by the application.

# Splitting Arrive time into two separate fields for better calculation
df_del_how['arriveTimeM'] = df_del_how.apply(lambda row: getArriveTimeOnly(row), axis=1)
df_del_how['arriveTimeDayDiff'] = df_del_how.apply(lambda row: getArriveDateDiff(row), axis=1)
df_del_how['arriveTimeDiff'] = df_del_how.apply(lambda row: getArriveTimeDiff(row), axis=1)

To do that, we’ll be generating a couple of derived columns (shown above), which we’ll be using the fetch the shortest duration. And, the data should look like this –

CalculatedFields

These are the two fields, which we’re using for our calculation. First, we’re splitting arriveTime into two separate columns i.e. arriveTimeM & arriveTimeDayDiff. However, arriveTimeDiff is a calculated field.

So, our logic to find the best routes –

  • arriveTimeDayDiff = Take the minimum of the records. If you have multiple candidates, then we’ll pick all of them. In this case, we’ll get two records.
  • ArrivalDiff = (24:00 – <Train’s Arrival Time>), then take the maximum of the value

Note that, in this case, we haven’t considered the departure time. You can add that logic to improvise & correct your prediction.

The above steps can be seen in the following snippet –

# To fetch the best route which saves time
lstTimeDayDiff = df_del_how['arriveTimeDayDiff'].values.tolist()
min_lstTimeDayDiff = int(min(lstTimeDayDiff))

df_min_timedaydiff = df_del_how[(df_del_how['arriveTimeDayDiff'] == min_lstTimeDayDiff)]

l.logr('4.IndianRailway_Del2How_TimeCalc_' + var + '.csv', debug_ind, df_min_timedaydiff, 'log')

# Now application will check the maximum arrivetimediff, this will bring the record
# which arrives early at Howrah station
lstTimeDiff = df_min_timedaydiff['arriveTimeDiff'].values.tolist()
max_lstTimeDiff = int(max(lstTimeDiff))

df_best_route = df_min_timedaydiff[(df_min_timedaydiff['arriveTimeDiff'] == max_lstTimeDiff)]

Let’s see how it runs –

Output

As you can see that NDLS (New Delhi), we’ve three records marked in the GREEN square box. However, as destination HWH (Howrah), we’ve only two records marked in the RED square box. However, as part of our calculation, we’ll pick the record marked with the BLUE square box.

Let’s see how the log directory generates all the files –

Log_Dir

Let’s see the final output in our csv file –

BestRoute

So, finally, we’ve achieved it. 😀

Let me know – how do you like this post. Please share your suggestion & comments.

I’ll be back with another installment from the Python verse.

Till then – Happy Avenging!

Note: All the data posted here are representational data & available over the internet & for educational purpose only.

Building GUI application using Python

Hi Guys!

Today, We’ll be exploring to create a GUI application using Python.

We’ll be using a briefcase package, one of the popular utilities from Python-verse.

The advantage of this package is you can create an application on Windows, MAC, Linux & Android using the same piece of code.

Let’s explore!

Step – 1:

We’ll be installing briefcase package –

1. Installing Packages - 1

Step – 2:

Install the toga package –

2. Installing Packages - 2.JPG

Step – 3:

Install the pycairo package –

3. Installing Packages - 3

Some O/S, you might not be able to install pycairo. In that case, you need to install it from a third-party site & need to install a wheel package.

4. Installing Packages - 4

Step – 4:

Finally, the last package –

5. Installing Packages - 5
  • For Windows, you need to install the Wix toolset.
  • For MAC, there is no additional tool you need to install.
  • For iOS, You need to have X-Code & a developer account.
  • For Linux, You need to install GTK 3.10 or later.
  • For Android, you need to install the Android studio.
  • For web-app, you need to use the Django framework.

Let’s create the virtual directory –

python -m venv –copies .env

.env\Scripts\activate.bat

Let’s create the default app –

There will be a series of inputs that you need to pass it to create the default application –

6. Creating First Step

Please find the RED highlighted options.

Let’s find the run commands for different environments  –

python setup.py windows -s

python setup.py macos -s

python setup.py linux -s

python setup.py ios -s

python setup.py android -s

Let’s run that in Windows –

7. Creating First App

Now, we’ll modify our code & we’ll add some text field & calculation logic in this Apps.

Let’s review the central directory structure –

11. Main Directory Structures

Now, explore the SDPythonApp directory & we’ll find the following structure –

10. Directory Structure

Let’s discuss our code –

1. app.py (This script will contain the main logic of GUI Apps & will invoke be the main application)

###############################################
#### Written By: SATYAKI DE                ####
#### Written On: 24-Nov-2019               ####
####                                       ####
#### Objective: This script will create a  ####
#### GUI application with sample function. ####
####                                       ####
###############################################

"""
First IOS App made from Python
"""
import toga
from toga.style import Pack
from toga.style.pack import COLUMN, ROW, LEFT, RIGHT


class Sdapp(toga.App):

    def two_digit_decimal(self, n1):
        n = round(n1, 2)
        return n

    def calculate(self, widget):
        try:
            # Dummy Tax Calculation
            val = (float(self.f_input.value) * 4.5) * 2.7 / 100.0
            self.c_input.value = self.two_digit_decimal(val)
        except Exception:
            self.c_input.value = 'Please provide numeric values!'

    def startup(self):
        # Create a main window with a name matching the app
        self.main_window = toga.MainWindow(title=self.name)

        # Create a main content box
        f_box = toga.Box()
        c_box = toga.Box()
        box = toga.Box()

        self.c_input = toga.TextInput(readonly=True)
        self.f_input = toga.TextInput()

        self.c_label = toga.Label('$', style=Pack(text_align=LEFT))
        self.f_label = toga.Label('Salary', style=Pack(text_align=LEFT))
        self.join_label = toga.Label('Tax Amount', style=Pack(text_align=RIGHT))

        # Button Activity
        button = toga.Button('Generate Tax', on_press=self.calculate)

        f_box.add(self.f_input)
        f_box.add(self.f_label)

        c_box.add(self.join_label)
        c_box.add(self.c_input)
        c_box.add(self.c_label)

        box.add(f_box)
        box.add(c_box)
        box.add(button)

        box.style.update(direction=COLUMN, padding_top=10)
        f_box.style.update(direction=ROW, padding=5)
        c_box.style.update(direction=ROW, padding=5)

        self.c_input.style.update(flex=1)
        self.f_input.style.update(flex=1, padding_left=160)
        self.c_label.style.update(width=100, padding_left=10)
        self.f_label.style.update(width=100, padding_left=10)
        self.join_label.style.update(width=150, padding_right=10)

        button.style.update(padding=15, flex=1)

        # Add the content on the main window
        self.main_window.content = box

        # Show the main window
        self.main_window.show()


def main():
    return Sdapp('SDApp', 'com.firstapp.SDPythonApp')

Let’s discuss the key lines –

self.c_label = toga.Label('

The following lines are textbox boilerplate in the main application.

# Button Activity
button = toga.Button('Generate Tax', on_press=self.calculate)

The above lines will trigger the event when someone clicks the application & it will trigger the function named calculate.

def calculate(self, widget):
    try:
        # Dummy Tax Calculation
        val = (float(self.f_input.value) * 4.5) * 2.7 / 100.0
        self.c_input.value = self.two_digit_decimal(val)
    except Exception:
        self.c_input.value = 'Please provide numeric values!'

In the above function, we’ve prepared a dummy calculation logic for TAX calculation. And, finally, we’ll be extracting two digits numeric digits after decimal by invoking the two_digit_decimal function.

def two_digit_decimal(self, n1):
    n = round(n1, 2)
    return n

This function will return two digits after decimal places.

In the above function, we’ve prepared a dummy calculation logic for TAX calculation. And, finally, we’ll be extracting two digits numeric digits after decimal by invoking the two_digit_decimal function.

Let’s run our application –

8.1. Before Advanced App

Let’s provide input say 1234 as shown in the above figures & click the Generate Tax button marked in RED. This will prompt the following screen.

8. Advanced App

Let’s explore if someone provides invalid input –

9. Exception Cases

As expected, this will throw a proper warning to its application user.

So, we’ve done it. Building our first python based GUI application across multiple platforms.

Please share your review.

So, we’ll come out with another new post in the coming days!

N.B.: This is demonstrated for RnD/study purposes. All the data posted here are representational data & available over the internet & for educational purpose only.

Sending SMS using 3rd party API by integrating with custom-built BOT in Python

Hi Guys!

Today, We’re going to discuss the way to send SMS through popular 3rd-party API (Twilio) using Python 3.7.

Before that, you need to register with Twilio. By default, they will give you some credit in order to explore their API.

And, then you can get a virtual number from them, which will be used to exchange SMS between your trusted numbers for trial Account.

1. Booking Phone Number

The basic architecture can be depicted are as follows –

14. FeatureImage

How to get a verified number for your trial account?

Here is the way, you have to do that –

10. VerifiedNumbers

You can create your own trial account by using this link.

Apart from that, you need to download & install Ngrok. This is available for multi-platform. For our case, we’re using Windows.

The purpose is to run your local web service through a global API like interface. I’ll explain that later.

You need to register & install that on your computer –

2. Ngrok

Once, you download & install you need to use the global link of any running local server application like this –

3. GetURL

This is the dummy link. I’ll hide the original link. However, every time when you restart the application, you’ll get a new link. So, you will be safe anyway. 🙂

4. UpdateLink

Once, you get the link, you have to update that global link under the messaging section. Remember that, you have to keep the “/sms” part after that.

Let’s see our sample code. here, I would be integrating my custom developed BOT developed in Python. However, I’ll be only calling that library. We’re not going post any script or explain that over here.

1. serverSms.py ( This script is a server script, which is using flask framework & it will respond to the user’s text message by my custom developed BOT using Python)

# /usr/bin/env python
##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 03-Nov-2019              ####
####                                      ####
#### Objective: This script will respond  ####
#### by BOT created by me. And, reply to  ####
#### sender about their queries.          ####
#### We're using Twillio API for this.    ####
####                                      ####
##############################################

from flask import Flask, request, redirect
from twilio import twiml
from twilio.twiml.messaging_response import Message, MessagingResponse
import logging
from flask import request
from SDChatbots.clsTalk2Bot import clsTalk2Bot

app = Flask(__name__)

@app.route("/sms", methods=['GET', 'POST'])
def sms_ahoy_reply():
    """Respond to incoming messages with a friendly SMS."""
    # Start our response
    # resp = twiml.Response()
    message_body = request.form['Body']

    print(message_body)
    logging.info(message_body)

    y = clsTalk2Bot()
    ret_val = y.TalkNow(message_body)
    zMsg = str(ret_val)
    print('Response: ', str(zMsg))

    resp = MessagingResponse()

    # Add a message
    resp.message(zMsg)

    return str(resp)

if __name__ == "__main__":
    app.run(debug=True)

Key lines from the above scripts are –

@app.route("/sms", methods=['GET', 'POST'])

The route is a way to let your application understand to trigger the appropriate functionalities inside your API.

message_body = request.form['Body']

Here, the application is capturing the incoming SMS & print that in your server log. We’ll see that when we run our application.

y = clsTalk2Bot()
ret_val = y.TalkNow(message_body)
zMsg = str(ret_val)

Now, the application is calling my developed python BOT & retrieve the response & convert it as a string before pushing the response SMS to the user, who originally send the SMS.

resp = MessagingResponse() --This is for Python 3.7 +

# Add a message
resp.message(zMsg)

return str(resp)

Finally, you are preparing the return SMS & send it back to the user.

For the old version, the following line might work –

resp = twiml.Response()

But, just check with the Twilio API.

Let’s run our server application. You will see the following screen –

11. ServerResponse

Let’s see, if one someone ask some question. How the application will respond –

7.1. BotIntegratedSMS

And, let’s explore how our server application is receiving it & the response from the server –

6. ServerResponse

Note that, we’ll be only sending the text to SMS, not the statistics sent by my BOT marked in RED.  😀

Let’s check the response from the BOT –

7.2. BotIntegratedSMS

Yes! We did it. 😀

But, make sure you are regularly checking your billing as this will cost you money. Always, check the current balance –

9. BillingInfo

You can check the usage from the following tab –

12. Usage

You can create a billing alarm to monitor your usage –

13. BillingAlert

Let me know, how do you like it.

So, we’ll come out with another exciting post in the coming days!

N.B.: This is demonstrated for RnD/study purposes. All the data posted here are representational data & available over the internet.

Converting text to voice in Python

Hi Guys!

Today, we’ll be discussing one new post of converting text into a voice using some third-party APIs. This is particularly very useful in many such cases, where you can use this method to get more realistic communication.

There are many such providers, where you can get an almost realistic voice for both males & females. However, most of them are subscription-based. So, you have to be very careful about your budget & how to proceed.

For testing purposes, I’ll be using voice.org to simulate this.

Let’s look out the architecture of this process –

FlowS

As you can see, the user-initiated the application & provide some input in the form of plain text. Once the data is given, the app will send it to the third-party API for the process. Now, the Third-party API will verify the authentication & then it will check all the associate parameters before it starting to generate the audio response. After that, it will send the payload & that will be received by the calling python application. Here, it will be decoded & create the audio file & finally, that will be played at the invoking computer.

This third-party API has lots of limitations. However, they are giving you the platform to test your concept.

As of now, they support the following languages – English, Chinese, Catalan, French, Finnish, Dutch, Danish, German, Italian, Japanese, Korean, Polish, Norwegian, Portuguese, Russian, Spanish & Sweedish.

In our case, we’ll be checking with English.

To work with this, you need to have the following modules installed in python –

  • playsound
  • requests
  • base64

Let’s see the directory structure –

1. Directory

Again, we are not going to discuss any script, which we’ve already discussed here.

Hence, we’re skipping clsL.py here.

1. clsConfig.py (This script contains all the parameters of the server.)

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 12-Oct-2019              ####
####                                      ####
#### Objective: This script is a config   ####
#### file, contains all the keys for      ####
#### azure cosmos db. Application will    ####
#### process these information & perform  ####
#### various CRUD operation on Cosmos DB. ####
##############################################

import os
import platform as pl

class clsConfig(object):
    Curr_Path = os.path.dirname(os.path.realpath(__file__))

    os_det = pl.system()
    if os_det == "Windows":
        sep = '\\'
    else:
        sep = '/'

    config = {
        'APP_ID': 1,
        'url': "https://voicerss-text-to-speech.p.rapidapi.com/",
        'host': "voicerss-text-to-speech.p.rapidapi.com",
        'api_key': "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
        'targetFile': "Bot_decode.mp3",
        'pitch_speed': "-6",
        'bot_language': "en-us",
        'audio_type': "mp3",
        'audio_freq': "22khz_8bit_stereo",
        'query_string_api': "hhhhhhhhhhhhhhhhhhhhhhhhhhhh",
        'b64_encoding': True,
        'APP_DESC_1': 'Text to voice conversion.',
        'DEBUG_IND': 'N',
        'INIT_PATH': Curr_Path,
        'LOG_PATH': Curr_Path + sep + 'log' + sep
    }

For security reasons, sensitive information masked with the dummy value.

‘api_key’: “xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx”,

‘query_string_api’: “hhhhhhhhhhhhhhhhhhhhhhhhhhhh”,

This two information is private to each subscriber. Hence, I’ve removed them & updated with some dummy values.

You have to fill-up with your subscribed information.

2. clsText2Voice.py (This script will convert the text data into an audio file using a GET API request from the third-party API & then play that using the web media player.)

###############################################
#### Written By: SATYAKI DE                ####
#### Written On: 27-Oct-2019               ####
#### Modified On 27-Oct-2019               ####
####                                       ####
#### Objective: Main class converting      ####
#### text to voice using third-party API.  ####
###############################################

from playsound import playsound
import requests
import base64
from clsConfig import clsConfig as cf

class clsText2Voice:
    def __init__(self):
        self.url = cf.config['url']
        self.api_key = cf.config['api_key']
        self.targetFile = cf.config['targetFile']
        self.pitch_speed = cf.config['pitch_speed']
        self.bot_language = cf.config['bot_language']
        self.audio_type = cf.config['audio_type']
        self.audio_freq = cf.config['audio_freq']
        self.b64_encoding = cf.config['b64_encoding']
        self.query_string_api = cf.config['query_string_api']
        self.host = cf.config['host']

    def getAudio(self, srcString):
        try:
            url = self.url
            api_key = self.api_key
            tarFile = self.targetFile
            pitch_speed = self.pitch_speed
            bot_language = self.bot_language
            audio_type = self.audio_type
            audio_freq = self.audio_freq
            b64_encoding = self.b64_encoding
            query_string_api = self.query_string_api
            host = self.host

            querystring = {
                "r": pitch_speed,
                "c": audio_type,
                "f": audio_freq,
                "src": srcString,
                "hl": bot_language,
                "key": query_string_api,
                "b64": b64_encoding
            }

            headers = {
                'x-rapidapi-host': host,
                'x-rapidapi-key': api_key
            }

            response = requests.request("GET", url, headers=headers, params=querystring)

            # Converting to MP3
            targetFile = tarFile
            mp3File_64_decode = base64.decodebytes(bytes(response.text, encoding="utf-8"))
            mp3File_result = open(targetFile, 'wb')

            # create a writable mp3File and write the decoding result
            mp3File_result.write(mp3File_64_decode)
            mp3File_result.close()

            playsound(targetFile)

            return 0
        except Exception as e:
            x = str(e)
            print('Error: ', x)

            return 1

Few crucial lines from the above script –

querystring = {
    "r": pitch_speed,
    "c": audio_type,
    "f": audio_freq,
    "src": srcString,
    "hl": bot_language,
    "key": query_string_api,
    "b64": b64_encoding
}

You can configure the voice of the audio by adjusting all the configurations. And, the text content will receive at srcString. So, whatever user will be typing that will be directly captured here & form the JSON payload accordingly.

response = requests.request("GET", url, headers=headers, params=querystring)

In this case, you will be receiving the audio file in the form of a base64 text file. Hence, you need to convert them back to the sound file by these following lines –

# Converting to MP3
targetFile = tarFile
mp3File_64_decode = base64.decodebytes(bytes(response.text, encoding="utf-8"))
mp3File_result = open(targetFile, 'wb')

# create a writable mp3File and write the decoding result
mp3File_result.write(mp3File_64_decode)
mp3File_result.close()

As you can see that, we’ve extracted the response.text & then we’ve decoded that to byte object to form the mp3 sound file at the receiving end.

Once we have our mp3 file ready, the following line simply plays the audio record.

playsound(targetFile)

Thus you can hear the actual voice.

3. callText2Voice.py (This is the main script that will invoke the text to voice API & then playback the audio once it gets the response from the third-party API.)

###############################################
#### Written By: SATYAKI DE                ####
#### Written On: 27-Oct-2019               ####
#### Modified On 27-Oct-2019               ####
####                                       ####
#### Objective: Main class converting      ####
#### text to voice using third-party API.  ####
###############################################

from clsConfig import clsConfig as cf
import clsL as cl
import logging
import datetime
import clsText2Voice as ct

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

def main():
    try:
        ret_2 = ''
        debug_ind = 'Y'

        general_log_path = str(cf.config['LOG_PATH'])

        # Enabling Logging Info
        logging.basicConfig(filename=general_log_path + 'consolidatedTwitter.log', level=logging.INFO)

        # Initiating Log Class
        l = cl.clsL()

        # Moving previous day log files to archive directory
        log_dir = cf.config['LOG_PATH']

        tmpR0 = "*" * 157

        logging.info(tmpR0)
        tmpR9 = 'Start Time: ' + str(var)
        logging.info(tmpR9)
        logging.info(tmpR0)

        print("Log Directory::", log_dir)
        tmpR1 = 'Log Directory::' + log_dir
        logging.info(tmpR1)

        # Query using parameters
        rawQry = str(input('Enter your string:'))

        x1 = ct.clsText2Voice()
        ret_2 = x1.getAudio(rawQry)

        if ret_2 == 0:
            print("Successfully converted from text to voice!")
            logging.info("Successfully converted from text to voice!")
            print("*" * 157)
            logging.info(tmpR0)
        else:
            print("Successfuly converted!")
            logging.info("Successfuly converted!")
            print("*" * 157)
            logging.info(tmpR0)

        print("*" * 157)
        logging.info(tmpR0)

        tmpR10 = 'End Time: ' + str(var)
        logging.info(tmpR10)
        logging.info(tmpR0)

    except ValueError:
        print("No relevant data to proceed!")
        logging.info("No relevant data to proceed!")

    except Exception as e:
        print("Top level Error: args:{0}, message{1}".format(e.args, e.message))

if __name__ == "__main__":
    main()

Essential lines from the above script –

# Query using parameters
rawQry = str(input('Enter your string:'))

x1 = ct.clsText2Voice()
ret_2 = x1.getAudio(rawQry)

As you can see, here the user will be passing the text content, which will be given to our class & then it will project the audio sound of that text.

Let’s see how it runs –

Input Text: Welcome to Satyaki De’s blog. This site mainly deals with the Python, SQL from different DBs & many useful areas from the leading cloud providers.

And, here is the run command under Windows OS looks like –

2. Windows_Run

And, please find the sample voice that it generates –

So, We’ve done it! 😀

Let us know your comment on this.

So, we’ll come out with another exciting post in the coming days!

N.B.: This is demonstrated for RnD/study purposes. All the data posted here are representational data & available over the internet.

Prepare analytics based on streaming data from Twitter using Python

Hi Guys!

Today, we will be projecting an analytics storyline based on streaming data from twitter’s developer account.

I want to make sure that this solely for educational purposes & no data analysis has provided to any agency or third-party apps. So, when you are planning to use this API, make sure that you strictly follow these rules.

In order to create a streaming channel from Twitter, you need to create one developer account.

As I’m a huge soccer fan, I would like to refer to one soccer place on Twitter for this. In this case, we’ll be checking BA Analytics for this.

6. Origin_Site

Please find the steps to create one developer account –

Step -1: 

You have to go to the following link. Over there you need to submit the request in order to create the account. You need to provide proper justification as to why you need that account. I’m not going into those forms. They are self-explanatory.

Once, your developer account activated, you need to click the following link as shown below –

1. TwitterSetup

Once you clicked that, the program will lead to you the following page –

2. TwitterSetup - Continue

If you don’t have any app, the first page will look something like the above page.

Step 2:

3. TwiterSetup - Continue

Now, you need to fill-up the following details. For security reasons, I’ll be hiding sensitive data here.

Step 3:

4. TwitterSetUp - Continue

After creating that, you need to go to the next tab i.e. key’s & tokens. The initial screen will only have Consumer API keys.

Step 4:

To generate the Access token, you need to click the create button from the above screenshot & then the new page will look like this –

5. TwitterSetUp - Continue

Our program will be using all these pieces of information.

So, now we’re ready for our Python program.

In order to access Twitter API through python, you need to install the following package –

pip install python-twitter

Let’s see the directory structure –

7. Directory

Let’s check only the relevant scripts here. We’re not going to discuss the clsL.py as we’ve already discussed. Please refer to the old post.

1. clsConfig.py (This script contains all the parameters of the server.)

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 12-Oct-2019              ####
####                                      ####
#### Objective: This script is a config   ####
#### file, contains all the keys for      ####
#### azure cosmos db. Application will    ####
#### process these information & perform  ####
#### various CRUD operation on Cosmos DB. ####
##############################################

import os
import platform as pl

class clsConfig(object):
    Curr_Path = os.path.dirname(os.path.realpath(__file__))

    os_det = pl.system()
    if os_det == "Windows":
        sep = '\\'
    else:
        sep = '/'

    config = {
        'APP_ID': 1,
        'EMAIL_SRC_JSON_FILE': Curr_Path + sep + 'src_file' + sep + 'srcEmail.json',
        'TWITTER_SRC_JSON_FILE': Curr_Path + sep + 'src_file' + sep + 'srcTwitter.json',
        'HR_SRC_JSON_FILE': Curr_Path + sep + 'src_file' + sep + 'srcHR.json',
        'ACCESS_TOKEN': '99999999-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
        'ACCESS_SECRET': 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY',
        'CONSUMER_KEY': "aaaaaaaaaaaaaaaaaaaaaaa",
        'CONSUMER_SECRET': 'HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH',
        'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
        'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
        'LOG_PATH': Curr_Path + sep + 'log' + sep,
        'REPORT_PATH': Curr_Path + sep + 'report',
        'APP_DESC_1': 'Feedback Communication',
        'DEBUG_IND': 'N',
        'INIT_PATH': Curr_Path
    }

For security reasons, I’ve removed the original keys with dummy keys. You have to fill-up your own keys.

2. clsTwitter.py (This script will fetch data from Twitter API & process the same & send it to the calling method.)

###############################################
#### Written By: SATYAKI DE                ####
#### Written On: 12-Oct-2019               ####
#### Modified On 12-Oct-2019               ####
####                                       ####
#### Objective: Main class fetching sample ####
#### data from Twitter API.                ####
###############################################

import twitter
from clsConfig import clsConfig as cf
import json
import re
import string
import logging

class clsTwitter:
    def __init__(self):
        self.access_token = cf.config['ACCESS_TOKEN']
        self.access_secret = cf.config['ACCESS_SECRET']
        self.consumer_key = cf.config['CONSUMER_KEY']
        self.consumer_secret = cf.config['CONSUMER_SECRET']

    def find_element(self, srcJson, key):
        """Pull all values of specified key from nested JSON."""
        arr = []

        def fetch(srcJson, arr, key):
            """Recursively search for values of key in JSON tree."""
            if isinstance(srcJson, dict):
                for k, v in srcJson.items():
                    if isinstance(v, (dict, list)):
                        fetch(v, arr, key)
                    elif k == key:
                        arr.append(v)
            elif isinstance(srcJson, list):
                for item in srcJson:
                    fetch(item, arr, key)
            return arr

        finJson = fetch(srcJson, arr, key)
        return finJson

    def searchQry(self, rawQry):
        try:
            fin_dict = {}
            finJson = ''
            res = ''
            cnt = 0

            # Parameters to invoke Twitter API
            ACCESS_TOKEN = self.access_token
            ACCESS_SECRET = self.access_secret
            CONSUMER_KEY = self.consumer_key
            CONSUMER_SECRET = self.consumer_secret

            tmpR20 = 'Raw Query: ' + str(rawQry)
            logging.info(tmpR20)

            finJson = '['

            if rawQry == '':
                print('No data to proceed!')
                logging.info('No data to proceed!')
            else:
                t = twitter.Api(
                                  consumer_key = CONSUMER_KEY,
                                  consumer_secret = CONSUMER_SECRET,
                                  access_token_key = ACCESS_TOKEN,
                                  access_token_secret = ACCESS_SECRET
                               )

                response = t.GetSearch(raw_query=rawQry)
                print('Total Records fetched:', str(len(response)))

                for i in response:

                    # Converting them to json
                    data = str(i)
                    res_json = json.loads(data)

                    # Calling individual key
                    id = res_json['id']
                    tmpR19 = 'Id: ' + str(id)
                    logging.info(tmpR19)

                    try:
                        f_count = res_json['quoted_status']['user']['followers_count']
                    except:
                        f_count = 0
                    tmpR21 = 'Followers Count: ' + str(f_count)
                    logging.info(tmpR21)

                    try:
                        r_count = res_json['quoted_status']['retweet_count']
                    except:
                        r_count = 0
                    tmpR22 = 'Retweet Count: ' + str(r_count)
                    logging.info(tmpR22)

                    text = self.find_element(res_json, 'text')

                    for j in text:
                        strF = re.sub(f'[^{re.escape(string.printable)}]', '', str(j))
                        pat = re.compile(r'[\t\n]')
                        strG = pat.sub("", strF)
                        res = "".join(strG)

                    # Forming return dictionary
                    #fin_dict.update({id:'id', f_count: 'followerCount', r_count: 'reTweetCount', res: 'msgPost'})
                    if cnt == 0:
                        finJson = finJson + '{"id":' + str(id) + ',"followerCount":' + str(f_count) + ',"reTweetCount":' + str(r_count) + ', "msgPost":"' + str(res) + '"}'
                    else:
                        finJson = finJson + ', {"id":' + str(id) + ',"followerCount":' + str(f_count) + ',"reTweetCount":' + str(r_count) + ', "msgPost":"' + str(res) + '"}'

                    cnt += 1

            finJson = finJson + ']'

            jdata = json.dumps(finJson)
            ResJson = json.loads(jdata)

            return ResJson

        except Exception as e:
            ResJson = ''
            x = str(e)
            print(x)

            logging.info(x)
            ResJson = {'errorDetails' : x}

            return ResJson

The key lines from this snippet are as follows –

def find_element(self, srcJson, key):
    """Pull all values of specified key from nested JSON."""
    arr = []

    def fetch(srcJson, arr, key):
        """Recursively search for values of key in JSON tree."""
        if isinstance(srcJson, dict):
            for k, v in srcJson.items():
                if isinstance(v, (dict, list)):
                    fetch(v, arr, key)
                elif k == key:
                    arr.append(v)
        elif isinstance(srcJson, list):
            for item in srcJson:
                fetch(item, arr, key)
        return arr

    finJson = fetch(srcJson, arr, key)
    return finJson

This function will check against a specific key & based on that it will search from the supplied JSON & returns the value. This would be particularly very useful when you don’t have any fixed position of your elements.

t = twitter.Api(
                  consumer_key = CONSUMER_KEY,
                  consumer_secret = CONSUMER_SECRET,
                  access_token_key = ACCESS_TOKEN,
                  access_token_secret = ACCESS_SECRET
               )

response = t.GetSearch(raw_query=rawQry)

In this case, Python application will receive the JSON response using the new Twitter API.

id = res_json['id']
try:
    f_count = res_json['quoted_status']['user']['followers_count']
except:
    f_count = 0
try:
    r_count = res_json['quoted_status']['retweet_count']
except:
    r_count = 0

Fetching specific fixed position elements from the response API.

text = self.find_element(res_json, 'text')

Fetching the dynamic position based element using our customized function.

for j in text:
    strF = re.sub(f'[^{re.escape(string.printable)}]', '', str(j))
    pat = re.compile(r'[\t\n]')
    strG = pat.sub("", strF)
    res = "".join(strG)

Removing non-printable characters & white spaces from the extracted text field in order to get clean data.

if cnt == 0:
    finJson = finJson + '{"id":' + str(id) + ',"followerCount":' + str(f_count) + ',"reTweetCount":' + str(r_count) + ', "msgPost":"' + str(res) + '"}'
else:
    finJson = finJson + ', {"id":' + str(id) + ',"followerCount":' + str(f_count) + ',"reTweetCount":' + str(r_count) + ', "msgPost":"' + str(res) + '"}'

Finally, generating a JSON string dynamically.

jdata = json.dumps(finJson)
ResJson = json.loads(jdata)

And, returning the JSON to our calling program.

3. callTwitterAPI.py (This is the main script that will invoke the Twitter API & then project the analytic report based on the available Twitter data.)

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 12-Oct-2019              ####
#### Modified On 12-Oct-2019              ####
####                                      ####
#### Objective: Main calling scripts.     ####
##############################################

from clsConfig import clsConfig as cf
import pandas as p
import clsL as cl
import logging
import datetime
import json
import clsTwitter as ct

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

def getMaximumFollower(df):
    try:
        d1 = df['followerCount'].max()
        d1_max_str = int(d1)

        return d1_max_str
    except Exception as e:
        x = str(e)
        print(x)
        dt_part1 = 0

        return dt_part1

def getMaximumRetweet(df):
    try:
        d1 = df['reTweetCount'].max()
        d1_max_str = int(d1)

        return d1_max_str
    except Exception as e:
        x = str(e)
        print(x)
        dt_part1 = ''

        return dt_part1

# Lookup functions from
# Azure cloud SQL DB

var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

def main():
    try:
        dfSrc = p.DataFrame()
        df_ret = p.DataFrame()
        ret_2 = ''
        debug_ind = 'Y'

        general_log_path = str(cf.config['LOG_PATH'])

        # Enabling Logging Info
        logging.basicConfig(filename=general_log_path + 'consolidatedTwitter.log', level=logging.INFO)

        # Initiating Log Class
        l = cl.clsL()

        # Moving previous day log files to archive directory
        arch_dir = cf.config['ARCH_DIR']
        log_dir = cf.config['LOG_PATH']

        tmpR0 = "*" * 157

        logging.info(tmpR0)
        tmpR9 = 'Start Time: ' + str(var)
        logging.info(tmpR9)
        logging.info(tmpR0)

        print("Archive Directory:: ", arch_dir)
        print("Log Directory::", log_dir)
        tmpR1 = 'Log Directory::' + log_dir
        logging.info(tmpR1)

        # Query using parameters
        rawQry = 'q=from%3ABlades_analytic&src=typd'

        x1 = ct.clsTwitter()
        ret_2 = x1.searchQry(rawQry)

        # Capturing the JSON Payload
        res = json.loads(ret_2)

        # Converting dictionary to Pandas Dataframe
        df_ret = p.read_json(ret_2, orient='records')

        # Resetting the column orders as per JSON
        df_ret = df_ret[list(res[0].keys())]

        l.logr('1.Twitter_' + var + '.csv', debug_ind, df_ret, 'log')

        print('Realtime Twitter Data:: ')
        logging.info('Realtime Twitter Data:: ')
        print(df_ret)
        print()

        # Checking execution status
        ret_val_2 = df_ret.shape[0]

        if ret_val_2 == 0:
            print("Twitter hasn't returned any rows. Please check your queries!")
            logging.info("Twitter hasn't returned any rows. Please check your queries!")
            print("*" * 157)
            logging.info(tmpR0)
        else:
            print("Successfuly row feteched!")
            logging.info("Successfuly row feteched!")
            print("*" * 157)
            logging.info(tmpR0)

        print('Finding Story points..')
        print("*" * 157)
        logging.info('Finding Story points..')
        logging.info(tmpR0)

        # Performing Basic Aggregate
        # 1. Find the user who has maximum Followers
        df_ret['MaxFollower'] = getMaximumFollower(df_ret)

        # 2. Find the user who has maximum Re-Tweets
        df_ret['MaxTweet'] = getMaximumRetweet(df_ret)

        # Getting Status
        df_MaxFollower = df_ret[(df_ret['followerCount'] == df_ret['MaxFollower'])]

        # Dropping Columns
        df_MaxFollower.drop(['reTweetCount'], axis=1, inplace=True)
        df_MaxFollower.drop(['MaxTweet'], axis=1, inplace=True)

        l.logr('2.Twitter_Maximum_Follower_' + var + '.csv', debug_ind, df_MaxFollower, 'log')

        print('Maximum Follower:: ')
        print(df_MaxFollower)
        print("*" * 157)
        logging.info(tmpR0)

        df_MaxTwitter = df_ret[(df_ret['reTweetCount'] == df_ret['MaxTweet'])]
        print()

        # Dropping Columns
        df_MaxTwitter.drop(['followerCount'], axis=1, inplace=True)
        df_MaxTwitter.drop(['MaxFollower'], axis=1, inplace=True)

        l.logr('3.Twitter_Maximum_Retweet_' + var + '.csv', debug_ind, df_MaxTwitter, 'log')

        print('Maximum Re-Twitt:: ')
        print(df_MaxTwitter)
        print("*" * 157)
        logging.info(tmpR0)

        tmpR10 = 'End Time: ' + str(var)
        logging.info(tmpR10)
        logging.info(tmpR0)

    except ValueError:
        print("No relevant data to proceed!")
        logging.info("No relevant data to proceed!")

    except Exception as e:
        print("Top level Error: args:{0}, message{1}".format(e.args, e.message))

if __name__ == "__main__":
    main()

And, here are the key lines –

x1 = ct.clsTwitter()
ret_2 = x1.searchQry(rawQry)

Our application is instantiating the newly developed class.

# Capturing the JSON Payload
res = json.loads(ret_2)

# Converting dictionary to Pandas Dataframe
df_ret = p.read_json(ret_2, orient='records')

# Resetting the column orders as per JSON
df_ret = df_ret[list(res[0].keys())]

Converting the JSON to pandas dataframe for our analytic data point.

def getMaximumFollower(df):
    try:
        d1 = df['followerCount'].max()
        d1_max_str = int(d1)

        return d1_max_str
    except Exception as e:
        x = str(e)
        print(x)
        dt_part1 = 0

        return dt_part1

def getMaximumRetweet(df):
    try:
        d1 = df['reTweetCount'].max()
        d1_max_str = int(d1)

        return d1_max_str
    except Exception as e:
        x = str(e)
        print(x)
        dt_part1 = ''

        return dt_part1

These two functions declared above in the calling script are generating the maximum data point from the Re-Tweet & Followers from our returned dataset.

# Getting Status
df_MaxFollower = df_ret[(df_ret['followerCount'] == df_ret['MaxFollower'])]

And, this is the way, our application will fetch the maximum twitter dataset –

df_MaxTwitter = df_ret[(df_ret['reTweetCount'] == df_ret['MaxTweet'])]

And, you can customize your output by dropping unwanted columns in the specific dataset.

And, here is the output on Windows, which looks like –

8. WindowsRun

And, here is the windows log directory –

WindowsRunLog

So, we’ve achieved our target data point.

So, we’ll come out with another exciting post in the coming days!

N.B.: This is demonstrated for RnD/study purposes. All the data posted here are representational data & available over the internet.

Explaining New Python Library – Regular Expression in JSON

Hi Guys!

As discussed, here is the continuation of the previous post. We’ll explain the regular expression from the library that I’ve created recently.

First, let me share the calling script for regular expression –

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 08-Sep-2019              ####
####                                      ####
#### Objective: Main calling scripts.     ####
##############################################

from dnpr.clsDnpr import clsDnpr
import datetime as dt
import json

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

# Lookup functions from
# Azure cloud SQL DB

def main():
    try:
        # Initializing the class
        t = clsDnpr()
        
        srcJson = [
                    {"FirstName": "Satyaki", "LastName": "De", "Sal": 1000},
                    {"FirstName": "Satyaki", "LastName": "De", "Sal": 1000},
                    {"FirstName": "Archi", "LastName": "Bose", "Sal": 500},
                    {"FirstName": "Archi", "LastName": "Bose", "Sal": 7000},
                    {"FirstName": "Deb", "LastName": "Sen", "Sal": 9500}
                  ]

        print("4. Checking regular expression functionality!")
        print()

        var13 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var13))

        print('::Function Regex_Like:: ')
        print()

        tarColumn = 'FirstName'
        print('Target Column for Rexex_Like: ', tarColumn)
        inpPattern = r"\bSa"
        print('Input Pattern: ', str(inpPattern))

        # Invoking the distinct function
        tarJson = t.regex_like(srcJson, tarColumn, inpPattern)

        print('End of Function Regex_Like!')
        print()

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var14 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var14))

        var15 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var15))

        print('::Function Regex_Replace:: ')
        print()

        tarColumn = 'FirstName'
        print('Target Column for Rexex_Replace: ', tarColumn)
        inpPattern = r"\bSa"
        print('Input Pattern: ', str(inpPattern))
        replaceString = 'Ka'
        print('Replacing Character: ', replaceString)

        # Invoking the distinct function
        tarJson = t.regex_replace(srcJson, tarColumn, inpPattern, replaceString)

        print('End of Function Rexex_Replace!')
        print()

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var16 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var16))

        var17 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var17))

        print('::Function Regex_Substr:: ')
        print()

        tarColumn = 'FirstName'
        print('Target Column for Regex_Substr: ', tarColumn)
        inpPattern = r"\bSa"
        print('Input Pattern: ', str(inpPattern))

        # Invoking the distinct function
        tarJson = t.regex_substr(srcJson, tarColumn, inpPattern)

        print('End of Function Regex_Substr!')
        print()

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var18 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var18))

        print("=" * 157)
        print("End of regular expression function!")
        print("=" * 157)



    except ValueError:
        print("No relevant data to proceed!")

    except Exception as e:
        print("Top level Error: args:{0}, message{1}".format(e.args, e.message))

if __name__ == "__main__":
    main()

As per the library, we’ll discuss the following functionalities –

  1. regex_like
  2. regex_replace
  3. regex_substr

Now, let us check how to call these functions.

1. regex_like:

Following is the base skeleton in order to invoke this function –

regex_like(Input Json, Target Column, Pattern To Match) return Output Json

Here are the key lines in the script –

srcJson = [
            {"FirstName": "Satyaki", "LastName": "De", "Sal": 1000},
            {"FirstName": "Satyaki", "LastName": "De", "Sal": 1000},
            {"FirstName": "Archi", "LastName": "Bose", "Sal": 500},
            {"FirstName": "Archi", "LastName": "Bose", "Sal": 7000},
            {"FirstName": "Deb", "LastName": "Sen", "Sal": 9500}
          ]

# Invoking the distinct function
tarJson = t.regex_like(srcJson, tarColumn, inpPattern)

2. regex_replace:

Following is the base skeleton in order to invoke this function –

regex_replace(Input Json, Target Column, Pattern to Replace) return Output Json

Here are the key lines in the script –

tarColumn = 'FirstName'
print('Target Column for Rexex_Replace: ', tarColumn)
inpPattern = r"\bSa"
print('Input Pattern: ', str(inpPattern))
replaceString = 'Ka'
print('Replacing Character: ', replaceString)

# Invoking the distinct function
tarJson = t.regex_replace(srcJson, tarColumn, inpPattern, replaceString)

As you can see, here ‘Sa’ with ‘Ka’ provided it matches the specific pattern in the JSON.

3. regex_replace:

Following is the base skeleton in order to invoke this function –

regex_substr(Input Json, Target Column, Pattern to substring) return Output Json

Here are the key lines –

tarColumn = 'FirstName'
print('Target Column for Regex_Substr: ', tarColumn)
inpPattern = r"\bSa"
print('Input Pattern: ', str(inpPattern))

# Invoking the distinct function
tarJson = t.regex_substr(srcJson, tarColumn, inpPattern)

In this case, we’ve subtracted a part of the JSON string & return the final result as JSON.

Let us first see the sample input JSON –

SourceJSON_Regex

Let us check how it looks when we run the calling script –

  • regex_like:
Regex_Like

This function will retrieve the elements, which will start with ‘Sa‘. As a result, we’ll see the following two elements in the Payload.

  • regex_replace:
Regex_Replace

In this case, we’re replacing any string which starts with ‘Sa‘ & replaced with the ‘Ka‘.

  • regex_substr:
Regex_Substr

As you can see that the first element FirstName changed the name from “Satyaki” to “tyaki“.

So, finally, we’ve achieved our target.

I’ll post the next exciting concept very soon.

Till then! Happy Avenging! 😀

N.B.: This is demonstrated for RnD/study purposes. All the data posted here are representational data & available over the internet.

Explaining new Python library

Hi Guys!

Here is the post as to how to call this Dnpr library & what are the current limitations of this library.

Before we start let’s post the calling script & then explain how we can use them –

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 08-Sep-2019              ####
####                                      ####
#### Objective: Main calling scripts.     ####
##############################################

from dnpr.clsDnpr import clsDnpr
import datetime as dt
import json

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

# Lookup functions from


def main():
    try:
        srcJson = [
                    {"FirstName": "Satyaki", "LastName": "De", "Sal": 1000},
                    {"FirstName": "Satyaki", "LastName": "De", "Sal": 1000},
                    {"FirstName": "Archi", "LastName": "Bose", "Sal": 500},
                    {"FirstName": "Archi", "LastName": "Bose", "Sal": 7000},
                    {"FirstName": "Deb", "LastName": "Sen", "Sal": 9500}
                  ]

        print("=" * 157)
        print("Checking distinct function!")
        print("=" * 157)
        print()

        print("*" * 157)
        print("Input Data: ")
        srcJsonFormat = json.dumps(srcJson, indent=1)
        print(str(srcJsonFormat))
        print("*" * 157)

        # Initializing the class
        t = clsDnpr()

        print("1. Checking distinct functionality!")

        var1 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var1))

        # Invoking the distinct function
        tarJson = t.distinct(srcJson)

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var2 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var2))

        print("=" * 157)
        print("End of distinct function!")
        print("=" * 157)

        print("2. Checking nvl functionality!")

        srcJson_1 = [
            {"FirstName": "Satyaki", "LastName": "", "Sal": 1000},
            {"FirstName": "Archi", "LastName": "Bose", "Sal": 500},
            {"FirstName": "Deb", "LastName": "", "Sal": 9500}
        ]

        var3 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var3))

        strDef = 'FNU'
        print("Default Value: ", strDef)
        srcColName = 'LastName'
        print('Candidate Column for NVL: ', srcColName)

        # Invoking the nvl function
        tarJson_1 = t.nvl(srcJson_1, srcColName, strDef)

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat_1 = json.dumps(tarJson_1, indent=1)
        print(str(tarJsonFormat_1))
        print("*" * 157)

        if not tarJson_1:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var4 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var4))

        print("=" * 157)
        print("End of nvl function!")
        print("=" * 157)

        print("3. Checking partition-by functionality!")

        srcJson_2 = [
            {"FirstName": "Satyaki", "LastName": "", "Sal": 1000},
            {"FirstName": "Satyaki", "LastName": "", "Sal": 700},
            {"FirstName": "Archi", "LastName": "Bose", "Sal": 500},
            {"FirstName": "Deb", "LastName": "", "Sal": 9500},
            {"FirstName": "Archi", "LastName": "Bose", "Sal": 4500},
        ]

        var5 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var5))

        GrList = ['FirstName', 'LastName']
        print("Partition By Columns::: ", str(GrList))
        grOperation = 'Max'
        print('Operation toe be performed: ', grOperation)
        strCandidateColumnName = 'Sal'
        print('Column Name on which the aggregate function will take place: ', strCandidateColumnName)

        # Invoking the partition by function - MAX
        tarJson_1 = t.partitionBy(srcJson_2, GrList, grOperation, strCandidateColumnName)

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat_1 = json.dumps(tarJson_1, indent=1)
        print(str(tarJsonFormat_1))
        print("*" * 157)

        if not tarJson_1:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var6 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var6))

        var7 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var7))

        grOperation_1 = 'Min'
        print('Operation toe be performed: ', grOperation_1)

        # Invoking the Partition By function - MIN
        tarJson_2 = t.partitionBy(srcJson_2, GrList, grOperation_1, strCandidateColumnName)

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat_2 = json.dumps(tarJson_2, indent=1)
        print(str(tarJsonFormat_2))
        print("*" * 157)

        if not tarJson_2:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var8 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var8))

        var9 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var9))

        grOperation_2 = 'Avg'
        print('Operation toe be performed: ', grOperation_2)

        # Invoking the Partition By function - Avg
        tarJson_3 = t.partitionBy(srcJson_2, GrList, grOperation_2, strCandidateColumnName)

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat_3 = json.dumps(tarJson_3, indent=1)
        print(str(tarJsonFormat_3))
        print("*" * 157)

        if not tarJson_3:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var10 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var10))

        var11 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var11))

        grOperation_3 = 'Sum'
        print('Operation toe be performed: ', grOperation_3)

        # Invoking the Partition By function - Sum
        tarJson_4 = t.partitionBy(srcJson_2, GrList, grOperation_3, strCandidateColumnName)

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat_4 = json.dumps(tarJson_4, indent=1)
        print(str(tarJsonFormat_4))
        print("*" * 157)

        if not tarJson_4:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var12 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var12))

        print("=" * 157)
        print("End of partition function!")
        print("=" * 157)

        print("4. Checking regular expression functionality!")
        print()

        var13 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var13))

        print('::Function Regex_Like:: ')
        print()

        tarColumn = 'FirstName'
        print('Target Column for Rexex_Like: ', tarColumn)
        inpPattern = r"\bSa"
        print('Input Pattern: ', str(inpPattern))

        # Invoking the regex_like function
        tarJson = t.regex_like(srcJson, tarColumn, inpPattern)

        print('End of Function Regex_Like!')
        print()

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var14 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var14))

        var15 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var15))

        print('::Function Regex_Replace:: ')
        print()

        tarColumn = 'FirstName'
        print('Target Column for Rexex_Replace: ', tarColumn)
        inpPattern = r"\bSa"
        print('Input Pattern: ', str(inpPattern))
        replaceString = 'Ka'
        print('Replacing Character: ', replaceString)

        # Invoking the regex_replace function
        tarJson = t.regex_replace(srcJson, tarColumn, inpPattern, replaceString)

        print('End of Function Rexex_Replace!')
        print()

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var16 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var16)) 

        var17 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("Start Time: ", str(var17))

        print('::Function Regex_Substr:: ')
        print()

        tarColumn = 'FirstName'
        print('Target Column for Regex_Substr: ', tarColumn)
        inpPattern = r"\bSa"
        print('Input Pattern: ', str(inpPattern))

        # Invoking the regex_substr function
        tarJson = t.regex_substr(srcJson, tarColumn, inpPattern)

        print('End of Function Regex_Substr!')
        print()

        print("*" * 157)
        print("Output Data: ")
        tarJsonFormat = json.dumps(tarJson, indent=1)
        print(str(tarJsonFormat))
        print("*" * 157)

        if not tarJson:
            print()
            print("No relevant output data!")
            print("*" * 157)
        else:
            print()
            print("Relevant output data comes!")
            print("*" * 157)

        var18 = dt.datetime.now().strftime("%Y-%m-%d %H-%M-%S")
        print("End Time: ", str(var18))

        print("=" * 157)
        print("End of regular expression function!")
        print("=" * 157)


    except ValueError:
        print("No relevant data to proceed!")

    except Exception as e:
        print("Top level Error: args:{0}, message{1}".format(e.args, e.message))

if __name__ == "__main__":
    main()

Let’s explain the key lines –

As of now, the source payload that it will support is mostly simple JSON.

As you can see, we’ve relatively started with the simple JSON containing an array of elements.

# Initializing the class
t = clsDnpr()

In this line, you can initiate the main library.

Let’s explore the different functions, which you can use on JSON.

1. Distinct:

Let’s discuss the distinct function on  JSON. This function can be extremely useful if you use NoSQL, which doesn’t offer any distinct features. Or, if you are dealing with or expecting your source with duplicate JSON inputs.

Let’s check our sample payload for distinct –

Sample_Payload_Distinct

Here is the basic syntax & argument that it is expecting –

distinct(Input Json) returnOutput Json

So, all you have to ensure that you are passing a JSON input string.

As per our example –

# Invoking the distinct function
tarJson = t.distinct(srcJson)

And, here is the output –

Distinct_Output

If you compare the source JSON. You would have noticed that there are two identical entries with the name “Satyaki” is now replaced by one unique entries.

Limitation: Currently, this will support only basic JSON. However, I’m working on it to support that much more complex hierarchical JSON in coming days.

2. NVL:

NVL is another feature that I guess platform like JSON should have. So, I built this library specially handles the NULL data scenario, where the developer may want to pass a default value in place of NULL.

Hence, the implementation of this function.

Here is the sample payload for this –

Sample_Payload_NVL

In this case, if there is some business logic that requires null values replaced with some default value for LastName say e.g. FNU. This function will help you to implement that logic.

Here is the basic syntax & argument that it is expecting –

nvl(
     Input Json, 
     Prospective Null Column Name, 
     Dafult Value in case of Null
   ) return Output Json

And, here is the code implementation –

strDef = ‘FNU’
print(“Default Value: “, strDef)
srcColName = ‘LastName’
print(‘Candidate Column for NVL: ‘, srcColName)

# Invoking the nvl function
tarJson_1 = t.nvl(srcJson_1, srcColName, strDef)

So, in the above lines, this code will replace the Null value with the “FNU” for the column LastName.

And, Here is the output –

NVL_Output

3. Partition_By:

I personally like this function as this gives more power to manipulate any data in JSON levels such as Avg, Min, Max or Sum. This might be very useful in order to implement some basic aggregation on the fly.

Here is the basic syntax & argument that it is expecting –

partition_by(
              Input Json, 
              Group By Column List, 
              Group By Operation, 
              Candidate Column Name, 
              Output Column Name
            ) return Output Json

Now, we would explore the sample payload for all these functions to test –

Sample_payload_PartitionBy

Case 1:

In this case, we’ll calculate the maximum salary against FirstName & LastName. However, I want to print the Location in my final JSON output.

So, if you see the sample data & let’s make it tabular for better understanding –

PartitionMAX_SourceTab

So, as per our business logic, our MAX aggregate would operate only on FirstName & LastName. Hence, the calculation will process accordingly.

In that case, the output will look something like –

PartitionMAX_FinTab

As you can see, from the above picture two things happen. It will remove any duplicate entries. In this case, Satyaki has exactly two identical rows. So, it removes one. However, as part of partition by clause, it keeps two entries of Archi as the location is different. Deb will be appearing once as expected.

Let’s run our application & find out the output –

MAX_PartitionBy

So, we meet our expectation.

Case 2:

Same, logic will be applicable for Min as well.

Hence, as per the table, we should expect the output as –

PartitionMIN_FinTab

And, the output of our application run is –

MIN_PartitionBy

So, this also come as expected.

Case 3:

Let’s check for average –

PartitionAVG_FinTab

The only thing I wanted to point out, as we’ve two separate entries for Satyaki. So, the average will contain the salary from both the value as rightfully so. Hence, the average of (1000+700)/2 = 850.

Let’s run our application –

AVG_PartitionBy

So, we’ve achieved our target.

Case 4:

Let’s check for Sum.

PartitionSUM_FinTab

Now, let’s run our application –

SUM_PartitionBy

In the next installment, we’ll be discussing the last function from this package i.e. Regular Expression in JSON.

I hope, you’ll like this presentation.

Let me know – if you find any special bug. I’ll look into that.

Till then – Happy Avenging!

Note: All the data posted here are representational data & available over the internet & for educational purpose only.

Building Azure Databricks Cluster installing desired packages & with a demo run (Time stone from Python Verse)

Today, I’ll be showing how to prepare a cluster in Azure Databricks from command prompt & will demonstrate any sample csv file process using Pyspark. This can be useful, especially when you want to customize your environment & need to install specific packages inside the clusters with more options.

This is not like any of my earlier posts, where my primary attention is on the Python side. At the end of this post, I’ll showcase one use of Pyspark script & how we can execute them inside Azure Data bricks.

Let’s roll the dice!

Step -1:

Type Azure Databricks in your search folder inside the Azure portal.

0. Azure Search

As shown in the red box, you have to click these options. And, it will take the application to new data bricks sign-in page.

Step -2:

Next step would be clicking the “Add” button. For the first time, the application will ask you to create a storage account associated with this brick.

1. Create Storage

After creation, the screen should look like this –

2.5. Azure-Data-Bricks Options

Now, click the Azure command-line & chose bash as your work environment –

2. After Creation

For security reason, I’ve masked the details.

After successful creation, this page should look like this –

3. Azure Databricks

Once, you click the launch workspace, it will take you to this next page –

4. Detailed Bricks

As you can see that, there are no notebook or python scripts there under Recents tab.

Step -3:

Let’s verify it from the command line shell environment.

5. Python-Env

As you can see, by default python version in bricks is 3.5.2.

Step -4:

Now, we’ll prepare one environment by creating a local directory under the cloud.

The directory that we’ll be creating is – “rndBricks.”

6. Creating Directory

Step -5:

Let’s create the virtual environment here –

Using “virtualenv” function, we’ll be creating the virtual environment & it should look like this –

7. Creating Python-VM

As you can see, that – this will create the first python virtual environment along with the pip & wheel, which is essential for your python environment.

After creating the VM, you need to update Azure CLI, which is shown in the next screenshot given below –

8. Installing Databricks CLI in Python-VM

Before you create the cluster, you need to first generate the token, which will be used for the cluster –

9.1. Generating Token

As shown in the above screen, the “red” marked area is our primary interest. The “green” box, which represents the account image that you need to click & then you have to click “User Settings” marked in blue. Once you click that, you can see the “purple” area, where you need to click the Generate new token button in case if you are doing it for the first time.

Now, we’ll be using this newly generated token to configure data bricks are as follows –

9.2. Configuring with Token

Make sure, you need to mention the correct zone, i.e. westus2/westus or any region as per your geography & convenience.

Once, that is done. You can check the cluster list by the following command (In case, if you already created any clusters in your subscription) –

10. Checking Clusters List

Since we’re building it from scratch. There is no cluster information showing here.

Step -6:

Let’s create the clusters –

11. Creating-Clusters-From-Command

Please find the command that you will be using are as follows –

databricks clusters create –json ‘{ “autoscale”: {“min_workers”: 2, “max_workers”: 8}, “cluster_name”: “pyRnd”, “spark_version”: “5.3.x-scala2.11”, “spark_conf”: {}, “node_type_id”: “Standard_DS3_v2”, “driver_node_type_id”: “Standard_DS3_v2”, “ssh_public_keys”: [], “custom_tags”: {}, “spark_env_vars”: {“PYSPARK_PYTHON”: “/databricks/python3/bin/python3”}, “autotermination_minutes”: 20, “enable_elastic_disk”: true, “cluster_source”: “UI”, “init_scripts”: [] }’

As you can see, you need to pass the information in JSON format. For your better understanding, please find the JSON in a proper format –

11.5. JSON

And, the raw version –

{
  "autoscale": {
    "min_workers": 2,
    "max_workers": 8
  },
  "cluster_name": "pyRnd",
  "spark_version": "5.3.x-scala2.11",
  "spark_conf": {},
  "node_type_id": "Standard_DS3_v2",
  "driver_node_type_id": "Standard_DS3_v2",
  "ssh_public_keys": [],
  "custom_tags": {},
  "spark_env_vars": {
    "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
  },
  "autotermination_minutes": 20,
  "enable_elastic_disk": true,
  "cluster_source": "UI",
  "init_scripts": []
}

Initially, the cluster status will show from the GUI are as follows –

12. Cluster-Status-In-Progress

After a few minutes, this will show the running state –

13. Cluster-Running Status

Let’s check the detailed configuration once the cluster created –

14. Initial Cluster Details

Step -7:

We need to check the library section. This is important as we might need to install many dependant python package to run your application on Azure data bricks. And, the initial Libraries will look like this –

15. Libraries

You can install libraries into an existing cluster either through GUI or through shell command prompt as well. Let’s explore the GUI option.

GUI Option:

First, click the Libraries tab under your newly created clusters, as shown in the above picture. Then you need to click “Install New” button. This will pop-up the following windows –

16. Installing Libraries

As you can see, you have many options along with the possibilities for your python (marked in red) application as well.

Case 1 (Installing PyPi packages):

19. Installing through GUI

Note: You can either mention the specific version or just simply name the package name.

Case 2 (Installing Wheel packages):

16.5. Installing Wheel Libraries

As you can see, from the upload options, you can upload your local libraries & then click the install button to install the same.

UI Option:

Here is another way, you can install your python libraries using the command line as shown in the below screenshots –

17. Running & Installing Libraries - Alternate Options

Few things to notice. The first command shows the current running cluster list. Second, command updating your pip packages. And, the third command, install your desired pypi packages.

Please find the raw commands –

databricks clusters list

pip install -U pip

databricks libraries install –cluster-id “XXXX-XXXXX-leech896” –pypi-package “pandas” –pypi-repo “https://pypi.org/project/pandas/&#8221;

After installing, the GUI page under the libraries section will look like this –

18. Installed Libraries

Note that, for any failed case, you can check the log in this way –

20. Installation-In-progress

If you click on the marked red area, it will pop-up the detailed error details, which is as follows –

19.5. Error Details

So, we’re done with our initial set-up.

Let’s upload one sample file into this environment & try to parse the data.

Step -8:

You can upload your sample file as follows –

23.1. First Step

First, click the “data” & then click the “add data” marked in the red box.

You can import this entire csv data as tables as shown in the next screenshot –

23.2. Uploading Data Files

Also, you can create a local directory here based on your requirements are explained as –

24. Creating Local Directory For Process

Step -9:

Let’s run the code.

Please find the following snippet in PySpark for our test –

1. DBFromFile.py (This script will call the Bricks script & process the data to create an SQL like a table for our task.)

###########################################
#### Written By: SATYAKI DE        ########
#### Written On: 10-Feb-2019       ########
####                               ########
#### Objective: Pyspark File to    ########
#### parse the uploaded csv file.  ########
###########################################

# File location and type
file_location = "/FileStore/tables/src_file/customer_addr_20180112.csv"
file_type = "csv"

# CSV options
infer_schema = "false"
first_row_is_header = "true"
delimiter = ","

# The applied options are for CSV files. For other file types, these will be ignored.
df = spark.read.format(file_type) \
  .option("inferSchema", infer_schema) \
  .option("header", first_row_is_header) \
  .option("sep", delimiter) \
  .load(file_location)

display(df)

# Create a view or table

temp_table_name = "customer_addr_20180112_csv"

df.createOrReplaceTempView(temp_table_name)

%sql

/* Query the created temp table in a SQL cell */

select * from `customer_addr_20180112_csv`

From the above sample snippet, one can see that the application is trying to parse the source data by providing all the parsing details & then use that csv as a table in SQL.

Let’s check step by step execution.

25. Working With Uploaded File

So, until this step, you can see that the application has successfully parsed the csv data.

And, finally, you can view the data –

25.1. Second Option

As the highlighted blue box shows that the application is using this csv file as a table. So, you have many options to analyze the information flexibly if you are familiar with SQL.

After your job run, make sure you terminate your cluster. Otherwise, you’ll receive a large & expensive usage bill, which you might not want!

So, finally, we’ve done it.

Let me know what do you think.

Till then, Happy Avenging! 😀

Note: All the data posted here are representational data & available over the internet & for educational purpose only.

Building an Azure Function using Python (Crossover between Reality Stone & Time Stone in Python Verse)

Hi Guys!

Today, we’ll be discussing a preview features from Microsoft Azure. Building an Azure function using Python on it’s Linux/Ubuntu VM. Since this is a preview feature, we cannot implement this to production till now. However, my example definitely has more detailed steps & complete code guide compared to whatever available over the internet.

In this post, I will take one of my old posts & enhance it as per this post. Hence, I’ll post those modified scripts. However, I won’t discuss the logic in details as most of these scripts have cosmetic changes to cater to this requirement.

In this post, we’ll only show Ubuntu run & there won’t be Windows or MAC comparison.

Initial Environment Preparation:

  1. Set-up new virtual machine on Azure.
  2. Set-up Azure function environments on that server.

Set-up new virtual machine on Azure:

I’m not going into the details of how to create Ubuntu VM on Microsoft Azure. You can refer the steps in more information here.

After successful creation, the VM will look like this –

Azure VM - Ubuntu

Detailed information you can get after clicking this hyperlink over the name of the VM.

Azure-VM Basic Details

You have to open port 7071 for application testing from the local using postman.

You can get it from the network option under VM as follows –

Network-Configuration

Make sure that you are restricting these ports to specific network & not open to ALL traffic.

So, your VM is ready now.

To update Azure CLI, you need to use the following commands –

sudo apt-get update && sudo apt-get install –only-upgrade -y azure-cli

Set-up Azure function environments on that server:

To set-up the environment, you don’t have to go for Python installation as by default Ubuntu in Microsoft Azure comes up with desired Python version, i.e., Python3.6. However, to run the python application, you need to install the following app –

  1. Microsoft SDK. You will get the details from this link.
  2. Installing node-js. You will get the details from this link.
  3. You need to install a docker. However, as per Microsoft official version, this is not required. But, you can create a Docker container to distribute the python function in Azure application. I would say you can install this just in case if you want to continue with this approach. You will get the details over here. If you want to know details about the Docker. And, how you want to integrate python application. You can refer to this link.
  4. Your desired python packages. In this case, we’ll be modifying this post – “Encryption/Decryption, JSON, API, Flask Framework in Python (Crossover between Reality Stone & Time Stone in Python Verse).” We’ll be modifying a couple of lines only to cater to this functionality & deploying the same as an Azure function.
  5. Creating an Azure function template on Ubuntu. The essential detail you’ll get it from here. However, over there, it was not shown in detailed steps of python packages & how you can add all the dependencies to publish it in details. It was an excellent post to start-up your knowledge.

Let’s see these components status & very brief details –

Microsoft SDK:

To check the dot net version. You need to type the following commands in Ubuntu –

dotnet –info

And, the output will look like this –

DotNet-Version

Node-Js:

Following is the way to verify your node-js version & details –

node -v

npm -v

And, the output looks like this –

Node-Js

Docker:

Following is the way to test your docker version –

docker -v

And, the output will look like this –

Docker-Version

Python Packages:

Following are the python packages that we need to run & publish that in Azure cloud as an Azure function –

pip freeze | grep -v “pkg-resources” > requirements.txt

And, the output is –

Requirements

You must be wondered that why have I used this grep commands here. I’ve witnessed that on many occassion in Microsoft Azure’s Linux VM it produces one broken package called resource=0.0.0, which will terminate the deployment process. Hence, this is very crucial to eliminate those broken packages.

Now, we’re ready for our python scripts. But, before that, let’s see the directory structure over here –

Win_Vs_Ubuntu-Cloud

Creating an Azure Function Template on Ubuntu: 

Before we post our python scripts, we’ll create these following components, which is essential for our Python-based Azure function –

  • Creating a group:

              Creating a group either through Azure CLI or using a docker, you can proceed. The commands for Azure CLI is as follows –

az group create –name “rndWestUSGrp” –location westus

It is advisable to use double quotes for parameters value. Otherwise, you might land-up getting the following error – “Error: “resourceGroupName” should satisfy the constraint – “Pattern”: /^[-w._]+$/“.

I’m sure. You don’t want to face that again. And, here is the output –

CreateDeploymentGroup

Note that, here I haven’t used the double-quotes. But, to avoid any unforeseen issues – you should use double-quotes. You can refer the docker command from the above link, which I’ve shared earlier.

Now, you need to create one storage account where the metadata information of your function will be stored. You will create that as follows –

az storage account create –name cryptpy2019 –location westus –resource-group rndWestUSGrp –sku Standard_LRS

And, the output will look like this –

AccountCreate_1

Great. Now, we’ll create a virtual environment for Python3.6.

python3.6 -m venv .env
source .env/bin/activate

Python-VM

Now, we’ll create a local function project.

func init encPro

And, the output you will get is as follows –

Local-Function

Inside this directory, you’ll see the following files –

Local-Function-Details

You need to edit the host.json with these default lines –

{
 “version”: “2.0”,
 “extensionBundle”: {
                                       “id”: “Microsoft.Azure.Functions.ExtensionBundle”,
                                       “version”: “[1.*, 2.0.0)”
                                     }
}

And, the final content of these two files (excluding the requirements.txt) will look like this –

Configuration

Finally, we’ll create the template function by this following command –

func new

This will follow with steps finish it. You need to choose Python as your programing language. You need to choose an HTTP trigger template. Once you created that successfully, you’ll see the following files –

func_New

Note that, our initial function name is -> getVal.

By default, Azure will generate some default code inside the __init__.py. The details of those two files can be found here.

Since we’re ready with our environment setup. We can now discuss our Python scripts –

1. clsConfigServer.py (This script contains all the parameters of the server.)

###########################################
#### Written By: SATYAKI DE        ########
#### Written On: 10-Feb-2019       ########
####                               ########
#### Objective: Parameter File     ########
###########################################

import os
import platform as pl

# Checking with O/S system
os_det = pl.system()

class clsConfigServer(object):
    Curr_Path = os.path.dirname(os.path.realpath(__file__))

    if os_det == "Windows":
        config = {
            'FILE': 'acct_addr_20180112.csv',
            'SRC_FILE_PATH': Curr_Path + '\\' + 'src_file\\',
            'PROFILE_FILE_PATH': Curr_Path + '\\' + 'profile\\',
            'HOST_IP_ADDR': '0.0.0.0',
            'DEF_SALT': 'iooquzKtqLwUwXG3rModqj_fIl409vemWg9PekcKh2o=',
            'ACCT_NBR_SALT': 'iooquzKtqLwUwXG3rModqj_fIlpp1vemWg9PekcKh2o=',
            'NAME_SALT': 'iooquzKtqLwUwXG3rModqj_fIlpp1026Wg9PekcKh2o=',
            'PHONE_SALT': 'iooquzKtqLwUwXG3rMM0F5_fIlpp1026Wg9PekcKh2o=',
            'EMAIL_SALT': 'iooquzKtqLwU0653rMM0F5_fIlpp1026Wg9PekcKh2o='
        }
    else:
        config = {
            'FILE': 'acct_addr_20180112.csv',
            'SRC_FILE_PATH': Curr_Path + '/' + 'src_file/',
            'PROFILE_FILE_PATH': Curr_Path + '/' + 'profile/',
            'HOST_IP_ADDR': '0.0.0.0',
            'DEF_SALT': 'iooquzKtqLwUwXG3rModqj_fIl409vemWg9PekcKh2o=',
            'ACCT_NBR_SALT': 'iooquzKtqLwUwXG3rModqj_fIlpp1vemWg9PekcKh2o=',
            'NAME_SALT': 'iooquzKtqLwUwXG3rModqj_fIlpp1026Wg9PekcKh2o=',
            'PHONE_SALT': 'iooquzKtqLwUwXG3rMM0F5_fIlpp1026Wg9PekcKh2o=',
            'EMAIL_SALT': 'iooquzKtqLwU0653rMM0F5_fIlpp1026Wg9PekcKh2o='
        }

2. clsEnDec.py (This script is a lighter version of encryption & decryption of our previously discussed scenario. Hence, we won’t discuss in details. You can refer my earlier post to understand the logic of this script.)

###########################################
#### Written By: SATYAKI DE        ########
#### Written On: 25-Jan-2019       ########
#### Package Cryptography needs to ########
#### install in order to run this  ########
#### script.                       ########
####                               ########
#### Objective: This script will   ########
#### encrypt/decrypt based on the  ########
#### hidden supplied salt value.   ########
###########################################

from cryptography.fernet import Fernet
import logging

from getVal.clsConfigServer import clsConfigServer as csf

class clsEnDec(object):

    def __init__(self):
        # Calculating Key
        self.token = str(csf.config['DEF_SALT'])

    def encrypt_str(self, data, token):
        try:
            # Capturing the Salt Information
            t1 = self.token
            t2 = token

            if t2 == '':
                salt = t1
            else:
                salt = t2

            logging.info("Encrypting the value!")

            # Checking Individual Types inside the Dataframe
            cipher = Fernet(salt)
            encr_val = str(cipher.encrypt(bytes(data,'utf8'))).replace("b'","").replace("'","")

            strV1 = "Encrypted value:: " + str(encr_val)
            logging.info(strV1)

            return encr_val

        except Exception as e:
            x = str(e)
            print(x)
            encr_val = ''

            return encr_val

    def decrypt_str(self, data, token):
        try:
            # Capturing the Salt Information
            t1 = self.token
            t2 = token

            if t2 == '':
                salt = t1
            else:
                salt = t2

            logging.info("Decrypting the value!")

            # Checking Individual Types inside the Dataframe
            cipher = Fernet(salt)
            decr_val = str(cipher.decrypt(bytes(data,'utf8'))).replace("b'","").replace("'","")

            strV2 = "Decrypted value:: " + str(decr_val)
            logging.info(strV2)

            return decr_val

        except Exception as e:
            x = str(e)
            print(x)
            decr_val = ''

            return decr_val

3. clsFlask.py (This is the main server script that will the encrypt/decrypt class from our previous scenario. This script will capture the requested JSON from the client, who posted from the clients like another python script or third-party tools like Postman.)

###########################################
#### Written By: SATYAKI DE            ####
#### Written On: 25-Jan-2019           ####
#### Package Flask package needs to    ####
#### install in order to run this      ####
#### script.                           ####
####                                   ####
#### Objective: This script will       ####
#### encrypt/decrypt based on the      ####
#### supplied salt value. Also,        ####
#### this will capture the individual  ####
#### element & stored them into JSON   ####
#### variables using flask framework.  ####
###########################################

from getVal.clsConfigServer import clsConfigServer as csf
from getVal.clsEnDec import clsEnDecAuth

getVal = clsEnDec()

import logging

class clsFlask(object):
    def __init__(self):
        self.xtoken = str(csf.config['DEF_SALT'])

    def getEncryptProcess(self, dGroup, input_data, dTemplate):
        try:
            # It is sending default salt value
            xtoken = self.xtoken

            # Capturing the individual element
            dGroup = dGroup
            input_data = input_data
            dTemplate = dTemplate

            # This will check the mandatory json elements
            if ((dGroup != '') & (dTemplate != '')):

                # Based on the Group & Element it will fetch the salt
                # Based on the specific salt it will encrypt the data
                if ((dGroup == 'GrDet') & (dTemplate == 'subGrAcct_Nbr')):
                    xtoken = str(csf.config['ACCT_NBR_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.encrypt_str(input_data, xtoken)
                elif ((dGroup == 'GrDet') & (dTemplate == 'subGrName')):
                    xtoken = str(csf.config['NAME_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.encrypt_str(input_data, xtoken)
                elif ((dGroup == 'GrDet') & (dTemplate == 'subGrPhone')):
                    xtoken = str(csf.config['PHONE_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.encrypt_str(input_data, xtoken)
                elif ((dGroup == 'GrDet') & (dTemplate == 'subGrEmail')):
                    xtoken = str(csf.config['EMAIL_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.encrypt_str(input_data, xtoken)
                else:
                    ret_val = ''
            else:
                ret_val = ''

            # Return value
            return ret_val

        except Exception as e:
            ret_val = ''
            # Return the valid json Error Response
            return ret_val

    def getDecryptProcess(self, dGroup, input_data, dTemplate):
        try:
            xtoken = self.xtoken

            # Capturing the individual element
            dGroup = dGroup
            input_data = input_data
            dTemplate = dTemplate

            # This will check the mandatory json elements
            if ((dGroup != '') & (dTemplate != '')):

                # Based on the Group & Element it will fetch the salt
                # Based on the specific salt it will decrypt the data
                if ((dGroup == 'GrDet') & (dTemplate == 'subGrAcct_Nbr')):
                    xtoken = str(csf.config['ACCT_NBR_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.decrypt_str(input_data, xtoken)
                elif ((dGroup == 'GrDet') & (dTemplate == 'subGrName')):
                    xtoken = str(csf.config['NAME_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.decrypt_str(input_data, xtoken)
                elif ((dGroup == 'GrDet') & (dTemplate == 'subGrPhone')):
                    xtoken = str(csf.config['PHONE_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.decrypt_str(input_data, xtoken)
                elif ((dGroup == 'GrDet') & (dTemplate == 'subGrEmail')):
                    xtoken = str(csf.config['EMAIL_SALT'])

                    strV1 = "xtoken: " + str(xtoken)
                    logging.info(strV1)
                    strV2 = "Flask Input Data: " + str(input_data)
                    logging.info(strV2)

                    #x = cen.clsEnDecAuth()
                    ret_val = getVal.decrypt_str(input_data, xtoken)
                else:
                    ret_val = ''
            else:
                ret_val = ''

            # Return value
            return ret_val

        except Exception as e:
            ret_val = ''
            # Return the valid Error Response
            return ret_val

4. __init__.py (This autogenerated script contains the primary calling methods of encryption & decryption based on the element header & values after enhanced as per the functionality.)

###########################################
#### Written By: SATYAKI DE            ####
#### Written On: 08-Jun-2019           ####
#### Package Flask package needs to    ####
#### install in order to run this      ####
#### script.                           ####
####                                   ####
#### Objective: Main Calling scripts.  ####
#### This is an autogenrate scripts.   ####
#### However, to meet the functionality####
#### we've enhanced as per our logic.  ####
###########################################
__all__ = ['clsFlask']

import logging
import azure.functions as func
import json

from getVal.clsFlask import clsFlask

getVal = clsFlask()

def main(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python Encryption function processed a request.')

    str_val = 'Input Payload:: ' + str(req.get_json())
    str_1 = str(req.get_json())

    logging.info(str_val)

    ret_val = {}
    DataIn = ''
    dGroup = ''
    dTemplate = ''
    flg = ''

    if (str_1 != ''):
        try:
            req_body = req.get_json()
            dGroup = req_body.get('dataGroup')

            try:
                DataIn = req_body.get('data')
                strV15 = 'If Part:: ' + str(DataIn)

                logging.info(strV15)

                if ((DataIn == '') | (DataIn == None)):
                    raise ValueError

                flg = 'Y'
            except ValueError:
                DataIn = req_body.get('edata')
                strV15 = 'Else Part:: ' + str(DataIn)
                logging.info(strV15)
                flg = 'N'
            except:
                DataIn = req_body.get('edata')
                strV15 = 'Else Part:: ' + str(DataIn)
                logging.info(strV15)
                flg = 'N'

            dTemplate = req_body.get('dataTemplate')

        except ValueError:
            pass

    strV5 = "Encrypt Decrypt Flag:: " + flg
    logging.info(strV5)

    if (flg == 'Y'):

        if ((DataIn != '') & ((dGroup != '') & (dTemplate != ''))):

            logging.info("Encryption Started!")
            ret_val = getVal.getEncryptProcess(dGroup, DataIn, dTemplate)
            strVal2 = 'Return Payload:: ' + str(ret_val)
            logging.info(strVal2)

            xval = json.dumps(ret_val)

            return func.HttpResponse(xval)
        else:
            return func.HttpResponse(
                 "Please pass a data in the request body",
                 status_code=400
            )
    else:

        if ((DataIn != '') & ((dGroup != '') & (dTemplate != ''))):

            logging.info("Decryption Started!")
            ret_val2 = getVal.getDecryptProcess(dGroup, DataIn, dTemplate)
            strVal3 = 'Return Payload:: ' + str(ret_val)
            logging.info(strVal3)

            xval1 = json.dumps(ret_val2)

            return func.HttpResponse(xval1)
        else:
            return func.HttpResponse(
                "Please pass a data in the request body",
                status_code=400
            )

In this script, based on the value of an flg variable, we’re calling our encryption or decryption methods. And, the value of the flg variable is set based on the following logic –

try:
    DataIn = req_body.get('data')
    strV15 = 'If Part:: ' + str(DataIn)

    logging.info(strV15)

    if ((DataIn == '') | (DataIn == None)):
        raise ValueError

    flg = 'Y'
except ValueError:
    DataIn = req_body.get('edata')
    strV15 = 'Else Part:: ' + str(DataIn)
    logging.info(strV15)
    flg = 'N'
except:
    DataIn = req_body.get('edata')
    strV15 = 'Else Part:: ' + str(DataIn)
    logging.info(strV15)
    flg = 'N'

So, if the application gets the “data” element then – it will consider the data needs to be encrypted; otherwise, it will go for decryption. And, based on that – it is setting the value.

Now, we’re ready to locally run our application –

func host start

And, the output will look like this –

StartingAzureFunction-Python
StartingAzureFunction-Python 2

Let’s test it from postman –

Encrypt:

Postman-Encrypt

Decrypt:

Postman-Decrypt

Great. Now, we’re ready to publish this application to Azure cloud.

As in our earlier steps, we’ve already built our storage account for the metadata. Please scroll to top to view that again. Now, using that information, we’ll make the function app with a more meaningful name –

az functionapp create –resource-group rndWestUSGrp –os-type Linux \
–consumption-plan-location westus –runtime python \
–name getEncryptDecrypt –storage-account cryptpy2019

CreatingFunctionPython

Let’s publish the function –

sudo func azure functionapp publish “getEncryptDecrypt” –build-native-deps

On many occassion, without the use of “–build-native-deps” might leads to failure. Hence, I’ve added that to avoid such scenarios.

Publishing-Function

Now, we need to test our first published complex Azure function with Python through postman –

Encrypt:

PubishedFuncPostmanEncrypt

Decrypt:

PubishedFuncPostmanDecrypt

Wonderful! So, it is working.

You can see the function under the Azure portal –

Deployed-Function

Let’s see some other important features of this function –

Monitor: You can monitor two ways. One is by clicking the monitor options you will get the individual requests level details & also get to see the log information over here –

Function-Monitor-Details-1

Clicking Application Insights will give you another level of detailed logs, which can be very useful for debugging. We’ll touch this at the end of this post with a very brief discussion.

Function-Monitor-Details-3.JPG

As you can see, clicking individual lines will show the details further.

Let’s quickly check the application insights –

Application-Insights-1

Application Insights will give you a SQL like an interface where you can get the log details of all your requests.

Application-Insights-2

You can expand the individual details for further information.

Application-Insights-3

You can change the parameter name & other details & click the run button to get all the log details for your debugging purpose.

So, finally, we’ve achieved our goal. This is relatively long posts. But, I’m sure this will help you to create your first python-based function on the Azure platform.

Hope, you will like this approach. Let me know your comment on the same.

I’ll bring some more exciting topic in the coming days from the Python verse.

Till then, Happy Avenging! 😀

Note: All the data posted here are representational data & available over the internet.

Combining the NoSQL(Cosmos DB) & traditional Azure RDBMS in Azure (Time stone solo from Python verse)

Hi Guys!

Today, our main objective is to extend our last post & blending two different kinds of data using Python.

Please refer the earlier post if you didn’t go through it – “Building Azure cosmos application.“.

What is the Objective?

In this post, our objective is to combine traditional RDBMS from the cloud with Azure’s NO SQL, which is, in this case, is Cosmos DB. And, try to forecast some kind of blended information, which can be aggregated further.

Examining Source Data.

No SQL Data from Cosmos:

Let’s check one more time the No SQL data created in our last post.

CosmosData

Total, we’ve created 6 records in our last post.

As you can see in red marked areas. From item, one can check the total number of records created. You can also filter out specific record using the Edit Filter blue color button highlighted with blue box & you need to provide the “WHERE CLAUSE” inside it.

Azure SQL DB:

Let’s create some data in Azure SQL DB.

But, before that, you need to create SQL DB in the Azure cloud. Here is the official Microsoft link to create DB in Azure. You can refer to it here.

I won’t discuss the detailed steps of creating DB here.

From Azure portal, it looks like –

Azure SQL DB Main Screen

Let’s see how the data looks like in Azure DB. For our case, we’ll be using the hrMaster DB.

Let’s create the table & some sample data aligned as per our cosmos data.

Azure SQL DB

We will join both the data based on subscriberId & then extract our required columns in our final output.

CombinedData

Good. Now, we’re ready for python scripts.

Python Scripts:

In this installment, we’ll be reusing the following python scripts, which is already discussed in my earlier post –

  • clsL.py
  • clsColMgmt.py
  • clsCosmosDBDet.py

So, I’m not going to discuss these scripts.

Before we discuss our scripts, let’s look out the directory structures –

Win_Vs_MAC

Here is the detailed directory structure between the Windows & MAC O/S.

1. clsConfig.py (This script will create the split csv files or final merge file after the corresponding process. However, this can be used as usual verbose debug logging as well. Hence, the name comes into the picture.)

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 25-May-2019              ####
#### Updated On: 02-Jun-2019              ####
####                                      ####
#### Objective: This script is a config   ####
#### file, contains all the keys for      ####
#### azure cosmos db. Application will    ####
#### process these information & perform  ####
#### various CRUD operation on Cosmos DB. ####
##############################################

import os
import platform as pl

class clsConfig(object):
    Curr_Path = os.path.dirname(os.path.realpath(__file__))
    db_name = 'rnd-de01-usw2-vfa-cdb'
    db_link = 'dbs/' + db_name
    CONTAINER1 = "RealtimeEmail"
    CONTAINER2 = "RealtimeTwitterFeedback"
    CONTAINER3 = "RealtimeHR"

    os_det = pl.system()
    if os_det == "Windows":
        sep = '\\'
    else:
        sep = '/'

    config = {
        'SERVER': 'xxxx-xxx.database.windows.net',
        'DATABASE_1': 'SalesForceMaster',
        'DATABASE_2': 'hrMaster',
        'DATABASE_3': 'statMaster',
        'USERNAME': 'admin_poc_dev',
        'PASSWORD': 'xxxxx',
        'DRIVER': '{ODBC Driver 17 for SQL Server}',
        'ENV': 'pocdev-saty',
        'ENCRYPT_FLAG': "yes",
        'TRUST_FLAG': "no",
        'TIMEOUT_LIMIT': "30",
        'PROCSTAT': "'Y'",
        'APP_ID': 1,
        'EMAIL_SRC_JSON_FILE': Curr_Path + sep + 'src_file' + sep + 'srcEmail.json',
        'TWITTER_SRC_JSON_FILE': Curr_Path + sep + 'src_file' + sep + 'srcTwitter.json',
        'HR_SRC_JSON_FILE': Curr_Path + sep + 'src_file' + sep + 'srcHR.json',
        'COSMOSDB_ENDPOINT': 'https://rnd-de01-usw2-vfa-cdb.documents.azure.com:443/',
        'CONFIG_TABLE': 'ETL_CONFIG_TAB',
        'COSMOS_PRIMARYKEY': "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXIsI00AxKXXXXXgg==",
        'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
        'COSMOSDB': db_name,
        'COSMOS_CONTAINER1': CONTAINER1,
        'COSMOS_CONTAINER2': CONTAINER2,
        'COSMOS_CONTAINER3': CONTAINER3,
        'CONFIG_ORIG': 'Config_orig.csv',
        'ENCRYPT_CSV': 'Encrypt_Config.csv',
        'DECRYPT_CSV': 'Decrypt_Config.csv',
        'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
        'LOG_PATH': Curr_Path + sep + 'log' + sep,
        'REPORT_PATH': Curr_Path + sep + 'report',
        'APP_DESC_1': 'Feedback Communication',
        'DEBUG_IND': 'N',
        'INIT_PATH': Curr_Path,
        'SQL_QRY_1': "SELECT c.subscriberId, c.sender, c.orderNo, c.orderDate, c.items.orderQty  FROM RealtimeEmail c",
        'SQL_QRY_2': "SELECT c.twitterId, c.Twit, c.DateCreated, c.Country FROM RealtimeTwitterFeedback c WHERE c.twitterId=@CrVal",
        'DB_QRY': "SELECT * FROM c",
        'AZURE_SQL_1': "SELECT DISTINCT subscriberId, state, country, annualIncome, customerType FROM dbo.onboardCustomer",
        'COLLECTION_QRY': "SELECT * FROM r",
        'database_link': db_link,
        'collection_link_1': db_link + '/colls/' + CONTAINER1,
        'collection_link_2': db_link + '/colls/' + CONTAINER2,
        'collection_link_3': db_link + '/colls/' + CONTAINER3,
        'options': {
            'offerThroughput': 1000,
            'enableCrossPartitionQuery': True,
            'maxItemCount': 2
        }
    }

Here, we’ve added a couple of more entries compared to the last time, which points the detailed configuration for Azure SQL DB.

‘SERVER’: ‘xxxx-xxx.database.windows.net’,
‘DATABASE_1’: ‘SalesForceMaster’,
‘DATABASE_2’: ‘hrMaster’,
‘DATABASE_3’: ‘statMaster’,
‘USERNAME’: ‘admin_poc_dev’,
‘PASSWORD’: ‘xxxxx’,
‘DRIVER’: ‘{ODBC Driver 17 for SQL Server}’,
‘ENV’: ‘pocdev-saty’,
‘ENCRYPT_FLAG’: “yes”,
‘TRUST_FLAG’: “no”,
‘TIMEOUT_LIMIT’: “30”,
‘PROCSTAT’: “‘Y'”, 

Here, you need to supply your DB credentials accordingly.

2. clsDBLookup.py (This script will look into the Azure SQL DB & fetch data from the traditional RDBMS of Azure environment.)

#####################################################
#### Written By: SATYAKI DE                      ####
#### Written On: 25-May-2019                     ####
####                                             ####
#### Objective: This script will check &         ####
#### test the connection with the Azure          ####
#### SQL DB & it will fetch all the records      ####
#### name resied under the same DB of a table.   ####
#####################################################

import pyodbc as py
import pandas as p
from clsConfig import clsConfig as cdc

class clsDBLookup(object):
    def __init__(self, lkpTableName = ''):
        self.server = cdc.config['SERVER']
        self.database = cdc.config['DATABASE_1']
        self.database1 = cdc.config['DATABASE_2']
        self.database2 = cdc.config['DATABASE_3']
        self.username = cdc.config['USERNAME']
        self.password = cdc.config['PASSWORD']
        self.driver = cdc.config['DRIVER']
        self.env = cdc.config['ENV']
        self.encrypt_flg = cdc.config['ENCRYPT_FLAG']
        self.trust_flg = cdc.config['TRUST_FLAG']
        self.timeout_limit = cdc.config['TIMEOUT_LIMIT']
        self.lkpTableName = cdc.config['CONFIG_TABLE']
        self.ProcStat = cdc.config['PROCSTAT']
        self.AppId = cdc.config['APP_ID']

    def LookUpData(self):
        try:
            # Assigning all the required values
            server = self.server
            database = self.database1
            username = self.username
            password = self.password
            driver = self.driver
            env = self.env
            encrypt_flg = self.encrypt_flg
            trust_flg = self.trust_flg
            timout_limit = self.timeout_limit
            lkpTableName = self.lkpTableName
            ProcStat = self.ProcStat
            AppId = self.AppId

            # Creating secure connection
            str_conn = 'Driver=' + driver + ';Server=tcp:' + server + ',1433;' \
                       'Database=' + database + ';Uid=' + username + '@' + env + ';' \
                       'Pwd=' + password + ';Encrypt=' + encrypt_flg + ';' \
                       'TrustServerCertificate=' + trust_flg + ';Connection Timeout=' + timout_limit + ';'

            db_con_azure = py.connect(str_conn)

            query = " SELECT [ruleId] as ruleId, [ruleName] as ruleName, [ruleSQL] as ruleSQL, " \
                    " [ruleFlag] as ruleFlag, [appId] as appId, [DBType] as DBType, " \
                    " [DBName] as DBName FROM [dbo][" + lkpTableName + "] WHERE ruleFLag = " + ProcStat + " " \
                    " and appId = " + AppId + " ORDER BY ruleId "

            df = p.read_sql(query, db_con_azure)

            # Closing the connection
            db_con_azure.close()

            return df
        except Exception as e:
            x = str(e)
            print(x)
            df = p.DataFrame()

            return df

    def azure_sqldb_read(self, sql):
        try:
            # Assigning all the required values
            server = self.server
            database = self.database1
            username = self.username
            password = self.password
            driver = self.driver
            env = self.env
            encrypt_flg = self.encrypt_flg
            trust_flg = self.trust_flg
            timout_limit = self.timeout_limit
            lkpTableName = self.lkpTableName
            ProcStat = self.ProcStat
            AppId = self.AppId

            # Creating secure connection
            str_conn = 'Driver=' + driver + ';Server=tcp:' + server + ',1433;' \
                       'Database=' + database + ';Uid=' + username + '@' + env + ';' \
                       'Pwd=' + password + ';Encrypt=' + encrypt_flg + ';' \
                       'TrustServerCertificate=' + trust_flg + ';Connection Timeout=' + timout_limit + ';'

            # print("Connection Details:: ", str_conn)
            db_con_azure = py.connect(str_conn)

            query = sql

            df = p.read_sql(query, db_con_azure)

            # Closing the connection
            db_con_azure.close()

            return df
        except Exception as e:
            x = str(e)
            print(x)
            df = p.DataFrame()

            return df

Major lines to discuss –

azure_sqldb_read(self, sql):

Getting the source SQL supplied from the configuration script.

db_con_azure = py.connect(str_conn)

query = sql

df = p.read_sql(query, db_con_azure)

After creating a successful connection, our application will read the SQL & fetch the data & store that into a pandas dataframe and return the output to the primary calling function.

3. callCosmosAPI.py (This is the main script, which will call all the methods to blend the data. Hence, the name comes into the picture.)

##############################################
#### Written By: SATYAKI DE               ####
#### Written On: 25-May-2019              ####
#### Modified On 02-Jun-2019              ####
####                                      ####
#### Objective: Main calling scripts.     ####
##############################################

import clsColMgmt as cm
import clsCosmosDBDet as cmdb
from clsConfig import clsConfig as cf
import pandas as p
import clsLog as cl
import logging
import datetime
import json
import clsDBLookup as dbcon

# Disbling Warning
def warn(*args, **kwargs):
    pass

import warnings
warnings.warn = warn

def getDate(row):
    try:
        d1 = row['orderDate']
        d1_str = str(d1)
        d1_dt_part, sec = d1_str.split('.')
        dt_part1 = d1_dt_part.replace('T', ' ')

        return dt_part1
    except Exception as e:
        x = str(e)
        print(x)
        dt_part1 = ''

        return dt_part1

# Lookup functions from
# Azure cloud SQL DB

var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")

def main():
    try:
        df_ret = p.DataFrame()
        df_ret_2 = p.DataFrame()
        df_ret_2_Mod = p.DataFrame()

        debug_ind = 'Y'

        # Initiating Log Class
        l = cl.clsLog()

        general_log_path = str(cf.config['LOG_PATH'])

        # Enabling Logging Info
        logging.basicConfig(filename=general_log_path + 'consolidated.log', level=logging.INFO)

        # Moving previous day log files to archive directory
        arch_dir = cf.config['ARCH_DIR']
        log_dir = cf.config['LOG_PATH']

        print("Archive Directory:: ", arch_dir)
        print("Log Directory::", log_dir)

        print("*" * 157)
        print("Testing COSMOS DB Connection!")
        print("*" * 157)

        # Checking Cosmos DB Azure
        y = cmdb.clsCosmosDBDet()
        ret_val = y.test_db_con()

        if ret_val == 0:
            print()
            print("Cosmos DB Connection Successful!")
            print("*" * 157)
        else:
            print()
            print("Cosmos DB Connection Failure!")
            print("*" * 157)
            raise Exception

        print("*" * 157)

        # Accessing from Azure SQL DB
        x1 = dbcon.clsDBLookup()
        act_df = x1.azure_sqldb_read(cf.config['AZURE_SQL_1'])

        print("Azure SQL DB::")
        print(act_df)
        print()

        print("-" * 157)

        # Calling the function 1
        print("RealtimeEmail::")

        # Fetching First collection data to dataframe
        print("Fethcing Comos Collection Data!")

        sql_qry_1 = cf.config['SQL_QRY_1']
        msg = "Documents generatd based on unique key"
        collection_flg = 1

        x = cm.clsColMgmt()
        df_ret = x.fetch_data(sql_qry_1, msg, collection_flg)

        l.logr('1.EmailFeedback_' + var + '.csv', debug_ind, df_ret, 'log')
        print('RealtimeEmail Data::')
        print(df_ret)
        print()

        # Checking execution status
        ret_val = int(df_ret.shape[0])

        if ret_val == 0:
            print("Cosmos DB Hans't returned any rows. Please check your queries!")
            print("*" * 157)
        else:
            print("Successfully fetched!")
            print("*" * 157)

        # Calling the 2nd Collection
        print("RealtimeTwitterFeedback::")

        # Fetching First collection data to dataframe
        print("Fethcing Cosmos Collection Data!")

        # Query using parameters
        sql_qry_2 = cf.config['SQL_QRY_2']
        msg_2 = "Documents generated based on RealtimeTwitterFeedback feed!"
        collection_flg = 2

        val = 'crazyGo'
        param_det = [{"name": "@CrVal", "value": val}]
        add_param = 2

        x1 = cm.clsColMgmt()
        df_ret_2 = x1.fetch_data(sql_qry_2, msg_2, collection_flg, add_param, param_det)

        l.logr('2.TwitterFeedback_' + var + '.csv', debug_ind, df_ret, 'log')
        print('Realtime Twitter Data:: ')
        print(df_ret_2)
        print()

        # Checking execution status
        ret_val_2 = int(df_ret_2.shape[0])

        if ret_val_2 == 0:
            print("Cosmos DB hasn't returned any rows. Please check your queries!")
            print("*" * 157)
        else:
            print("Successfuly row feteched!")
            print("*" * 157)

        # Merging NoSQL Data (Cosmos DB) with Relational DB (Azure SQL DB)
        df_Fin_temp = p.merge(df_ret, act_df, on='subscriberId', how='inner')

        df_fin = df_Fin_temp[['orderDate', 'orderNo', 'sender', 'state', 'country', 'customerType']]

        print("Initial Combined Data (From Cosmos & Azure SQL DB) :: ")
        print(df_fin)

        l.logr('3.InitCombine_' + var + '.csv', debug_ind, df_fin, 'log')

        # Transforming the orderDate as per standard format
        df_fin['orderDateM'] = df_fin.apply(lambda row: getDate(row), axis=1)

        # Dropping the old column & renaming the new column to old column
        df_fin.drop(columns=['orderDate'], inplace=True)
        df_fin.rename(columns={'orderDateM': 'orderDate'}, inplace=True)

        print("*" * 157)
        print()
        print("Final Combined & Transformed result:: ")
        print(df_fin)

        l.logr('4.Final_Combine_' + var + '.csv', debug_ind, df_fin, 'log')
        print("*" * 157)

    except ValueError:
        print("No relevant data to proceed!")

    except Exception as e:
        print("Top level Error: args:{0}, message{1}".format(e.args, e.message))

if __name__ == "__main__":
    main()

The key lines from this script –

def getDate(row):
    try:
        d1 = row['orderDate']
        d1_str = str(d1)
        d1_dt_part, sec = d1_str.split('.')
        dt_part1 = d1_dt_part.replace('T', ' ')

        return dt_part1
    except Exception as e:
        x = str(e)
        print(x)
        dt_part1 = ''

        return dt_part1

This function converts NoSQL date data type more familiar format.

NoSQL Date:
NoSQL_Date
Transformed Date:
Transformed Date
# Accessing from Azure SQL DB
x1 = dbcon.clsDBLookup()
act_df = x1.azure_sqldb_read(cf.config['AZURE_SQL_1'])

print("Azure SQL DB::")
print(act_df)
print()

Above lines are calling the Azure SQL DB method to retrieve the RDBMS data into our dataframe.

# Merging NoSQL Data (Cosmos DB) with Relational DB (Azure SQL DB)
df_Fin_temp = p.merge(df_ret, act_df, on='subscriberId', how='inner')

df_fin = df_Fin_temp[['orderDate', 'orderNo', 'sender', 'state', 'country', 'customerType']]

In these above lines, we’re joining the data retrieved from two different kinds of the database to prepare our initial combined dataframe. Also, we’ve picked only the desired column, which will be useful for us.

# Transforming the orderDate as per standard format
df_fin['orderDateM'] = df_fin.apply(lambda row: getDate(row), axis=1)

# Dropping the old column & renaming the new column to old column
df_fin.drop(columns=['orderDate'], inplace=True)
df_fin.rename(columns={'orderDateM': 'orderDate'}, inplace=True)

In the above lines, we’re transforming our date field, as shown above in one of our previous images by calling the getDate method.

Let’s see the directory structure of our program –

Win_Vs_MAC

Let’s see how it looks when it runs –

Windows:

Win_Run_1
Win_Run_2

MAC:

MAC_Run_1
MAC_Run_2

So, finally, we’ve successfully blended the data & make more meaningful data projection.

Following python packages are required to run this application –

pip install azure

pip install azure-cosmos

pip install pandas

pip install requests

pip install pyodbc

This application tested on Python3.7.1 & Python3.7.2 as well. As per Microsoft, their official supported version is Python3.5.

I hope you’ll like this effort.

Wait for the next installment. Till then, Happy Avenging. 😀

[Note: All the sample data are available/prepared in the public domain for research & study.]