This new solution will evaluate the power of Stable Defussion, which is created solutions as we progress & refine our prompt from scratch by using Stable Defussion & Python. This post opens new opportunities for IT companies & business start-ups looking to deliver solutions & have better performance compared to the paid version of Stable Defussion AI’s API performance. This project is for the advanced Python, Stable Defussion for data Science Newbies & AI evangelists.
In a series of posts, I’ll explain and focus on the Stable Defussion API and custom solution using the Python-based SDK of Stable Defussion.
But, before that, let us view the video that it generates from the prompt by using the third-party API:
And, let us understand the prompt that we supplied to create the above video –
Lighthouse on a cliff overlooking the ocean, dynamic ocean waves crashing against rocks, dramatic clouds moving across sky, photorealistic water movement, mist and ocean spray, wind-driven waves, atmospheric sky motion, natural fluid dynamics, realistic, detailed, 8k. Do not change the size & shape of the lighthouse & the field on top of which the Lighthouse built.
Isn’t it exciting?
However, I want to stress this point: the video generated by the Stable Defusion (Stability AI) API was able to partially apply the animation effect. Even though the animation applies to the cloud, It doesn’t apply the animation to the wave. But, I must admit, the quality of the video is quite good.
Let us understand the code and how we run the solution, and then we can try to understand its performance along with the other solutions later in the subsequent series.
As you know, we’re exploring the code base of the third-party API, which will actually execute a series of API calls that create a video out of the prompt.
CODE:
Let us understand some of the important snippet –
class clsStabilityAIAPI:
def __init__(self, STABLE_DIFF_API_KEY, OUT_DIR_PATH, FILE_NM, VID_FILE_NM):
self.STABLE_DIFF_API_KEY = STABLE_DIFF_API_KEY
self.OUT_DIR_PATH = OUT_DIR_PATH
self.FILE_NM = FILE_NM
self.VID_FILE_NM = VID_FILE_NM
def delFile(self, fileName):
try:
# Deleting the intermediate image
os.remove(fileName)
return 0
except Exception as e:
x = str(e)
print('Error: ', x)
return 1
def generateText2Image(self, inputDescription):
try:
STABLE_DIFF_API_KEY = self.STABLE_DIFF_API_KEY
fullFileName = self.OUT_DIR_PATH + self.FILE_NM
if STABLE_DIFF_API_KEY is None:
raise Exception("Missing Stability API key.")
response = requests.post(f"{api_host}/v1/generation/{engine_id}/text-to-image",
headers={
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": f"Bearer {STABLE_DIFF_API_KEY}"
},
json={
"text_prompts": [{"text": inputDescription}],
"cfg_scale": 7,
"height": 1024,
"width": 576,
"samples": 1,
"steps": 30,
},)
if response.status_code != 200:
raise Exception("Non-200 response: " + str(response.text))
data = response.json()
for i, image in enumerate(data["artifacts"]):
with open(fullFileName, "wb") as f:
f.write(base64.b64decode(image["base64"]))
return fullFileName
except Exception as e:
x = str(e)
print('Error: ', x)
return 'N/A'
def image2VideoPassOne(self, imgNameWithPath):
try:
STABLE_DIFF_API_KEY = self.STABLE_DIFF_API_KEY
response = requests.post(f"https://api.stability.ai/v2beta/image-to-video",
headers={"authorization": f"Bearer {STABLE_DIFF_API_KEY}"},
files={"image": open(imgNameWithPath, "rb")},
data={"seed": 0,"cfg_scale": 1.8,"motion_bucket_id": 127},
)
print('First Pass Response:')
print(str(response.text))
genID = response.json().get('id')
return genID
except Exception as e:
x = str(e)
print('Error: ', x)
return 'N/A'
def image2VideoPassTwo(self, genId):
try:
generation_id = genId
STABLE_DIFF_API_KEY = self.STABLE_DIFF_API_KEY
fullVideoFileName = self.OUT_DIR_PATH + self.VID_FILE_NM
response = requests.request("GET", f"https://api.stability.ai/v2beta/image-to-video/result/{generation_id}",
headers={
'accept': "video/*", # Use 'application/json' to receive base64 encoded JSON
'authorization': f"Bearer {STABLE_DIFF_API_KEY}"
},)
print('Retrieve Status Code: ', str(response.status_code))
if response.status_code == 202:
print("Generation in-progress, try again in 10 seconds.")
return 5
elif response.status_code == 200:
print("Generation complete!")
with open(fullVideoFileName, 'wb') as file:
file.write(response.content)
print("Successfully Retrieved the video file!")
return 0
else:
raise Exception(str(response.json()))
except Exception as e:
x = str(e)
print('Error: ', x)
return 1Now, let us understand the code –
1. CLASS INSTANTIATION
This function is called when an object of the class is created. It initializes four properties:
STABLE_DIFF_API_KEY: the API key for Stability AI services.OUT_DIR_PATH: the folder path to save files.FILE_NM: the name of the generated image file.VID_FILE_NM: the name of the generated video file.
2. delFile(fileName)
This function deletes a file specified by fileName.
- If successful, it returns
0. - If an error occurs, it logs the error and returns
1.
3. generateText2Image(inputDescription)
This function generates an image based on a text description:
- Sends a request to the Stability AI text-to-image endpoint using the API key.
- Saves the resulting image to a file.
- Returns the file’s path on success or
'N/A'if an error occurs.
4. image2VideoPassOne(imgNameWithPath)
This function uploads an image to create a video in its first phase:
- Sends the image to Stability AI’s image-to-video endpoint.
- Logs the response and extracts the
id(generation ID) for the next phase. - Returns the
idif successful or'N/A'on failure.
5. image2VideoPassTwo(genId)
This function retrieves the video created in the second phase using the genId:
- Checks the video generation status from the Stability AI endpoint.
- If complete, saves the video file and returns
0. - If still processing, returns
5. - Logs and returns
1for any errors.
As you can see, the code is pretty simple to understand & we’ve taken all the necessary actions in case of any unforeseen network issues or even if the video is not ready after our job submission in the following lines of the main calling script (generateText2VideoAPI.py) –
waitTime = 10
time.sleep(waitTime)
# Failed case retry
retries = 1
success = False
try:
while not success:
try:
z = r1.image2VideoPassTwo(gID)
except Exception as e:
success = False
if z == 0:
success = True
else:
wait = retries * 2 * 15
str_R1 = "retries Fail! Waiting " + str(wait) + " seconds and retrying!"
print(str_R1)
time.sleep(wait)
retries += 1
# Checking maximum retries
if retries >= maxRetryNo:
success = True
raise Exception
except:
print()And, let us see how the run looks like –

Let us understand the CPU utilization –

As you can see, CPU utilization is minimal since most tasks are at the API end.
So, we’ve done it. 🙂
Please find the next series on this topic below:
Enabling & Exploring Stable Defussion – Part 2
Enabling & Exploring Stable Defussion – Part 3
Please let me know your feedback after reviewing all the posts! 🙂
Note: All the data & scenarios posted here are representational data & scenarios & available over the internet & for educational purposes only. There is always room for improvement in this kind of model & the solution associated with it. I’ve shown the basic ways to achieve the same for educational purposes only.
3 thoughts on “Enabling & Exploring Stable Defussion – Part 1”