Real-time video summary assistance App – Part 2

As a continuation of the previous post, I would like to continue my discussion about the implementation of MCP protocols among agents. But before that, I want to add the quick demo one more time to recap our objectives.

Let us recap the process flow –

Also, understand the groupings of scripts by each group as posted in the previous post –

Message-Chaining Protocol (MCP) Implementation:

    clsMCPMessage.py
    clsMCPBroker.py

YouTube Transcript Extraction:

    clsYouTubeVideoProcessor.py

Language Detection:

    clsLanguageDetector.py

Translation Services & Agents:

    clsTranslationAgent.py
    clsTranslationService.py

Documentation Agent:

    clsDocumentationAgent.py
    
Research Agent:

    clsDocumentationAgent.py

Great! Now, we’ll continue with the main discussion.


def extract_youtube_id(youtube_url):
    """Extract YouTube video ID from URL"""
    youtube_id_match = re.search(r'(?:v=|\/)([0-9A-Za-z_-]{11}).*', youtube_url)
    if youtube_id_match:
        return youtube_id_match.group(1)
    return None

def get_youtube_transcript(youtube_url):
    """Get transcript from YouTube video"""
    video_id = extract_youtube_id(youtube_url)
    if not video_id:
        return {"error": "Invalid YouTube URL or ID"}
    
    try:
        transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
        
        # First try to get manual transcripts
        try:
            transcript = transcript_list.find_manually_created_transcript(["en"])
            transcript_data = transcript.fetch()
            print(f"Debug - Manual transcript format: {type(transcript_data)}")
            if transcript_data and len(transcript_data) > 0:
                print(f"Debug - First item type: {type(transcript_data[0])}")
                print(f"Debug - First item sample: {transcript_data[0]}")
            return {"text": transcript_data, "language": "en", "auto_generated": False}
        except Exception as e:
            print(f"Debug - No manual transcript: {str(e)}")
            # If no manual English transcript, try any available transcript
            try:
                available_transcripts = list(transcript_list)
                if available_transcripts:
                    transcript = available_transcripts[0]
                    print(f"Debug - Using transcript in language: {transcript.language_code}")
                    transcript_data = transcript.fetch()
                    print(f"Debug - Auto transcript format: {type(transcript_data)}")
                    if transcript_data and len(transcript_data) > 0:
                        print(f"Debug - First item type: {type(transcript_data[0])}")
                        print(f"Debug - First item sample: {transcript_data[0]}")
                    return {
                        "text": transcript_data, 
                        "language": transcript.language_code, 
                        "auto_generated": transcript.is_generated
                    }
                else:
                    return {"error": "No transcripts available for this video"}
            except Exception as e:
                return {"error": f"Error getting transcript: {str(e)}"}
    except Exception as e:
        return {"error": f"Error getting transcript list: {str(e)}"}

# ----------------------------------------------------------------------------------
# YouTube Video Processor
# ----------------------------------------------------------------------------------

class clsYouTubeVideoProcessor:
    """Process YouTube videos using the agent system"""
    
    def __init__(self, documentation_agent, translation_agent, research_agent):
        self.documentation_agent = documentation_agent
        self.translation_agent = translation_agent
        self.research_agent = research_agent
    
    def process_youtube_video(self, youtube_url):
        """Process a YouTube video"""
        print(f"Processing YouTube video: {youtube_url}")
        
        # Extract transcript
        transcript_result = get_youtube_transcript(youtube_url)
        
        if "error" in transcript_result:
            return {"error": transcript_result["error"]}
        
        # Start a new conversation
        conversation_id = self.documentation_agent.start_processing()
        
        # Process transcript segments
        transcript_data = transcript_result["text"]
        transcript_language = transcript_result["language"]
        
        print(f"Debug - Type of transcript_data: {type(transcript_data)}")
        
        # For each segment, detect language and translate if needed
        processed_segments = []
        
        try:
            # Make sure transcript_data is a list of dictionaries with text and start fields
            if isinstance(transcript_data, list):
                for idx, segment in enumerate(transcript_data):
                    print(f"Debug - Processing segment {idx}, type: {type(segment)}")
                    
                    # Extract text properly based on the type
                    if isinstance(segment, dict) and "text" in segment:
                        text = segment["text"]
                        start = segment.get("start", 0)
                    else:
                        # Try to access attributes for non-dict types
                        try:
                            text = segment.text
                            start = getattr(segment, "start", 0)
                        except AttributeError:
                            # If all else fails, convert to string
                            text = str(segment)
                            start = idx * 5  # Arbitrary timestamp
                    
                    print(f"Debug - Extracted text: {text[:30]}...")
                    
                    # Create a standardized segment
                    std_segment = {
                        "text": text,
                        "start": start
                    }
                    
                    # Process through translation agent
                    translation_result = self.translation_agent.process_text(text, conversation_id)
                    
                    # Update segment with translation information
                    segment_with_translation = {
                        **std_segment,
                        "translation_info": translation_result
                    }
                    
                    # Use translated text for documentation
                    if "final_text" in translation_result and translation_result["final_text"] != text:
                        std_segment["processed_text"] = translation_result["final_text"]
                    else:
                        std_segment["processed_text"] = text
                    
                    processed_segments.append(segment_with_translation)
            else:
                # If transcript_data is not a list, treat it as a single text block
                print(f"Debug - Transcript is not a list, treating as single text")
                text = str(transcript_data)
                std_segment = {
                    "text": text,
                    "start": 0
                }
                
                translation_result = self.translation_agent.process_text(text, conversation_id)
                segment_with_translation = {
                    **std_segment,
                    "translation_info": translation_result
                }
                
                if "final_text" in translation_result and translation_result["final_text"] != text:
                    std_segment["processed_text"] = translation_result["final_text"]
                else:
                    std_segment["processed_text"] = text
                
                processed_segments.append(segment_with_translation)
                
        except Exception as e:
            print(f"Debug - Error processing transcript: {str(e)}")
            return {"error": f"Error processing transcript: {str(e)}"}
        
        # Process the transcript with the documentation agent
        documentation_result = self.documentation_agent.process_transcript(
            processed_segments,
            conversation_id
        )
        
        return {
            "youtube_url": youtube_url,
            "transcript_language": transcript_language,
            "processed_segments": processed_segments,
            "documentation": documentation_result,
            "conversation_id": conversation_id
        }

Let us understand this step-by-step:

Part 1: Getting the YouTube Transcript

def extract_youtube_id(youtube_url):
    ...

This extracts the unique video ID from any YouTube link. 

def get_youtube_transcript(youtube_url):
    ...
  • This gets the actual spoken content of the video.
  • It tries to get a manual transcript first (created by humans).
  • If not available, it falls back to an auto-generated version (created by YouTube’s AI).
  • If nothing is found, it gives back an error message like: “Transcript not available.”

Part 2: Processing the Video with Agents

class clsYouTubeVideoProcessor:
    ...

This is like the control center that tells each intelligent agent what to do with the transcript. Here are the detailed steps:

1. Start the Process

def process_youtube_video(self, youtube_url):
    ...
  • The system starts with a YouTube video link.
  • It prints a message like: “Processing YouTube video: [link]”

2. Extract the Transcript

  • The system runs the get_youtube_transcript() function.
  • If it fails, it returns an error (e.g., invalid link or no subtitles available).

3. Start a “Conversation”

  • The documentation agent begins a new session, tracked by a unique conversation ID.
  • Think of this like opening a new folder in a shared team workspace to store everything related to this video.

4. Go Through Each Segment of the Transcript

  • The spoken text is often broken into small parts (segments), like subtitles.
  • For each part:
    • It checks the text.
    • It finds out the time that part was spoken.
    • It sends it to the translation agent to clean up or translate the text.

5. Translate (if needed)

  • If the translation agent finds a better or translated version, it replaces the original.
  • Otherwise, it keeps the original.

6. Prepare for Documentation

  • After translation, the segment is passed to the documentation agent.
  • This agent might:
    • Summarize the content,
    • Highlight important terms,
    • Structure it into a readable format.

7. Return the Final Result

The system gives back a structured package with:

  • The video link
  • The original language
  • The transcript in parts (processed and translated)
  • A documentation summary
  • The conversation ID (for tracking or further updates)

class clsDocumentationAgent:
    """Documentation Agent built with LangChain"""
    
    def __init__(self, agent_id: str, broker: clsMCPBroker):
        self.agent_id = agent_id
        self.broker = broker
        self.broker.register_agent(agent_id)
        
        # Initialize LangChain components
        self.llm = ChatOpenAI(
            model="gpt-4-0125-preview",
            temperature=0.1,
            api_key=OPENAI_API_KEY
        )
        
        # Create tools
        self.tools = [
            clsSendMessageTool(sender_id=self.agent_id, broker=self.broker)
        ]
        
        # Set up LLM with tools
        self.llm_with_tools = self.llm.bind(
            tools=[tool.tool_config for tool in self.tools]
        )
        
        # Setup memory
        self.memory = ConversationBufferMemory(
            memory_key="chat_history",
            return_messages=True
        )
        
        # Create prompt
        self.prompt = ChatPromptTemplate.from_messages([
            ("system", """You are a Documentation Agent for YouTube video transcripts. Your responsibilities include:
                1. Process YouTube video transcripts
                2. Identify key points, topics, and main ideas
                3. Organize content into a coherent and structured format
                4. Create concise summaries
                5. Request research information when necessary
                
                When you need additional context or research, send a request to the Research Agent.
                Always maintain a professional tone and ensure your documentation is clear and organized.
            """),
            MessagesPlaceholder(variable_name="chat_history"),
            ("human", "{input}"),
            MessagesPlaceholder(variable_name="agent_scratchpad"),
        ])
        
        # Create agent
        self.agent = (
            {
                "input": lambda x: x["input"],
                "chat_history": lambda x: self.memory.load_memory_variables({})["chat_history"],
                "agent_scratchpad": lambda x: format_to_openai_tool_messages(x["intermediate_steps"]),
            }
            | self.prompt
            | self.llm_with_tools
            | OpenAIToolsAgentOutputParser()
        )
        
        # Create agent executor
        self.agent_executor = AgentExecutor(
            agent=self.agent,
            tools=self.tools,
            verbose=True,
            memory=self.memory
        )
        
        # Video data
        self.current_conversation_id = None
        self.video_notes = {}
        self.key_points = []
        self.transcript_segments = []
        
    def start_processing(self) -> str:
        """Start processing a new video"""
        self.current_conversation_id = str(uuid.uuid4())
        self.video_notes = {}
        self.key_points = []
        self.transcript_segments = []
        
        return self.current_conversation_id
    
    def process_transcript(self, transcript_segments, conversation_id=None):
        """Process a YouTube transcript"""
        if not conversation_id:
            conversation_id = self.start_processing()
        self.current_conversation_id = conversation_id
        
        # Store transcript segments
        self.transcript_segments = transcript_segments
        
        # Process segments
        processed_segments = []
        for segment in transcript_segments:
            processed_result = self.process_segment(segment)
            processed_segments.append(processed_result)
        
        # Generate summary
        summary = self.generate_summary()
        
        return {
            "processed_segments": processed_segments,
            "summary": summary,
            "conversation_id": conversation_id
        }
    
    def process_segment(self, segment):
        """Process individual transcript segment"""
        text = segment.get("text", "")
        start = segment.get("start", 0)
        
        # Use LangChain agent to process the segment
        result = self.agent_executor.invoke({
            "input": f"Process this video transcript segment at timestamp {start}s: {text}. If research is needed, send a request to the research_agent."
        })
        
        # Update video notes
        timestamp = start
        self.video_notes[timestamp] = {
            "text": text,
            "analysis": result["output"]
        }
        
        return {
            "timestamp": timestamp,
            "text": text,
            "analysis": result["output"]
        }
    
    def handle_mcp_message(self, message: clsMCPMessage) -> Optional[clsMCPMessage]:
        """Handle an incoming MCP message"""
        if message.message_type == "research_response":
            # Process research information received from Research Agent
            research_info = message.content.get("text", "")
            
            result = self.agent_executor.invoke({
                "input": f"Incorporate this research information into video analysis: {research_info}"
            })
            
            # Send acknowledgment back to Research Agent
            response = clsMCPMessage(
                sender=self.agent_id,
                receiver=message.sender,
                message_type="acknowledgment",
                content={"text": "Research information incorporated into video analysis."},
                reply_to=message.id,
                conversation_id=message.conversation_id
            )
            
            self.broker.publish(response)
            return response
        
        elif message.message_type == "translation_response":
            # Process translation response from Translation Agent
            translation_result = message.content
            
            # Process the translated text
            if "final_text" in translation_result:
                text = translation_result["final_text"]
                original_text = translation_result.get("original_text", "")
                language_info = translation_result.get("language", {})
                
                result = self.agent_executor.invoke({
                    "input": f"Process this translated text: {text}\nOriginal language: {language_info.get('language', 'unknown')}\nOriginal text: {original_text}"
                })
                
                # Update notes with translation information
                for timestamp, note in self.video_notes.items():
                    if note["text"] == original_text:
                        note["translated_text"] = text
                        note["language"] = language_info
                        break
            
            return None
        
        return None
    
    def run(self):
        """Run the agent to listen for MCP messages"""
        print(f"Documentation Agent {self.agent_id} is running...")
        while True:
            message = self.broker.get_message(self.agent_id, timeout=1)
            if message:
                self.handle_mcp_message(message)
            time.sleep(0.1)
    
    def generate_summary(self) -> str:
        """Generate a summary of the video"""
        if not self.video_notes:
            return "No video data available to summarize."
        
        all_notes = "\n".join([f"{ts}: {note['text']}" for ts, note in self.video_notes.items()])
        
        result = self.agent_executor.invoke({
            "input": f"Generate a concise summary of this YouTube video, including key points and topics:\n{all_notes}"
        })
        
        return result["output"]

Let us understand the key methods in a step-by-step manner:

The Documentation Agent is like a smart assistant that watches a YouTube video, takes notes, pulls out important ideas, and creates a summary — almost like a professional note-taker trained to help educators, researchers, and content creators. It works with a team of other assistants, like a Translator Agent and a Research Agent, and they all talk to each other through a messaging system.

1. Starting to Work on a New Video

    def start_processing(self) -> str
    

    When a new video is being processed:

    • A new project ID is created.
    • Old notes and transcripts are cleared to start fresh.

    2. Processing the Whole Transcript

    def process_transcript(...)
    

    This is where the assistant:

    • Takes in the full transcript (what was said in the video).
    • Breaks it into small parts (like subtitles).
    • Sends each part to the smart brain for analysis.
    • Collects the results.
    • Finally, a summary of all the main ideas is created.

    3. Processing One Transcript Segment at a Time

    def process_segment(self, segment)
    

    For each chunk of the video:

    • The assistant reads the text and timestamp.
    • It asks GPT-4 to analyze it and suggest important insights.
    • It saves that insight along with the original text and timestamp.

    4. Handling Incoming Messages from Other Agents

    def handle_mcp_message(self, message)
    

    The assistant can also receive messages from teammates (other agents):

    If the message is from the Research Agent:

    • It reads new information and adds it to its notes.
    • It replies with a thank-you message to say it got the research.

    If the message is from the Translation Agent:

    • It takes the translated version of a transcript.
    • Updates its notes to reflect the translated text and its language.

    This is like a team of assistants emailing back and forth to make sure the notes are complete and accurate.

    5. Summarizing the Whole Video

    def generate_summary(self)
    

    After going through all the transcript parts, the agent asks GPT-4 to create a short, clean summary — identifying:

    • Main ideas
    • Key talking points
    • Structure of the content

    The final result is clear, professional, and usable in learning materials or documentation.


    class clsResearchAgent:
        """Research Agent built with AutoGen"""
        
        def __init__(self, agent_id: str, broker: clsMCPBroker):
            self.agent_id = agent_id
            self.broker = broker
            self.broker.register_agent(agent_id)
            
            # Configure AutoGen directly with API key
            if not OPENAI_API_KEY:
                print("Warning: OPENAI_API_KEY not set for ResearchAgent")
                
            # Create config list directly instead of loading from file
            config_list = [
                {
                    "model": "gpt-4-0125-preview",
                    "api_key": OPENAI_API_KEY
                }
            ]
            # Create AutoGen assistant for research
            self.assistant = AssistantAgent(
                name="research_assistant",
                system_message="""You are a Research Agent for YouTube videos. Your responsibilities include:
                    1. Research topics mentioned in the video
                    2. Find relevant information, facts, references, or context
                    3. Provide concise, accurate information to support the documentation
                    4. Focus on delivering high-quality, relevant information
                    
                    Respond directly to research requests with clear, factual information.
                """,
                llm_config={"config_list": config_list, "temperature": 0.1}
            )
            
            # Create user proxy to handle message passing
            self.user_proxy = UserProxyAgent(
                name="research_manager",
                human_input_mode="NEVER",
                code_execution_config={"work_dir": "coding", "use_docker": False},
                default_auto_reply="Working on the research request..."
            )
            
            # Current conversation tracking
            self.current_requests = {}
        
        def handle_mcp_message(self, message: clsMCPMessage) -> Optional[clsMCPMessage]:
            """Handle an incoming MCP message"""
            if message.message_type == "request":
                # Process research request from Documentation Agent
                request_text = message.content.get("text", "")
                
                # Use AutoGen to process the research request
                def research_task():
                    self.user_proxy.initiate_chat(
                        self.assistant,
                        message=f"Research request for YouTube video content: {request_text}. Provide concise, factual information."
                    )
                    # Return last assistant message
                    return self.assistant.chat_messages[self.user_proxy.name][-1]["content"]
                
                # Execute research task
                research_result = research_task()
                
                # Send research results back to Documentation Agent
                response = clsMCPMessage(
                    sender=self.agent_id,
                    receiver=message.sender,
                    message_type="research_response",
                    content={"text": research_result},
                    reply_to=message.id,
                    conversation_id=message.conversation_id
                )
                
                self.broker.publish(response)
                return response
            
            return None
        
        def run(self):
            """Run the agent to listen for MCP messages"""
            print(f"Research Agent {self.agent_id} is running...")
            while True:
                message = self.broker.get_message(self.agent_id, timeout=1)
                if message:
                    self.handle_mcp_message(message)
                time.sleep(0.1)
    

    Let us understand the key methods in detail.

    1. Receiving and Responding to Research Requests

      def handle_mcp_message(self, message)
      

      When the Research Agent gets a message (like a question or request for info), it:

      1. Reads the message to see what needs to be researched.
      2. Asks GPT-4 to find helpful, accurate info about that topic.
      3. Sends the answer back to whoever asked the question (usually the Documentation Agent).

      class clsTranslationAgent:
          """Agent for language detection and translation"""
          
          def __init__(self, agent_id: str, broker: clsMCPBroker):
              self.agent_id = agent_id
              self.broker = broker
              self.broker.register_agent(agent_id)
              
              # Initialize language detector
              self.language_detector = clsLanguageDetector()
              
              # Initialize translation service
              self.translation_service = clsTranslationService()
          
          def process_text(self, text, conversation_id=None):
              """Process text: detect language and translate if needed, handling mixed language content"""
              if not conversation_id:
                  conversation_id = str(uuid.uuid4())
              
              # Detect language with support for mixed language content
              language_info = self.language_detector.detect(text)
              
              # Decide if translation is needed
              needs_translation = True
              
              # Pure English content doesn't need translation
              if language_info["language_code"] == "en-IN" or language_info["language_code"] == "unknown":
                  needs_translation = False
              
              # For mixed language, check if it's primarily English
              if language_info.get("is_mixed", False) and language_info.get("languages", []):
                  english_langs = [
                      lang for lang in language_info.get("languages", []) 
                      if lang["language_code"] == "en-IN" or lang["language_code"].startswith("en-")
                  ]
                  
                  # If the highest confidence language is English and > 60% confident, don't translate
                  if english_langs and english_langs[0].get("confidence", 0) > 0.6:
                      needs_translation = False
              
              if needs_translation:
                  # Translate using the appropriate service based on language detection
                  translation_result = self.translation_service.translate(text, language_info)
                  
                  return {
                      "original_text": text,
                      "language": language_info,
                      "translation": translation_result,
                      "final_text": translation_result.get("translated_text", text),
                      "conversation_id": conversation_id
                  }
              else:
                  # Already English or unknown language, return as is
                  return {
                      "original_text": text,
                      "language": language_info,
                      "translation": {"provider": "none"},
                      "final_text": text,
                      "conversation_id": conversation_id
                  }
          
          def handle_mcp_message(self, message: clsMCPMessage) -> Optional[clsMCPMessage]:
              """Handle an incoming MCP message"""
              if message.message_type == "translation_request":
                  # Process translation request from Documentation Agent
                  text = message.content.get("text", "")
                  
                  # Process the text
                  result = self.process_text(text, message.conversation_id)
                  
                  # Send translation results back to requester
                  response = clsMCPMessage(
                      sender=self.agent_id,
                      receiver=message.sender,
                      message_type="translation_response",
                      content=result,
                      reply_to=message.id,
                      conversation_id=message.conversation_id
                  )
                  
                  self.broker.publish(response)
                  return response
              
              return None
          
          def run(self):
              """Run the agent to listen for MCP messages"""
              print(f"Translation Agent {self.agent_id} is running...")
              while True:
                  message = self.broker.get_message(self.agent_id, timeout=1)
                  if message:
                      self.handle_mcp_message(message)
                  time.sleep(0.1)

      Let us understand the key methods in step-by-step manner:

      1. Understanding and Translating Text:

      def process_text(...)
      

      This is the core job of the agent. Here’s what it does with any piece of text:

      Step 1: Detect the Language

      • It tries to figure out the language of the input text.
      • It can handle cases where more than one language is mixed together, which is common in casual speech or subtitles.

      Step 2: Decide Whether to Translate

      • If the text is clearly in English, or it’s unclear what the language is, it decides not to translate.
      • If the text is mostly in another language or has less than 60% confidence in being English, it will translate it into English.

      Step 3: Translate (if needed)

      • If translation is required, it uses the translation service to do the job.
      • Then it packages all the information: the original text, detected language, the translated version, and a unique conversation ID.

      Step 4: Return the Results

      • If no translation is needed, it returns the original text and a note saying “no translation was applied.”

      2. Receiving Messages and Responding

      def handle_mcp_message(...)
      

      The agent listens for messages from other agents. When someone asks it to translate something:

      • It takes the text from the message.
      • Runs it through the process_text function (as explained above).
      • Sends the translated (or original) result to the person who asked.
      class clsTranslationService:
          """Translation service using multiple providers with support for mixed languages"""
          
          def __init__(self):
              # Initialize Sarvam AI client
              self.sarvam_api_key = SARVAM_API_KEY
              self.sarvam_url = "https://api.sarvam.ai/translate"
              
              # Initialize Google Cloud Translation client using simple HTTP requests
              self.google_api_key = GOOGLE_API_KEY
              self.google_translate_url = "https://translation.googleapis.com/language/translate/v2"
          
          def translate_with_sarvam(self, text, source_lang, target_lang="en-IN"):
              """Translate text using Sarvam AI (for Indian languages)"""
              if not self.sarvam_api_key:
                  return {"error": "Sarvam API key not set"}
              
              headers = {
                  "Content-Type": "application/json",
                  "api-subscription-key": self.sarvam_api_key
              }
              
              payload = {
                  "input": text,
                  "source_language_code": source_lang,
                  "target_language_code": target_lang,
                  "speaker_gender": "Female",
                  "mode": "formal",
                  "model": "mayura:v1"
              }
              
              try:
                  response = requests.post(self.sarvam_url, headers=headers, json=payload)
                  if response.status_code == 200:
                      return {"translated_text": response.json().get("translated_text", ""), "provider": "sarvam"}
                  else:
                      return {"error": f"Sarvam API error: {response.text}", "provider": "sarvam"}
              except Exception as e:
                  return {"error": f"Error calling Sarvam API: {str(e)}", "provider": "sarvam"}
          
          def translate_with_google(self, text, target_lang="en"):
              """Translate text using Google Cloud Translation API with direct HTTP request"""
              if not self.google_api_key:
                  return {"error": "Google API key not set"}
              
              try:
                  # Using the translation API v2 with API key
                  params = {
                      "key": self.google_api_key,
                      "q": text,
                      "target": target_lang
                  }
                  
                  response = requests.post(self.google_translate_url, params=params)
                  if response.status_code == 200:
                      data = response.json()
                      translation = data.get("data", {}).get("translations", [{}])[0]
                      return {
                          "translated_text": translation.get("translatedText", ""),
                          "detected_source_language": translation.get("detectedSourceLanguage", ""),
                          "provider": "google"
                      }
                  else:
                      return {"error": f"Google API error: {response.text}", "provider": "google"}
              except Exception as e:
                  return {"error": f"Error calling Google Translation API: {str(e)}", "provider": "google"}
          
          def translate(self, text, language_info):
              """Translate text to English based on language detection info"""
              # If already English or unknown language, return as is
              if language_info["language_code"] == "en-IN" or language_info["language_code"] == "unknown":
                  return {"translated_text": text, "provider": "none"}
              
              # Handle mixed language content
              if language_info.get("is_mixed", False) and language_info.get("languages", []):
                  # Strategy for mixed language: 
                  # 1. If one of the languages is English, don't translate the entire text, as it might distort English portions
                  # 2. If no English but contains Indian languages, use Sarvam as it handles code-mixing better
                  # 3. Otherwise, use Google Translate for the primary detected language
                  
                  has_english = False
                  has_indian = False
                  
                  for lang in language_info.get("languages", []):
                      if lang["language_code"] == "en-IN" or lang["language_code"].startswith("en-"):
                          has_english = True
                      if lang.get("is_indian", False):
                          has_indian = True
                  
                  if has_english:
                      # Contains English - use Google for full text as it handles code-mixing well
                      return self.translate_with_google(text)
                  elif has_indian:
                      # Contains Indian languages - use Sarvam
                      # Use the highest confidence Indian language as source
                      indian_langs = [lang for lang in language_info.get("languages", []) if lang.get("is_indian", False)]
                      if indian_langs:
                          # Sort by confidence
                          indian_langs.sort(key=lambda x: x.get("confidence", 0), reverse=True)
                          source_lang = indian_langs[0]["language_code"]
                          return self.translate_with_sarvam(text, source_lang)
                      else:
                          # Fallback to primary language
                          if language_info["is_indian"]:
                              return self.translate_with_sarvam(text, language_info["language_code"])
                          else:
                              return self.translate_with_google(text)
                  else:
                      # No English, no Indian languages - use Google for primary language
                      return self.translate_with_google(text)
              else:
                  # Not mixed language - use standard approach
                  if language_info["is_indian"]:
                      # Use Sarvam AI for Indian languages
                      return self.translate_with_sarvam(text, language_info["language_code"])
                  else:
                      # Use Google for other languages
                      return self.translate_with_google(text)

      This Translation Service is like a smart translator that knows how to:

      • Detect what language the text is written in,
      • Choose the best translation provider depending on the language (especially for Indian languages),
      • And then translate the text into English.

      It supports mixed-language content (such as Hindi-English in one sentence) and uses either Google Translate or Sarvam AI, a translation service designed for Indian languages.

      Now, let us understand the key methods in a step-by-step manner:

      1. Translating Using Google Translate

      def translate_with_google(...)
      

      This function uses Google Translate:

      • It sends the text, asks for English as the target language, and gets a translation back.
      • It also detects the source language automatically.
      • If successful, it returns the translated text and the detected original language.
      • If there’s an error, it returns a message saying what went wrong.

      Best For: Non-Indian languages (like Spanish, French, Chinese) and content that is not mixed with English.

      2. Main Translation Logic

      def translate(self, text, language_info)
      

      This is the decision-maker. Here’s how it works:

      Case 1: No Translation Needed

      If the text is already in English or the language is unknown, it simply returns the original text.

      Case 2: Mixed Language (e.g., Hindi + English)

      If the text contains more than one language:

      • ✅ If one part is English → use Google Translate (it’s good with mixed languages).
      • ✅ If it includes Indian languages only → use Sarvam AI (better at handling Indian content).
      • ✅ If it’s neither English nor Indian → use Google Translate.

      The service checks how confident it is about each language in the mix and chooses the most likely one to translate from.

      Case 3: Single Language

      If the text is only in one language:

      • ✅ If it’s an Indian language (like Bengali, Tamil, or Marathi), use Sarvam AI.
      • ✅ If it’s any other language, use Google Translate.

      So, we’ve done it.

      I’ve included the complete working solutions for you in the GitHub Link.

      We’ll cover the detailed performance testing, Optimized configurations & many other useful details in our next post.

      Till then, Happy Avenging! 🙂

      Enable OpenAI chatbot with the selected YouTube video content using LangChain, FAISS & YouTube data-API.

      Today, I’m very excited to demonstrate an effortless & new way to extract the transcript from YouTube videos & then answer the questions based on the topics selected by the users. In this post, I plan to deal with the user inputs to consider the case first & then it can summarize the video content through useful advanced analytics with the help of the LangChain & OpenAI-based model.

      In this post, I’ve directly subscribed to OpenAI & I’m not using OpenAI from Azure. However, I’ll explore that in the future as well.
      Before I explain the process to invoke this new library, why not view the demo first & then discuss it?

      Demo

      Isn’t it very exciting? This will lead to a whole new ballgame, where one can get critical decision-making information from these human sources along with their traditional advanced analytical data.

      How will it help?

      Let’s say as per your historical data & analytics, the dashboard is recommending prod-A, prod-B & prod-C as the top three products for potential top-performing brands. Whereas, you are getting some alerts from the TV news on prod-B due to the recent incidents. So, in that case, you don’t want to continue with the prod-B investment. You may find a new product named prod-Z. That may reduce the risk of your investment.


      What is LangChain?

      LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model but will also be:

      1. Data-aware: connect a language model to other sources of data
      2. Agentic: allow a language model to interact with its environment

      The LangChain framework works around these principles.

      To know more about this, please click the following link.

      As you can see, this is one of the critical components in our solution, which will bind the OpenAI bot & it will feed the necessary data to provide the correct response.


      What is FAISS?

      Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that do not fit in RAM. It also has supporting code for evaluation and parameter tuning.

      Faiss developed using C++ with complete wrappers for Python—some of the most beneficial algorithms available both on CPU & in GPU as well. Facebook AI Research develops it.

      To know more about this, please click the following link.


      FLOW OF EVENTS:

      Let us look at the flow diagram as it captures the sequence of events that unfold as part of the process.

      Here are the steps that will follow in sequence –

      • The application will first get the topic on which it needs to look from YouTube & find the top 5 videos using the YouTube data-API.
      • Once the application returns a list of websites from the above step, LangChain will drive the application will extract the transcripts from the video & then optimize the response size in smaller chunks to address the costly OpenAI calls. During this time, it will invoke FAISS to create document DBs.
      • Finally, it will send those chunks to OpenAI for the best response based on your supplied template that performs the final analysis with small data required for your query & gets the appropriate response with fewer costs.

      CODE:

      Why don’t we go through the code made accessible due to this new library for this particular use case?

      • clsConfigClient.py (This is the main calling Python script for the input parameters.)


      ################################################
      #### Written By: SATYAKI DE ####
      #### Written On: 15-May-2020 ####
      #### Modified On: 28-May-2023 ####
      #### ####
      #### Objective: This script is a config ####
      #### file, contains all the keys for ####
      #### personal OpenAI-based video content ####
      #### enable bot. ####
      #### ####
      ################################################
      import os
      import platform as pl
      class clsConfigClient(object):
      Curr_Path = os.path.dirname(os.path.realpath(__file__))
      os_det = pl.system()
      if os_det == "Windows":
      sep = '\\'
      else:
      sep = '/'
      conf = {
      'APP_ID': 1,
      'ARCH_DIR': Curr_Path + sep + 'arch' + sep,
      'PROFILE_PATH': Curr_Path + sep + 'profile' + sep,
      'LOG_PATH': Curr_Path + sep + 'log' + sep,
      'DATA_PATH': Curr_Path + sep + 'data' + sep,
      'MODEL_PATH': Curr_Path + sep + 'model' + sep,
      'TEMP_PATH': Curr_Path + sep + 'temp' + sep,
      'MODEL_DIR': 'model',
      'APP_DESC_1': 'LangChain Demo!',
      'DEBUG_IND': 'N',
      'INIT_PATH': Curr_Path,
      'FILE_NAME': 'Output.csv',
      'MODEL_NAME': 'gpt-3.5-turbo',
      'OPEN_AI_KEY': "sk-kfrjfijdrkidjkfjd9474nbfjfkfjfhfhf84i84hnfhjdbv6Bgvv",
      'YOUTUBE_KEY': "AIjfjfUYGe64hHJ-LOFO5u-mkso9pPOJGFU",
      'TITLE': "LangChain Demo!",
      'TEMP_VAL': 0.2,
      'PATH' : Curr_Path,
      'MAX_CNT' : 5,
      'OUT_DIR': 'data'
      }

      Some of the key entries from the above scripts are as follows –

      'MODEL_NAME': 'gpt-3.5-turbo',
      'OPEN_AI_KEY': "sk-kfrjfijdrkidjkfjd9474nbfjfkfjfhfhf84i84hnfhjdbv6Bgvv",
      'YOUTUBE_KEY': "AIjfjfUYGe64hHJ-LOFO5u-mkso9pPOJGFU",
      'TEMP_VAL': 0.2,

      From the above code snippet, one can understand that we need both the API keys for YouTube & OpenAI. And they have separate costs & usage, which I’ll share later in the post. Also, notice that the temperature sets to 0.2 ( range between 0 to 1). That means our AI bot will be consistent in response. And our application will use the GPT-3.5-turbo model for its analytic response.

      • clsTemplate.py (Contains all the templates for OpenAI.)


      ################################################
      #### Written By: SATYAKI DE ####
      #### Written On: 27-May-2023 ####
      #### Modified On: 28-May-2023 ####
      #### ####
      #### Objective: This script is a config ####
      #### file, contains all the template for ####
      #### OpenAI prompts to get the correct ####
      #### response. ####
      #### ####
      ################################################
      # Template to use for the system message prompt
      templateVal_1 = """
      You are a helpful assistant that that can answer questions about youtube videos
      based on the video's transcript: {docs}
      Only use the factual information from the transcript to answer the question.
      If you feel like you don't have enough information to answer the question, say "I don't know".
      Your answers should be verbose and detailed.
      """

      view raw

      clsTemplate.py

      hosted with ❤ by GitHub

      The above code is self-explanatory. Here, we’re keeping the correct instructions for our OpenAI to respond within these guidelines.

      • clsVideoContentScrapper.py (Main class to extract the transcript from the YouTube videos & then answer the questions based on the topics selected by the users.)


      #####################################################
      #### Written By: SATYAKI DE ####
      #### Written On: 27-May-2023 ####
      #### Modified On 28-May-2023 ####
      #### ####
      #### Objective: This is the main calling ####
      #### python class that will invoke the ####
      #### LangChain of package to extract ####
      #### the transcript from the YouTube videos & ####
      #### then answer the questions based on the ####
      #### topics selected by the users. ####
      #### ####
      #####################################################
      from langchain.document_loaders import YoutubeLoader
      from langchain.text_splitter import RecursiveCharacterTextSplitter
      from langchain.embeddings.openai import OpenAIEmbeddings
      from langchain.vectorstores import FAISS
      from langchain.chat_models import ChatOpenAI
      from langchain.chains import LLMChain
      from langchain.prompts.chat import (
      ChatPromptTemplate,
      SystemMessagePromptTemplate,
      HumanMessagePromptTemplate,
      )
      from googleapiclient.discovery import build
      import clsTemplate as ct
      from clsConfigClient import clsConfigClient as cf
      import os
      ###############################################
      ### Global Section ###
      ###############################################
      open_ai_Key = cf.conf['OPEN_AI_KEY']
      os.environ["OPENAI_API_KEY"] = open_ai_Key
      embeddings = OpenAIEmbeddings(openai_api_key=open_ai_Key)
      YouTube_Key = cf.conf['YOUTUBE_KEY']
      youtube = build('youtube', 'v3', developerKey=YouTube_Key)
      # Disbling Warning
      def warn(*args, **kwargs):
      pass
      import warnings
      warnings.warn = warn
      ###############################################
      ### End of Global Section ###
      ###############################################
      class clsVideoContentScrapper:
      def __init__(self):
      self.model_name = cf.conf['MODEL_NAME']
      self.temp_val = cf.conf['TEMP_VAL']
      self.max_cnt = int(cf.conf['MAX_CNT'])
      def createDBFromYoutubeVideoUrl(self, video_url):
      try:
      loader = YoutubeLoader.from_youtube_url(video_url)
      transcript = loader.load()
      text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
      docs = text_splitter.split_documents(transcript)
      db = FAISS.from_documents(docs, embeddings)
      return db
      except Exception as e:
      x = str(e)
      print('Error: ', x)
      return ''
      def getResponseFromQuery(self, db, query, k=4):
      try:
      """
      gpt-3.5-turbo can handle up to 4097 tokens. Setting the chunksize to 1000 and k to 4 maximizes
      the number of tokens to analyze.
      """
      mod_name = self.model_name
      temp_val = self.temp_val
      docs = db.similarity_search(query, k=k)
      docs_page_content = " ".join([d.page_content for d in docs])
      chat = ChatOpenAI(model_name=mod_name, temperature=temp_val)
      # Template to use for the system message prompt
      template = ct.templateVal_1
      system_message_prompt = SystemMessagePromptTemplate.from_template(template)
      # Human question prompt
      human_template = "Answer the following question: {question}"
      human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
      chat_prompt = ChatPromptTemplate.from_messages(
      [system_message_prompt, human_message_prompt]
      )
      chain = LLMChain(llm=chat, prompt=chat_prompt)
      response = chain.run(question=query, docs=docs_page_content)
      response = response.replace("\n", "")
      return response, docs
      except Exception as e:
      x = str(e)
      print('Error: ', x)
      return '', ''
      def topFiveURLFromYouTube(self, service, **kwargs):
      try:
      video_urls = []
      channel_list = []
      results = service.search().list(**kwargs).execute()
      for item in results['items']:
      print("Title: ", item['snippet']['title'])
      print("Description: ", item['snippet']['description'])
      channel = item['snippet']['channelId']
      print("Channel Id: ", channel)
      # Fetch the channel name using the channel ID
      channel_response = service.channels().list(part='snippet',id=item['snippet']['channelId']).execute()
      channel_title = channel_response['items'][0]['snippet']['title']
      print("Channel Title: ", channel_title)
      channel_list.append(channel_title)
      print("Video Id: ", item['id']['videoId'])
      vidURL = "https://www.youtube.com/watch?v=" + item['id']['videoId']
      print("Video URL: " + vidURL)
      video_urls.append(vidURL)
      print("\n")
      return video_urls, channel_list
      except Exception as e:
      video_urls = []
      channel_list = []
      x = str(e)
      print('Error: ', x)
      return video_urls, channel_list
      def extractContentInText(self, topic, query):
      try:
      discussedTopic = []
      strKeyText = ''
      cnt = 0
      max_cnt = self.max_cnt
      urlList, channelList = self.topFiveURLFromYouTube(youtube, q=topic, part='id,snippet',maxResults=max_cnt,type='video')
      print('Returned List: ')
      print(urlList)
      print()
      for video_url in urlList:
      print('Processing Video: ')
      print(video_url)
      db = self.createDBFromYoutubeVideoUrl(video_url)
      response, docs = self.getResponseFromQuery(db, query)
      if len(response) > 0:
      strKeyText = 'As per the topic discussed in ' + channelList[cnt] + ', '
      discussedTopic.append(strKeyText + response)
      cnt += 1
      return discussedTopic
      except Exception as e:
      discussedTopic = []
      x = str(e)
      print('Error: ', x)
      return discussedTopic

      Let us understand the key methods step by step in detail –

      def topFiveURLFromYouTube(self, service, **kwargs):
          try:
              video_urls = []
              channel_list = []
              results = service.search().list(**kwargs).execute()
      
              for item in results['items']:
                  print("Title: ", item['snippet']['title'])
                  print("Description: ", item['snippet']['description'])
                  channel = item['snippet']['channelId']
                  print("Channel Id: ", channel)
      
                  # Fetch the channel name using the channel ID
                  channel_response = service.channels().list(part='snippet',id=item['snippet']['channelId']).execute()
                  channel_title = channel_response['items'][0]['snippet']['title']
                  print("Channel Title: ", channel_title)
                  channel_list.append(channel_title)
      
                  print("Video Id: ", item['id']['videoId'])
                  vidURL = "https://www.youtube.com/watch?v=" + item['id']['videoId']
                  print("Video URL: " + vidURL)
                  video_urls.append(vidURL)
                  print("\n")
      
              return video_urls, channel_list
      
          except Exception as e:
              video_urls = []
              channel_list = []
              x = str(e)
              print('Error: ', x)
      
              return video_urls, channel_list

      The above code will fetch the most relevant YouTube URLs & bind them into a list along with the channel names & then share the lists with the main functions.

      def createDBFromYoutubeVideoUrl(self, video_url):
          try:
              loader = YoutubeLoader.from_youtube_url(video_url)
              transcript = loader.load()
      
              text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
              docs = text_splitter.split_documents(transcript)
      
              db = FAISS.from_documents(docs, embeddings)
              return db
      
          except Exception as e:
              x = str(e)
              print('Error: ', x)
              return ''

      The provided Python code defines a function createDBFromYoutubeVideoUrl which appears to create a database of text documents from the transcript of a YouTube video. Here’s the explanation in simple English:

      1. The function createDBFromYoutubeVideoUrl has defined with one argument: video_url.
      2. The function uses a try-except block to handle any potential exceptions or errors that may occur.
      3. Inside the try block, the following steps are going to perform:
      • First, it creates a YoutubeLoader object from the provided video_url. This object is likely responsible for interacting with the YouTube video specified by the URL.
      • The loader object then loads the transcript of the video. This object is the text version of everything spoken in the video.
      • It then creates a RecursiveCharacterTextSplitter object with a specified chunk_size of 1000 and chunk_overlap of 100. This object may split the transcript into smaller chunks (documents) of text for easier processing or analysis. Each piece will be around 1000 characters long, and there will overlap of 100 characters between consecutive chunks.
      • The split_documents method of the text_splitter object will split the transcript into smaller documents. These documents are stored in the docs variable.
      • The FAISS.from_documents method is then called with docs and embeddings as arguments to create a FAISS (Facebook AI Similarity Search) index. This index is a database used for efficient similarity search and clustering of high-dimensional vectors, which in this case, are the embeddings of the documents. The FAISS index is stored in the db variable.
      • Finally, the db variable is returned, representing the created database from the video transcript.

      4. If an exception occurs during the execution of the try block, the code execution moves to the except block:

      • Here, it first converts the exception e to a string x.
      • Then it prints an error message.
      • Finally, it returns an empty string as an indication of the error.

      def getResponseFromQuery(self, db, query, k=4):
            try:
                """
                gpt-3.5-turbo can handle up to 4097 tokens. Setting the chunksize to 1000 and k to 4 maximizes
                the number of tokens to analyze.
                """
      
                mod_name = self.model_name
                temp_val = self.temp_val
      
                docs = db.similarity_search(query, k=k)
                docs_page_content = " ".join([d.page_content for d in docs])
      
                chat = ChatOpenAI(model_name=mod_name, temperature=temp_val)
      
                # Template to use for the system message prompt
                template = ct.templateVal_1
      
                system_message_prompt = SystemMessagePromptTemplate.from_template(template)
      
                # Human question prompt
                human_template = "Answer the following question: {question}"
                human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
      
                chat_prompt = ChatPromptTemplate.from_messages(
                    [system_message_prompt, human_message_prompt]
                )
      
                chain = LLMChain(llm=chat, prompt=chat_prompt)
      
                response = chain.run(question=query, docs=docs_page_content)
                response = response.replace("\n", "")
                return response, docs
      
            except Exception as e:
                x = str(e)
                print('Error: ', x)
      
                return '', ''

      The Python function getResponseFromQuery is designed to search a given database (db) for a specific query and then generate a response using a language model (possibly GPT-3.5-turbo). The answer is based on the content found and the particular question. Here is a simple English summary:

      1. The function getResponseFromQuery takes three parameters: db, query, and k. The k parameter is optional and defaults to 4 if not provided. db is the database to search, the query is the question or prompts to analyze, and k is the number of similar items to return.
      2. The function initiates a try-except block for handling any errors that might occur.
      3. Inside the try block:
      • The function retrieves the model name and temperature value from the instance of the class this function is a part of.
      • The function then searches the db database for documents similar to the query and saves these in docs.
      • It concatenates the content of the returned documents into a single string docs_page_content.
      • It creates a ChatOpenAI object with the model name and temperature value.
      • It creates a system message prompt from a predefined template.
      • It creates a human message prompt, which is the query.
      • It combines these two prompts to form a chat prompt.
      • An LLMChain object is then created using the ChatOpenAI object and the chat prompt.
      • This LLMChain object is used to generate a response to the query using the content of the documents found in the database. The answer is then formatted by replacing all newline characters with empty strings.
      • Finally, the function returns this response along with the original documents.
      1. If any error occurs during these operations, the function goes to the except block where:
      • The error message is printed.
      • The function returns two empty strings to indicate an error occurred, and no response or documents could be produced.

      def extractContentInText(self, topic, query):
          try:
              discussedTopic = []
              strKeyText = ''
              cnt = 0
              max_cnt = self.max_cnt
      
              urlList, channelList = self.topFiveURLFromYouTube(youtube, q=topic, part='id,snippet',maxResults=max_cnt,type='video')
              print('Returned List: ')
              print(urlList)
              print()
      
              for video_url in urlList:
                  print('Processing Video: ')
                  print(video_url)
                  db = self.createDBFromYoutubeVideoUrl(video_url)
      
                  response, docs = self.getResponseFromQuery(db, query)
      
                  if len(response) > 0:
                      strKeyText = 'As per the topic discussed in ' + channelList[cnt] + ', '
                      discussedTopic.append(strKeyText + response)
      
                  cnt += 1
      
              return discussedTopic
          except Exception as e:
              discussedTopic = []
              x = str(e)
              print('Error: ', x)
      
              return discussedTopic

      This Python function, extractContentInText, is aimed to extract relevant content from the transcripts of top YouTube videos on a specific topic and generate responses to a given query. Here’s a simple English translation:

      1. The function extractContentInText is defined with topic and query as parameters.
      2. It begins with a try-except block to catch and handle any possible exceptions.
      3. In the try block:
      • It initializes several variables: an empty list discussedTopic to store the extracted information, an empty string strKeyText to keep specific parts of the content, a counter cnt initialized at 0, and max_cnt retrieved from the self-object to specify the maximum number of YouTube videos to consider.
      • It calls the topFiveURLFromYouTube function (defined previously) to get the URLs of the top videos on the given topic from YouTube. It also retrieves the list of channel names associated with these videos.
      • It prints the returned list of URLs.
      • Then, it starts a loop over each URL in the urlList.
        • For each URL, it prints the URL, then creates a database from the transcript of the YouTube video using the function createDBFromYoutubeVideoUrl.
        • It then uses the getResponseFromQuery function to get a response to the query based on the content of the database.
        • If the length of the response is greater than 0 (meaning there is a response), it forms a string strKeyText to indicate the channel that the topic was discussed on and then appends the answer to this string. This entire string is then added to the discussedTopic list.
        • It increments the counter cnt by one after each iteration.
        • Finally, it returns the discussedTopic list, which now contains relevant content extracted from the videos.
      1. If any error occurs during these operations, the function goes into the except block:
      • It first resets discussedTopic to an empty list.
      • Then it converts the exception e to a string and prints the error message.
      • Lastly, it returns the empty discussedTopic list, indicating that no content could be extracted due to the error.
      • testLangChain.py (Main Python script to extract the transcript from the YouTube videos & then answer the questions based on the topics selected by the users.)


      #####################################################
      #### Written By: SATYAKI DE ####
      #### Written On: 27-May-2023 ####
      #### Modified On 28-May-2023 ####
      #### ####
      #### Objective: This is the main calling ####
      #### python script that will invoke the ####
      #### clsVideoContentScrapper class to extract ####
      #### the transcript from the YouTube videos. ####
      #### ####
      #####################################################
      import clsL as cl
      from clsConfigClient import clsConfigClient as cf
      import datetime
      import textwrap
      import clsVideoContentScrapper as cvsc
      # Disbling Warning
      def warn(*args, **kwargs):
      pass
      import warnings
      warnings.warn = warn
      ######################################
      ### Get your global values ####
      ######################################
      debug_ind = 'Y'
      # Initiating Logging Instances
      clog = cl.clsL()
      data_path = cf.conf['DATA_PATH']
      data_file_name = cf.conf['FILE_NAME']
      cVCScrapper = cvsc.clsVideoContentScrapper()
      ######################################
      #### Global Flag ########
      ######################################
      def main():
      try:
      var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
      print('*'*120)
      print('Start Time: ' + str(var))
      print('*'*120)
      #query = "What are they saying about Microsoft?"
      print('Please share your topic!')
      inputTopic = input('User: ')
      print('Please ask your questions?')
      inputQry = input('User: ')
      print()
      retList = cVCScrapper.extractContentInText(inputTopic, inputQry)
      cnt = 0
      for discussedTopic in retList:
      finText = str(cnt + 1) + ') ' + discussedTopic
      print()
      print(textwrap.fill(finText, width=150))
      cnt += 1
      r1 = len(retList)
      if r1 > 0:
      print()
      print('Successfully Scrapped!')
      else:
      print()
      print('Failed to Scrappe!')
      print('*'*120)
      var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
      print('End Time: ' + str(var1))
      except Exception as e:
      x = str(e)
      print('Error: ', x)
      if __name__ == "__main__":
      main()

      Please find the key snippet –

      def main():
          try:
              var = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
              print('*'*120)
              print('Start Time: ' + str(var))
              print('*'*120)
      
              #query = "What are they saying about Microsoft?"
              print('Please share your topic!')
              inputTopic = input('User: ')
              print('Please ask your questions?')
              inputQry = input('User: ')
              print()
      
              retList = cVCScrapper.extractContentInText(inputTopic, inputQry)
              cnt = 0
      
              for discussedTopic in retList:
                  finText = str(cnt + 1) + ') ' + discussedTopic
                  print()
                  print(textwrap.fill(finText, width=150))
      
                  cnt += 1
      
              r1 = len(retList)
      
              if r1 > 0:
                  print()
                  print('Successfully Scrapped!')
              else:
                  print()
                  print('Failed to Scrappe!')
      
              print('*'*120)
              var1 = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
              print('End Time: ' + str(var1))
      
          except Exception as e:
              x = str(e)
              print('Error: ', x)
      
      if __name__ == "__main__":
          main()

      The above main application will capture the topics from the user & then will give the user a chance to ask specific questions on the topics, invoking the main class to extract the transcript from YouTube & then feed it as a source using ChainLang & finally deliver the response. If there is no response, then it will skip the overall options.

      USAGE & COST FACTOR:

      Please find the OpenAI usage –

      Please find the YouTube API usage –


      So, finally, we’ve done it.

      I know that this post is relatively bigger than my earlier post. But, I think, you can get all the details once you go through it.

      You will get the complete codebase in the following GitHub link.

      I’ll bring some more exciting topics in the coming days from the Python verse. Please share & subscribe to my post & let me know your feedback.

      Till then, Happy Avenging! 🙂

      Note: All the data & scenarios posted here are representational data & scenarios & available over the internet & for educational purposes only. Some of the images (except my photo) we’ve used are available over the net. We don’t claim ownership of these images. There is always room for improvement & especially in the prediction quality. Sample video taken from Santrel Media & you would find the link over here.