Matrix Business Intelligence & Analytics

🐍 What Can You Do with Python? Pretty Much Everything.
Python is one of the most versatile tools in the modern tech stack — it powers everything from machine learning models and financial dashboards to websites, automations, and creative content. Whether you're working with data, building a product, or automating a workflow, Python is your go-to language for turning ideas into real, scalable tools.

📊 Dive into analytics and business intelligence by cleaning, transforming, and visualizing data with Pandas, Plotly, or Power BI integrations.
🎛️ Automate repetitive tasks — like scraping websites, sending scheduled reports, or transforming spreadsheets — and free up hours every week.
🧠 Tap into AI and machine learning with libraries like scikit-learn, TensorFlow, or OpenAI’s API, letting you build anything from chatbots to image classifiers.
🎨 Create audio and visual content on the fly: Python can generate multilingual voiceovers, render images with DALL·E, or script and edit video timelines.
🌐 Build web apps and APIs with frameworks like Flask or FastAPI — perfect for dashboards, tools, or client-facing products.
💡 Power up your workflows by connecting to services like Microsoft Azure, Google Cloud, SharePoint, Slack, and Teams — all with a few lines of code.

Whether you're automating boring stuff, building clever tools for your business, analyzing huge datasets, or experimenting with the latest in AI, Python gives you a fast, flexible, and powerful way to get it done — and scale it when it works.

🧠 GPT Prompt Expander — Supercharge Your Content Creation

What it does:
This Python tool takes simple ideas from a CSV file — basic prompts like “write about investing” — and transforms them into rich, detailed content using the GPT-4 model. It automates the process of asking GPT questions, captures the responses, and saves the outputs in a clean CSV file with token usage stats for cost tracking.

Why it’s useful:
Creating content from scratch can be time-consuming and vague prompts often lead to underwhelming results. This script bridges that gap by:

Turning rough concepts into ready-to-use paragraphs or scripts
Logging each prompt and GPT response so nothing gets lost
Making your creative process reproducible and efficient

🔍 Use Cases

Here’s how this script can be used in the real world:

Marketing teams: Feed it a CSV of product ideas or pain points, and get blog intros, email campaigns, or ad copy back — all at scale.
Educators and course creators: Expand basic lesson themes into rich topic outlines or module descriptions.
Social media planners: Generate weekly post drafts from one-word themes (e.g. “resilience”, “teamwork”, “compliance”).
Business analysts: Convert bullet points of observations into structured summaries or reports for stakeholders.
AI prompt engineers: Refine vague prompts into detailed ones by using GPT to rewrite them with context, tone, and examples.

🐍 GPT Prompt Expander - Code

## GPT Prompt

# Install Libraries

%pip install openai requests python-dotenv

# Import Modules

import csv

import os

import requests

import pandas as pd

import time

from openai import OpenAI

from datetime import datetime

from dotenv import load_dotenv

print("All libraries are loaded correctly!")

# Load environment variables from .env file

load_dotenv() # Load the .env file

# Instantiate OpenAI client

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Prompt

def ask_chatgpt(prompt):

try:

response = client.chat.completions.create(

model="gpt-4", # Use the desired model

messages=[{"role": "user", "content": prompt}]

)

output = response.choices[0].message.content

# Token usage information

total_tokens = response.usage.total_tokens

prompt_tokens = response.usage.prompt_tokens

completion_tokens = response.usage.completion_tokens

return output, f"Total tokens used: {total_tokens}, Prompt tokens: {prompt_tokens}, Completion tokens: {completion_tokens}"

except Exception as e:

return f"Error: {str(e)}"

# Save ChatGPT output to CSV

def save_to_csv(prompts_outputs):

directory = os.path.join(os.getcwd(), "output_folder")

if not os.path.exists(directory):

os.makedirs(directory)

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

file_name = f"chatgpt_output_{timestamp}.csv"

file_path = os.path.join(directory, file_name)

# Save data to CSV

with open(file_path, mode='w', newline='') as file:

writer = csv.writer(file)

writer.writerow(["Prompt", "Output", "Token Info"])

for prompt, output, token_info in prompts_outputs:

writer.writerow([prompt, output, token_info])

# Read prompts from CSV

def read_prompts_from_csv(file_name):

prompts = []

input_directory = os.path.join(os.getcwd(), "input_folder")

if not os.path.exists(input_directory):

os.makedirs(input_directory)

file_path = os.path.join(input_directory, file_name)

with open(file_path, mode='r') as file:

reader = csv.DictReader(file)

for row in reader:

prompts.append(row['prompt_text']) # Access prompt_text column

return prompts

# Run ChatGPT for prompts and save outputs

def run_chatgpt_and_save(prompts):

prompts_outputs = []

for prompt in prompts:

output, token_info = ask_chatgpt(prompt)

# Print prompt and output

print(f"\nPrompt: {prompt}")

print(f"Full Output:\n{output}")

# Capture all lines in the output

output_lines = output.split('\n')

valuable_lines = [line.strip() for line in output_lines if line.strip()]

# Print and save valuable lines

print("Valuable Output:")

for valuable_output in valuable_lines:

print(valuable_output)

prompts_outputs.append((prompt, valuable_output, token_info))

save_to_csv(prompts_outputs)

## Execution Block

prompts = read_prompts_from_csv('prompts.csv')

run_chatgpt_and_save(prompts)

🎨 GPT Visualiser — Turn Words into Images Automatically

What it does:
This script reads a list of prompts (typically generated text from ChatGPT), feeds them into OpenAI’s DALL·E image generation API, downloads the images, and saves them locally — all automatically. It also logs each image generation event with metadata for easy reference and reuse.

Why it’s useful:
Whether you're a content creator, educator, or business professional, visuals are crucial — but designing them manually is time-consuming. This tool gives you the power of AI image generation on autopilot. Feed it a spreadsheet of creative prompts, and it delivers polished, high-res images in seconds.

🧠 Use Cases

Marketing & Social Media: Batch-create images for blog headers, social posts, or ad creatives from content outlines or themes.
eLearning & Presentations: Generate unique visuals for slides, infographics, or course content based on lecture notes.
YouTube & Video Creators: Turn storyboards or narration prompts into scene artwork or thumbnails.
AI-Powered Storytelling: Combine it with a script generator to produce children’s books, comics, or concept art from narrative outlines.
Internal Prototypes: Visualize product ideas, logos, or UI concepts for pitch decks and stakeholder engagement.

🧩 How It Works (in Plain English)

“You give it a list of descriptions — like 'a futuristic library in the clouds' or 'a red koala playing chess' — and it fetches beautiful, AI-rendered images for each one. It saves everything for you, with timestamps and download history, so you don’t lose track of anything.”

⚙️ Tech-Savvy Extras

Compatible with CSV workflows: great for combining with other automated content generation tools
Saves timestamps and file tracking for audit or reuse
Flexible file naming using prompt snippets and time data
Works well with previous scripts (like your GPT Prompt Expander) for end-to-end idea-to-asset automation

🐍 GPT Visualiser - Code

#Install Libraries

#%pip install openai requests python-dotenv

# Import Modules

import os

import csv

import requests

import openai # Import OpenAI

from datetime import datetime

from dotenv import load_dotenv

# Load environment variables from .env file

load_dotenv()

# Set OpenAI API key

openai_api_key = os.getenv("OPENAI_API_KEY")

openai.api_key = openai_api_key

# Folder where images will be saved

output_folder = 'gptimage_output_folder'

if not os.path.exists(output_folder):

os.makedirs(output_folder)

# Function to read prompts from the CSV file

def read_prompts_from_csv(file_name):

prompts = []

file_path = os.path.join(os.getcwd(), "output_folder", file_name)

# Read the file and extract the prompts

with open(file_path, mode='r') as file:

reader = csv.DictReader(file)

for row in reader:

prompts.append(row['Output']) # Assuming the "Output" column exists

return prompts

# Function to generate images using OpenAI DALL·E API

def generate_image_with_gpt(prompt):

try:

response = openai.Image.create(

prompt=prompt,

n=1,

size="1024x1024"

)

image_url = response['data'][0]['url']

print(f"Image generated: {image_url}")

return image_url

except Exception as e:

print(f"Error generating image: {str(e)}")

return None

# Function to download and save the image locally

def download_image(image_url, prompt):

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

# Create a file name based on the prompt and timestamp

file_name = f"GPT-{prompt[:10].replace(' ', '_')}-{timestamp}.jpg"

image_path = os.path.join(output_folder, file_name)

# Download the image

response = requests.get(image_url)

if response.status_code == 200:

with open(image_path, 'wb') as img_file:

img_file.write(response.content)

print(f"Image saved as {image_path}")

else:

print(f"Failed to download image: {response.status_code}")

# Save image data to CSV with timestamp

def save_image_data(generation_id, image_url, prompt):

directory = os.path.join(os.getcwd(), "gptimage_output_folder")

file_path = os.path.join(directory, "output.csv")

# Get the current timestamp

timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

# Append image data to CSV

with open(file_path, mode='a', newline='') as file:

writer = csv.writer(file)

writer.writerow([generation_id, image_url, prompt, timestamp])

print(f"Image data saved to {file_path}")

# Function to get the nth last file in a directory

def get_nth_last_file(directory, n=1):

files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]

files.sort(key=lambda x: os.path.getmtime(os.path.join(directory, x)), reverse=True)

if len(files) < n:

print(f"Not enough files in directory. Only {len(files)} available.")

return None

return files[n-1]

# Execution Block for ChatGPT Output Prompts

directory = os.path.join(os.getcwd(), "output_folder")

nth_last = 1 # Default to last file, change to nth as needed

file_name = get_nth_last_file(directory, nth_last)

if file_name:

print(f"Using file: {file_name}")

prompts = read_prompts_from_csv(file_name)

for prompt in prompts:

print(f"Generating image for prompt: {prompt}")

image_url = generate_image_with_gpt(prompt)

if image_url:

download_image(image_url, prompt)

save_image_data("GPT", image_url, prompt)

else:

print("No file found.")

📄 Azure Vision OCR — Automated Text Extraction from Images and PDFs

What it does:
This Python script integrates with Microsoft Azure’s Computer Vision API to scan and extract text from the most recent file in a designated folder — including PDFs, screenshots, scanned documents, or images. It polls the Azure OCR engine, retrieves the results, and saves them to a timestamped .txt file, ready for analysis or reporting.

In plain terms:

“Drop an image or PDF into a folder. This script grabs it, runs it through Microsoft’s world-class OCR engine, and spits out all the text it finds. It even works with multi-page PDFs and screenshots. No manual typing, no fiddling.”

🧠 Use Cases

Regulatory Compliance: Automatically extract content from paper-based contracts, forms, or IDs for digital storage and audit readiness.
Invoice & Receipt Processing: Pull line items and totals from scanned invoices or expense receipts for bookkeeping or data entry workflows.
Knowledge Digitization: Convert training manuals, old scans, or handwritten notes into digital formats.
Risk & Surveillance: Scan printed market notices or physical memos and flag content for compliance or review.
Translation Pipelines: Combine this with a language detection or translation engine to convert scanned foreign-language content into English.
Legal Discovery: Process scanned evidence or court documents for searchability and internal tagging.

⚙️ Technical Highlights

Supports multiple file types: PDF, JPG, PNG, BMP, TIFF, GIF
Uses Azure Vision API v3.2 for accurate OCR
Automatically selects the most recent file for processing
Outputs clean, line-by-line text files with timestamps
Compatible with scalable workflows or RPA integrations

🐍 Azure Vision OCR - Code

##MS VISION

# #Install Libraries

%pip install openai requests python-dotenv langdetect

%pip install azure-cognitiveservices-speech

%pip install azure-cognitiveservices-vision-computervision

## Rest API Method

import os

import glob

import requests

from dotenv import load_dotenv

from datetime import datetime

import time

# Step 1: Load environment variables from .env file for API key and endpoint

load_dotenv()

# Retrieve the API key and endpoint from environment variables

subscription_key = os.getenv("MS_VISION_API_KEY1")

endpoint = os.getenv("VISION_TRANSLATION_URL") + "/vision/v3.2/read/analyze"

# Ensure that the API key and endpoint are loaded properly

if not subscription_key or not endpoint:

raise ValueError("API Key or Endpoint is missing! Check your .env file.")

else:

print(f"Using Endpoint: {endpoint}")

print(f"Using Subscription Key: {subscription_key}")

# Step 2: Ensure the output folder exists, or create it

output_folder = 'azure_vision_output_folder'

if not os.path.exists(output_folder):

os.makedirs(output_folder)

# Step 3: Function to find the most recent file (images, PDFs, screenshots) in the input folder

def get_most_recent_file(directory):

# Handle various image and document formats, including PDFs

files = glob.glob(os.path.join(directory, '*.jpg')) + \

glob.glob(os.path.join(directory, '*.png')) + \

glob.glob(os.path.join(directory, '*.jpeg')) + \

glob.glob(os.path.join(directory, '*.tiff')) + \

glob.glob(os.path.join(directory, '*.bmp')) + \

glob.glob(os.path.join(directory, '*.gif')) + \

glob.glob(os.path.join(directory, '*.pdf')) # Include PDFs

if not files:

raise ValueError(f"No files found in the directory: {directory}")

# Return the most recently created file

return max(files, key=os.path.getctime)

# Step 4: Perform OCR on the file (image or PDF)

def perform_ocr(file_path):

print(f"Reading file content from: {file_path}")

with open(file_path, 'rb') as file_stream:

# Set headers and send POST request to Azure Vision API

headers = {

'Ocp-Apim-Subscription-Key': subscription_key,

'Content-Type': 'application/octet-stream'

}

response = requests.post(endpoint, headers=headers, data=file_stream)

# Check if the request was accepted successfully

if response.status_code == 202:

operation_location = response.headers.get('Operation-Location')

print(f"Operation URL: {operation_location}")

return operation_location

else:

raise Exception(f"Error: {response.status_code}, {response.text}")

# Step 5: Poll the operation location to get OCR results

def get_ocr_results(operation_location):

headers = {

'Ocp-Apim-Subscription-Key': subscription_key

}

while True:

# Poll the operation location to check status

response = requests.get(operation_location, headers=headers)

result = response.json()

# Check if the operation is complete

if result['status'] == 'succeeded':

return result['analyzeResult']

elif result['status'] == 'failed':

raise Exception("OCR operation failed.")

# Wait for a moment before polling again

print("Waiting for OCR operation to complete...")

time.sleep(1)

# Step 6: Save the OCR results to a text file

def save_ocr_results(results, output_folder):

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

output_file_name = f"ocr_output_{timestamp}.txt"

output_file_path = os.path.join(output_folder, output_file_name)

# Write OCR results to the output file

with open(output_file_path, 'w', encoding='utf-8') as output_file:

for page in results['readResults']:

for line in page['lines']:

output_file.write(line['text'] + '\n')

print(f"Detected Text: {line['text']}")

print(f"OCR results saved successfully to {output_file_path}")

# Step 7: Main function to handle the workflow

def main():

input_folder = 'azure_vision_input_folder'

try:

# Get the most recent file from the input folder (handles images, PDFs)

latest_file = get_most_recent_file(input_folder)

print(f"Processing the most recent file: {latest_file}")

# Perform OCR by sending the file to Azure Vision API

operation_location = perform_ocr(latest_file)

# Poll the operation location to get the OCR results

ocr_results = get_ocr_results(operation_location)

# Save the OCR results to a text file

save_ocr_results(ocr_results, output_folder)

except ValueError as e:

print(f"Error: {e}")

except Exception as e:

print(f"Unexpected error occurred: {e}")

# Run the script

if __name__ == "__main__":

main()

🌍 Multilingual Voiceover Generator — Instantly Globalise Your Content

What it does:
This powerful Python script reads a written script file and uses ElevenLabs' advanced multilingual TTS API to generate natural-sounding voiceovers in up to 8 languages: English, Spanish, Portuguese, Hindi, Japanese, Arabic, Indonesian, and Russian. It automatically detects the latest content file, selects the voice, and outputs professionally voiced .mp3 files — ready for social media, YouTube, or eLearning use.

🧠 Use Cases

🌐 Social Media Multilingual Boost: Post your TikToks, Shorts, and Reels with audio narration in multiple languages — reach global audiences without re-recording.
📢 Marketing Campaigns: Deliver regional versions of product pitches, explainer videos, or launch announcements across APAC, EMEA, and LATAM markets.
🎓 eLearning Modules: Make educational material accessible in the learner’s native language — from compliance training to onboarding.
🗣️ AI Podcasts: Convert your English script into audio episodes in multiple languages for international syndication.
📹 YouTube Localization: Use it to voiceover subtitles or narration for foreign-language channels without hiring multiple VOs.
📖 Audiobooks and Storytelling: Instantly narrate stories in various languages for global audiences or multilingual children’s apps.

📦 Technical Highlights

Automatically scans the most recent script file
Uses voice IDs stored securely via environment variables for custom voice selection
Detects script language using langdetect if needed
Creates high-quality .mp3 files with timestamps and language codes
Fully integrated with ElevenLabs multilingual model (v2)

🧑‍💼 In Plain English

“You write one script. This tool gives you eight versions of that script — each one spoken out loud in a different language, using realistic AI voices. It’s your voice, globally amplified.”

⚙️ Bonus Features

Heuristics to avoid duplicate runs if language-specific versions already exist
Modular voice settings (e.g., stability and similarity boost)
Supports API voice preview — see and select available ElevenLabs voices dynamically

🐍 Multilingual Voiceover Generator - Code

import os

import requests

import pandas as pd

import time

import glob

from datetime import datetime

from dotenv import load_dotenv

from langdetect import detect, LangDetectException

print("All libraries are loaded correctly!")

# Load environment variables from the .env file

load_dotenv()

# URL for fetching available voices

voices_url = "https://api.elevenlabs.io/v1/voices"

# Define constants and variables for the TTS request

XI_API_KEY = os.getenv("XI_API_KEY") # Load the API key from the .env file

CHUNK_SIZE = 1024 # Size of chunks to read/write during streaming

# Define available languages, including Russian and English

languages = {

"es": "Spanish",

"pt": "Brazilian Portuguese",

"hi": "Hindi",

"ja": "Japanese",

"ar": "Arabic",

"id": "Indonesian",

"ru": "Russian",

"en": "English" # Added English to the list

}

# Voice settings for Eleven Labs TTS

voice_settings = {

"stability": 0.4,

"similarity_boost": 0.8,

"style": 0.0,

"use_speaker_boost": True

}

# Function to get voice_id from .env using the voice name

def get_voice_id_by_name(voice_name):

voice_env_var = f"VOICE_{voice_name.replace(' ', '_')}" # Convert name to match .env format

voice_id = os.getenv(voice_env_var)

if voice_id:

return voice_id

else:

print(f"Voice with name {voice_name} not found in the environment variables!")

return None

# Ensure the eleven_output_folder exists, or create it

output_folder = r'C:\Users\Temp\pyproj\eleven_output_folder' # Replace with your directory

if not os.path.exists(output_folder):

os.makedirs(output_folder)

# Generate dynamic output file name based on voice_id, language, and current timestamp

def generate_output_file_name(voice_id, language, output_folder):

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") # Get the current timestamp

file_name = f"Eleven-{voice_id}-{language}-{timestamp}.mp3"

return os.path.join(output_folder, file_name)

# Function to get the latest file for each language or the base (non-language) file

def get_latest_files(directory):

files = glob.glob(os.path.join(directory, '*.txt'))

# Check if any files are found

if not files:

raise ValueError(f"No .txt files found in the directory: {directory}")

# Store the latest files for each language

latest_files = {}

# Identify most recent file for each language and for the base English file

for lang_code, lang_name in languages.items():

language_files = [file for file in files if f"-{lang_name}-" in file]

if language_files:

latest_files[lang_code] = max(language_files, key=os.path.getctime)

# Also fetch non-language-specific file (for English or general use)

non_language_files = [file for file in files if not any(f"-{lang_name}-" in file for lang_name in languages.values())]

if non_language_files:

latest_files["en"] = max(non_language_files, key=os.path.getctime)

return latest_files

def read_file_content(file_path):

with open(file_path, 'r', encoding='utf-8') as f:

return f.read()

# Function to detect the language of the input text (for non-language-specific files)

def detect_language(text):

try:

language = detect(text)

return language

except LangDetectException:

return "unknown"

# Function to send the request to ElevenLabs TTS API

def send_to_eleven_labs(api_key, text, voice_id, output_path, voice_settings):

tts_url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream"

headers = {

"Accept": "application/json",

"xi-api-key": api_key

}

data = {

"text": text,

"model_id": "eleven_multilingual_v2",

"voice_settings": voice_settings

}

response = requests.post(tts_url, headers=headers, json=data, stream=True)

if response.ok:

with open(output_path, "wb") as f:

for chunk in response.iter_content(chunk_size=CHUNK_SIZE):

f.write(chunk)

print(f"Audio stream saved successfully to {output_path}.")

else:

print(f"Error: {response.status_code} - {response.text}")

# Main function to process the input files

def process_files_for_eleven_labs():

# Fetch the latest files (both language-specific and base)

latest_files = get_latest_files(r'C:\Users\Temp\pyproj\eleven_input_folder')

# Heuristic: If all language files are available, process all. Otherwise, process the English/general file.

if all(lang_code in latest_files for lang_code in languages.keys()):

# Process all languages

for lang_code, lang_name in languages.items():

file_path = latest_files[lang_code]

text = read_file_content(file_path)

voice_id = get_voice_id_by_name("Charlotte") ##### DEFINE YOUR VOICE HERE.

output_file_path = generate_output_file_name(voice_id, lang_code, output_folder) # Use dynamic voice_id here

send_to_eleven_labs(XI_API_KEY, text, voice_id, output_file_path, voice_settings)

elif "en" in latest_files:

# Process only the non-language-specific (English) file

file_path = latest_files["en"]

text = read_file_content(file_path)

detected_language = detect_language(text)

voice_id = get_voice_id_by_name("Charlotte") ###### Replace with desired voice name#######

output_file_path = generate_output_file_name(voice_id, detected_language, output_folder) # Use dynamic voice_id here

send_to_eleven_labs(XI_API_KEY, text, voice_id, output_file_path, voice_settings)

else:

print("No valid files to process.")

# Run the process

if __name__ == "__main__":

process_files_for_eleven_labs()

# Setup headers for the request

headers = {

"Accept": "application/json",

"xi-api-key": XI_API_KEY, # Make sure you define XI_API_KEY

"Content-Type": "application/json"

}

# Function to fetch and display available voices

def fetch_and_display_voices():

response = requests.get(voices_url, headers=headers)

if response.status_code == 200:

voices_data = response.json()

# Display all available voices with their voice_id

print("Available Voices and their Voice IDs:")

for voice in voices_data['voices']:

print(f"Voice Name: {voice['name']} | Voice ID: {voice['voice_id']}")

return voices_data['voices']

else:

print(f"Error: {response.status_code} - {response.text}")

return None

# Fetch and display the available voices

fetch_and_display_voices()

🗣️ Azure Multilingual Voiceover Engine — Scalable Speech Synthesis with Microsoft TTS

What it does:
This script taps into Microsoft Azure’s Cognitive Services Speech API to automatically convert text files into natural-sounding voiceovers in up to 8 languages. It scans your input folder for the most recent scripts, detects the target language based on filename conventions, and generates clean, high-quality .mp3 voice files — ready for publishing, training, or social sharing.

🧠 Use Cases

🌍 Multilingual Video Narration: Perfect for YouTube, Shorts, TikToks, or Reels that need narration in multiple languages to expand reach.
📢 Marketing Personalisation: Quickly localize your product messages for international campaigns — same message, native voices.
📚 Corporate Training: Deliver multilingual compliance or onboarding content at scale, without hiring separate VOs.
📖 Audiobook & Podcast Adaptation: Create narrated versions of your blog posts, short stories, or newsletters in Spanish, Hindi, Arabic, and more.
🧾 Government & Accessibility Services: Convert legal documents, notices, or educational material into speech to improve access.

💡 Why It's Great

“This tool transforms your plain text into lifelike spoken audio in 8 different languages, using Microsoft’s AI voices — no recording studio, no delays, just fast, scalable narration.”

🌐 Supported Languages & Voices

English – Sonia (UK)
Spanish – Dalia (Mexico)
Portuguese / Brazilian Portuguese – Elza (Brazil)
Hindi – Ananya (India)
Arabic – Salma (Egypt)
Japanese – Mayu
Russian – Svetlana
Indonesian – Gadis

⚙️ Technical Features

Detects most recent .txt file per language for dynamic processing
Uses Azure Neural TTS voices with MP3 output format (32kbps mono)
Auto-generates filenames with timestamps for version control
Built-in logging and error handling for synthesis status
Easily extendable to more languages with a simple dictionary update

🐍 Azure Multilingual Voiceover Engine - Code

##MS SPEECH

#Install Libraries

%pip install openai requests python-dotenv langdetect

%pip install azure-cognitiveservices-speech

import os

import azure.cognitiveservices.speech as speechsdk

import glob # Import glob for file handling

from datetime import datetime

from dotenv import load_dotenv

# Load environment variables from the .env file

load_dotenv()

# Retrieve API keys and endpoint information from the environment

speech_key = os.getenv("MS_SPEECH_API_KEY1") # Azure Speech API Key

service_region = "australiaeast" # Set your Azure region

# Folders for input and output

input_folder = r'eleven_input_folder'

output_folder = r'speech_ms_output_folder'

# Ensure the output folder exists, or create it

if not os.path.exists(output_folder):

os.makedirs(output_folder)

# Language-specific voices (matching full language names, including "Brazilian Portuguese")

voices = {

"Spanish": "es-MX-DaliaNeural", # Spanish (Mexico) - Dalia

"Portuguese": "pt-BR-ElzaNeural", # Portuguese (Brazil) - Thalita

"Brazilian Portuguese": "pt-BR-ElzaNeural", # Added to map Brazilian Portuguese to Thalita

"Hindi": "hi-IN-AnanyaNeural", # Hindi (India) - Ananya

"Arabic": "ar-EG-SalmaNeural", # Arabic (Egypt) - Salma

"Japanese": "ja-JP-MayuNeural", # Japanese - Mayu

"Russian": "ru-RU-SvetlanaNeural", # Russian - Svetlana

"Indonesian": "id-ID-GadisNeural", # Indonesian - Gadis

"English": "en-GB-SoniaNeural" # English (UK) - Sonia (Default)

}

# Function to generate dynamic output file name

def generate_output_file_name(language, output_folder):

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") # Get the current timestamp

file_name = f"output_speech_{language}_{timestamp}.mp3"

return os.path.join(output_folder, file_name)

# Function to read the content of the latest file

def read_file_content(file_path):

print(f"Reading file content from: {file_path}")

with open(file_path, 'r', encoding='utf-8') as f:

content = f.read()

print(f"Content read from file: {content[:100]}...") # Print the first 100 characters for debugging

return content

# Function to map language name from the filename

def get_language_from_filename(file_path):

# Extract the language part from the filename, assuming format: MS-Language-...

filename = os.path.basename(file_path)

if "MS-" in filename:

parts = filename.split("-")

if len(parts) > 1:

language_name = parts[1] # This should give us the language name (like Russian, Spanish, etc.)

print(f"Detected language from filename: {language_name}")

return language_name

return "English" # Default to English if no match

# Function to find the most recent file for each language

def get_latest_files(directory):

files = glob.glob(os.path.join(directory, '*.txt'))

if not files:

raise ValueError(f"No .txt files found in the directory: {directory}")

latest_files = {}

# Iterate over language-specific voices and find the latest file for each

for lang_code, voice_name in voices.items():

# Find files with the language code in the file name

language_files = [file for file in files if f"-{lang_code}-" in file]

if language_files:

# Get the most recent file for the language

latest_files[lang_code] = max(language_files, key=os.path.getctime)

print(f"Latest file for {lang_code}: {latest_files[lang_code]}")

# Find the most recent non-language-specific file (English/general) if available

non_language_files = [file for file in files if not any(f"-{lang_code}-" in file for lang_code in voices.keys())]

if non_language_files:

latest_files["English"] = max(non_language_files, key=os.path.getctime)

print(f"Latest general file: {latest_files['English']}")

return latest_files

# Function to process and synthesize speech using Azure Speech SDK

def synthesize_speech(text, language, output_path):

print(f"Synthesizing speech for language: {language}")

print(f"Using output file: {output_path}")

# Set up the speech configuration

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Set the appropriate voice for the language

voice = voices.get(language, voices["English"]) # Default to English if no voice found

speech_config.speech_synthesis_voice_name = voice

print(f"Using voice: {voice}")

# Set the audio output format to MP3

speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3)

# Create a speech synthesizer

audio_output = speechsdk.audio.AudioOutputConfig(filename=output_path)

speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_output)

# Perform the text-to-speech synthesis

result = speech_synthesizer.speak_text_async(text).get()

# Handle the results

if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:

print(f"Speech synthesized successfully and saved to {output_path}")

elif result.reason == speechsdk.ResultReason.Canceled:

cancellation_details = result.cancellation_details

print(f"Speech synthesis canceled: {cancellation_details.reason}")

if cancellation_details.reason == speechsdk.CancellationReason.Error:

print(f"Error details: {cancellation_details.error_details}")

# Main function to process text files and generate speech outputs

def process_files_for_ms_tts():

try:

# Get the latest text files for each language

latest_files = get_latest_files(input_folder)

# Process each file dynamically based on language

for _, file_path in latest_files.items():

text = read_file_content(file_path)

# Detect language from filename

language_name = get_language_from_filename(file_path)

# Check if the detected language is supported

if language_name not in voices:

print(f"Warning: Language '{language_name}' not found in the voices mapping. Defaulting to English.")

language_name = "English"

# Generate the output file path

output_file_path = generate_output_file_name(language_name, output_folder)

# Synthesize speech for the current text file

synthesize_speech(text, language_name, output_file_path)

except Exception as e:

print(f"Error processing files: {e}")

# Run the process

if __name__ == "__main__":

process_files_for_ms_tts()

Loving This?

Reach out with any suggestions to make the project even better!