Girard AI - AI Automation Platform

The Voice feature provides powerful speech synthesis and recognition capabilities powered by OpenAI. Convert text to natural-sounding speech or transcribe audio to text with high accuracy.

Overview

Voice offers two main capabilities:

Text-to-Speech (TTS) - Convert text to audio
Speech-to-Text (STT) - Transcribe audio to text

Text-to-Speech (TTS)

What is TTS?

Text-to-Speech converts written text into natural-sounding audio. Use it for:

Creating voiceovers
Generating audio content
Accessibility applications
Audio notifications
Content narration

Using TTS

Navigate to Voice in the sidebar
Select the Text to Speech tab
Enter your text in the input field
Choose a voice
Click Generate Speech
Download or play the audio

Available Voices

Voice	Description	Best For
Alloy	Neutral, balanced	General purpose
Echo	Warm, conversational	Podcasts, casual content
Fable	Narrative, storytelling	Audiobooks, stories
Onyx	Deep, authoritative	Professional, formal
Nova	Friendly, energetic	Marketing, upbeat content
Shimmer	Clear, expressive	Instructions, tutorials

Voice Examples

Alloy - "A versatile voice suitable for any content type."

Echo - Best for conversational content that feels approachable.

Fable - Perfect for narrative content and storytelling.

Onyx - Ideal for professional presentations and formal content.

Nova - Great for energetic marketing and promotional material.

Shimmer - Excellent for clear instructional content.

TTS Settings

Setting	Options	Description
Voice	6 options	Choose the voice personality
Speed	0.25x - 4.0x	Adjust playback speed
Format	MP3, WAV	Output audio format

Character Limits

Single request: Up to 4,096 characters
Longer content: Split into multiple requests

Tips for Better TTS

Use Punctuation - Commas and periods create natural pauses
Spell Out Numbers - "Twenty-five" vs "25" for clarity
Add Emphasis - Use ALL CAPS sparingly for emphasis
Preview First - Generate short samples before long content

Example Text

Welcome to Girard, your all-in-one AI platform.

Today, we're excited to show you how easy it is to create
amazing content using artificial intelligence.

Let's get started!

Speech-to-Text (STT)

What is STT?

Speech-to-Text transcribes audio into written text. Use it for:

Meeting transcription
Voice notes
Accessibility
Content creation
Audio analysis

Using STT

Navigate to Voice in the sidebar
Select the Speech to Text tab
Upload an audio file or record
Click Transcribe
Review and copy the text

Supported Audio Formats

Format	Extension	Max Size
MP3	.mp3	25 MB
MP4	.mp4, .m4a	25 MB
WAV	.wav	25 MB
WebM	.webm	25 MB
MPEG	.mpeg, .mpga	25 MB

Recording Audio

Record directly in the browser:

Click the Record button
Grant microphone permission if prompted
Speak clearly into your microphone
Click Stop when finished
Review the recording
Click Transcribe

Transcription Features

Automatic Punctuation - Adds periods, commas, etc.
Speaker Detection - Identifies different speakers (beta)
Timestamps - Optional timestamp markers
Language Detection - Automatically detects language

Supported Languages

STT supports multiple languages including:

English (US, UK, AU)
Spanish
French
German
Italian
Portuguese
Japanese
Chinese (Mandarin)
And many more...

Tips for Better Transcription

Clear Audio - Minimize background noise
Speak Clearly - Enunciate words properly
Quality Mic - Use a good microphone
Optimal Distance - Not too close or far from mic
Steady Volume - Maintain consistent volume

Workflow Examples

Creating a Podcast Intro

Write your intro script
Choose an engaging voice (Nova or Echo)
Generate the audio
Download and add to your podcast

Welcome back to Tech Talk, the podcast where we dive deep
into the latest technology trends. I'm your host, and today
we have an exciting episode lined up for you.

Transcribing an Interview

Upload the interview audio file
Click Transcribe
Review the text for accuracy
Edit any errors
Export for your article

Creating Audio Notifications

Write short, clear messages
Use appropriate voice (Shimmer for instructions)
Generate audio clips
Integrate into your application

Your order has been confirmed. You'll receive a shipping
notification within 24 hours.

API Usage

TTS API Example

curl -X POST https://www.girardai.com/api/voice/tts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, world!",
    "voice": "alloy"
  }'

STT API Example

curl -X POST https://www.girardai.com/api/voice/stt \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@recording.mp3"

Credit Usage

Action	Credits
TTS (per 1000 characters)	1
STT (per minute of audio)	2

Quality & Limitations

TTS Quality

Natural-sounding voices
Consistent pronunciation
Emotional expression
Multiple languages supported

TTS Limitations

Max 4,096 characters per request
No voice cloning
Limited custom pronunciation
English voices work best with English text

STT Accuracy

High accuracy for clear audio
Handles accents well
Automatic punctuation
Good with technical terms

STT Limitations

Background noise affects quality
Very fast speech may be less accurate
Some specialized terms may be missed
Max file size 25 MB

Best Practices

For TTS

Write for Speech
- Use conversational language
- Avoid complex sentences
- Read aloud before generating
Format Appropriately
- Break long text into paragraphs
- Use punctuation for pacing
- Consider the listener
Choose the Right Voice
- Match voice to content type
- Test multiple voices
- Consider your audience

For STT

Prepare Good Audio
- Use quality recording equipment
- Minimize background noise
- Ensure clear speech
Optimize Files
- Keep under 25 MB
- Use supported formats
- Trim unnecessary portions
Review Output
- Check for errors
- Verify technical terms
- Add formatting as needed

Troubleshooting

TTS Issues

No audio generated:

Check character count
Verify text input
Try a different voice

Pronunciation issues:

Spell out problematic words
Use phonetic spelling
Break up complex terms

STT Issues

Low accuracy:

Improve audio quality
Reduce background noise
Speak more clearly

Upload failed:

Check file size (< 25 MB)
Verify file format
Try converting to MP3

Previous: Agents | Next: MCP Servers