Jump to related tools in the same category or review the original source on GitHub.

Speech & Transcription @eftalyurtseven Updated 2/26/2026

Eachlabs Voice Audio OpenClaw Skill - ClawHub

Do you want your AI agent to automate Eachlabs Voice Audio workflows? This free skill from ClawHub helps with speech & transcription tasks without building custom tools from scratch.

What this skill does

Text-to-speech, speech-to-text, voice conversion, and audio processing using EachLabs AI models. Supports ElevenLabs TTS, Whisper transcription with diarization, and RVC voice conversion. Use when the user needs TTS, transcription, or voice conversion.

Install

npx clawhub@latest install eachlabs-voice-audio

Full SKILL.md

Open original
namedescription
eachlabs-voice-audioText-to-speech, speech-to-text, voice conversion, and audio processing using EachLabs AI models. Supports ElevenLabs TTS, Whisper transcription with diarization, and RVC voice conversion. Use when the user needs TTS, transcription, or voice conversion.

EachLabs Voice & Audio

Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.

Authentication

Header: X-API-Key: <your-api-key>

Set the EACHLABS_API_KEY environment variable. Get your key at eachlabs.ai.

Available Models

Text-to-Speech

Model Slug Best For
ElevenLabs TTS elevenlabs-text-to-speech High quality TTS
ElevenLabs TTS w/ Timestamps elevenlabs-text-to-speech-with-timestamp TTS with word timing
ElevenLabs Text to Dialogue elevenlabs-text-to-dialogue Multi-speaker dialogue
ElevenLabs Sound Effects elevenlabs-sound-effects Sound effect generation
ElevenLabs Voice Design v2 elevenlabs-voice-design-v2 Custom voice design
Kling V1 TTS kling-v1-tts Kling text-to-speech
Kokoro 82M kokoro-82m Lightweight TTS
Play AI Dialog play-ai-text-to-speech-dialog Dialog TTS
Stable Audio 2.5 stable-audio-2-5-text-to-audio Text to audio

Speech-to-Text

Model Slug Best For
ElevenLabs Scribe v2 elevenlabs-speech-to-text-scribe-v2 Best quality transcription
ElevenLabs STT elevenlabs-speech-to-text Standard transcription
Wizper with Timestamp wizper-with-timestamp Timestamped transcription
Wizper wizper Basic transcription
Whisper whisper Open-source transcription
Whisper Diarization whisper-diarization Speaker identification
Incredibly Fast Whisper incredibly-fast-whisper Fastest transcription

Voice Conversion & Cloning

Model Slug Best For
RVC v2 rvc-v2 Voice conversion
Train RVC train-rvc Train custom voice model
ElevenLabs Voice Clone elevenlabs-voice-clone Voice cloning
ElevenLabs Voice Changer elevenlabs-voice-changer Voice transformation
ElevenLabs Voice Design v3 elevenlabs-voice-design-v3 Advanced voice design
ElevenLabs Dubbing elevenlabs-dubbing Video dubbing
Chatterbox S2S chatterbox-speech-to-speech Speech to speech
Open Voice openvoice Open-source voice clone
XTTS v2 xtts-v2 Multi-language voice clone
Stable Audio 2.5 Inpaint stable-audio-2-5-inpaint Audio inpainting
Stable Audio 2.5 A2A stable-audio-2-5-audio-to-audio Audio transformation
Audio Trimmer audio-trimmer-with-fade Audio trimming with fade

Audio Utilities

Model Slug Best For
FFmpeg Merge Audio Video ffmpeg-api-merge-audio-video Merge audio with video
Toolkit Video Convert toolkit Video/audio conversion

Prediction Flow

  1. Check model GET https://api.eachlabs.ai/v1/model?slug=<slug> — validates the model exists and returns the request_schema with exact input parameters. Always do this before creating a prediction to ensure correct inputs.
  2. POST https://api.eachlabs.ai/v1/prediction with model slug, version "0.0.1", and input matching the schema
  3. Poll GET https://api.eachlabs.ai/v1/prediction/{id} until status is "success" or "failed"
  4. Extract the output from the response

Examples

Text-to-Speech with ElevenLabs

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-text-to-speech",
    "version": "0.0.1",
    "input": {
      "text": "Welcome to our product demo. Today we will walk through the key features.",
      "voice_id": "EXAVITQu4vr4xnSDxMaL",
      "model_id": "eleven_v3",
      "stability": 0.5,
      "similarity_boost": 0.7
    }
  }'

Transcription with ElevenLabs Scribe

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-speech-to-text-scribe-v2",
    "version": "0.0.1",
    "input": {
      "media_url": "https://example.com/recording.mp3",
      "diarize": true,
      "timestamps_granularity": "word"
    }
  }'

Transcription with Wizper (Whisper)

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "wizper-with-timestamp",
    "version": "0.0.1",
    "input": {
      "audio_url": "https://example.com/audio.mp3",
      "language": "en",
      "task": "transcribe",
      "chunk_level": "segment"
    }
  }'

Speaker Diarization with Whisper

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "whisper-diarization",
    "version": "0.0.1",
    "input": {
      "file_url": "https://example.com/meeting.mp3",
      "num_speakers": 3,
      "language": "en",
      "group_segments": true
    }
  }'

Voice Conversion with RVC v2

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "rvc-v2",
    "version": "0.0.1",
    "input": {
      "input_audio": "https://example.com/vocals.wav",
      "rvc_model": "CUSTOM",
      "custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
      "pitch_change": 0,
      "output_format": "wav"
    }
  }'

Merge Audio with Video

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "ffmpeg-api-merge-audio-video",
    "version": "0.0.1",
    "input": {
      "video_url": "https://example.com/video.mp4",
      "audio_url": "https://example.com/narration.mp3",
      "start_offset": 0
    }
  }'

ElevenLabs Voice IDs

The elevenlabs-text-to-speech model supports these voice IDs. Pass the raw ID string:

Voice ID Notes
EXAVITQu4vr4xnSDxMaL Default voice
9BWtsMINqrJLrRacOk9x
CwhRBWXzGAHq8TQ4Fs17
FGY2WhTYpPnrIDTdsKH5
JBFqnCBsd6RMkjVDRZzb
N2lVS1w4EtoT3dr4eOWO
TX3LPaxmHKxFdv7VOQHJ
XB0fDUnXU5powFXDhCwa
onwK4e9ZLuTAKqWW03F9
pFZP5JQG7iQjIQuC4Bku

Parameter Reference

See references/MODELS.md for complete parameter details for each model.

Original URL: https://github.com/openclaw/skills/blob/main/skills/eftalyurtseven/eachlabs-voice-audio

Related skills

If this matches your use case, these are close alternatives in the same category.