Voice Agent OpenClaw Skill - ClawHub
Do you want your AI agent to automate Voice Agent workflows? This free skill from ClawHub helps with speech & transcription tasks without building custom tools from scratch.
What this skill does
Local Voice Input/Output for Agents using the AI Voice Agent API.
Install
npx clawhub@latest install voice-agentFull SKILL.md
Open original| name | version | description | homepage | user invocable | disable model invocation |
|---|---|---|---|---|---|
| voice-agent | 1.1.0 | Local Voice Input/Output for Agents using the AI Voice Agent API. | https://github.com/ricardotrevisan/ai-conversational-skill | true | false |
Voice Agent
This skill allows you to speak and listen to the user using a local Voice Agent API. It is client-only and does not start containers or services. It uses local Whisper for Speech-to-Text transcription and AWS Polly for Text-to-Speech generation.
Prerequisite
Requires a running backend API at http://localhost:8000.
Backend setup instructions are in this repository:
README.mdwalkthrough.mdDOCKER_README.md
Behavior Guidelines
- Audio First: When the user communicates via audio (files), your PRIMARY mode of response is Audio File.
- Silent Delivery: When sending an audio response, DO NOT send a text explanation like "I sent an audio". Just send the audio file.
- Workflow:
- User sends audio.
- Use
transcribeto read it. - You think of a response.
- Use
synthesizeto generate the audio file. - You send the file.
- STOP. Do not add text commentary.
- Failure Handling: If
healthfails or connection errors occur, do not attempt service management from this skill. Ask the user to start or fix the backend using the repository docs.
Tools
Transcribe File
To transcribe an audio file with local Whisper STT, run the client script with the transcribe command.
python3 {baseDir}/scripts/client.py transcribe "/path/to/audio/file.ogg"
Synthesize to File
To generate audio from text with AWS Polly TTS and save it to a file, run the client script with the synthesize command.
python3 {baseDir}/scripts/client.py synthesize "Text to speak" --output "/path/to/output.mp3"
Health Check
To check if the voice agent API is running and healthy:
python3 {baseDir}/scripts/client.py health