Asterisk AI Voice Agent

(github.com)

49 points | by akrulino 3 hours ago

7 comments

wild_egg 58 minutes ago
The baseline configurations all note <2s and <3s times. I haven't tried any voice AI stuff yet but a 3s latency waiting on a reply seems rage inducing if you're actually trying to accomplish something.
Is that really where SOTA is right now?
[-]
- coderintherye 17 minutes ago
  Microsoft Foundry's realtime voice API (which itself is wrapping AI models from the major players) has response times in the milliseconds.
- wellthisisgreat 33 minutes ago
  No, there are models with sub-second latency for sure
looneysquash 29 minutes ago
That seems like bad news for Allison. Though I know she already had some TTS voices available, so many not.
eugene3306 27 minutes ago
I've created Asterisk Codex Skill, but turns out there is ten seconds timeout for scripts
aftbit 1 hour ago
This opens up new possibilities for interactive phone services. Retro-futuristic for sure.
nextworddev 2 hours ago
Can I connect this to Twilio
[-]
- kwindla 52 minutes ago
  One easy way to build voice agents and connect them to Twilio is the Pipecat open source framework. Pipecat supports a wide variety of network transports, including the Twilio MediaStream WebSocket protocol so you don't have to bounce through a SIP server. Here's a getting started doc.[1]
  (If you do need SIP, this Asterisk project looks really great.)
  Pipecat has 90 or so integrations with all the models/services people use for voice AI these days. NVIDIA, AWS, all the foundation labs, all the voice AI labs, most of the video AI labs, and lots of other people use/contribute to Pipecat. And there's lots of interesting stuff in the ecosystem, like the open source, open data, open training code Smart Turn audio turn detection model [2], and the Pipecat Flows state machine library [3].
  [1] - https://docs.pipecat.ai/guides/telephony/twilio-websockets [2] - https://github.com/pipecat-ai/pipecat-flows/ [3] - https://github.com/pipecat-ai/smart-turn
  Disclaimer: I spend a lot of my time working on Pipecat. Also writing about both voice AI in general and Pipecat in particular. For example: https://voiceaiandvoiceagents.com/
  [-]
  - nextworddev 4 minutes ago
    This is good stuff.
    In your opinion, how close is Pipecat + OSS to replacing proprietary infra from Vapi, Retell, Sierra, etc?
    Ps did you write this web guide?
- VladVladikoff 1 hour ago
  Technically yes, twilio has sip trunks.
krater23 2 hours ago
Please don't. I had a talk with a shitty AI bot on a Fedex line. It's absolute crap. Just give me a 'Type 1 for x, type 2 for y'. Then I don't need to guess what are the possibilities.
[-]
- EvanAnderson 1 hour ago
  Voice-controlled phone systems are hugely rage-inducing for me. I am often in loud setting with background chatter. Muting my audio and using a touchtone keypad is so much more accurate and easy than having to find a quiet place and worrying that somebody is going to say something that the voice response system detects.
- 9x39 1 hour ago
  One problem is once you’re in deep building a phone IVR workflow beyond X or Y (yes, these are intentional), callers don’t care about some deep and featured input menu. They just mash 0 or pick a random option and demand a human finish the job and transfer them - understandably.
  When you’re committed to phone intent complexity (hell), the AI assisted options are sort of less bad since you don’t have to explain the menu to callers, they just make demands.
  [-]
  - tartoran 38 minutes ago
    What if the goal is to keep gaslighting you until you give up your demands?
    [-]
    - 9x39 24 minutes ago
      Most voice agents for large companies are a calculated game to deter customers from expensive humans as we know, but not always.
      Sort of like how Jira can be a streamlined tool or a prison of 50-step workflows, it's all up to the designer.
johnebgd 2 hours ago
I welcome the spam calls from our asterisk overlords.
[-]
- VladVladikoff 1 hour ago
  I’m honestly surprised it hasn’t been more prevalent yet. I still get call centre type spam calls where you can hear all the background noise of the rest of the call centre.
  [-]
  - userbinator 48 minutes ago
    Is the background noise real, or is it also AI-generated to make you think that it's a human?
    [-]
    - tartoran 35 minutes ago
      The background noise is a recording for sure, no AI needed, just a background noise audiofile in a loop would do.
      [-]
      - VladVladikoff 24 minutes ago
        Why though? It adds nothing positive, it only makes me sure it is a scam call.