Nume MacAroon @nm

**Hacker News** @h4ckernews@mastodon.social · Jul 15

Hacker News @h4ckernews@mastodon.social

Voxtral-Mini-3B-2507 – Open source speech understanding model

https://huggingface.co/mistralai/Voxtral-Mini-3B-2507

huggingface.comistralai/Voxtral-Mini-3B-2507 · Hugging FaceWe’re on a journey to advance and democratize artificial intelligence through open source and open science.

#HackerNews #OpenSource #SpeechRecognition

**Ai Orbit** @aiorbit@mastodon.social · Jul 14

Jul 14

Ai Orbit @aiorbit@mastodon.social

Voice AI for All: How Transfer Learning & Synthetic Speech Unlock Inclusion https://aiorbit.app/voice-ai-for-all-how-transfer-learning-synthetic-speech-unlock-inclusion/ #VoiceAI
#InclusiveAI
#AssistiveTech
#SpeechRecognition

Replied in thread

**Ecologia Digital** @josemurilo@mato.social · Jul 8

Jul 8

Ecologia Digital @josemurilo@mato.social

"#KarenHao only really gets her teeth into this point in the book’s epilogue, “How the Empire Falls.” She takes inspiration from #TeHiku, a #Māori AI #speechrecognition project. Te Hiku seeks to revitalize the #te_reo language through putting archived audio tapes of te reo speakers into an AI model, teaching new generations of Māori.
The tech has been developed on consent and active participation from the Māori community, and it is only licensed to organizations that respect Māori values"

**Jeremy Kahn** @trochee@dair-community.social · Jul 4

Jul 4

Jeremy Kahn @trochee@dair-community.social

I don't know why they call it vibe coding

Replied in thread

**Debby** @debby@hear-me.social · Jul 3 *

Jul 3 *

Debby @debby@hear-me.social

@thelinuxEXP I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by @mkiol

It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.

I primarily use #WhisperAI for transcription and Piper for voice, but many other models are available as well.

It is available as flatpak and https://github.com/mkiol/dsnote

#TTS #transcription #TextToSpeech #translator translation #offline #machinetranslation #sailfishos #SpeechSynthesis #SpeechRecognition #speechtotext #nmt #linux-desktop #stt #asr #flatpak-applications #SpeechNote

The image shows a screenshot of the "About" page for Speech Note 4.8.1. The page is structured with a dark gray header and a light gray body. The header includes a title "About" and a version number "4.8.1" with a subtitle "Note taking, reading and translating with Speech to Text, Text to Speech and Machine Translation." Below this, there is a section titled "Changes," followed by "About," which includes links to the project website and bug reporting pages on GitHub and GitLab, along with a support email address. The page also states that Speech Note is developed as an open-source project under the Mozilla Public License 2.0. The "Authors" section lists Michal Kosciessa as the copyright holder for the years 2021-2025. The "Translators" section lists several names, including Heimen Stoffels, Béranger Arnaud, and others. The "Libraries in use" section lists various libraries such as Qt, Coqui STT, Vosk, and others. The page has a "Close" button in the bottom right corner.

Provided by @altbot, generated privately and locally using Ovis2-8B

**Debby** @debby@hear-me.social · May 23 *

May 23 *

Debby @debby@hear-me.social

Excited to share Thorsten-Voice's YouTube channel!

Thorsten presents innovative TTS solutions and a variety of voice technologies, making it an excellent starting point for anyone interested in open-source text-to-speech. Whether you're a developer, accessibility advocate, or tech enthusiast, his channel offers valuable insights and resources. Don't miss out on this fantastic content!

follow hem here: @thorstenvoice
or on YouTube: https://www.youtube.com/@ThorstenMueller YouTube channel!

YouTubeThorsten-VoiceGuude! (hi, nice to see you) 👋, i'm Thorsten 😊. You like open source, privacy aware and local running voice technology? Me too 😎. You'll find cooking recipe like tutorials on TTS, STT, Voice Assistants, AI, ML and way more cool stuff here. So, hop on and join my amazing community 🥰. #opensource #voice #cloning #technology #news #tutorial #local #privacy #tech #tts #stt #voiceassistant #raspberrypi #smarthome #homeassistant * My project website: https://www.Thorsten-Voice.de * Me on GitHub: https://github.com/thorstenMueller

#Accessibility #FLOSS #TTS

Replied in thread

**Debby** @debby@hear-me.social · May 23 *

May 23 *

Debby @debby@hear-me.social

Goode @thorstenvoice, just found your channel and I'm impressed! Your work on TTS is fantastic and so important for accessibility in the FLOSS community. Keep it up! #AccessibilityMatters #FLOSS #TTS #OpenSource #Inclusivity #FOSS #Coqui #AI #CoquiAI #VoiceAssistant #Sprachassistent #VoiceTechnology #KünstlicheStimme #MachineLearning #Python #Rhasspy #TextToSpeech #VoiceTech #STT #SpeechSynthesis #SpeechRecognition #Sprachsynthese #ArtificialVoice #VoiceCloning #Spracherkennung #CoquiTTS #voice #a11y #ScreenReader

**Farooq | فاروق** @farooqkz@cr8r.gg · May 6

May 6

Farooq | فاروق @farooqkz@cr8r.gg

Yesterday, I ordered food online. However it went a little off. And I contacted Support. They called me and for one moment, I thought it's a bot or recorded voice or something. And I hated it. Then I realized it's a human on the line.

I was planning to do an LLM+TTS+Speech Recognition and deploy it on A311D. To see if I can practice british accent with it. Now I'm rethinking about what I want to do. This way we are going, it doesn't lead to a good destination. I would hate it if I would have to talk to a voice enabled chatbot as support agent rather than a human.

And don't get me wrong. Voice enabled chatbots can have tons of good uses. But replacing humans with LLMs, not a good one. I don't think so.

#LLM #AI #TTS

**Farooq | فاروق** @farooqkz@cr8r.gg · Apr 27 *

Apr 27 *

Farooq | فاروق @farooqkz@cr8r.gg

After my #wake_word_detection #research has delievered fruits, I have plans to continue works in the voice domain. I would love if I could train a #TTS model which has #British accent so I would use it to practice.

I was wondering if I could do the inference on #A311D #NPU. However, as I am skimming papers of different models, having inference on A311D with reasonable performance seems unlikely. Even training of these models on my entry level #IntelArc #GPU would be painful.

Maybe I could just finetune an already existing models. I am also thinking about using #GeneticProgramming for some components of these TTS models to see if there will be better inference performance.

There are #FastSpeech2 and #SpeedySpeech which look promising. I wonder how much natural their accents will be. But they would be good starting points.

BTW, if anyone needs opensource models, I would love to work as a freelancer and have an #opensource job. Even if someone can just provide access to computation resources, that would be good.

#forhire #opensourcejob #job #hiring

#AI #VoiceAI #opensourceai

**Farooq | فاروق** @farooqkz@cr8r.gg · Feb 7

Feb 7

Farooq | فاروق @farooqkz@cr8r.gg

For learning languages, do you think it's a good idea to practice with an AI Speech Recognition and an AI Speech Synthesis engine?

I'm specifically interesting in British English and German.

#AI #ML #LanguageLearning

**The Conversation U.S.** @TheConversationUS@newsie.social · Feb 5

Feb 5

The Conversation U.S. @TheConversationUS@newsie.social

Speech recognition systems struggle with accents and dialects, risking problems in critical fields like healthcare and emergency services. Imagine calling 911 and the AI used to screen out non-emergency calls can’t understand you.

A Spanish language professor explains: https://theconversation.com/sorry-i-didnt-get-that-ai-misunderstands-some-peoples-words-more-than-others-239281 #AI #speechrecognition

The Conversation‘Sorry, I didn’t get that’: AI misunderstands some people’s words more than others

Recent searches

Search options

Administered by:

Server stats:

#speechrecognition

Recent searches

Search options

Administered by:

Server stats:

speechRecognition

#speechrecognition