veganism.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Veganism Social is a welcoming space on the internet for vegans to connect and engage with the broader decentralized social media community.

Administered by:

Server stats:

296
active users

#speechrecognition

0 posts0 participants0 posts today
Replied in thread

"#KarenHao only really gets her teeth into this point in the book’s epilogue, “How the Empire Falls.” She takes inspiration from #TeHiku, a #Māori AI #speechrecognition project. Te Hiku seeks to revitalize the #te_reo language through putting archived audio tapes of te reo speakers into an AI model, teaching new generations of Māori.
The tech has been developed on consent and active participation from the Māori community, and it is only licensed to organizations that respect Māori values"

Replied in thread

@thelinuxEXP I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by @mkiol

It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.

I primarily use #WhisperAI for transcription and Piper for voice, but many other models are available as well.

It is available as flatpak and github.com/mkiol/dsnote

#TTS #transcription #TextToSpeech #translator translation #offline #machinetranslation #sailfishos #SpeechSynthesis #SpeechRecognition #speechtotext #nmt #linux-desktop #stt #asr #flatpak-applications #SpeechNote

🌟 Excited to share Thorsten-Voice's YouTube channel! 🎥 🗣️🔊 ♿ 💬

Thorsten presents innovative TTS solutions and a variety of voice technologies, making it an excellent starting point for anyone interested in open-source text-to-speech. Whether you're a developer, accessibility advocate, or tech enthusiast, his channel offers valuable insights and resources. Don't miss out on this fantastic content! 🎬

follow hem here: @thorstenvoice
or on YouTube: youtube.com/@ThorstenMueller YouTube channel!

YouTubeThorsten-VoiceGuude! (hi, nice to see you) 👋, i'm Thorsten 😊. You like open source, privacy aware and local running voice technology? Me too 😎. You'll find cooking recipe like tutorials on TTS, STT, Voice Assistants, AI, ML and way more cool stuff here. So, hop on and join my amazing community 🥰. #opensource #voice #cloning #technology #news #tutorial #local #privacy #tech #tts #stt #voiceassistant #raspberrypi #smarthome #homeassistant * My project website: https://www.Thorsten-Voice.de * Me on GitHub: https://github.com/thorstenMueller

Yesterday, I ordered food online. However it went a little off. And I contacted Support. They called me and for one moment, I thought it's a bot or recorded voice or something. And I hated it. Then I realized it's a human on the line.

I was planning to do an LLM+TTS+Speech Recognition and deploy it on A311D. To see if I can practice british accent with it. Now I'm rethinking about what I want to do. This way we are going, it doesn't lead to a good destination. I would hate it if I would have to talk to a voice enabled chatbot as support agent rather than a human.

And don't get me wrong. Voice enabled chatbots can have tons of good uses. But replacing humans with LLMs, not a good one. I don't think so.

#LLM#AI#TTS

After my #wake_word_detection #research has delievered fruits, I have plans to continue works in the voice domain. I would love if I could train a #TTS model which has #British accent so I would use it to practice.

I was wondering if I could do the inference on #A311D #NPU. However, as I am skimming papers of different models, having inference on A311D with reasonable performance seems unlikely. Even training of these models on my entry level #IntelArc #GPU would be painful.

Maybe I could just finetune an already existing models. I am also thinking about using #GeneticProgramming for some components of these TTS models to see if there will be better inference performance.

There are #FastSpeech2 and #SpeedySpeech which look promising. I wonder how much natural their accents will be. But they would be good starting points.

BTW, if anyone needs opensource models, I would love to work as a freelancer and have an #opensource job. Even if someone can just provide access to computation resources, that would be good.

#forhire #opensourcejob #job #hiring

Speech recognition systems struggle with accents and dialects, risking problems in critical fields like healthcare and emergency services. Imagine calling 911 and the AI used to screen out non-emergency calls can’t understand you.

A Spanish language professor explains: theconversation.com/sorry-i-di #AI #speechrecognition

The Conversation‘Sorry, I didn’t get that’: AI misunderstands some people’s words more than others
More from The Conversation U.S.

#UnplugTrump - Tipp5:
Verabschiede dich von Alexa und anderen Sprachassistenten, die deine Gespräche mithören und auswerten. Nutze stattdessen eine datenschutzfreundliche Alternative wie OpenVoiceOS, ein Open-Source-Sprachassistent, der von einer aktiven Community weiterentwickelt wird und auf einem RaspberryPi läuft. So behältst du die Kontrolle über deine Daten.

Today in my web browsing history: The tech sector waxing lyrical about #OpenAI's upgrades to #ChatGPT, which include #SpeechRecognition and #SpeechSynthesis capabilities, meanwhile on Wikipedia, which was scraped to train many LLMs, begs for donations.

All of the content we've placed online has been mined, processed, refined, and is being sold back to us.

For many of us, that's a new experience.

But I suspect that for those from the global South, it's a repeat of centuries of colonisation.

Reposting my #introduction after the SDF database crash—

Hi everybody! I'm Will and I enjoy all things #languages and #code. My day job is in #nlp (natural language processing #nlproc) and #speechrecognition for language education. In grad school I worked on #Bayesian #pragmatics with #deeplearning.

I speak English natively, #español #Spanish / #中文 #Chinese / #العربية #Arabic passably, and lots of others poorly. Talk to me in your language!