![]() ![]() ![]() Whatever the case, Meta used the scraped text and speech to create the training dataset for SeamlessM4T, called SeamlessAlign. Some have filed lawsuits against companies building AI tools on top of publicly available data, arguing that the vendors should be compelled to provide credit if not compensation - and clear ways to opt out.īut Meta claims that the data it mined - which might contain personally identifiable information, the company admits - wasn’t copyrighted and came primarily from open source or licensed sources. Not every content creator agrees with the practice of leveraging public data to train models that could be used commercially. In an interview with TechCrunch, Juan Pino, a research scientist at Meta’s AI research division and a contributor on the project, wouldn’t reveal the exact sources of the data, saying only that there was “a variety” of them. In developing it, Meta says that it scraped publicly available text (in the order of “tens of billions” of sentences) and speech (4 million hours) from the web. ![]() Mozilla, meanwhile, spearheaded Common Voice, one of the largest multi-language collections of voices for training automatic speech recognition algorithms.īut SeamlessM4T is among the more ambitious efforts to date to combine translation and transcription capabilities into a single model. Meta isn’t the only one investing resources in developing sophisticated AI translation and transcription tools.īeyond the wealth of commercial services and open source models already available from Amazon, Microsoft, OpenAI and a number of startups, Google is creating what it calls the Universal Speech Model, a part of the tech giant’s larger effort to build a model that can understand the world’s 1,000 most-spoken languages. And it builds on Massively Multilingual Speech, Meta’s framework that provides speech recognition, language identification and speech synthesis tech across more than 1,100 languages. SeamlessM4T is something of a spiritual successor to Meta’s No Language Left Behind, a text-to-text machine translation model, and Universal Speech Translator, one of the few direct speech-to-speech translation systems to support the Hokkien language. “SeamlessM4T implicitly recognizes the source languages without the need for a separate language identification model.” “Our single model provides on-demand translations that enable people who speak different languages to communicate more effectively,” Meta writes in a blog post shared with TechCrunch. In its quest to develop AI that can understand a range of different dialects, Meta has created an AI model, SeamlessM4T, that can translate and transcribe close to 100 languages across text and speech.Īvailable in open source along with SeamlessAlign, a new translation dataset, Meta claims that SeamlessM4T represents a “significant breakthrough” in the field of AI-powered speech-to-speech and speech-to-text. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |