Speechmatics

Speechmatics The most accurate and inclusive speech-to-text API ever released. Speechmatics exists to understand every voice. Speechmatics is a registered trademark.

Offering its speech-to-text API engine for solution and service providers to integrate into their stack irrespective of their industry or use case. Businesses use Speechmatics around the world to accurately understand and transcribe human-level speech into text regardless of demographic, age, gender, accent, dialect or location. Speechmatics is based in Cambridge, UK, Denver, USA, Chennai, India and Brno, Czech Republic.

29/12/2023

2023 has been quite the busy one for us here at Speechmatics. As a final round up ahead of the New Year, we thought we’d share some highlights:

2 billion parameters used by our Self Supervised Learning model in Ursa, launched earlier this year - https://bit.ly/3tBp5sR

6,050 YEARS of audio transcribed this year (or 52,970,961 hours if you prefer bigger numbers)

49 languages we can now transcribe, with recent additions of Persian and Spanish Bilingual, meaning you can now transcribe over half the world’s population. https://bit.ly/471MVvA

41% uplift in Norwegian Accuracy achieved this year, a huge uplift in accuracy.

35 languages we can now translate to and from English, opening up new markets and opportunities.

18% improvement in real-time speaker diarization accuracy, giving you incredible multi-speaker transcripts.

17% reduction in errors on average across all languages this year (and we’re not done yet).

17 events we attended this year with IBC being a highlight.

7.3x improvement in the time it takes to transcribe long files (over 20 mins), slashing our Real Time Factor (RTF) to just 0.04.

4th in a list of EU AI companies compiled by Crunchbase, alongside Mistral, Synthesia and Stability. https://bit.ly/481I8vs

4 weeks was the amount of time it took to add accurate, reliable Persian to Speechmatics roster, adding 110 million new voices we can understand. https://bit.ly/3GOzYL9

4 new Speech Capabilities launched – Summaries, Chapters, Topics and Sentiment – combining the power of accurate ASR with LLMs now empowers you to do more than ever with your media.

3 new videos from The.Shed, where we show prototypes of products and features than can be built using our Capabilities. https://bit.ly/3NDadkv

1 new bold vision for the future of Speech Technology with Speech Intelligence. https://bit.ly/3tBoWWl

28/12/2023

Here's 10 of our most popular pieces of content from 2023:

In a world of increasingly costly machine learning model deployments, ensuring accurate GPU operation timing is key to resource optimization. In this technical blog post, we explore best practices to achieve this in PyTorch- https://bit.ly/4avL8Ss

When it comes to speech-to-text, accuracy goes hand-in-hand with accessibility. In this blog we explore how Ursa stacks up for people from underrepresented groups - https://bit.ly/4auDojG

Accuracy Team Lead, John Hughes, explains how and why Word Error Rate as an accuracy measure is outdated and often misaligned with human judgment.
https://bit.ly/4axDZkM

2023 was definitely the year that LLMs and ChatGPT captured the public imagination. This is our technical guide to how GPT-4 works - https://bit.ly/487X4IC

Our launch blog for Summaries, which allows businesses to extract more value from audio & video content - https://bit.ly/483vA6E

On the theme of launches... Back in April, we announced our ability to translate in real-time - https://bit.ly/484ENM8

Two exciting new senior hires joined Speechmatics this year, with Trevor Back as Chief Product Officer and Usman Gulfaraz as Chief Revenue Officer -
https://bit.ly/4ax2sq5

Hyperscalers, like Microsoft, offer convenient ASR solutions, but their generic models have limitations for Contact Center as a Service (CCaaS) vendors. Here we explore their limitations - https://bit.ly/3vabrxr

Seven examples of speech recognition and areas where speech-to-text technology makes a valuable difference - https://bit.ly/4az6PRy

And finally, a big one for Speechmatics and the future of our space. CEO Katy Wigdahl introduces Speech Intelligence, foundational speech technology for the AI era - https://bit.ly/3uVjfmH

How does Speechmatics 'learn' a new language?Strangely, in much the same way people do. Here's how 👇 Step 1: The Lessons...
22/12/2023

How does Speechmatics 'learn' a new language?

Strangely, in much the same way people do. Here's how 👇

Step 1: The Lessons
We start by training our models in a similar fashion to taking a language class or course.

We immerse the models in the language and watch as the model picks up the language over time. Like a good linguist, large models are able to soak up the data and learn from the data very quickly. Others may need to see a lot more data or see the same data a few times to meet our required levels of accuracy.

Step 2: The Mock Exams

Like learning a language, a couple of practice tests are a great way to solidify your learning. And just like in real life, if you have a friend who speaks the language fluently, you might be brave enough to engage them in conversation, with them providing pointers and feedback on where you are making mistakes. For us, we had a customer who was one such friend whom we could send some of our models for feedback.

Step 3: The Final Exam
In real life, the final exam cannot be something you've seen before, or be too closely tied to your teacher's approach. It is an exam, after all. Our evaluation data is our 'final’ exam.

This evaluation dataset comes from different sources that we trust highly, and we use this third set for testing the models once they're built to see how well we've done and benchmark ourselves in the market.

To read the full behind-the-scenes of how we added Persian in just 4 weeks, follow the link 👉 👉 👉 -

Lifting the lid on how Speechmatics' speech recognition learnt the Persian language.

Continuing our mission to understand every voice, we recently added our 49th language to the roster of those we support ...
21/12/2023

Continuing our mission to understand every voice, we recently added our 49th language to the roster of those we support for transcription: Persian.

Read the inside story on how we added Persian to Speechmatics in just 4 weeks, without compromising accuracy.

Lifting the lid on how Speechmatics' speech recognition learnt the Persian language.

2023 was the year speech technology got an upgrade.Read more about Speech Intelligence, which has the potential to funda...
18/12/2023

2023 was the year speech technology got an upgrade.

Read more about Speech Intelligence, which has the potential to fundamentally change how we interact with technology and unleash the biggest new source of business value in years -

A New Chapter for AI and Understanding Every Voice

Podcasters, take a load of your mind and plate, with Chapters. Turn your latest episode into short chapter summaries wit...
15/12/2023

Podcasters, take a load of your mind and plate, with Chapters. Turn your latest episode into short chapter summaries with automatic headings and timestamps so your audience can find their favourite parts -

In this episode of The.Shed, we've created a bot that can create easy-to-read chapters of your favourite (long) podcasts. Everything is generated automatical...

We use best-in-class machine learning models to identify optimal places for chapter markers based on topic changes in me...
15/12/2023

We use best-in-class machine learning models to identify optimal places for chapter markers based on topic changes in media files. We call this Chapters, the latest Speech Intelligence capability from Speechmatics. See the full web page here - https://bit.ly/47HB7PM

Speechmatics now can automatically detect natural transition points in spoken content to divide files into distinct chapters and then summarize the content within that chapter. This makes it effortless to create navigable chapters for videos, podcast episodes, audiobooks, lectures, and other long-form media. Instead of a single unbroken stream of sentences, transcripts can now be split into chapters at points where the media naturally moves from one topic to the next.

We even give those chapters headings, automatically. This removes the need to do this manually and satisfies that natural urge to make content easier to parse, but also gives you grouped information to use in other workflows.

We've designed it such that you do less work processing and iterating on the transcript, and can focus on what you do best, making that transcript useful and actionable in your product.

Besides the intuitive reasons for wanting to make content digestible, there are statistical engagement benefits for doing so too. For EdTech companies, longer videos have higher dropout rates among students. For media distribution companies and content creators, longer videos lose viewers.

Read the full blog here - https://bit.ly/3upJtxl

Sign up to Speechmatics to try out Chapters today - https://bit.ly/3GAyRyz .

Speechmatics can now automatically detect natural transition points in spoken content to divide files into digestible, summarized chapters.

Address

Unit 296 Cambridge Science Park, Milton Road
Cambridge
CB40WD

Opening Hours

Monday 9am - 5:30pm
Tuesday 9am - 5:30pm
Wednesday 9am - 5:30pm
Thursday 9am - 5:30pm
Friday 9am - 5:30pm

Telephone

+441223794497

Alerts

Be the first to know and let us send you an email when Speechmatics posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Contact The Business

Send a message to Speechmatics:

Share