AudioMeta Speech-to-Text

Home / Zipreel AudioMeta®

Real-time, Speech-to-text Engine for Media Metadata

AudioMeta® is Zipreel Inc.’s real-time, speech-to-text engine that accepts HTTP media streams as input and generates text (metadata) as output. At present, AudioMeta®’s accuracy can enable indexing/search on the input media content from the generated output text.

Each AudioMeta® service instance can handle one HTTP audio stream in real-time to perform speech-to-text conversion. On a single server, the number of audiometa service instances that can run in real-time is dictated by the peak RAM bandwidth and RAM clock rate. Each input stream needs 10-12GB/s of RAM bandwidth. Intuitive REST APIs make workflow integration a piece of cake!

AudioMeta® Specifications

File I/O Compression Standards Platforms
Inputs: MP3, FLV, AAC, MPEG2-TS, MP4
Outputs: Distribution Format Exchange Profile (DFXP)
 MP3, AAC Xeon 55XX, Xeon 56XX, E3-26xx, E3-12xx Blades, 1U, 2U, 3U with redundant power options
Outputs: MPEG2-TS, MP4, Microsoft Smooth Streaming, Apple HLS, MPEG DASH Audio: Not typically required but can be supported if needed Mix and match resolutions, frame rates and bit rates – very flexible output configurations
File I/O
Inputs: MP3, FLV, AAC, MPEG2-TS, MP4
Outputs: Distribution Format Exchange Profile (DFXP)
Outputs: MPEG2-TS, MP4, Microsoft Smooth Streaming, Apple HLS, MPEG DASH
Compression Standards
MP3, AAC
Audio: Not typically required but can be supported if needed
Platforms
Xeon 55XX, Xeon 56XX, E3-26xx, E3-12xx Blades, 1U, 2U, 3U with redundant power options
Mix and match resolutions, frame rates and bit rates – very flexible output configurations