【Position Highlights】

Cutting-edge Technology: Focus on speech large language models and next-generation ASR technologies, driving innovation in speech understanding and generation.
Core Impact: Direct contribution to company's flagship products in speech intelligence and audio processing platforms.
Expert Collaboration: Work alongside world-class researchers and engineers in speech AI and large language models.
Comprehensive Coverage: Research spans ASR, speech generation, speaker diarization, audio codecs, and speech-language model integration.

【Job Responsibilities】

Design and develop advanced speech algorithms based on large-scale models (Transformers, Conformers, Whisper-like architectures, and Speech LLMs).
Lead or contribute to R&D and deployment in the following key areas:
Automatic Speech Recognition (ASR): Develop robust end-to-end ASR systems for multilingual and multi-accent scenarios; optimize streaming ASR with ultra-low latency; implement context-aware and personalized speech recognition.
Speech Generation: Advance zero-shot TTS with speaker adaptation; develop controllable speech synthesis with emotion and prosody modeling; create voice conversion and cross-lingual speech generation systems.
Speaker Diarization: Build state-of-the-art speaker diarization systems for multi-speaker scenarios; develop joint ASR and diarization models; implement real-time speaker tracking and identification.
Speech Large Language Models: Design and train speech-text multimodal LLMs; develop speech understanding models with reasoning capabilities; create unified models for multiple speech tasks.

【Position Highlights】

Cutting-edge Technology: Focus on speech large language models and next-generation ASR technologies, driving innovation in speech understanding and generation.
Core Impact: Direct contribution to company's flagship products in speech intelligence and audio processing platforms.
Expert Collaboration: Work alongside world-class researchers and engineers in speech AI and large language models.
Comprehensive Coverage: Research spans ASR, speech generation, speaker diarization, audio codecs, and speech-language model integration.

【Job Responsibilities】

Design and develop advanced speech algorithms based on large-scale models (Transformers, Conformers, Whisper-like architectures, and Speech LLMs).
Lead or contribute to R&D and deployment in the following key areas:
Automatic Speech Recognition (ASR): Develop robust end-to-end ASR systems for multilingual and multi-accent scenarios; optimize streaming ASR with ultra-low latency; implement context-aware and personalized speech recognition.
Speech Generation: Advance zero-shot TTS with speaker adaptation; develop controllable speech synthesis with emotion and prosody modeling; create voice conversion and cross-lingual speech generation systems.
Speaker Diarization: Build state-of-the-art speaker diarization systems for multi-speaker scenarios; develop joint ASR and diarization models; implement real-time speaker tracking and identification.
Speech Large Language Models: Design and train speech-text multimodal LLMs; develop speech understanding models with reasoning capabilities; create unified models for multiple speech tasks.

高级/资深Speech算法工程师（AIGC方向）

Zoom