The Speech Team within the Siri organization drives major speech recognition, synthesis and speech to speech model changes for various features deeply embedded throughout Apple’s ecosystem. Our team owns on-device accurate and private speech recognition models across various systems on chip and hardware platforms with diverse compute restrictions, enabling prominent production user experiences. This team drives core technology advances while fulfilling major production needs, including developing speech to speech experiences and the underlying multimodal foundation model technology for current and future speech-enabled features across Apple’s software, hardware, and services ecosystem. This allows for cutting edge applied research anchored in Apple specific production needs, while improving speech interaction experiences for Apple’s customers around the world. Our technology powers speech interaction for iOS, watchOS, visionOS, macOS, tvOS, including Siri, Dictation and various speech enabled Apple Intelligence features.

Machine Learning Research Manager, Speech and Multimodal Modeling