際際滷

際際滷Share a Scribd company logo
How Siri Works
Initial Sounds
First, the sounds of your speech are
immediately encoded into a compact
digital form, preserving its information.
Language Comprehenders
The signal from the iPhone connects with
the Internet and communicates with a
cloud-based model that consists of a
series of language comprehenders.
Speech Evaluation
Simultaneously, a local model evaluates
the speech on the device. Then, with the
cloud models, the device figures out if it
needs the network to move forth or can
handle the process locally. In other words,
a network query could be sending a text or
searching the web, while a local process
would be playing a song or setting an
alarm. If it is deemed local, then it does
not use the cloud anymore.
Language to Letters
Using both local and server models, the
device recognizes which letters constitute
which parts of the speech. Now that the
speech is converted into letters-vowels
and consonants, a language model can
estimate the comprised words of the
speech. Then, the system creates a list of
possible interpretations of the words that
your speech might mean.
Final Action
After this, it is all downhill. Of the list of
possible interpretations, the most
confident result is used. The computer
determines the intent of the speech, and
performs the function in the iPhone. If your
speech is too ambiguous at any point
during the process, the computer will defer
and make sure that the computer-
determined intent is correct.

More Related Content

Siri

  • 2. Initial Sounds First, the sounds of your speech are immediately encoded into a compact digital form, preserving its information.
  • 3. Language Comprehenders The signal from the iPhone connects with the Internet and communicates with a cloud-based model that consists of a series of language comprehenders.
  • 4. Speech Evaluation Simultaneously, a local model evaluates the speech on the device. Then, with the cloud models, the device figures out if it needs the network to move forth or can handle the process locally. In other words, a network query could be sending a text or searching the web, while a local process would be playing a song or setting an alarm. If it is deemed local, then it does not use the cloud anymore.
  • 5. Language to Letters Using both local and server models, the device recognizes which letters constitute which parts of the speech. Now that the speech is converted into letters-vowels and consonants, a language model can estimate the comprised words of the speech. Then, the system creates a list of possible interpretations of the words that your speech might mean.
  • 6. Final Action After this, it is all downhill. Of the list of possible interpretations, the most confident result is used. The computer determines the intent of the speech, and performs the function in the iPhone. If your speech is too ambiguous at any point during the process, the computer will defer and make sure that the computer- determined intent is correct.