This document discusses including automatic speech recognition, text to speech, and voice compression technologies as standard libraries within operating platforms. It argues that including these technologies can help platforms by increasing the number and quality of applications, lowering costs for developers, and attracting more developers to support the platform long-term. The CEO of Speereo Software proposes licensing their speech recognition technologies directly to platform owners.
1 of 2
More Related Content
Letter from CEO
1. Confidential document.
Partners only.
Automatic Speech Recognition, Text To Speech
& Voice Compression as parts of operating platforms.
Technical and Marketing aspects.
Konstantin Lamin
CEO
Speereo Software
2009
Speech recognition technology will become an inseparable part of human interaction with machines
and very soon. Speech recognition will ease such interaction, make it more comfortable. That especially is
true if one thinks of mobile platforms, where standard interfaces are not applicable much.
It is obvious that any platforms success is based on quantity, quality, novelty and openness of
software tuned for that platform. These qualities are attainable only if platform owner has a sound,
transparent and reliable technology policy; if financial and marketing needs of a software developer are
taken into consideration.
They are many platforms available today and developers simply will not support one with uncertain
future and low software distribution chances. Such situation forces platform owners to step out of traditional
boundaries majority of them are hardware manufacturers and main growth takes place in software market.
That is why I believe that any platform success will strongly depend on ability of its top management staff to
adapt to new game rules. No new processors, hardware designs, cameras, buttons or materials can
influence platform success like developer support.
Therefore the time has come for platform owners to support ASR/TTS/VC developers. Lack of clearly
defined strategy in this area makes distribution of these technologies difficult, thins down efforts put in
development.
We believe that the best way to implement speech recognition technology is to include its libraries into
a platform. This will allow:
To use speech engines in standard platform applications. ASR/TTS/VC libraries will become
open to platform developers;
To include into a good number of ASR/TTS/VC applications into software preinstalled into
mobile device by default;
To give access to ASR/TTS/VC to all developers that work within the platform;
To free developers working with a platform from document and payment hassle tied to
ASR/TTS/VC licensing;
To use standard ASR/TTS/VC libraries within all applications, therefore saving RAM and
ROM;
To make a final move to unified paradigm of voice interfaces in miscellaneous applications.
Standard libraries and SDK will give additional support to that;
To concentrate all development experience gained, support efforts made and consulting
materials produced for ASR/TTS/VC application development in one place instead of dispersing
them through many partners;
To include into platform cost ASR/TTS/VC modules and list of applications preinstalled. This
will allow ASR/TTS/VC developer i.e. Speereo to cut time and costs involved in sales and licensing
of ASR/TTS/VC engines to a multitude of third-party developers, i.e. to reduce licensing prices.
Lowering of licensing prices will also cut down costs that carry third-party developers in production of
innovating software. Therefore the platform will gain many necessary applications and by that will
grow in device sales;
To include ASR/TTS/VC into a platform SDK. By taking such strategic step platform owner
chooses to support one technology partner. That will allow him to plan medium and long term
strategy with lower risks, to invest into technology development (additional languages for speech
2. recognition engine for example). With that the platform owner will develop the platform market, give it
a direction to take for the future. Such responsible approach will make platform grow and be
attractive to third-party developers for long-term partnership.
Next step of interface development will once again show those who were left behind platform
owners that did not include ASR/TTS/VC into their platforms. It is not our business to tell platform owners of
their strategic development. We are simply pointing out one the possibilities and making ourselves useful.
Speereo offers:
0.3-0.5 EUR per license per device
ASR/TTS/VC libraries with rights to use in third-party applications in given device; support and training
of third-party developers by Speereo included.
0.5-1.2 EUR per package
ASR/TTS/VC libraries with rights to use in third-party applications in given device; support and training
of third-party developers by Speereo; plus starting application package (launcher, organizer, translator).
In case of licensing contract and technological partnership we are ready to take on the responsibility
for technology support and development in accordance with platform owners strategy (new devices
compatibility for example).
Alternative partnerships
Investment Application development Sales to end-users model is not that attractive for us
because of high development costs (application development in our field is more complex than average).
Investment return will be possible in case of covering of several segments of retail market. It is possible, but
strategically involves slower technological development and fewer resources drawn. Our main specialty is
speech recognition and its use.
Investment Licensing to third-party developers model also holds some minuses. With lack of
knowledge of ASR/TTS/VC technologies quality third-party developers bring in additional risks. Serious
knowledge of the subject requires extensive resources involvement. Third-party developers will not know if
certain ASR/TTS/VC supports the platform at all. Moreover development costs will be added to ASR/TTS/VC
license price and profitability of such project will become very queasy. It is worth to mention that application
developers are not very skilled in ASR/TTS/VC field.
Someone must set a trend, and in order to do that much work is to be done. We have already
developed basic application package that will get users and developers used to speech recognition
technologies. Why then go down the same road twice? It is better to use accumulated experience and move
forward.
Konstantin Lamin
CEO