43
43
import java .io .InputStream ;
44
44
45
45
/**
46
- * The IBM Text to Speech service provides capabilities to synthesize text into natural-sounding speech in a variety of
46
+ * ### Service Overview
47
+ * The IBM Text to Speech service provides a Representational State Transfer (REST) Application Programming Interface
48
+ * (API) that uses IBM's speech-synthesis capabilities to synthesize text into natural-sounding speech in a variety of
47
49
* languages, dialects, and voices. The service currently synthesizes text from US English, UK English, French, German,
48
50
* Italian, Japanese, Spanish, or Brazilian Portuguese into audio spoken in a male or female voice (the service supports
49
51
* only a single gender for some languages). The audio is streamed back to the client with minimal delay.
52
+ * ### API Overview
53
+ * The Text to Speech service consists of the following related endpoints:
54
+ * * `/v1/voices` provides information about the voices available for synthesized speech.
55
+ * * `/v1/synthesize` synthesizes written text to audio speech.
56
+ * * `/v1/pronunciation` returns the pronunciation for a specified word. The `/v1/pronunciation` method is currently
57
+ * beta functionality.
58
+ * * `/v1/customizations` and `/v1/customizations/{customization_id}` lets users create custom voice models, which are
59
+ * dictionaries of words and their translations for use in speech synthesis. All `/v1/customizations` methods are
60
+ * currently beta functionality.
61
+ * * `/v1/customizations/{customization_id}/words` and `/v1/customizations/{customization_id}/words/{word}` lets users
62
+ * manage the words in a custom voice model.
63
+ *
64
+ *
65
+ * **Note about the Try It Out feature:** The `Try it out!` button lets you experiment with the methods of the API by
66
+ * making actual cURL calls to the service. The feature is **not** supported for use with the `POST /v1/synthesize`
67
+ * method. For examples of calls to this method, see the [Text to Speech API
68
+ * reference](http://www.ibm.com/watson/developercloud/text-to-speech/api/v1/).
69
+ * ### API Usage
70
+ * The following information provides details about using the service to synthesize audio:
71
+ * * **Audio formats:** The service supports a number of audio formats (MIME types). For more information about audio
72
+ * formats and sampling rates, including links to a number of Internet sites that provide technical and usage details
73
+ * about the different formats, see [Specifying an audio
74
+ * format](https://console.bluemix.net/docs/services/text-to-speech/http.html#format).
75
+ * * **SSML:** Many methods refer to the Speech Synthesis Markup Language (SSML), an XML-based markup language that
76
+ * provides annotations of text for speech-synthesis applications; for example, many methods accept or produce
77
+ * translations that use an SSML-based phoneme format. See [Using
78
+ * SSML](https://console.bluemix.net/docs/services/text-to-speech/SSML.html) and [Using IBM
79
+ * SPR](https://console.bluemix.net/docs/services/text-to-speech/SPRs.html).
80
+ * * **Word translations:** Many customization methods accept or return sounds-like or phonetic translations for words.
81
+ * A phonetic translation is based on the SSML format for representing the phonetic string of a word. Phonetic
82
+ * translations can use standard International Phonetic Alphabet (IPA) representation:
83
+ *
84
+ * <phoneme alphabet="ipa" ph="təmˈɑto"></phoneme>
85
+ *
86
+ * or the proprietary IBM Symbolic Phonetic Representation (SPR):
87
+ *
88
+ * <phoneme alphabet="ibm" ph="1gAstroEntxrYFXs"></phoneme>
89
+ *
90
+ * For more information about customization and about sounds-like and phonetic translations, see [Understanding
91
+ * customization](https://console.bluemix.net/docs/services/text-to-speech/custom-intro.html).
92
+ * * **GUIDs:** The pronunciation and customization methods accept or return a Globally Unique Identifier (GUID). For
93
+ * example, customization IDs (specified with the `customization_id` parameter) and service credentials are GUIDs. GUIDs
94
+ * are hexadecimal strings that have the format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`.
95
+ * * **WebSocket interface:** The service also offers a WebSocket interface as an alternative to its HTTP REST interface
96
+ * for speech synthesis. The WebSocket interface supports both plain text and SSML input, including the SSML
97
+ * <mark> element and word timings. See [The WebSocket
98
+ * interface](https://console.bluemix.net/docs/services/text-to-speech/websockets.html).
99
+ * * **Authentication:** You authenticate to the service by using your service credentials. You can use your credentials
100
+ * to authenticate via a proxy server that resides in IBM Cloud, or you can use your credentials to obtain a token and
101
+ * contact the service directly. See [Service credentials for Watson
102
+ * services](https://console.bluemix.net/docs/services/watson/getting-started-credentials.html) and [Tokens for
103
+ * authentication](https://console.bluemix.net/docs/services/watson/getting-started-tokens.html).
104
+ * * **Custom voice model ownership:** In all cases, you must use service credentials created for the instance of the
105
+ * service that owns a custom voice model to use the methods described in this documentation with that model. For more
106
+ * information, see [Ownership of custom voice
107
+ * models](https://console.bluemix.net/docs/services/text-to-speech/custom-models.html#customOwner).
108
+ * * **Request Logging:** By default, all Watson services log requests and their results. Data is collected only to
109
+ * improve the Watson services. If you do not want to share your data, set the header parameter
110
+ * `X-Watson-Learning-Opt-Out` to `true` for each request. Data is collected for any request that omits this header. See
111
+ * [Controlling request logging for Watson
112
+ * services](https://console.bluemix.net/docs/services/watson/getting-started-logging.html).
113
+ *
114
+ * The service does not log data (words and translations) that are used to build custom language models; your training
115
+ * data is never used to improve the service's base models. The service does log data when a custom model is used with a
116
+ * synthesize request; you must set the `X-Watson-Learning-Opt-Out` request header to prevent logging for recognition
117
+ * requests. For more information, see [Request logging and data
118
+ * privacy](https://console.bluemix.net/docs/services/text-to-speech/custom-models.html#customLogging).
50
119
*
51
120
* For more information about the service and its various interfaces, see [About Text to
52
121
* Speech](https://console.bluemix.net/docs/services/text-to-speech/index.html).
@@ -132,13 +201,14 @@ public ServiceCall<Voices> listVoices() {
132
201
* Streaming speech synthesis of the text in the body parameter.
133
202
*
134
203
* Synthesizes text to spoken audio, returning the synthesized audio stream as an array of bytes. Identical to the
135
- * `GET` method but passes longer text in the body of the request, not with the URL. Text size is limited to 5 KB. If
136
- * a request includes invalid query parameters, the service returns a `Warnings` response header that provides
137
- * messages about the invalid parameters. The warning includes a descriptive message and a list of invalid argument
138
- * strings. For example, a message such as `\"Unknown arguments:\"` or `\"Unknown url query arguments:\"` followed by
139
- * a list of the form `\"invalid_arg_1, invalid_arg_2.\"` The request succeeds despite the warnings. **Note about the
140
- * Try It Out feature:** The `Try it out!` button is **not** supported for use with the the `POST /v1/synthesize`
141
- * method. For examples of calls to the method, see the [Text to Speech API
204
+ * `GET` method but passes longer text in the body of the request, not with the URL. Text size is limited to 5 KB.
205
+ * (For the `audio/l16` format, you can optionally specify `endianness=big-endian` or `endianness=little-endian`; the
206
+ * default is little endian.) If a request includes invalid query parameters, the service returns a `Warnings`
207
+ * response header that provides messages about the invalid parameters. The warning includes a descriptive message and
208
+ * a list of invalid argument strings. For example, a message such as `\"Unknown arguments:\"` or `\"Unknown url query
209
+ * arguments:\"` followed by a list of the form `\"invalid_arg_1, invalid_arg_2.\"` The request succeeds despite the
210
+ * warnings. **Note about the Try It Out feature:** The `Try it out!` button is **not** supported for use with the the
211
+ * `POST /v1/synthesize` method. For examples of calls to the method, see the [Text to Speech API
142
212
* reference](http://www.ibm.com/watson/developercloud/text-to-speech/api/v1/).
143
213
*
144
214
* @param synthesizeOptions the {@link SynthesizeOptions} containing the options for the call
@@ -147,9 +217,7 @@ public ServiceCall<Voices> listVoices() {
147
217
public ServiceCall <InputStream > synthesize (SynthesizeOptions synthesizeOptions ) {
148
218
Validator .notNull (synthesizeOptions , "synthesizeOptions cannot be null" );
149
219
RequestBuilder builder = RequestBuilder .post ("/v1/synthesize" );
150
- if (synthesizeOptions .accept () != null ) {
151
- builder .header ("Accept" , synthesizeOptions .accept ());
152
- }
220
+ builder .header ("Accept" , synthesizeOptions .accept ());
153
221
if (synthesizeOptions .voice () != null ) {
154
222
builder .query ("voice" , synthesizeOptions .voice ());
155
223
}
0 commit comments