Skip to content

Speech gapic client library #1012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Jul 14, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
aa0de94
Migrate quickstart to GAPIC client library
dizcology Jun 13, 2017
c695c82
Migrate transcribe to GAPIC client library
dizcology Jun 13, 2017
4777dc6
Migrate transcribe_async to GAPIC client library
dizcology Jun 14, 2017
e77f5f8
Migrate transcribe_streaming to GAPIC client library
dizcology Jun 14, 2017
199a748
clean up
dizcology Jun 20, 2017
51c4d01
clean up
dizcology Jun 21, 2017
0de6c7c
Import from google.cloud.speech
dizcology Jun 26, 2017
a594c70
update transcribe samples
dizcology Jun 27, 2017
9129caf
import in alphabetic order
dizcology Jun 27, 2017
4db0f45
remove unused variable
dizcology Jun 29, 2017
f09dfec
use strings instead of enums
dizcology Jun 29, 2017
66d53aa
restructure code
dizcology Jun 30, 2017
99b2e79
comment on sreaming requests
dizcology Jul 5, 2017
c7d1ad7
import style
dizcology Jul 6, 2017
ce0d25d
flake
dizcology Jul 7, 2017
3196c73
correct indent
dizcology Jul 11, 2017
d5acd7c
migrate transcribe_streaming_mic to gapic
dizcology Jul 11, 2017
cb40b7f
update google-cloud-speech version requirement
dizcology Jul 11, 2017
34ce758
addressing review comments
dizcology Jul 11, 2017
0955793
at the end of the audio stream, put None to signal to the generator
dizcology Jul 11, 2017
e355325
flake
dizcology Jul 12, 2017
a5f4c35
addressing github review comments
dizcology Jul 12, 2017
73d2b79
add region tags for migration guide
dizcology Jul 13, 2017
39f9b6b
update README
dizcology Jul 13, 2017
efe110c
rst format
dizcology Jul 13, 2017
1f4cda6
bullet
dizcology Jul 13, 2017
bd32ab4
addressing PR review comments
dizcology Jul 13, 2017
1f861ee
use enums
dizcology Jul 13, 2017
8fa2982
remove a word
dizcology Jul 13, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions speech/cloud-client/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ Google Cloud Speech API Python Samples

This directory contains samples for Google Cloud Speech API. The `Google Cloud Speech API`_ enables easy integration of Google speech recognition technologies into developer applications. Send audio and receive a text transcription from the Cloud Speech API service.

- See the `migration guide`_ for information about migrating to Python client library v0.27.

.. _migration guide: https://cloud.google.com/speech/docs/python-client-migration




Expand Down
6 changes: 6 additions & 0 deletions speech/cloud-client/README.rst.in
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ product:
recognition technologies into developer applications. Send audio and receive
a text transcription from the Cloud Speech API service.


- See the `migration guide`_ for information about migrating to Python client library v0.27.


.. _migration guide: https://cloud.google.com/speech/docs/python-client-migration

setup:
- auth
- install_deps
Expand Down
22 changes: 15 additions & 7 deletions speech/cloud-client/quickstart.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,16 @@ def run_quickstart():
import os

# Imports the Google Cloud client library
# [START migration_import]
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
# [END migration_import]

# Instantiates a client
speech_client = speech.Client()
# [START migration_client]
client = speech.SpeechClient()
# [END migration_client]

# The name of the audio file to transcribe
file_name = os.path.join(
Expand All @@ -35,14 +41,16 @@ def run_quickstart():
# Loads the audio into memory
with io.open(file_name, 'rb') as audio_file:
content = audio_file.read()
sample = speech_client.sample(
content,
source_uri=None,
encoding='LINEAR16',
sample_rate_hertz=16000)
audio = types.RecognitionAudio(content=content)

config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')

# Detects speech in the audio file
alternatives = sample.recognize('en-US')
response = client.recognize(config, audio)
alternatives = response.results[0].alternatives

for alternative in alternatives:
print('Transcript: {}'.format(alternative.transcript))
Expand Down
2 changes: 1 addition & 1 deletion speech/cloud-client/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
google-cloud-speech==0.26.0
google-cloud-speech==0.27.0
45 changes: 31 additions & 14 deletions speech/cloud-client/transcribe.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,33 +31,50 @@
def transcribe_file(speech_file):
"""Transcribe the given audio file."""
from google.cloud import speech
speech_client = speech.Client()
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()

# [START migration_sync_request]
# [START migration_audio_config_file]
with io.open(speech_file, 'rb') as audio_file:
content = audio_file.read()
audio_sample = speech_client.sample(
content=content,
source_uri=None,
encoding='LINEAR16',
sample_rate_hertz=16000)

alternatives = audio_sample.recognize('en-US')
audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')
# [END migration_audio_config_file]

# [START migration_sync_response]
response = client.recognize(config, audio)
# [END migration_sync_request]
alternatives = response.results[0].alternatives

for alternative in alternatives:
print('Transcript: {}'.format(alternative.transcript))
# [END migration_sync_response]


def transcribe_gcs(gcs_uri):
"""Transcribes the audio file specified by the gcs_uri."""
from google.cloud import speech
speech_client = speech.Client()
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()

# [START migration_audio_config_gcs]
audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.FLAC,
sample_rate_hertz=16000,
language_code='en-US')
# [END migration_audio_config_gcs]

audio_sample = speech_client.sample(
content=None,
source_uri=gcs_uri,
encoding='FLAC',
sample_rate_hertz=16000)
response = client.recognize(config, audio)
alternatives = response.results[0].alternatives

alternatives = audio_sample.recognize('en-US')
for alternative in alternatives:
print('Transcript: {}'.format(alternative.transcript))

Expand Down
54 changes: 30 additions & 24 deletions speech/cloud-client/transcribe_async.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,63 +30,69 @@
def transcribe_file(speech_file):
"""Transcribe the given audio file asynchronously."""
from google.cloud import speech
speech_client = speech.Client()
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()

# [START migration_async_request]
with io.open(speech_file, 'rb') as audio_file:
content = audio_file.read()
audio_sample = speech_client.sample(
content,
source_uri=None,
encoding='LINEAR16',
sample_rate_hertz=16000)

operation = audio_sample.long_running_recognize('en-US')
audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')

# [START migration_async_response]
operation = client.long_running_recognize(config, audio)
# [END migration_async_request]

# Sleep and poll operation.done()
retry_count = 100
while retry_count > 0 and not operation.complete:
while retry_count > 0 and not operation.done():
retry_count -= 1
time.sleep(2)
operation.poll()

if not operation.complete:
if not operation.done():
print('Operation not complete and retry limit reached.')
return

alternatives = operation.results
alternatives = operation.result().results[0].alternatives
for alternative in alternatives:
print('Transcript: {}'.format(alternative.transcript))
print('Confidence: {}'.format(alternative.confidence))
# [END send_request]
# [END migration_async_response]


def transcribe_gcs(gcs_uri):
"""Asynchronously transcribes the audio file specified by the gcs_uri."""
from google.cloud import speech
speech_client = speech.Client()
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()

audio_sample = speech_client.sample(
content=None,
source_uri=gcs_uri,
encoding='FLAC',
sample_rate_hertz=16000)
audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.FLAC,
sample_rate_hertz=16000,
language_code='en-US')

operation = audio_sample.long_running_recognize('en-US')
operation = client.long_running_recognize(config, audio)

retry_count = 100
while retry_count > 0 and not operation.complete:
while retry_count > 0 and not operation.done():
retry_count -= 1
time.sleep(2)
operation.poll()

if not operation.complete:
if not operation.done():
print('Operation not complete and retry limit reached.')
return

alternatives = operation.results
alternatives = operation.result().results[0].alternatives
for alternative in alternatives:
print('Transcript: {}'.format(alternative.transcript))
print('Confidence: {}'.format(alternative.confidence))
# [END send_request_gcs]


if __name__ == '__main__':
Expand Down
43 changes: 31 additions & 12 deletions speech/cloud-client/transcribe_streaming.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,20 +29,39 @@
def transcribe_streaming(stream_file):
"""Streams transcription of the given audio file."""
from google.cloud import speech
speech_client = speech.Client()
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()

# [START migration_streaming_request]
with io.open(stream_file, 'rb') as audio_file:
audio_sample = speech_client.sample(
stream=audio_file,
encoding=speech.encoding.Encoding.LINEAR16,
sample_rate_hertz=16000)
alternatives = audio_sample.streaming_recognize('en-US')

for alternative in alternatives:
print('Finished: {}'.format(alternative.is_final))
print('Stability: {}'.format(alternative.stability))
print('Confidence: {}'.format(alternative.confidence))
print('Transcript: {}'.format(alternative.transcript))
content = audio_file.read()

# In practice, stream should be a generator yielding chunks of audio data.
stream = [content]
requests = (types.StreamingRecognizeRequest(audio_content=chunk)
for chunk in stream)

config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')
streaming_config = types.StreamingRecognitionConfig(config=config)

# streaming_recognize returns a generator.
# [START migration_streaming_response]
responses = client.streaming_recognize(streaming_config, requests)
# [END migration_streaming_request]

for response in responses:
for result in response.results:
print('Finished: {}'.format(result.is_final))
print('Stability: {}'.format(result.stability))
alternatives = result.alternatives
for alternative in alternatives:
print('Confidence: {}'.format(alternative.confidence))
print('Transcript: {}'.format(alternative.transcript))
# [END migration_streaming_response]


if __name__ == '__main__':
Expand Down
Loading