Speech2Text-NLP-Text2Speech
Speech-to-Text, Natural Language Understanding and Text-to-Speech on IBM Cloud¶
To convert audio file to text use IBM Watson Speech-to-Text service, which accompanies each recognized word by a confidence level, start and end time. We also relied on the official Watson Python SDK to interact with the APIs.
Copyright By PowCoder代写 加微信 powcoder
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as md
import time
import json
%matplotlib notebook
How to get your own API key for IBM Watson AI/ML Services:¶
Create free account at IBM Cloud and get free access to it via IBM Academic Initiative or via https://cloud.ibm.com/registration/
Login to IBM Cloud and navigate to Watson AI/ML Services page in Catalog https://cloud.ibm.com/catalog?category=ai#services
In the IBM Cloud catalog, find the Speech to Text service under the “AI / Machine Learning” category, and then, as in the screenshot below, click “Create”
Once you have created the service, you can now access it via your dashboard: https://cloud.ibm.com/dashboard/
To get your API Key, follow the instructions in the screenshot below. Go to “Service credentials”, click to expand “Auto-generated service credentials” and copy apikey to this Python notebook.
Play input audio file¶
Install “urllib” module to download audio file.
!pip install urllib3
Requirement already satisfied: urllib3 in c:\users\roman\anaconda3\lib\site-packages (1.26.4)
Download audio file to local directory.
import urllib
file_url = “http://analytics.romanko.ca/data/”
filename = “sample.wav”
opener = urllib.request.build_opener()
opener.addheaders = [(‘User-agent’, ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0’)]
urllib.request.install_opener(opener)
urllib.request.urlretrieve(file_url + filename, filename)
(‘sample.wav’,
Play audio file.
import IPython
IPython.display.Audio(filename)
Your browser does not support the audio element.
Speech-to-Text¶
!pip install ibm-watson –upgrade
Requirement already satisfied: ibm-watson in c:\users\roman\anaconda3\lib\site-packages (5.3.0)
Requirement already satisfied: requests<3.0,>=2.0 in c:\users\roman\anaconda3\lib\site-packages (from ibm-watson) (2.27.1)
Requirement already satisfied: ibm-cloud-sdk-core==3.*,>=3.3.6 in c:\users\roman\anaconda3\lib\site-packages (from ibm-watson) (3.13.2)
Requirement already satisfied: python-dateutil>=2.5.3 in c:\users\roman\anaconda3\lib\site-packages (from ibm-watson) (2.8.1)
Requirement already satisfied: websocket-client==1.1.0 in c:\users\roman\anaconda3\lib\site-packages (from ibm-watson) (1.1.0)
Requirement already satisfied: PyJWT<3.0.0,>=2.0.1 in c:\users\roman\anaconda3\lib\site-packages (from ibm-cloud-sdk-core==3.*,>=3.3.6->ibm-watson) (2.1.0)
Requirement already satisfied: urllib3<2.0.0,>=1.26.0 in c:\users\roman\anaconda3\lib\site-packages (from ibm-cloud-sdk-core==3.*,>=3.3.6->ibm-watson) (1.26.4)
Requirement already satisfied: six>=1.5 in c:\users\roman\anaconda3\lib\site-packages (from python-dateutil>=2.5.3->ibm-watson) (1.15.0)
Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users\roman\anaconda3\lib\site-packages (from requests<3.0,>=2.0->ibm-watson) (2.0.10)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\roman\anaconda3\lib\site-packages (from requests<3.0,>=2.0->ibm-watson) (2020.12.5)
Requirement already satisfied: idna<4,>=2.5 in c:\users\roman\anaconda3\lib\site-packages (from requests<3.0,>=2.0->ibm-watson) (2.10)
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
# Instantiate the service using your credentials
speech_to_text = SpeechToTextV1(
authenticator=IAMAuthenticator(‘RsmNELiZjbNSZ6SPggYHbvcum-7XibSj3qS3mH3KSP27′)
audio_file = open(filename, “rb”)
result = speech_to_text.recognize(audio=audio_file, content_type=’audio/wav’, timestamps=True, word_confidence=True).get_result()
{‘result_index’: 0,
‘results’: [{‘final’: True,
‘alternatives’: [{‘transcript’: ‘thunderstorms could produce large hail isolated tornadoes and heavy rain ‘,
‘confidence’: 1.0,
‘timestamps’: [[‘thunderstorms’, 1.45, 2.32],
[‘could’, 2.32, 2.55],
[‘produce’, 2.55, 3.01],
[‘large’, 3.01, 3.35],
[‘hail’, 3.35, 3.79],
[‘isolated’, 3.79, 4.42],
[‘tornadoes’, 4.42, 5.03],
[‘and’, 5.03, 5.27],
[‘heavy’, 5.27, 5.61],
[‘rain’, 5.61, 6.15]],
‘word_confidence’: [[‘thunderstorms’, 1.0],
[‘could’, 1.0],
[‘produce’, 1.0],
[‘large’, 1.0],
[‘hail’, 1.0],
[‘isolated’, 1.0],
[‘tornadoes’, 1.0],
[‘and’, 1.0],
[‘heavy’, 1.0],
[‘rain’, 0.99]]}]}]}
transcript = result[“results”][0][“alternatives”][0][“transcript”]
print(transcript)
thunderstorms could produce large hail isolated tornadoes and heavy rain
# Locally save the results for later use
with open(“speech_to_text_res.txt”, ‘w+’) as f:
f.write(json.dumps(result))
Natural Language Understanding of Text¶
For natural language processing of text we use IBM Watson Natural Language Understanding service.
from ibm_watson import NaturalLanguageUnderstandingV1 as NLU
from ibm_watson.natural_language_understanding_v1 import Features, EntitiesOptions, KeywordsOptions
nlu = NLU(
authenticator=IAMAuthenticator(‘8hm4tu1zVZ4MNWhdEG-LXNiUfTN0DUAqoaqtHUN5mwr3′),
version=’2021-08-01’
response = nlu.analyze(text=transcript, features=Features(entities=EntitiesOptions(emotion=True, sentiment=True,limit=2), keywords=KeywordsOptions(emotion=True, sentiment=True,limit=2))).get_result()
print(json.dumps(response, indent=2))
“usage”: {
“text_units”: 1,
“text_characters”: 73,
“features”: 2
“language”: “en”,
“keywords”: [
“text”: “large hail”,
“sentiment”: {
“score”: -0.779156,
“label”: “negative”
“relevance”: 0.994203,
“emotion”: {
“sadness”: 0.31748,
“joy”: 0.242799,
“fear”: 0.21567,
“disgust”: 0.020892,
“anger”: 0.063758
“count”: 1
“text”: “heavy rain”,
“sentiment”: {
“score”: -0.779156,
“label”: “negative”
“relevance”: 0.917787,
“emotion”: {
“sadness”: 0.31748,
“joy”: 0.242799,
“fear”: 0.21567,
“disgust”: 0.020892,
“anger”: 0.063758
“count”: 1
“entities”: []
for i in range(len(response[‘keywords’])):
print(response[‘keywords’][i][‘text’],”-“,response[‘keywords’][i][‘sentiment’][‘label’], “sentiment”)
large hail – negative sentiment
heavy rain – negative sentiment
Text-to-Speech¶
To convert text to audio file we use IBM Watson Text-to-Speech service.
from ibm_watson import TextToSpeechV1
# Text to Speech
text_to_speech = TextToSpeechV1(
authenticator=IAMAuthenticator(‘-d2lUSD5et_Nn9NXi8Hd1_SF2Xvi2ZsnFGc88fpP75vq’)
output_audio_file = open(“output.wav”, “wb”)
response_text = ‘I detected this keyword “‘ + response[‘keywords’][1][‘text’] + ‘” with ‘ + response[‘keywords’][1][‘sentiment’][‘label’] + ‘ sentiment’
audio_data = text_to_speech.synthesize(response_text, accept=”audio/wav”).get_result().content
output_audio_file.write(audio_data)
print(response_text)
I detected this keyword “heavy rain” with negative sentiment
IPython.display.Audio(“output.wav”)
Your browser does not support the audio element.
More complex examples of Speech-to-Text¶
filename1 = ‘0001.wav’
urllib.request.urlretrieve(file_url + filename1, filename1)
audio_file1 = open(filename1, “rb”)
result1 = speech_to_text.recognize(audio_file1, content_type=”audio/wav”).get_result()
print(“\n”)
print(result1[“results”][0][“alternatives”][0][“transcript”])
several tornadoes touched down as a line of severe thunderstorms swept through Colorado on Sunday
filename2 = ‘en-US_Broadband_sample1.wav’
urllib.request.urlretrieve(file_url + filename2, filename2)
audio_file2 = open(filename2, “rb”)
result2 = speech_to_text.recognize(audio_file2, content_type=”audio/wav”).get_result()
print(“\n”)
print(result2[“results”][0][“alternatives”][0][“transcript”])
so thank you very much for coming David it’s good to have you here good as my pleasure Michael glad to be with you how real is artificial intelligence the question of how real is artificial intelligence is a complex one on I would say %HESITATION if if we define artificial intelligence is the ability of a machine on its own to understand large volumes of data to reason that data with a purpose to to predict the future and then tell you continue to learn and get better that is happening today in certain fields how far in the continuum is IBM Watson in operability artificial intelligence yes so so first of all once once it’s actually intelligent it will no longer be artificial so we’re moving to the point that these systems increasingly understand enormous volumes of data
{‘result_index’: 0,
‘results’: [{‘final’: True,
‘alternatives’: [{‘transcript’: “so thank you very much for coming David it’s good to have you here good as my pleasure Michael glad to be with you how real is artificial intelligence the question of how real is artificial intelligence is a complex one on I would say %HESITATION if if we define artificial intelligence is the ability of a machine on its own to understand large volumes of data to reason that data with a purpose to to predict the future and then tell you continue to learn and get better that is happening today in certain fields how far in the continuum is IBM Watson in operability artificial intelligence yes so so first of all once once it’s actually intelligent it will no longer be artificial so we’re moving to the point that these systems increasingly understand enormous volumes of data “,
‘confidence’: 0.99}]}]}
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com