语音服务-语音转换文字范例(from-file code)

延续昨天,今天就来看看范例中是怎麽呼叫API及使用SDK吧!(打开index.html及token.php)
https://ithelp.ithome.com.tw/upload/images/20201012/20130663i6aijp6xfZ.png

今天html及DOM的部分就先不理会

<!-- Speech SDK reference sdk. -->
<script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>

首先引入了SDK中的js档
接着这个function要在token.php中呼叫API以取得token
并在下面使用SDK的程序码最後被呼叫

<!-- Speech SDK Authorization token -->
  <script>
  // Note: Replace the URL with a valid endpoint to retrieve
  //       authorization tokens for your subscription.
  var authorizationEndpoint = "token.php";

  function RequestAuthorizationToken() {
    if (authorizationEndpoint) {
      var a = new XMLHttpRequest();
      a.open("GET", authorizationEndpoint);
      a.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
      a.send("");
      a.onload = function() {
          var token = JSON.parse(atob(this.responseText.split(".")[1]));
          serviceRegion.value = token.region;
          authorizationToken = this.responseText;
          subscriptionKey.disabled = true;
          subscriptionKey.value = "using authorization token (hit F5 to refresh)";
          console.log("Got an authorization token: " + token);
      }
    }
  }
  </script>

authorizationEndpoint这个变数要放的是token.php的路径
(如果是放在同一个资料夹的同一层就无需更改范例程序码)
function RequestAuthorizationToken
透过XMLHttpRequest来GET token.php取得的token
放到authorizationToken这个变数
并且其位置(region)放到serviceRegion.value

再来便要使用SDK了

// Speech SDK USAGE
  
    // status fields and start button in UI
    var phraseDiv;
    var startRecognizeOnceAsyncButton;

    // subscription key and region for speech services.
    var subscriptionKey, serviceRegion;
    var authorizationToken;
    var SpeechSDK;
    var recognizer;

    document.addEventListener("DOMContentLoaded", function () {
      startRecognizeOnceAsyncButton = document.getElementById("startRecognizeOnceAsyncButton");
      subscriptionKey = document.getElementById("subscriptionKey");
      serviceRegion = document.getElementById("serviceRegion");
      phraseDiv = document.getElementById("phraseDiv");

      startRecognizeOnceAsyncButton.addEventListener("click", function () {
        startRecognizeOnceAsyncButton.disabled = true;
        phraseDiv.innerHTML = "";

        // if we got an authorization token, use the token. Otherwise use the provided subscription key
        var speechConfig;
        if (authorizationToken) {
          speechConfig = SpeechSDK.SpeechConfig.fromAuthorizationToken(authorizationToken, serviceRegion.value);
        } else {
          if (subscriptionKey.value === "" || subscriptionKey.value === "subscription") {
            alert("Please enter your Microsoft Cognitive Services Speech subscription key!");
            return;
          }
          speechConfig = SpeechSDK.SpeechConfig.fromSubscription(subscriptionKey.value, serviceRegion.value);
        }

        speechConfig.speechRecognitionLanguage = "zh-TW";
        var audioConfig  = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
        recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);

        recognizer.recognizeOnceAsync(
          function (result) {
            startRecognizeOnceAsyncButton.disabled = false;
            phraseDiv.innerHTML += result.text;
            window.console.log(result);

            recognizer.close();
            recognizer = undefined;
          },
          function (err) {
            startRecognizeOnceAsyncButton.disabled = false;
            phraseDiv.innerHTML += err;
            window.console.log(err);

            recognizer.close();
            recognizer = undefined;
          });
      });

      if (!!window.SpeechSDK) {
        SpeechSDK = window.SpeechSDK;
        startRecognizeOnceAsyncButton.disabled = false;

        document.getElementById('content').style.display = 'block';
        document.getElementById('warning').style.display = 'none';

        // in case we have a function for getting an authorization token, call it.
        if (typeof RequestAuthorizationToken === "function") {
            RequestAuthorizationToken();
        }
      }
    });

一开始宣告了几个变数
分别用来放UI介面的DOM元素以及API所需的key, region(上面有用到)以及SDK的物件等

// status fields and start button in UI
    var phraseDiv;
    var startRecognizeOnceAsyncButton;

    // subscription key and region for speech services.
    var subscriptionKey, serviceRegion;
    var authorizationToken;
    var SpeechSDK;
    var recognizer;

接着如果成功拿到授权的token
就利用SDK的fromAuthorizationToken()放到speechConfig

// if we got an authorization token, use the token. Otherwise use the provided subscription key
        var speechConfig;
        if (authorizationToken) {
          speechConfig = SpeechSDK.SpeechConfig.fromAuthorizationToken(authorizationToken, serviceRegion.value);
        } else {
          if (subscriptionKey.value === "" || subscriptionKey.value === "subscription") {
            alert("Please enter your Microsoft Cognitive Services Speech subscription key!");
            return;
          }
          speechConfig = SpeechSDK.SpeechConfig.fromSubscription(subscriptionKey.value, serviceRegion.value);
        }

然後设定辨识的语言,预设为美国地区的英文
要改成中文的话可以改为"zh-TW"
其他则如https://docs.microsoft.com/zh-tw/azure/cognitive-services/speech-service/language-support
目前还没找到怎麽混着不同语言的方法

speechConfig.speechRecognitionLanguage = "zh-TW";

然後音讯的config则来自SDK的fromDefaultMicrophoneInput()
也就是使用电脑预设的麦克风

var audioConfig  = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();

两个config都设定完成,就可以来初始化辨识器了

recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);

范例中使用的是非同步的一次性辨识
在侦测到一段沉默时即停止辨识
在这段程序码中,会将结果的文字放到网页页面的文字框中

recognizer.recognizeOnceAsync(
          function (result) {
            startRecognizeOnceAsyncButton.disabled = false;
            phraseDiv.innerHTML += result.text;
            window.console.log(result);

            recognizer.close();
            recognizer = undefined;
          },
          function (err) {
            startRecognizeOnceAsyncButton.disabled = false;
            phraseDiv.innerHTML += err;
            window.console.log(err);

            recognizer.close();
            recognizer = undefined;
          });

<<:  Day27 - GitLab CI 如何让工作流程流水线跑快一点?之一 从 .gitlab-ci.yml 大部分解

>>:  Day 27 介绍 gulp

Day1 javascript简单介绍

JavaScript 是 Web 的编程语言,几乎所有现代的 HTML 页面都会使用到 JavaSc...

【Day15】:STM32辗压Arduino的功能—TIM(下)

TIMER+NVIC中断 今天我们来使用Timer的中断功能吧! 设定与昨天大致相同,只是我们现在需...

【Day20-填充】二维图片资料要怎麽做填充?

今天简单介绍一下在处理二维图片中算是偶尔会用到的工具——填充资料 沿着各种方向填充 这边以2d-ar...

[Day 50] 留言板後台及前台(六) - 前端显示资料

昨天我们已经把资料写进去了, 今天要开始显示留言板了, 但是首先我们需要抓使用者的图片, 所以要先在...

以Postgresql为主,再聊聊资料库 PostgreSQL last N in-table cache 探讨

PostgreSQL last N in-table cache 探讨 前些天对悠游卡储值时,加值机...