[Android] 使用SpeechRecognizer建立無彈出畫面的語音辨識

在一般場景下辨識語音時，大多會跳出聆聽中的對話框，不過在最近的專案需求中，因畫面上需要顯示即時訊息，所以有了在做語音辨識時，不要有任何系統畫面影響APP本身的畫面的需求。

為了達到這項需求，我們需要使用SpeechRecognizer來自行呼叫聆聽並接收辨識結果。

1.首先要在manifests中登記錄音與網路權限，畢竟要使用到麥克風與讓google辨識~

<uses-permission android:name="android.permission.INTERNET"/>
<uses-permission android:name="android.permission.RECORD_AUDIO"/>

然後在Android11以上，需要在application外面增加queries標籤。

...
...

<queries>
  <intent>
    <action android:name="android.speech.RecognitionService" />
  </intent>
</queries>

<application
  android:allowBackup="true"
  android:icon="@mipmap/ic_launcher"
...
...

2.然後回到主程式中，首先要先確認/取得權限：

要求權限：

ActivityCompat.requestPermissions(Test12Activity.this,
  new String[]{Manifest.permission.RECORD_AUDIO},
  REQUEST_RECORD_PERMISSION);

檢查要求回傳狀態：

private static final int REQUEST_RECORD_PERMISSION = 100;

@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
  super.onRequestPermissionsResult(requestCode, permissions, grantResults);
  switch (requestCode) {
    case REQUEST_RECORD_PERMISSION:
      if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
        //權限已取得
      } else {
        Toast.makeText(this, "Permission Denied!", Toast.LENGTH_SHORT).show();
      }
  }
}

3.建立SpeechRecognizer物件：

首先在class上implements RecognitionListener：

public class Test12Activity extends AppCompatActivity implements RecognitionListener

implements後，IDE會哀哀叫要我們把方法補齊：

@Override
public void onReadyForSpeech(Bundle bundle) {
  Log.i(LOG_TAG, "onReadyForSpeech");
}

@Override
public void onBeginningOfSpeech() {
  Log.i(LOG_TAG, "onBeginningOfSpeech");
}

@Override
public void onRmsChanged(float rmsdB) {
  Log.i(LOG_TAG, "onRmsChanged: " + rmsdB);
}

@Override
public void onBufferReceived(byte[] bytes) {
  Log.i(LOG_TAG, "onBufferReceived: " + bytes);
}

@Override
public void onEndOfSpeech() {
  Log.i(LOG_TAG, "onEndOfSpeech");
}

@Override
public void onError(int i) {
  String errorMessage = getErrorText(i);
  Log.d(LOG_TAG, "FAILED " + errorMessage);
  tvResult.setText(errorMessage);
}

@Override
public void onResults(Bundle results) {
  Log.i(LOG_TAG, "onResults");
  ArrayList<String> matches = results
      .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
  String text = "";
  for (String result : matches)
    text += result + "\n";
  Log.i(LOG_TAG, "顯示文字: " + text);
  tvResult.setText(text);
}

@Override
public void onPartialResults(Bundle bundle) {
  Log.i(LOG_TAG, "onPartialResults");
}

@Override
public void onEvent(int i, Bundle bundle) {
  Log.i(LOG_TAG, "onEvent");
}

public static String getErrorText(int errorCode) {
  String message;
  switch (errorCode) {
    case SpeechRecognizer.ERROR_AUDIO:
      message = "Audio recording error";
      break;
    case SpeechRecognizer.ERROR_CLIENT:
      message = "Client side error";
      break;
    case SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS:
      message = "Insufficient permissions";
      break;
    case SpeechRecognizer.ERROR_NETWORK:
      message = "Network error";
      break;
    case SpeechRecognizer.ERROR_NETWORK_TIMEOUT:
      message = "Network timeout";
      break;
    case SpeechRecognizer.ERROR_NO_MATCH:
      message = "No match";
      break;
    case SpeechRecognizer.ERROR_RECOGNIZER_BUSY:
      message = "RecognitionService busy";
      break;
    case SpeechRecognizer.ERROR_SERVER:
      message = "error from server";
      break;
    case SpeechRecognizer.ERROR_SPEECH_TIMEOUT:
      message = "No speech input";
      break;
    default:
      message = "Didn't understand, please try again.";
      break;
  }
  return message;
}

看起來很多行，其實就是語音辨識過程中的各種狀態回傳而已，讓我們可以根據各種狀況去做對應的處理。

其中tvResult為畫面上的TextView，把訊息顯示出來。

onRmsChanged為目前的音量大小，可以做成動畫用(例如與progressBar串接變成一個指示器)。

最重要的onResults為辨識結果，在本例中會得到3個結果，依照google認為的機率排序。

再來是把SpeechRecognizer初始化的部分完成：

private SpeechRecognizer speech = null;
private Intent recognizerIntent;
private String LOG_TAG = "Test12Activity";

speech = SpeechRecognizer.createSpeechRecognizer(this);
Log.i(LOG_TAG, "isRecognitionAvailable: " + SpeechRecognizer.isRecognitionAvailable(this));//語音辨識是否可用
speech.setRecognitionListener(this);
recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE, "zh-TW");//收聽語言
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 3);//最大結果數量
recognizerIntent.putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS, 5000);//最短收聽時間

注意：在測試過程中，有遇到SpeechRecognizer.isRecognitionAvailable(this)會有false的狀況，後來調查後發現是Google語音服務的問題，因為我的測試機是刷別的rom，沒有安裝Google助理。在安裝完Google助理後(會一併安裝Google App與語音搜尋)，就會得到可使用的狀態了。所以使用特殊機種(如大陸那邊賣的)，有可能會無法使用，所以最好加上判斷並通知使用者。

4.開始語音辨識：

在想要的地方開始執行語音辨識(這邊為按下按鈕)：

btnStart.setOnClickListener(v -> {
  speech.startListening(recognizerIntent);
});

呼叫speech.startListening()即可，把上方建立的Intent設定訊息帶入。

如果要停止辨識，就用speech.stopListening()即可。

5.釋放資源：

在整個程式執行完成後，別忘記釋放資源：

@Override
protected void onStop() {
  super.onStop();
  if (speech != null) {
    speech.destroy();
    Log.i(LOG_TAG, "destroy");
  }
}

到這邊就是一個無畫面的語音辨識流程了，這種做法可讓整的辨識流程比較可控制，與APP會更好搭配。

-END-

發佈留言 取消回覆

發佈留言取消回覆