Android Speech Recognitoin Confidence Levels

Android Speech Recognitoin Confidence Levels - android

When the android text to speech functionality translates audio waves to text, is it possible to determine the 'confidence levels' of spoken text? So for example, if someone speaks too far away from the mic and the android device picks up distorted sounds, would it both output translated text along with a low confidence interval to state it isn't sure how accurate that particular translation was.

if you are implementing the RecognitionListener examine this code clip from my onResults method.
#Override
public void onResults(Bundle results) {
String LOG = "SpeechRecognizerActivity"
Log.d(LOG, "onResults");
ArrayList<String> strlist = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
float [] confidence = results.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES);
for (int i = 0; i < strlist.size(); i++) {
Log.d(LOG, "result=" + strlist.get(i));
}
Log.d(LOG + " result", strlist.get(0));
if (confidence != null){
if (confidence.length > 0){
Log.d(LOG + " confidence", String.valueOf(confidence[0]));
} else {
Log.d(LOG + " confidence score not available", "unknown confidence");
}
} else {
Log.d(LOG, "confidence not found");
}
}
You won't see anything unless you add this to your recognizer intent:
iSpeechIntent.putExtra(RecognizerIntent.EXTRA_CONFIDENCE_SCORES, true);

Yes. In the returned Bundle, there's a float array called CONFIDENCE_SCORES. From the docs:
Key used to retrieve a float array from the Bundle passed to the onResults(Bundle) and onPartialResults(Bundle) methods. The array should be the same size as the ArrayList provided in RESULTS_RECOGNITION, and should contain values ranging from 0.0 to 1.0, or -1 to represent an unavailable confidence score.
Confidence values close to 1.0 indicate high confidence (the speech recognizer is confident that the recognition result is correct), while values close to 0.0 indicate low confidence.
This value is optional and might not be provided.
Please note that it is not guaranteed to be there. Check for it and use if present. Gamble if not.

Related

Unable to iterate over SegmentList while more than one match is found

I'm modifying the pocketsphinx android demo to test continuous keywords spotting based on a keywords list and relative thresholds.
When the onResult method of my implementation of edu.cmu.pocketsphinx.RecognitionListener is called this string
hypothesis.getHypstr() will contain the list of possible matches.
I read here that to get every single match and their weights it is possible to do like this:
for (Segment seg : recognizer.getDecoder().seg()) {
System.out.println(seg.getWord() + " " + seg.getProb());
}
However my code running is never iterating over segments like if SegmentList was empty while hypothesis.getHypstr() shows more than one match.
To reproduce the case I'm using this keyword list with very low tresholds so that more matches are easily found:
rainbow /1e-50/
about /1e-50/
blood /1e-50/
energies /1e-50/
My onPartialResult method is doing nothing while:
public void onEndOfSpeech() {
switchSearch(KWS_SEARCH);
}
public void onResult(Hypothesis hypothesis) {
if (hypothesis != null) {
for (Segment seg : recognizer.getDecoder().seg()) {
//No iteration is done here!!!
Log.d("onResult", seg.getWord() + " " + seg.getProb());
}
String text = hypothesis.getHypstr();
makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show();
}
}
For example if I say "energies" then hypothesis.getHypstr()="blood about energies blood" but no iteration is done over SegmentList: I can see it by putting a breakpoint at the beginning of the onResult method.
Any suggestion?
Thanks

There is a threading issue here. onResult message is delivered when recognizer is already restarted in switchSearch and so hypothesis is cleared and query for result returns nothing.
You can put this code inside switchSearch before recognizer is restarted, then it will work ok:
private void switchSearch(String searchName) {
boolean wasRunning = recognizer.stop();
if (wasRunning) {
for (Segment seg : recognizer.getDecoder().seg()) {
Log.d("!!!! ", seg.getWord());
}
}
// If we are not spotting, start listening with timeout (10000 ms or 10 seconds).
if (searchName.equals(KWS_SEARCH))
recognizer.startListening(searchName);
else
recognizer.startListening(searchName, 10000);
String caption = getResources().getString(captions.get(searchName));
((TextView) findViewById(R.id.caption_text)).setText(caption);
}
If you use only keyword spotting, you can also put this code inside onPartialResult which is invoked as soon as keyphrase is detected, not when silence is detected. That makes reaction faster. You do not need onEndOfSpeech and onResult in pure keyword spotting.

onLeScan callback returns oddly positive RSSI values

using Bluetooth Low Energy (BLE) scan on Android, I noticed that sometimes RSSI values are incorrect.
My code simply calls the start scan function:
mBluetoothAdapter.startLeScan(mLeScanCallback);
and then I read results in the callback and save results in a file:
private static BluetoothAdapter.LeScanCallback mLeScanCallback =
new BluetoothAdapter.LeScanCallback() {
#Override
public void onLeScan(final BluetoothDevice device, final int rssi, final byte[] scanRecord) {
String objScanRec = bytesToHex(scanRecord);
outStr = rssi + ";" + objScanRec + ";" + device.getName() + ";" + beaconLocation + ";\n";
try {
Raw_log.write(outStr);
Raw_log.flush();
} catch (IOException e) {
e.printStackTrace();
}
// }
}
};
the problem is that I read positive RSSI values, also if the beacon is at a fixed distance.
E.g. I have the beacon 30 cm from the phone (or smartwatch) I read a values around -45 which are realistic, but also values around +80 or +100 (which are not realistic) those values are around 20% of measurements.
Is there something that I'm missing?
thanks

thanks for your help I figured out it's a problem related only to Samsung Gear Live. I came up with this s solution:
if(rssi > 0){
rssi = rssi - 128;
}
I've tested the solution and it works fine. (e.g. the obtained positive values after correction are now similar to negative values, for example I read
-44 -45 -43 84 82
that after correctio become:
-44 -45 -43 -44 -46)

This is definitely not normal. I have never seen a rssi value in that callback be positive. Typical values are from -30 to -120.
I suspect there is something wrong with the way the data are written out to the log, or read back. What happens if you just do a regular Log.d(TAG, "rssi="+rssi); Do you ever see positive values? If so, can you share an excerpt, along with the device model you are using to detect and the device you are detecting?

Interpreting BluetoothGatt Value from Light Sensor

I am writing an app which is receiving values from a Light Sensor of a BLE device. I am trying to determine what it is that I am receiving. I am trying to get the Lux value which is provided by the sensor, but am concerned that it needs conversion. I do not know what the unit of measure is for this sensor. For example, the unit for an Android phone is SI Lux. Should be easy enough, but for this sensor, the specs do not state.
Here is the code which is giving me output:
case MSG_LIGHT:
characteristic = (BluetoothGattCharacteristic) msg.obj;
if (characteristic.getValue() == null) {
Log.w(TAG, "Error obtaining light value");
return;
}
int formatlgt1 = -1;
formatlgt1 = BluetoothGattCharacteristic.FORMAT_SINT8;
Log.i(LIGHT, "Light RawValue1 " + characteristic.getIntValue(formatlgt1, 0));
Log.i(LIGHT, "Light RawValue2 " + characteristic.getIntValue(formatlgt1, 1));
Log.w(LIGHT, "Light UUID " + characteristic.getUuid());
Log.w(LIGHT, "Light Stored Value " + characteristic.getValue());
Log.w(LIGHT, "Light Descriptors " + characteristic.getDescriptors());
Log.d(LIGHT, "Light Characteristic " + characteristic);
updateLightValues(characteristic);
break;
Simple enough, just read the sensor and give me the various outputs from that sensor at the time of reading. Next here is the output:
Light RawValue1 4
Light RawValue2 9
Light UUID 0000aa91-0000-1000-8000-00805f9b34fb
Light Stored Value [B#431d30b0
Light Descriptors [android.bluetooth.BluetoothGattDescriptor#4300e508, android.bluetooth.BluetoothGattDescriptor#4300eaf8]
Light Characteristic android.bluetooth.BluetoothGattCharacteristic#43002b10
I am interpreting that the measurement of this is the RawValues 1 & 2 but am logging what is stored to help. Problem is that the StoredValue is [B#431d30b0 which is beyond me. According to the description form the manufacturer, it states that the first byte is the HILUX at address 00x03 and the second is LOLUX at address 00x04 with a default value of 00:00.
What am I looking at here and where am I going wrong? Where I am hurting is my understanding of what I am reading. Can't seem to get a good search context to learn about it.
Thanks

Is my target selection AI efficient?

quick question. I am developing a top-down 2d Platformer game with lots of enemies in the map (at least a hundred spawn at the start of each level). Each enemy uses an AI that searches the map for objects with a specified tag, sorts each object into a list based on their distance, then reacts to the object closest to them.
My code works, but the thing is, if the machine my game is running on is slow, then my game lags. I want to be able to port my game to Android and iOS with low end specs.
In pursuit of putting less strain on the CPU, is there a better way to write my AI?
Here is my code:
void Start () {
FoodTargets = new List<Transform>(); // my list
SelectedTarget = null; // the target the enemy reacts to
myTransform = transform;
AddAllFood ();
}
public void AddAllFood()
{
GameObject[] Foods = GameObject.FindGameObjectsWithTag("Object");
foreach (GameObject enemy in Foods)
AddTarget (enemy.transform);
}
public void AddTarget(Transform enemy)
{
if (enemy.GetComponent<ClassRatingScript>().classrating != 1) { // classrating is an attribute each enemy has that determines their identity (like if they are a plant, a herbivore or a carnivore)
FoodTargets.Add (enemy); // adds the object to the list
}
}
private void SortTargetsByDistance() // this is how I sort according to distance, is this the fastest and most efficient way to do this?
{
FoodTargets.Sort (delegate(Transform t1, Transform t2) {
return Vector3.Distance(t1.position, myTransform.position).CompareTo(Vector3.Distance(t2.position, myTransform.position));
});
}
private void TargetEnemy() // this is called every 4 frames
{
if (SelectedTarget == null) {
SortTargetsByDistance ();
SelectedTarget = FoodTargets [1];
}
else {
SortTargetsByDistance ();
SelectedTarget = FoodTargets [1];
}
}
if (optimizer <= 2) { // this is a variable that increments every frame and resets to 0 on the 3rd frame. Only every 3rd frame is the target enemy method is called.
optimizer++;
} else {
TargetEnemy ();
// the rest are attributes that the AI considers when reacting to their target
targetmass = SelectedTarget.GetComponent<MassScript> ().mass;
targetclass = SelectedTarget.GetComponent<ClassRatingScript> ().classrating;
mass = this.GetComponent<MassScript> ().mass;
classrating = this.GetComponent<ClassRatingScript> ().classrating;
distance = Vector3.Distance (transform.position, SelectedTarget.transform.position);
optimizer = 0;
}
Is there a more optimized way of doing this? Your help will be much appreciated. Thanks in advance!

I'm not awfully familiar with C# or Unity but I would look very carefully at what sorting algorithm your sorting method is using. If all you want is the closest Game Object, then sorting isn't necessary.
The fastest sorting algorithms, such as Quicksort, are O(n*log(n)). That is to say that the time it takes to sort a collection of n objects is bounded by some constant multiple of n*log(n). If you just want the k closest objects, where k << n, then you can perform k iterations of the Bubble Sort algorithm. This will have time-complexity O(k*n), which is much better then before.
However, if you only need the single closest object, then just find the closest object without sorting (pseudocode):
float smallestDistance = Inf;
object closestObject = null;
foreach object in objectsWithTag {
float d = distance(object, enemy);
if (d < smallestDistance) {
smallestDistance = d;
closestObject = object;
}
}
This extremely simple algorithm has time complexity O(n).

SpannableStringBuffer limited to 9,999 characters?

My app reads in large amounts of data from text files assets and displays them on-screen in a TextView. (The largest is ~450k.) I read the file in, line-by-line into a SpannableStringBuffer (since there is some metadata I remove, such as section names). This approach has worked without complaints in the two years that I've had the app on the market (over 7k active device installs), so I know that the code is reasonably correct.
However, I got a recent report from a user on a LG Lucid (LGE VS840 4G, Android 2.3.6) that the text is truncated. From log entries, my app only got 9,999 characters in the buffer. Is this a known issue with a SpannableStringBuffer? Are there other recommended ways to build a large Spannable buffer? Any suggested workarounds?
Other than keeping a separate expected length that I update each time I append to the SpannableStringBuilder, I don't even have a good way to detect the error, since the append interface returns the object, not an error!
My code that reads in the data is:
currentOffset = 0;
try {
InputStream is = getAssets().open(filename);
BufferedReader br = new BufferedReader(new InputStreamReader(is));
ssb.clear();
jumpOffsets.clear();
ArrayList<String> sectionNamesList = new ArrayList<String>();
sectionOffsets.clear();
int offset = 0;
while (br.ready()) {
String s = br.readLine();
if (s.length() == 0) {
ssb.append("\n");
++offset;
} else if (s.charAt(0) == '\013') {
jumpOffsets.add(offset);
String name = s.substring(1);
if (name.length() > 0) {
sectionNamesList.add(name);
sectionOffsets.add(offset);
if (showSectionNames) {
ssb.append(name);
ssb.append("\n");
offset += name.length() + 1;
}
}
} else {
if (!showNikud) {
// Remove nikud based on Unicode character ranges
// Does not replace combined characters (\ufb20-\ufb4f)
// See
// http://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
s = s. replaceAll("[\u05b0-\u05c7]", "");
}
if (!showMeteg) {
// Remove meteg based on Unicode character ranges
// Does not replace combined characters (\ufb20-\ufb4f)
// See
// http://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
s = s.replaceAll("\u05bd", "");
}
ssb.append(s);
ssb.append("\n");
offset += s.length() + 1;
}
}
sectionNames = sectionNamesList.toArray(new String[0]);
currentFilename = filename;
Log.v(TAG, "ssb.length()=" + ssb.length() +
", daavenText.getText().length()=" +
daavenText.getText().length() +
", showNikud=" + showNikud +
", showMeteg=" + showMeteg +
", showSectionNames=" + showSectionNames +
", currentFilename=" + currentFilename
);
After looking over the interface, I plan to replace the showNikud and showMeteg cases with InputFilters.

Is this a known issue with a SpannableStringBuffer?
I see nothing in the source code to suggest a hard limit on the size of a SpannableStringBuffer. Given your experiences, my guess is that this is a problem particular to that device, due to a stupid decision by an engineer at the device manufacturer.
Any suggested workarounds?
If you are distributing through the Google Play Store, block this device in your console.
Or, don't use one massive TextView, but instead use several smaller TextView widgets in a ListView (so they can be recycled), perhaps one per paragraph. This should have the added benefit of reducing your memory footprint.
Or, generate HTML and display the content in a WebView.

After writing (and having the user run) a test app, it appears that his device has this arbitrary limit for SpannableStringBuilder, but not StringBuilder or StringBuffer. I tested a quick change to read into a StringBuilder and then create a SpannableString from the result. Unfortunately, that means that I can't create the spans until it is fully read in.
I have to consider using multiple TextView objects in a ListView, as well as using Html.FromHtml to see if that works better for my app's long term plans.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

Android Speech Recognitoin Confidence Levels - android

Related

Unable to iterate over SegmentList while more than one match is found

onLeScan callback returns oddly positive RSSI values

Interpreting BluetoothGatt Value from Light Sensor

Is my target selection AI efficient?

SpannableStringBuffer limited to 9,999 characters?

Categories

Resources