Good day! In the work task, there was a need to implement a noise suppressor when working with a user's record. The problem is that despite the fact that everything was implemented according to the instructions from developer.android (https://developer.android.com/reference/android/media/audiofx/NoiseSuppressor) - it feels like NS just does not work . What could be the problem?
When implementing NS, we accordingly hung up a check for the availability of NS (NoiseSupressor.isAvailable() ) and, if available, we bind NS to the audio session number (create(int audioSession)). In the same "if" when binding to the session number, I added to the variable (conditionally supressor) .enabled = true (as it applies to AudioEffectsyour text) - but, when testing, there is no difference between the default recording and using NS.
Related
How can I specify a custom wakeword name (eg "stack overflow" or "party time") in the spokestack-android configuration? I'm looking for something like:
SpeechPipeline pipeline = new SpeechPipeline.Builder()
.setProperty("wakeword", "stack overflow")
//...
.build();
Update: You can train your own wakeword (without writing code, just providing audio samples) with a Maker subscription. When they're finished training, you can download and configure the custom wake word the same way you set up the default wake word.
Currently, Spokestack Android only supports wakeword detection via a binary classifier, so we only recognize "Spokestack". In theory, this could be done via Android's platform ASR, with the caveat that the user would constantly be interrupted by Google Assistant-style audible dings as the ASR request times out and gets restarted, so it'd only be useful for informal demos, not real apps.
That said, it's theoretically possible, so feel free to open an issue, and it might show up in a future version if we get enough demand for it.
I can't seem to find anything related to finding out what application got audio focus. I can correctly determine from my application what type of focus change it was, but not from any other application. Is there any way to determine what application received focus?
"What am I wanting to do?"
I have managed to record internal sound whether it be music or voice. If I am currently recording audio no matter the source, I want to determine what application took the focus over to determine what my application need's to do next.
Currently I am using the AudioManager.OnAudioFocusChangeListener for my application to stop recording internal sounds once the focus changes, but I want the application's name that gained the focus.
Short Answer: There's no good solution... and Android probably intended it this way.
Explanation:
Looking at the source code, AudioManager has no API's(even hidden APIs) for checking who has Audio Focus. AudioManager wraps calls to AudioService which holds onto the real audio state. The API that AudioService exposes through it's Stub when AudioManager binds to it also does not have an API for querying current Audio Focus. Thus, even through reflection / system level permissions you won't be able get the information you want.
If you're curious how the focus changes are kept track of, you can look at MediaFocusControl whose instance is a member variable of AudioService here.
Untested Hacky Heuristic:
You might be able to get some useful information by looking at UsageStats timestamps. Then once you have apps that were used within say ~500ms of you losing AudioFocus you can cross-check them against apps with Audio Permissions. You can follow this post to get permissions for any installed app.
This is clearly a heuristic and could require some tuning. It also requires the user to grant your app permissions to get access to the usage stats. Mileage may vary.
Looking at the MediaContorller class (new in lollipop, available in comparability library for older versions).
There are these two methods that look interesting:
https://developer.android.com/reference/android/media/session/MediaController.html#getPackageName()
https://developer.android.com/reference/android/media/session/MediaController.html#getSessionActivity()
getPackageName supposedly returns the current sessions package name:
http://androidxref.com/5.1.1_r6/xref/frameworks/base/media/java/android/media/session/MediaController.java#397
getSessionActivity gives you a PendingIntent with an activity to start (if one is supplied), where you could get the package as well.
Used together with your audio listener and a broadcast receiver for phone state to detect if the phone is currently ringing you might be able to use this in order to get a more fine grained detection than you currently have. As Trevor Carothers pointed out above, there is no way to get the general app with audio focus.
You can use dumpsys audio to find who are using audio focus. And, you can also look into the results of dumpsys media_session.
And, if you want to find who're playing music, you can choose dumpsys media.audio_flinger. For myself, I switch to this command.
According to the Release Notes (of July 8), the docs for the Sender and the updated answer of this question, the Styled Media Receiver of Google Cast does now support Closed Captioning or Subtitle tracks.
However, when I tell the Default or the Styled Media Receiver to show a text track, nothing happens. It does not even load the .vtt from the server, as I can see in the logs.
I can tell the receiver app got the text tracks just fine, but even using the Android example app, the subtitles never show up. According to all the logs, they are being sent and the receiver app is told to show them - but they never appear, they are never even loaded.
The MediaTrack is being created as follows:
new MediaTrack.Builder(2, MediaTrack.TYPE_TEXT)
.setName("Deutsch")
.setSubtype(MediaTrack.SUBTYPE_CAPTIONS)
.setContentId("https://example.com/video/caption_de.vtt")
.setContentType("text/vtt")
.setLanguage("de").build();
I have checked thrice that the file exists and is being loaded with the type text/vtt. But that does not matter, as the file is never even requested by the player. I have tried both MediaTrack.SUBTYPE_CAPTIONS and MediaTrack.SUBTYPE_SUBTITLES.
So I need to know, is this claimed support of CC in the Styled Media Receiver simply a lie? Or is there some undocumented trick required to make it possible?
If there is still a custom receiver required, I would like to know how to convert the example player to support subtitles, as it doesn't seem to support them either.
First, I suggest you change your wording in future posts (re: "..is simply a lie.."); that is not appropriate at all. Secondly, it works and you can test that with the CastVideos-android app (or ios variation of it for that matter); the first three videos have CC. Lastly, we have documentation on that subject on our documentation site (https://developers.google.com/cast/docs/android_sender, under "Using the Tracks API").
I have read the Android APIs and tried searching over the internet about declaring a custom audioSessionId and then using that audioSessionId to initialize an AudioFx class and assign my MediaPlayer or AudioTrack the hardcoded audioSessionId.
This method would allow me to create an AudioFx first and later attach a new MediaPlayer or AudioTrack to this audioSessionId.
I'm currently able to use this method on Android 2.3.6 but on Android 4.x I'm running into issues with errors that initialization fails or on other ICS/JellyBean devices this error is silent but calling a function leads to exceptions.
Samsung Galaxy S II [Android 4.0.3]: [Issue no longer happens with Android 4.0.4]
E/AudioEffect(13250): set(): AudioFlinger could not create effect, status: -38
E/AudioEffects-JNI(13250): AudioEffect initCheck failed -5
E/AudioEffect-JAVA(13250): Error code -5 when initializing AudioEffect.
W/WrapEqualizer(13250): createEqualizer() -> Effect library not loaded
Motorola Xoom [Android 4.1.2]
Fails it seems silently after the constructor. Then calling on getProperties() it crashes.
java.lang.RuntimeException: AudioEffect: set/get parameter error
at android.media.audiofx.AudioEffect.checkStatus(AudioEffect.java:1247)
at android.media.audiofx.Equalizer.getProperties(Equalizer.java:532)
Nexus 4 [Android 4.2.1]
Using audioSessionId=0 everything works fine but using any other number the device will report the following silent error every time I try to change the preset, band level, bass boost to ON or Virtualizer to ON. The effect ID reported is different depending on the FX I'm trying to modify.
W/AudioPolicyManagerBase(165): unregisterEffect() unknown effect ID 1381
Update 08/11/12:
I'm able to use audioSessionId as 0. I know it's deprecated but it works using the permission. <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" /> Should I be using the AudioFx with the audio session id 0?
You should look at: this
Apparently it is an unsolved issue came up in ICS, and probably wasn't solved either in JB.
Should I be using the AudioFx with the audio session id 0?
It will probably work in some cases, but don't count on it to continue to do so on future Android versions. You'll already be compromising interoperability between your app and other apps on Jellybean. Just take a look at what the AudioFlinger does when an effect is enabled:
// suspend all effects in AUDIO_SESSION_OUTPUT_MIX when enabling any effect on
// another session. This gives the priority to well behaved effect control panels
// and applications not using global effects.
// Enabling post processing in AUDIO_SESSION_OUTPUT_STAGE session does not affect
// global effects
if ((sessionId != AUDIO_SESSION_OUTPUT_MIX) && (sessionId != AUDIO_SESSION_OUTPUT_STAGE)) {
setEffectSuspended_l(NULL, enabled, AUDIO_SESSION_OUTPUT_MIX);
}
i know this issue
if somebody want to try
do this
Equalizer eq=null;
.
.
.
.
.
//in any function before initialization do this
if(eq!=null)
eq.release();
eq=new Equalizer(0, audiosessionid);
try it once
Other than session 0 which is the "deprecated global session", my understanding of the AudioFlinger code shows that sessions are only created for classes which actually do audio IO, that is, AudioRecord, AudioTrack, MediaPlayer etc. You should create these classes, and then get their session ID, and then attach the effect.
Any other value you supply for session ID will correspond to an audio session that does not exist, and so will fail.
CONTEXT: My application is sending sentences to whatever TTS engine the user has. Sentences are user-generated and may contain punctuation.
PROBLEM: Some users report that the punctuation is read aloud (TTS says "comma" etc) on SVOX, Loquendo and possibly others.
QUESTION:
Should I strip all punctuation?
Should I transform the punctuation using this kind of API?
Should I let the TTS engine deal with the punctuation?
The same user that sees the problem with Loquendo, does not have this problem with another Android application called FBReader. So I guess the 3rd option is not the right thing to do.
I had the same problem with one of my apps.
The input string was:
Next alarm in 10 minutes,it will be 2:45 pm
and the TTS engine would say:
Next alarm in 10 minutes comma it will be 2:45 pm.
The problem was fixed just by adding a space after the comma like this:
Next alarm in 10 minutes, it will be 2:45 pm
This is a stupid mistake, and maybe your problem is more complicated than that, but it worked for me. :)
So, you're worried about what back-alley-acquired text-to-speech engine the user might happen to have selected as their default... presumably because you don't want your app to look bad due to this engine's unknown/bad behavior. Understandable.
The (good) fact is, though, that the TTS's behavior is not actually your responsibility unless you decide to embed an engine in the app itself (Difficulty: Hard, Recommended? No).
Engines can and should be presumed to adhere to Android rules and behaviors dictated here... and presumed to supply their own sufficient set of configuration options in the Android system settings (home\settings\language&locale\TTS) which may or may not include pronunciation options. The user should also be presumed intelligent enough to install an engine that they are satisfied with.
It is a slippery slope to take on the job of anticipating and "correcting" for unknown and unwanted engine behaviors (at least in engines that you haven't tested yourself).
A SIMPLE AND GOOD OPTION (Difficulty: Easy):
Make a setting in your app: "ignore punctuation."
A BETTER OPTION (Difficulty: Medium):
Do the above, but only show the "ignore punctuation" setting-option if the engine you have detected on the user's device is prone to this issue.
Also, one thing to note is that there are many, many differences between engines (whether they use embedded voices vs online, response time, initialization time, reliability/adherence to Android specs, behavior across Android API levels, behavior across their own version history, the quality of voices, not to mention language capability)... differences that may be even more important to users than whether or not punctuation is pronounced.
You say "My application is sending sentences to whatever TTS engine the user has." Well... "That's yer problem right there." Why not give the user a choice on what engine to use?
And leads us to...
AN EVEN BETTER OPTION (Difficulty: Hard and Good! [in my humble opinion]):
Decide on some "known-good" engines your app will "support," starting with Google and Samsung. I would guess that there are less than 5% of devices out there these days that don't have either of those engines on them.
Study and test these engines as much as possible across all Android API levels that you plan to support... at least in as far as whether they pronounce punctuation or not.
Over time, test more engines if you like, and add them to your supported engines in subsequent app updates.
Run an algorithm when your app starts that detects which engines are installed, then use that info against your own list of supported engines:
private ArrayList<String> whatEnginesAreInstalled(Context context) {
final Intent ttsIntent = new Intent();
ttsIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
final PackageManager pm = context.getPackageManager();
final List<ResolveInfo> list = pm.queryIntentActivities(ttsIntent, PackageManager.GET_META_DATA);
ArrayList<String> installedEngineNames = new ArrayList<>();
for (ResolveInfo r : list) {
String engineName = r.activityInfo.applicationInfo.packageName;
installedEngineNames.add(engineName);
// just logging the version number out of interest
String version = "null";
try {
version = pm.getPackageInfo(engineName,
PackageManager.GET_META_DATA).versionName;
} catch (Exception e) {
Log.i("XXX", "try catch error");
}
Log.i("XXX", "we found an engine: " + engineName);
Log.i("XXX", "version: " + version);
}
return installedEngineNames;
}
In your app's settings, present all engines that you've decided to support as options (even if not currently installed). This could be a simple group of RadioButtons with titles corresponding to the different engine names. If the user selects one that isn't installed, notify them of that and give them the option of installing it with an intent.
Save the user's selected engine name (String) in SharedPreferences, and use their selection as the last argument of the TextToSpeech constructor any time you need a TTS in your app.
If the user has some weird engine installed, present it as a choice also, even if it is unrecognized/unsupported, but inform them that they have selected an unknown/untested engine.
If the user selects an engine that is supported but is known to pronounce punctuation (bad), then upon selection of that engine, have an alert dialog pop up warning the user about that, explaining that they can turn this bad behavior off with the "ignore punctuation" setting referred to already.
SIDE-NOTES:
Don't let the SVOX/PICO (emulator) engine get you too worried -- it has many flaws and is not even designed or guaranteed to run on Android above API ~20, but is still included on emulators images up to API ~24, resulting in "unpredictable results" that don't actually reflect reality. I have yet to see this engine on any real hardware device made within the last seven years or so.
Since you say that "sentences are user generated," I would be more worried about solving the problem of what language they are going to be typing in! I'll look out for a question on that! :)