The latest firmware updates on Pixel Buds pro have enabled head tracking for spatial audio. Do you know if Google offers an API in Android to read data from the PixelBuds IMU (i.e. angles/orientation data) like Apple does for AirPods (CoreMotion API)?
I imagine it would be via Android's Sensors HAL API but I haven't found anything for reading this data in the documentation.
Besides, do you know if via the spatializer of Android 13
ro.audio.spatializer_enabled=true
it is possible to fix the position of a sound source?
Related
I just updated to Android 13 and one of the advertised features is the spatial audio, but when I access the spatializer class to check if it's available, I get a false. Also, in the settings there is nothing regarding the spatial audio. Is it already implemented or not?
we are creating a game for android and iOS, where we want to modify the story based on relaxation/stress levels.
We are thinking about using heart rate sensors like the Polar chest sensors and similar.
I was wondering if any knows about a repo to include in our app to handle the Bluetooth part of the sensors, we don't have experience with accessing to hardware, we implemented some connections and readings but connection is unstable, sometimes it connects, another's connection is lost, sensor is not found....
Thanks in advance
take a look at the developer's page of https://www.rookmotion.com/api, it is an app already made for heart rate monitoring but also has an SDK for third-party developers
I am an Android developer who is living with hearing impairment and I am currently exploring the option of making a speech to text app with Speech Recognizer API in Android. Closed-captioning telephones and Innocaption are not available in in my home country. Potential applications might be like captioning during telephone calls.
https://developer.android.com/reference/android/speech/SpeechRecognizer.html
The API is meant for capturing voice commands, not for real-time live transcribing. I am even able to implement it as a service but I constantly need to restart it after it has delivered a result or a partial result, which is not feasible in a conversational setting (words get lost while the service is restarting).
Do note that I don't need a 100% accuracy for this app. Many hearing impaired people find it helpful to have some context of the conversation to help them along. So I don't actually need comments about how this is not going to be accurate.
Is there a way to implement Speech Recognizer in a continuous mode? I can create a textview that constantly updates itself when new text is returned from the service. If this API is not what I should be looking at, is there any recommendation? I tested CMUSphinx but find that it is too dependent on blocks of phrases/sentences that it is not likely to work for the kind of application I have in mind.
I am a deaf software developer, so I can chime in. I've been monitoring the state of art of Speech-To-Text APIs, and the APIs have now become "good enough" to provide operatorless relay/captioning services for CERTAIN kinds of phone conversations with people using telephone in quiet settings. For example, I get 98% transcription accuracy with my spouse's voice with the Apple Siri realtime transcription (iOS 8).
I was able to jerryrig phone captioning by routing the sound out of one phone, to a 2nd iPhone that I press the microphone button (popup keyboard), and successfully captioned a telephone conversation with ~95% accuracy at 250 words per minute (faster than Sprint Captioned Telephone and Hamilton Captioned Telephone), at least until the 1 minute cutoff time.
Thusly, I declare computer-based voice recognition practical for phone calls with family members (of the type you call frequently in quiet environments), where you can at least coach them to move to a quiet place to allow captioning to work properly (with >95% accuracy). Since iOS 8 got released, we REALLY need this, so we don't need to rely on rely operators or captioning telephone. Sprint Captioned telephone lags badly during fast speech, while Apple Siri keeps up, so I can conduct more natural telephone conversations with my jerryrigged two-iOS-device Apple Siri "realtime Captioned Telephone" setup.
Some cellphones transmit audio in a higher-def manner, so it works well between two iPhones (iPhone speaker piped into another iPhone's Siri running in iOS8 continuous mode). That's assuming you're on G.722.2 (AMR-WB), like when running two iPhones on the same carrier that supports the high-def audio telephony standard. It works perfectly when piped through Siri -- roughly as good as doing it in front of the phone, for the same human voice (assuming the other end is speaking into the phone in a quiet environment).
Google and Apple needs to open up their speech-to-text APIs to assistive applications, pronto, because operatorless telephone transcription is finally now practical, at least when calling family members (good voices & coached to be in a quiet environment when receiving call). The continuous recognition time limit needs to also be removed during this situation, too.
Google is not going to work with telephone quality audio anyway, you need to work on captioning service using CMUSphinx yourself.
You probably didn't configure CMUSphinx properly, it should be ok for large vocabulary transcription, the only thing you should care about is to use telephony 8khz model, not wideband model and generic language model.
For the best accuracy it's probably worth to move processing on the server, you can setup the PBX to make the calls and transcribe audio there instead of hoping to do something on a limited device.
It is true that the SpeechRecognizer API documentation claims that
The implementation of this API is likely to stream audio to remote
servers to perform speech recognition. As such this API is not
intended to be used for continuous recognition, which would consume a
significant amount of battery and bandwidth.
This bit of text was added a year ago (https://android.googlesource.com/platform/frameworks/base/+/2921cee3048f7e64ba6645d50a1c1705ef9658f8). However, no changes were made to the API at the time, i.e. the API remained the same. Also, I don't really see anything specific to networking and battery drain in the API documentation. So, go ahead and implement a recognizer (maybe based on CMUSphinx) and make it accessible via this API.
The project objective is something like this:
We would have a locomotive robot having on-board GPS on it.
Now using that GPS we want to track the position of robot and trace it on android cellphone.
(as they provide best interface with Google services)
Not only that
We even want to control the robot from android cellphone.
Is it possible to send control signal from android cellphone to that robot so that robot can make a move according to control signal.
How can we make a connection between android cellphone and on-board GPS of robot.
(We are somewhat newbie to robotics)
Any better ideas, suggestions are most welcome.
Check out Dension Wirc module:
WiRC module by Dension
It works for RC controlled platforms by sending a pulse width modulated train, allowing you to control servos and electronic speed controllers. There are 8 channels. I'm using it to control 2 tracks and a pan/tilt turret, it works great. I emailed the support team, and they sent me an iPhone project, which got me running in a matter of hours. The WiRC kit comes with a camera, so I can drive my robot remotely via wifi.
In terms of GPS, I did a test on iPhone, under clear sky, and the GPS signal drifts badly. The accuracy is indeed somewhere between 30-50 feet, it is not enough to track position of a small robot precisely. I will post a screenshot of my experiment.
Check out this screenshot: I'm walking along the white paths on the map with the phone in my shirt pocket. Every second it places a pin on the map. You can see how badly the red pins deviate from the white path. This is 30-50 feet off path. For a 2 feet long robot, this is a major trouble. If it tries to correct it's path with such resolution of GPS, it is likely to become very confused.
I've seen a differential drive equation on wikipedia (a motor with slit encoders), counting the number of slits that passed past encoder in a certain interval of time. This may help correct the GPS, but requires additional hardware
I am working on something same like this
I am trying to make an autonomous robot capable of moving itself based on some extensive robotics algorithms but certainly you don't need that.
But I think it will be better for you to mount the Android phone on the Robot and then control it with your laptop via WiFi or any other medium.
Mounting an android phone will have many advantages like:
Having a nice GPS and where there is no need to do extra work to integrate it with other hardware and software.
And you can have other hardware like accelerometer, proximity sensor, gravity sensor etc. which can be useful in many ways.
Now there is a lot of data for making Robots based on Android. Here is the Cellbots
they work on making robots from android and control them remotely from laptops or Android.
To read the sensor's data on an Android platform (i.e. Accelerometer, Gyroscope, Magnetometer, Barometer, GPS ), people over the internet are talking about two ways to acquire such data
Primary way: reading the data using the Android SDK via JAVA.
The 2nd way is related to reading the data using the Android NDK.
What about communicating with the sensors directly via SPI,I2C, or UART without the use of the SDK or the NDK ? I understand that I'll be burdened by understanding the communication protocol with the sensors and reading specific registers from which I can acquire the data in a more efficient way. Is this possible ?
In theory it is possible, Walid. If you throw enough time and money at most technical problems, solutions become possible. But I would have to ask why anyone would want to do it that way?
It would be like saying "I'm pretty sure I can drive my car, inverted. I'll operate the accelerator and brake with my hands, and I'll add a couple of extra mirrors to reflect the windshield view down to me. And I'll steer with my legs. Don't ask me how I'll operate the horn!" It's just doing it at a goofy level.
You'd surely need details of the individual chips, which means you'd need to tear your XOOM apart - that kind of implementation info is not published. Not because it's a big secret, but because it keeps costs down if manufacturers don't publish info that 100% of consumers don't need.
Bottom line: there are more productive uses of your energy and brainpower.
Peter