JeremyFr79 228 Posted June 1, 2015 Posted June 1, 2015 Ok, so my in-laws got an X1 DVR from Comcast/Xfinity today, and all I gotta say is the voice controls are the SHIT! is there any hope of maybe seeing something like this with Emby, say within the mobile apps? This would push usability over the top! Wishful thinking I'm sure but I can't be the only one here that thinks this would be insanely awesome. or....maybe its just the whiskey tonight talking I dunno lol
CashMoney 94 Posted June 1, 2015 Posted June 1, 2015 Not sure about mobile apps, but @@chef has written a plugin for using the Kinect for voice control, and Voxcommando also has a Media Browser/Emby plugin, both of which work well.
chef 3810 Posted June 1, 2015 Posted June 1, 2015 Adding system.speech namespace into a plugin wouldn't be difficult. The question is: it's worth while? There is a large possibility that we will see an Xbox one app in the probable future, so adding functionality like this to a plugin or client app, seems redundant, however not impossible. The kinect app is old now, and open air speech commands aren't very good with the old generation kinect. But you wouldn't need to use kinect. A plugin or client app could easily use any microphone to control emby with speech.
chef 3810 Posted June 1, 2015 Posted June 1, 2015 Not sure is Luke's question was directed at me, but... Unless the user had some sort of remote with a microphone in it (ie.: amulet voice remote), perhaps the only other practical microphone would be the one in everybody's pocket... Their cell phone. In that case, .net namespaces are not practical. Xbox one kinect has a decent microphone array, but not everyone has an Xbox (even though I think everyone should have one lol). Speech technology needs a big investment in hardware. Not sure where one would begin. If it is .net/Windows OS tech that someone would want, I know those namespaces, and all the tricks. But again, a big investment, which would involve limited users. Any thoughts?
Luke 42079 Posted June 1, 2015 Posted June 1, 2015 yea i was looking for examples of what we could do with the mobile apps, because we can certainly capture voice input.
Koleckai Silvestri 1154 Posted June 1, 2015 Posted June 1, 2015 (edited) With the new Roku models, the 3 has a microphone in the remote. I am guessing this is processed on some site and not by the Roku itself. The voice rec in the Amazon Fire uses Amazon's servers for the same purpose. If you have and older Roku or your new device doesn't have a microphone built into the remote, you can use the Roku app to do voice recognition. This is processed by the voice recognition APIs on the device. These may hit their respective servers as well (Siri, Cortana, Google Now, Amazon). I use it all the time on my older Roku 3 device but it should work on any of them. For Emby to handle this through an app, it would need to tie into those same voice recognition APIs. I am sure they will be different depending on the OS of the Device but basically they will process the speech and return text. Edited June 1, 2015 by Koleckai Silvestri
JeremyFr79 228 Posted June 1, 2015 Author Posted June 1, 2015 Well for instance on this "X1" system my inlaws got, you can voice search media, for instance you could say "show me NCIS" and it pulls up NCIS from your recorded content, the guide, and on demand, etc. Or you say 'watch CBS" and it automatically tunes to your CBS affiliate, so on and so forth. I hate hate hate cable company provided equipment, but damn if that wasn't the funnest shit to use. So in my head I'm thinking something along the lines of this..... You log into your android app or IOS or whatever, you select to "cast", once you're "casting" you are provided with a "mic" button of which you could press and say something like "watch CBS" or "watch chappie" and it would then cast the content (provided it found it) to your device, something along those lines. Or music "listen to Pink Floyd" etc. It just seems this could be a killer feature. All my wife could do after we left the inlaws last night was talk about how cool it was to talk to your remote and the cable box did it, and worked damn near flawlessly at that.
chef 3810 Posted June 1, 2015 Posted June 1, 2015 I agree it is a killer feature. It would also be awesome to have end points for this stuff in the API. If you capture voice with a phone, which device would do the recognition? The server? The phone?
Luke 42079 Posted June 1, 2015 Posted June 1, 2015 Well for instance on this "X1" system my inlaws got, you can voice search media, for instance you could say "show me NCIS" and it pulls up NCIS from your recorded content, the guide, and on demand, etc. Or you say 'watch CBS" and it automatically tunes to your CBS affiliate, so on and so forth. I hate hate hate cable company provided equipment, but damn if that wasn't the funnest shit to use. So in my head I'm thinking something along the lines of this..... You log into your android app or IOS or whatever, you select to "cast", once you're "casting" you are provided with a "mic" button of which you could press and say something like "watch CBS" or "watch chappie" and it would then cast the content (provided it found it) to your device, something along those lines. Or music "listen to Pink Floyd" etc. It just seems this could be a killer feature. All my wife could do after we left the inlaws last night was talk about how cool it was to talk to your remote and the cable box did it, and worked damn near flawlessly at that. Do you press a button before talking to it, or is it continuous?
JeremyFr79 228 Posted June 1, 2015 Author Posted June 1, 2015 Do you press a button before talking to it, or is it continuous? You press and hold a button on the remote while your issuing the command and then let go once you're done talking.
chef 3810 Posted June 1, 2015 Posted June 1, 2015 I agree it is a killer feature. It would also be awesome to have end points for this stuff in the API. If you capture voice with a phone, which device would do the recognition? The server? The phone? Because if you bit-stream the audio back to the server with the larger processor... Well... That would be pretty cool stuff.
mediacowboy 438 Posted June 1, 2015 Posted June 1, 2015 This would be awesome. You could have it integrated into the remote part of so many of the interfaces.
Luke 42079 Posted June 2, 2015 Posted June 2, 2015 if any of you run the server dev build I've added this to the web interface. Chrome is currently the only browser that supports the SpeechRecognition api. (although maybe Edge?) it only accepts a few commands at this point, and only in english. so clearly, it's going to take a few months to build up a library of commands. an upcoming update to android will have it, and ios later on too. but i see what you mean about the fun factor 2
JeremyFr79 228 Posted June 2, 2015 Author Posted June 2, 2015 if any of you run the server dev build I've added this to the web interface. Chrome is currently the only browser that supports the SpeechRecognition api. (although maybe Edge?) it only accepts a few commands at this point, and only in english. so clearly, it's going to take a few months to build up a library of commands. an upcoming update to android will have it, and ios later on too. but i see what you mean about the fun factor DAMN that was fast! this is the shit that makes me love you guys! 1
mediacowboy 438 Posted June 2, 2015 Posted June 2, 2015 What are some of the commands it supports already?
Luke 42079 Posted June 2, 2015 Posted June 2, 2015 There's only about 6 of them, and then it displays 4 random examples. So it's going to take quite some time to build up the list. Plus we'll want the examples to be context sensitive and related to what you were doing before you clicked the microphone. So for example, if you're controlling another device with a mobile app we can have it give examples of how to do that. 4
JeremyFr79 228 Posted June 2, 2015 Author Posted June 2, 2015 I tried the dev release, but it still breaks my liveTV using ServerWMC so I had to go back to the beta builds for now as I have other people who use my system for TV. Looks awesome so far even with the little bit you have. This could seriously be one of Emby's killer features that set's you guys well ahead of the competition if it matures nicely. (not that you're not already ahead of the competition in my eyes )
Xzener 729 Posted June 2, 2015 Posted June 2, 2015 This is great. Is it possible to use while remote mirroring Luke??
Angelblue05 4132 Posted June 2, 2015 Posted June 2, 2015 This is some really exciting stuff. I think if we are able to use our phones for the voice capture, I will have no choice but to use this badass feature! 1
Luke 42079 Posted June 3, 2015 Posted June 3, 2015 Yea the eventual goal is to control other apps. Even now it will do that with the play commands if you have another player already selected
chef 3810 Posted June 3, 2015 Posted June 3, 2015 (edited) How are you finding the Google recognition in comparison to the Windows version? When I built the speech recognition before, I used the same context idea. When a user was looking at their TV shows I would only add specific commands from their library, building each Media Browser tier to the command list. The list became lengthy after drilling into their episodes or songs. Lengthy command lists can cause Mis-recognitions. I built functions to remove commands from the list if the user went back to a series tier, and completely rebuilt the recognizer if the user switched libraries. It is faster in Windows to destroy a speech recognizer on a separate thread and build a new one then it is to remove commands. That is my only tips for speech recognition app for Windows. Maybe useless info, but I thought I would share anyway. Edited June 3, 2015 by chef 1
Luke 42079 Posted June 3, 2015 Posted June 3, 2015 IE doesn't support this so i have no windows version to compare to 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now