Next Xbox One update will let you help improve speech recognition

With the next system update, Microsoft is asking for fans to provide their own voice samples to help improve their voice recognition software.

11

"Xbox, watch A&E... No, not AMC. A&E. No, go back. Wait, Xbox. Stop. Ugh, Xbox go home."

When Xbox One's voice commands work, they can be rather magical. However, there will be times when Xbox simply doesn't understand what you're saying--and that can be frustrating. With the next system update, Microsoft is asking for fans to provide their own voice samples to help improve their voice recognition software.

Described as a "completely optional" opt-in setting, Xbox One owners will be able to share their speech data with Microsoft. The company promises all data "will be used for product improvement only," and that "the more voice samples we have to input into our algorithms, the better and more responsive Xbox One can be to our fans." In order to opt-in, you'll have to go to Settings, then Privacy & Online Safety, then "allow" Share Voice Data. We're probably not going to opt in, mostly because we don't want to hurt Microsoft's feelings with all the cussin' we do at home.

In addition, the next update will also introduce a sound mixer for Snap mode. With this new feature, you'll be able to control the volume levels of the two simultaneously running apps--so you can have live TV be much louder than your Titanfall session, for example.

Filed Under

From The Chatty

  • reply
    May 5, 2014 9:00 AM

    Andrew Yoon posted a new article, Next Xbox One update will let you help improve speech recognition.

    With the next system update, Microsoft is asking for fans to provide their own voice samples to help improve their voice recognition software.

    • reply
      May 5, 2014 9:01 AM

      I'm sure this won't backfire at all.

      • reply
        May 5, 2014 9:15 AM

        "Xbox go fuck yourself!"

      • reply
        May 5, 2014 9:25 AM

        They did this on the 360. It was a little app that ran you through a bunch of crazy words and you got some avatar rewards for doing it.

      • reply
        May 5, 2014 10:52 AM

        It won't. They'll have to have a fleet of interns listening to every single sample to prevent bad samples from being used.

      • reply
        May 5, 2014 12:45 PM

        I dunno, I saw Raoul's video.

    • reply
      May 5, 2014 9:22 AM

      That's cool; I'm glad MS is finally doing this.

      • reply
        May 5, 2014 9:26 AM

        I still don't understand WHY they have to do this. They have all that Cornata data, all the data from the Kinect 1, why did this thing seem like they just started from square one with no fucking data?

        • reply
          May 5, 2014 9:36 AM

          I'd guess a combination of not enough time (to polish the experience since they did a complete change of the authentication method on the Xbox before launch), and internal secrecy? That's all I can guess.

          • reply
            May 5, 2014 9:38 AM

            Yea, it just seems like a huge waste of resources to rework ALL that voice data...

            • reply
              May 5, 2014 11:02 AM

              where does it say that's happening?

              • reply
                May 5, 2014 1:05 PM

                It doesn't, I'm assuming a lot of shit because that's what we do here.

        • reply
          May 5, 2014 10:53 AM

          I don't think any of that data was actually sent to MS's servers. If you ran that tuning app, then yes, but that was a very small sample set.

        • reply
          May 5, 2014 1:03 PM

          Speech recognition is an incredibly difficult problem. For one thing a large portion of the data depends on the mic you have, so gathering data from a different device may not help as much as you think.

          Moreover, the improvements in speech recognition we've seen over the last years have been mostly due to new machine learning techniques that requires a *boat load* of data to train. We're at the point right now where basically the algorithms are the same for everyone working on this (details differ), but whoever has more data will get better results. Nuance (the Dragon folks, also used by Siri) and Google both have tons and tons of data, and MS is trying to catch up.

          • reply
            May 5, 2014 1:13 PM

            Also, you an actually use *unlabeled* data to help speech recognition in a lot of cases. So even if the Kinect doesn't understand you, or gets it wrong, or you just spout fucking nonsense, it can still use that in the initial phases to sort of "tune" the neural nets to extract the salient features of voice, and then once it's done that it can use the labeled data on top to make it actually recognize specific things.

            (If you're a nerd, check out restricted boltzman machines for the unsupervised part, and deep neural nets for the context).

    • reply
      May 5, 2014 9:27 AM

      Fuck this. All I want is HBO go update. God damn is that so hard to do...

      • reply
        May 5, 2014 9:35 AM

        Maybe they are going to do a universal binary? I don't understand what's taken it so long either...

      • reply
        May 5, 2014 10:51 AM

        One word: Licensing

        Probably has very little to do with MS and everything to do with how much HBO wants MS to pay for it even though MS doesn't get any direct revenue from it (I said Direct).

      • reply
        May 5, 2014 11:04 AM

        yes.

    • reply
      May 5, 2014 4:57 PM

      So if all the southerners opt in and we slobs from NJ opt out, does that mean the South will "win" with better speech recognition?