Not playing VR TV makers get together to release "smart voice"

OFweek smart home network news With the VR / AR this boom gradually cooling after the 2017 TV circle fired another thing - intelligent voice technology, there is a saying called artificial intelligence, said the simple point is the voice on the TV Interaction. Although not a new concept, it is no coincidence that every recent spring conference of all television manufacturers has focused almost exclusively on the intelligent voice technology of their television. Why is intelligent voice technology blown out on smart TVs at this point in time? It is worth our exploration of this.

The voice of intelligent voice really come?

Investment industry popular saying: investment to vote for "third eye beauty", first eye, second eye contact with the beauty of the threshold and cost are relatively high, and only the third eye of beauty belongs to the public, corresponding to the product, It is that any technology product must go to the third generation to be widely accepted by the public before it can survive for a long time.

Artificial intelligence technology from the 50's cybernetics and early neural networks, to today's AlphaGo, Master, is now just experiencing the third stage of technological development. At the end of the 1950s, a wave of climaxes died. By the 1970s, the National Science Foundation did not support it. In the 80-90s, it was once again active, but many cognitive scientists strongly opposed the artificial intelligence concept â€œphysical symbol system assumptionâ€ that was very hot at that time, and considered the body as a necessary condition for reasoning. In addition, the reduction in scientific research funds was dead. Today is just the third wave, and there are theoretically larger opportunities.

The third-generation technology should be enough to go down to the consumer field. Next, let's take a look at the specific application of the intelligent voice technology of TV manufacturers.

TCL: At the conference, TCL highlighted the artificial intelligence assistant â€œLittle Tâ€, which has three characteristics: perception, recognition, service, and learning. The â€œLittle Tâ€ is the crystallization of the data sharing between TCL Group, Tencent and Ali on artificial intelligence and cloud services to achieve resource sharing.

Changhong: Launched AI Center, a television-centric artificial intelligence platform. It is reported that in addition to cooperation with IBM, HKUST, etc., Changhong has also formed "artificial intelligence industry alliance" with Dolby, Tencent, Vantage, Tsinghua University, Xi'an Jiaotong University, Microsoft, and Chinese Academy of Sciences.

Micro-whale: Micro-Whale Technology launched the Micro-Whale Smart Voice TV 2.0 high-end product drunk A series, and announced that micro-whales will also enter the 2.0 era. It has cooperated with HKUST News, MIT Media Labs, Microsoft, and others in voice remote control, multimedia interaction, and face recognition.

LeTV: LeTV started with super-television equipped with voice capabilities. Super-TV voice technology has gone through the process from cooperation to independent research and development. LeSpey's super-speech technology not only includes speech recognition and semantic analysis, but also has its own research and development of speech synthesis TTS technology. Fully online.

Almost every manufacturer is emphasizing that speech recognition has risen from functional level to artificial intelligence. And there is a huge team behind it, working closely with well-known voice technology and artificial intelligence platforms and research and development. It is hard to deny the development of smart voice in the television industry. But the event does not mean that technology and business are mature enough.

How difficult is speech recognition?

Why is it that intelligent speech technology has developed so long or is it unable to accurately recognize speech and speech? We need to first understand how speech recognition works.

The sound is actually a ripple, just like the spectrum in nature. If you want to analyze the sound, you must first divide the ripple of the sound into many small segments. It is like a video consisting of many frames. The frame is composed of many pixels. The voice can also be divided into many frames. . So the general process of speech recognition can be summarized as follows:

Acquisition: Segmentation of sound wave information

Encoding: Turn each unit length speech into a multidimensional vector (content information)

Training: Learn to judge voice from the data, instead of using artificial rules. Using a database and building a model to allow the voice system to self-learn (if a dialect is encountered, a separate system needs to be established)

Decoding: Combining trained models allows you to recognize speech by judging new speech vectors.

Feedback: Play the analysis results through the device.

A seemingly simple process, in fact, there are many difficulties in each link, there are many uncontrollable factors. On the one hand, under complex conditions, the recognition rate has dropped significantly, such as local dialects, background noise, and the difference in the speed of speaking, all of which are not regular; on the other hand, the training and testing of speech are not exactly matching. If you use the People's Broadcasting Station's voice to train a database, how can there be so many broadcasters in practice?

All these are just fur, and the most important thing is that artificial intelligence's understanding of semantics is a great difficulty. Even as a human, if you suddenly throw you a paragraph without context, you don't necessarily understand what it means. Artificial intelligence is even more persuasive. Depending on the microphone, noise, accent, and conversation content, artificial intelligence may react differently. Essentially, it does not have awareness, and it lacks sufficient recognition of human language. know.

There is also a dilemma in the actual operation of voice interaction on television: the speed of response. Imagine if you asked TV a question. Even if the answer is accurate, but the waiting time is as long as two or three seconds, do you still have a desire to continue to speak to it?

To sum up, no matter whether the voice recognition algorithm is in urgent need of revolution, or the miracle of the voice engineering or the lack of hardware performance, the development of intelligent voice is still far from being a well-known road. Just because it met this era, it can ignore its immaturity and tolerate its growth. Because it has developed fast enough.

In addition to enough intelligence, what else is needed?

Today's smart voice is not perfect, but on the TV platform, do you really need to be smart?

What is the main purpose of TV? Search-on-demand-play control is nothing more than a three-point, deep integration of online and offline voice recognition toolkits, and timely updates, will be able to basically meet the needs of users.

But if TV is used as an AI control center, then TV will be used at a high frequency. The demand for smart voice is much higher. But there is one thing that will never change as an essential attribute of smart TVs. That is to provide users with enough content and services.

If television does not have enough modules and features and there are not enough content and services, the motivation for users to use the language will be lost. If you can't get through all home appliance smart platforms and don't have a unified control protocol, users will have limited access to smart voice.

Really let the user use the intelligent voice function. The excellent speech recognition technology is only a small part of it. The service that solves the actual problem of the user in the home scenario is fundamental. For example, face recognition, children's education, and intelligent voice added by many manufacturers on television can play a role in the value of voice technology in this scenario.

Conclusion: Artificial intelligence only has the meaning of use and room for improvement only when it is constantly interacting. Therefore, in the era of the Internet of Things, where intelligent interactions are constantly changing, intelligent voice technology is the right time. Many people see it as the next outlet, nor is it nonsense. However, the value of all science and technology is generated around human services. How to use intelligent voice to connect the needs of humanity in all family environments is the problem that brand manufacturers urgently need to think about while developing technology.

Dash Cam for Toyota

Dash Cam For Toyota,Dashboard Camera With Gps,Toyota Integrated Dashcam,Toyota Dash Cam Front And Rear

SHENZHEN ROSOTO TECHNOLOGY CO., LTD. , https://www.rdtkdashcam.com