Article count:16428 Read by:87919360

Hottest Technical Articles
Exclusive: A senior executive of NetEase Games was taken away for investigation due to corruption
OPPO is going global, and moving forward
It is reported that Xiaohongshu is testing to directly direct traffic to personal WeChat; Luckin Coffee is reported to enter the US and hit Starbucks with $2, but the official declined to comment; It is reported that JD Pay will be connected to Taobao and Tmall丨E-commerce Morning News
Yu Kai of Horizon Robotics stands at the historical crossroads of China's intelligent driving
Lei Jun: Don't be superstitious about BBA, domestic brands are rising in an all-round way; Big V angrily criticized Porsche 4S store recall "sexy operation": brainless and illegal; Renault returns to China and is building a research and development team
A single sentence from an overseas blogger caused an overseas product to become scrapped instantly. This is a painful lesson. Amazon, Walmart, etc. began to implement a no-return and refund policy. A "civil war" broke out between Temu's semi-hosted and fully-hosted services.
Tmall 3C home appliances double 11 explosion: brands and platforms rush to
Shareholders reveal the inside story of Huayun Data fraud: thousands of official seals were forged, and more than 3 billion yuan was defrauded; Musk was exposed to want 14 mothers and children to live in a secret family estate; Yang Yuanqing said that Lenovo had difficulty recruiting employees when it went overseas in the early days
The app is coming! Robin Li will give a keynote speech on November 12, and the poster reveals a huge amount of information
It is said that Zhong Shanshan asked the packaged water department to sign a "military order" and the entire department would be dismissed if the performance did not meet the standard; Ren Zhengfei said that it is still impossible to say that Huawei has survived; Bilibili reported that employees manipulated the lottery丨Leifeng Morning News
Account Entry

What core technology does the AI ​​robot sales call company exposed by CCTV 315 use?

Latest update time:2019-03-17
    Reads:

▲Click above Leifeng.com Follow


The early telecommunications industry's "call you to death" and number-changing software black market have evolved into today's AI robot harassment calls.

Text | Zhao Chenxi

The annual CCTV 315 "evening party" is the most "nervous" moment of the year for enterprises. Last night, the CCTV 315 column team exposed the violations in many industries. Medical waste, dangerous spicy strips, the tricks of local eggs, unhygienic sanitary products, many routines of home appliance after-sales service, etc. The industrial chain behind it is huge and shocking. Among them, the exposure of harassing calls from smart robots has attracted the attention of many people.

Because everyone receives various sales calls in daily life. Real estate, bank loans, training institutions, education, cars, etc. However, most people may not know that the person making the marketing call may not be a real person, but an AI robot. First, the probe box identifies the mobile phone connected to the wireless network. Then, the user's private MAC information is obtained without the user's knowledge. The MAC is then converted into a mobile phone number and "matched" with the big data. Then, an AI robot that simulates a human is used to make outbound calls.

These probe boxes are widely distributed in public places such as shopping malls, supermarkets, office buildings, convenience stores, etc., and are very concealed. CCTV exposed a number of companies. The entire industry chain includes intelligent robot harassment calls + big data marketing + probe boxes. The specific companies are:

  • Yige Technology Co., Ltd.

  • Shaanxi Yilongxinke Artificial Intelligence Technology Co., Ltd.

  • Zhongke Zhilian Technology Co., Ltd.

  • Biho Technology Co., Ltd.

  • Shengya Technology Co., Ltd.

  • Samoyed Internet Financial Technology Co., Ltd.

  • Shenzhen Miaodi Technology Co., Ltd.

  • Shanghai Zhizi Information Technology Co., Ltd.

  • Lingwo Network Technology Co., Ltd.

  • Fortune Technology Co., Ltd.

  • Hangzhou DiJin Network Technology Co., Ltd.

CCTV's 315 program introduced that a company can make more than 4 billion calls a year. In the telecommunications industry, "nuisance calls" have never been eradicated. Behind it are network security, communication networks of different operators, Internet access to communication networks, caller and called party responsibilities, and many other aspects. In recent years, with the continuous emergence and iteration of emerging technologies, the early communications industry's "call to death" and number-changing software black industry have evolved into today's AI robot nuisance calls, and the technology has been constantly upgraded.



Analysis of similar cases abroad

Do you remember the 2018 Google I/O, the annual developer conference held by Google in California in 2018? In addition to many new products such as Android P, Gmail, Gboard, TPUv3, etc., Google's personal assistant Google Assistant has added Duplex, which can call restaurants, hair salons and other commercial stores to help users make appointments.

From the demonstration cases at the conference, we can see that Duplex can not only communicate with humans in a natural and fluent voice without being noticed, but also successfully handle unexpected situations. For example, it can respond to the auxiliary words "emm" and "uha", understand the context of the conversation, and has the function of actively providing corpus. Of course, Google is not the only company in the world that has achieved this magical "effect".

Subsequently, Microsoft also stood up and issued a technical statement:

The significance of full-duplex voice technology is that it can transform "human-computer interaction" into "human-computer communication." The difference of one word has huge value.

On April 4 this year, we officially released Full Duplex Sensory in the United States and China simultaneously, and predicted that the industry will realize the value of this technology and accelerate its focus in this direction. We are very happy to see more and more industry peers joining us.

In fact, the first full-duplex voice call with artificial intelligence in human history did not happen in the United States, but in China. We are honored to dedicate this crown to our motherland. Since August 2016, Microsoft (Asia) Internet Engineering Academy has enabled XiaoIce to complete more than 600,000 calls with human users through human users' initiative.

Today, we are releasing an actual recording of a phone call that took place two years ago, and will dedicate it as precious material to Chinese people who speak Chinese all over the world.

The core technology behind Google Duplex: It is actually an RNN network built by TensorFlow Extended (RFX). In order to achieve high accuracy, Google trained Duplex's RNN network with anonymous phone conversation data. This network uses the recognition result text of Google's automatic speech recognition (ASR), as well as features in the audio, conversation history, conversation parameters (such as the service to be booked, the current time), etc. Google trained different understanding models for each different task, but some training corpora are shared between different tasks. Finally, Google also used TFX's hyperparameter optimization to further improve the model.

The input speech is first processed by the automatic speech recognition system (ASR), and the generated text is input into the RNN network together with the context data and other inputs. The generated response text is then read out through the text-to-speech (TTS) system.

Google uses a cascade TTS engine and a generative TTS engine (which uses Tacotron and WaveNet) to control the intonation of the voice according to different situations. The system can also generate some modal particles (such as "hmmm" and "uh"), which also makes the voice more natural.

When cascade TTS needs to combine speech units that vary greatly, or needs to increase the generated pauses, modal particles are added to the generated speech, allowing the system to indicate to the other party in a natural way "Yes, I'm listening" or "I'm still thinking about it" (humans often use modal particles while thinking when speaking). Google's user surveys also confirmed that humans find conversations with modal particles more familiar and natural. On the other hand, the system's latency must also be able to match the characteristics of human speech. In some cases, the system even uses a fast approximation model, allowing the system to achieve a latency of less than 100ms.

From Microsoft's technical statement, it can also be seen that Microsoft's so-called full-duplex voice interaction technology Full-Deplex Voice and Google's Duplex should be extremely similar in technology. However, the generation model used by Microsoft is LSTM, while Google uses RNN network.

As Microsoft said, "In fact, the first full-duplex voice call with artificial intelligence in human history did not happen in the United States, but in China." Whether it is the application scenarios of Google or Microsoft, it can be seen that the initial purpose of studying "human-computer communication" is good, that is, to free people from single, simple, and unskilled labor. However, at present, domestic full-duplex voice calls based on artificial intelligence are used in gray areas by some companies, resulting in the "flooding" of harassing calls. So, what technologies do these companies exposed in China use?



Experts explain the technology and ethical standards behind it

To this end, Leifeng.com interviewed Wang Shijin, deputy director of iFlytek AI Research Institute. Wang Shijin told Leifeng.com that AI conversational robots are a type of human-computer interaction system mainly used in service scenarios. Its background mainly involves multiple AI core technologies such as speech recognition, semantic understanding, conversational question and answer, speech synthesis, knowledge graph, etc. In addition, it also requires engineering technology support such as process control, telephone exchange platform, and communication lines.

Telephone is a typical human-computer interaction application scenario, in addition to WeChat, web pages, APP and other scenarios. The interaction in the telephone scenario is real-time two-way interaction, and because the audio quality of the telephone channel is relatively poor and the information carrier is single, its technical complexity is generally high.

These companies exposed in China generally do not have core AI technology, and their system backends often call on the open platform capabilities of other AI companies. From a technical point of view, the intelligent voice technology used by telemarketing robots is very basic, mainly converting the original human speech into a computer broadcast, and calling some simple voice recognition technology.

However, these companies often choose to record their own voices instead of using them, which is not smart, but simpler and cheaper. At present, Google, Microsoft, and domestic companies such as iFlytek and Alibaba have relatively comprehensive core AI technology capabilities, and telephone conversation robots are also a typical application of these capabilities.

iFlytek's current telephone robot technology is mainly used in scenarios such as industry customer service, telephone ordering, and logistics ordering. It focuses on solving problems in the field of intelligent services, improving efficiency, and reducing costs, and has significant application value. For customers who actually purchase services, iFlytek states in the agreement that outgoing calls cannot be used for illegal purposes such as "nuisance calls". Once discovered, the service will be terminated immediately. After inquiries, many telephone sales robot companies on the market that claim to "use iFlytek's services" were found to be not iFlytek's customers, but just using iFlytek's name.

China's economy is developing rapidly, and society and the public are relatively tolerant of the application of emerging technologies. Therefore, driven by commercial interests, it is relatively easy for some ethical issues in technology application to arise. We believe that telemarketing robots that specifically make "nuisance calls" are not a technical issue, but a social ethical issue.

If AI technology is compared to a weapon, the final effect of it depends on who uses it and how it is used. In pursuit of commercial interests, the interests of some people should not be harmed, including commercial interests and other rights of personal privacy. We should pursue a win-win business logic. This requires society and the industry to jointly advocate the concept of value creation and strengthen regulation and supervision through more laws and regulations.

In November last year, the Ministry of Industry and Information Technology announced the "Work Plan for Promoting the Special Action of Comprehensive Rectification of Nuisance Calls", which severely rectified the problem of nuisance calls and made strict regulations. With the rapid development and application of artificial intelligence technology, the availability of telephone conversation robots has been greatly improved. They have been rapidly applied in many fields such as intelligent services, finance, logistics, and medical care, and have also produced huge social and economic benefits.

Wang Shijin believes that this system should be used first in service communication scenarios where there is a lot of manual repetitive work, so as to free up people's energy to do more valuable things. For example, customer service or consulting services in the fields of intelligent services, finance, education, and medical care, such as the confirmation of information between couriers and customers when delivering packages, and routine follow-up visits to patients by hospitals or communities.



Summarize

Leifeng.com believes that artificial intelligence is not only a science and an industry, but also involves all aspects of social life. It is very likely to change the employment structure, impact the law and social ethics, infringe on personal privacy, and challenge the norms of international relations. The security risks and challenges involved, how to develop safely, reliably and controllably in the future, and the ethical constraints behind it have always been a concern of countries around the world.

During the two sessions this year, Baidu CEO Robin Li also proposed that from the perspective of society, government and the public, we need to consider what should be done and what should not be done, what is good and what is bad in the development of artificial intelligence technology. We should make some regulations and predictions as soon as possible to avoid the development of artificial intelligence in a bad direction.

- END -


Recommended Reading


Multiple pictures! CCTV 315 exposes the chaos of robots making harassing phone calls: 4 billion calls a year, these AI companies are on the list

Baidu launches senior executive retirement plan, Zhang Yaqin will retire in October

Xiaomi urgently halts sales of the Xiaomi Mi 9 series; Apple responds to AirPods causing cancer; WWDC 2019 conference date confirmed

Trump orders the US to ground Boeing 737Max; Mobike’s withdrawal from Singapore confirmed; Ma Huateng personally comments on the “Lulu incident”

Follow Leiphone.com (leiphone-sz) Reply 2 Add readers group to make friends


Latest articles about

Database "Suicide Squad" 
Exclusive: Yin Shiming takes over as President of Google Cloud China 
After more than 150 days in space, the US astronaut has become thin and has a cone-shaped face. NASA insists that she is safe and healthy; it is reported that the general manager of marketing of NetEase Games has resigned but has not lost contact; Yuanhang Automobile has reduced salaries and laid off employees, and delayed salary payments 
Exclusive: Google Cloud China's top executive Li Kongyuan may leave, former Microsoft executive Shen Bin is expected to take over 
Tiktok's daily transaction volume is growing very slowly, far behind Temu; Amazon employees exposed that they work overtime without compensation; Trump's tariff proposal may cause a surge in the prices of imported goods in the United States 
OpenAI's 7-year security veteran and Chinese executive officially announced his resignation and may return to China; Yan Shuicheng resigned as the president of Kunlun Wanwei Research Institute; ByteDance's self-developed video generation model is open for use丨AI Intelligence Bureau 
Seven Swordsmen 
A 39-year-old man died suddenly while working after working 41 hours of overtime in 8 days. The company involved: It is a labor dispatch company; NetEase Games executives were taken away for investigation due to corruption; ByteDance does not encourage employees to call each other "brother" or "sister" 
The competition pressure on Douyin products is getting bigger and bigger, and the original hot-selling routines are no longer effective; scalpers are frantically making money across borders, and Pop Mart has become the code for wealth; Chinese has become the highest-paid foreign language in Mexico丨Overseas Morning News 
ByteDance has launched internal testing of Doubao, officially entering the field of AI video generation; Trump's return may be beneficial to the development of AI; Taobao upgrades its AI product "Business Manager" to help Double Eleven丨AI Intelligence Bureau 

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号