Today's headlines
HEADLINE NEWS
OpenAI 7-year security veteran, Chinese executive
Weng Li
Officially announced his resignation, may return to China
On November 9, Lilian Weng, head of the OpenAI safety system team,
????
Weng Li announced that she will leave on November 15, ending her nearly 7-year career at OpenAI.
In her resignation letter, Weng Li said that this was a difficult choice and expressed her feelings for the OpenAI team and its achievements.
Weng Li is a Chinese scientist who has contributed to several key areas since joining OpenAI. She joined in early 2017 and participated in the research of robotics and deep reinforcement learning, and laid the foundation for the subsequent GPT-4 and security systems. She leads the applied artificial intelligence research team and is committed to improving the practicality and security of the OpenAI API. In particular, after the release of GPT-4, she led the vision of the OpenAI security system and is committed to improving the practicality and security of the OpenAI API. She said that the team has made progress in model security, adversarial resistance, and jailbreak defense, setting new standards for the industry.
And
Not long ago, she just appeared at the 2024 Bilibili Super Science Night event and delivered a keynote speech on "AI Security and the Way of "Cultivation", which sparked heated discussions across the Internet. For a while, whether Weng Li would choose to return to China after leaving also became the focus of everyone's attention. (IT Home, Quantum Bit)
Domestic Information
DOMESTIC NEWS
Yan Shuicheng, a famous AI scholar and chief scientist of Tiangong Intelligent Technology, left Kunlun Wanwei
AI Technology Review has exclusively learned that Yan Shuicheng, a top international scholar in the field of AI, has recently left Kunlun Wanwei.
On September 1, 2023, Kunlun Wanwei officially announced that Yan Shuicheng will serve as the chief scientist of Kunlun Wanwei and Tiangong Intelligent. When Yan Shuicheng joined Kunlun Wanwei, Kunlun Wanwei stated that Yan Shuicheng would promote the company to establish 2050 global research centers in Singapore, London and Silicon Valley.
In 2015, Yan Shuicheng entered the industry and served as vice president of 360 Group, dean of the Artificial Intelligence Research Institute and chief scientist. In 2019, he joined Yitu Technology as chief technology officer. In 2021, he returned to Singapore to join Sea Group and founded Sea AI Lab, leaving in early 2023.
Before joining Kunlun Wanwei, Yan Shuicheng
had accumulated rich achievements in machine learning, computer vision and multimedia. He is a fellow of the Singapore Academy of Engineering, and was elected as an AAAI Fellow, ACM Fellow, IEEE Fellow and IAPR Fellow. He was selected as a "Thomson Reuters Global Highly Cited Scholar" eight times and is a leader in the field of computer vision in China.
He studied for his undergraduate, master's and doctoral degrees at Peking University (1995-2004). Since he entered Microsoft Research Asia (MSRA) as an intern during his doctoral studies in 2001 and followed Dr. Zhang Hongjiang to conduct research in artificial intelligence, he has continued to invest many years of energy in this field and achieved outstanding results. (For more information, please click
Exclusive | Famous AI scholar and Tiangong Intelligence Chief Scientist Yan Shuicheng leaves Kunlun Wanwei
)
Optimizing the ability to generate exciting moments in short dramas, Kunlun Wanwei SkyReels AI short drama platform was launched in the United States on December 10
On November 10, Kunlun Wanwei announced that its AI short drama platform SkyReels will be launched in the United States on December 10.
According to the official introduction, in terms of script generation, SkyReels has enriched the popular creative templates.
The script large model generation capability has been greatly improved, especially the optimization of the short drama's cool point generation capability. The current generation effect can reach an A or even S rating in the manual script rating.
In terms of character generation, the R&D team has added an AI actor library.
At the same time, actor attribute tags have been constructed, based on the multimodal large model capabilities, to intelligently help users find the image that best suits the script characters.
In terms of video generation, SkyReels' video generation success rate increased by 21%. In terms of BGM and TTS matching, SkyReels built an actor voice library with emotions and a skit BGM library, which increased the matching accuracy by 35%.
The SkyReels team said that it can complete the generation of scripts, characters, storyboards and a complete two-minute skit in 10 minutes.
(IT Home)
China's first white paper focusing on AI innovation, self-discipline and governance in the technology industry was released, calling for the development of "human-centered intelligence"
On November 10, Lenovo Group, Shanghai Jiao Tong University Artificial Intelligence Research Institute, ESG30 and others jointly wrote the first report in China focusing on AI innovation self-discipline and governance in the technology industry, "Human-centered Intelligence: A View of Technology Development in the Era of Human-machine Symbiosis", which was released at the 15th Caixin Summit ESG Special Session. At the meeting, the "Human-centered Intelligence Development and Governance Initiative" jointly initiated by Caixin Think Tank, ESG30, Lenovo Group, Shanghai Jiao Tong University Artificial Intelligence Research Institute, Tencent Research Institute and the United Nations Industrial Development Organization was launched.
SenseTime, Siemens Healthcare, Ping An Health, iFlytek, Ant Digital, Tianhong Fund, 4Paradigm, BiRen Technology, Pony.ai and other 25 companies and institutions joined the initiative as the first members. The initiative aims to encourage leading institutions in various industries to jointly promote the development of AI technology in a more human-centered, responsible and sustainable direction. (Titanium Media APP)
ByteDance's self-developed video generation model Seaweed is now available for use
ByteDance's AI content platform, Dream AI, announced that the video generation model Seaweed, developed by ByteDance, is now officially available to platform users. After logging in, users can experience it by selecting "Video S2.0" in the "Video Generation" function.
At the end of September, ByteDance officially announced its entry into the field of AI video, releasing two video generation models, Seaweed and Pixeldance, from the Doubao model family, and inviting small-scale beta tests to creators and corporate customers through Dream AI and Volcano Engine respectively.
The Doubao video generation model Seaweed that is open for use this time is the standard version of the model, which can generate a high-quality AI video of 5 seconds in 60 seconds. Jimeng AI officially revealed that the Pro versions of the two video generation models Seaweed and Pixeldance will also be open for use in the near future. The Pro version of the model can achieve natural and coherent multi-shot actions and complex interactions with multiple subjects, and overcome the consistency problem of multi-lens switching. When switching lenses, it can maintain the consistency of the subject, style, and atmosphere at the same time; it is suitable for the proportions of various devices such as movies, televisions, computers, and mobile phones. (IT Home)
Say goodbye to "silent movies": Zhipu releases new Qingying, which can generate 10 seconds of 4K60 frames/with built-in sound effects video
On November 8, the Zhipu Technology Team released and open-sourced the latest version of the video model CogVideoX v1.5. Compared with the original model, CogVideoX v1.5 will include the ability to generate videos of 5/10 seconds, 768P, and 16 frames. The I2V model supports any size ratio, greatly improving the quality of image-generated videos and complex semantic understanding.
According to the official introduction, CogVideoX v1.5 will also be launched on the "Qingying" platform, and combined with the newly launched CogSound sound effect model, "New Qingying" will have the following features:
-
质量提升:在图生视频的质量、美学表现、运动合理性以及复杂提示词语义理解方面能力显著增强。
-
Ultra-high-definition resolution: Supports the generation of 10s, 4K, and 60-frame ultra-high-definition videos.
-
Variable ratio: supports any ratio to adapt to different playback scenarios.
-
Multi-channel output: The same command/image can generate 4 videos at one time.
-
AI video with sound effects: New Qingying can generate sound effects that match the picture. (IT Home)
Baidu may launch smart glasses with built-in AI assistant
Recently, there was news that Baidu may launch a smart glasses product with built-in Xiaodu assistant, and plans to display it at the 2024 Baidu World Conference on November 12. Later, the official announced that it would release a "new AI species" at this event, so it is also believed by the outside world that it is very likely to be the rumored smart glasses.
According to people familiar with the matter, the smart glasses that Xiaodu is about to launch will have a built-in camera, can take photos and videos, and support voice interaction functions built on the Baidu Wenxin basic model. At the same time, the cost of this product may be lower than the US$299 (approximately RMB 2,139) of Ray-Ban Meta smart glasses, and is expected to be launched as early as early next year. (Sanyi Life)
Kuaishou's "KeLing AI" independent application is launched on Apple App Store, supporting the generation of videos and pictures
Kuaishou recently launched an independent "KeLing AI" application on the Apple App Store to further strengthen its AI content creation layout on mobile terminals. At present, "KeLing AI" has formed a multi-platform product matrix, including web version, App, mini program and overseas version. According to the official introduction, "KeLing AI" is a new generation of creative productivity platform, based on Kuaishou's self-developed "KeLing Big Model" and "KeTu Big Model", providing users with video and picture generation and editing functions.
Currently, the video lengths supported by "KeLing AI" are 5 seconds and 10 seconds, with the longest being 10 seconds, which is similar to similar products. In contrast, Douyin's "Jimeng AI" provides more time options, including 3 seconds, 6 seconds, 9 seconds and 12 seconds, and adds a camera movement function, giving users greater creative flexibility. (IT Home)
Expert: "AI+quantum computing" is an important branch of future computing
On November 9, Jin Shi, a member of the European Academy of Sciences and a foreign member of the European Academy of Humanities and Sciences, introduced in Chongqing that the combination of AI and quantum computing is an important branch in the future computing field, and some countries have already made plans. "Quantum computing is designed using the principles of quantum mechanics, while artificial intelligence (AI) relies on learning a large amount of data resources. Quantum computing, as a new paradigm, can solve the problem of high resource consumption of AI."
"At present, the exploration of application scenarios based on quantum computers and quantum cloud platforms has gradually become a hot topic in the industry." Jin Shi said that relying on quantum computing, more efficient solutions can be provided for industries such as financial technology, big data, weather forecasting, biomedicine, energy and transportation. He took the financial field as an example, quantum computing can develop evaluation and optimization solutions for financial companies, optimize the best investment portfolio for stock assets, and evaluate the risks of options. (China News Service)
The large model Xiao Ai has upgraded its capabilities, and multiple devices support the "Music Q&A" function
Recently, Xiaomi officially announced that the large model Xiao Ai music capabilities have been upgraded, and multiple devices support the "Music Q&A" function. According to reports, this function is based on the AI capability upgrade and supports song information tracking, searching for professional music knowledge, and other content. After users upgrade the Xiao Ai large model to the latest version, they can experience the new function on their mobile phones and car terminals.
Xiaomi Xiao Ai received a major version update at the end of July this year, fully upgrading to "Big Model Xiao Ai", supporting natural question and answer, image editing, and car wake-up defense, covering core categories of devices such as mobile phones, tablets, TVs, speakers, and cars. In the Xiaomi 15 series and Xiaomi Pengpai OS 2 new product launch conference in October this year, Super Xiao Ai officially debuted. The new Super Xiao Ai supports helping users "remember ID photos" and "remember schedules", realizes intelligent screen extraction, and local storage can be deleted. It is known as "a one-step direct access to complex processes and can be used across devices." (IT Home)
The authoritative Chinese big model October list is released, SenseTime "Daily Innovation" Gold Medal
Recently, SuperCLUE, a benchmark for Chinese large models, released the "October 2024 Report on Chinese Large Model Benchmark Evaluation". In this evaluation, SenseTime's Daily New Business Model performed well, with a total score ranking in the first echelon of domestic large models, and won the gold medal.
This evaluation covered 23 large domestic models and conducted a comprehensive assessment from three dimensions: liberal arts, science, and Hard additional tasks, involving more than 2,900 questions in total. SenseTime SenseChat5.5 performed well in multiple evaluation tasks, especially in language understanding and security, and also performed well in logical reasoning and code disciplines. What is more worth mentioning is that it ranks in the first echelon in China in both precise instruction following and high-level reasoning in Hard tasks, demonstrating its strong complex reasoning ability. (New Intelligence)
International News
FOREIGN NEWS
AI content game: OpenAI wins first round of copyright dispute with news organizations
On November 9, it was reported that OpenAI won the first round of victory in the copyright dispute case with Raw Story and AlterNet.
Previously, three American news websites, Raw Story, The Intercept and AlterNet, sued Microsoft and OpenAI, accusing their chatbots of plagiarizing news website articles for training AI. These news websites all stated that OpenAI's chatbot ChatGPT (Microsoft's Copilot also uses this technology) plagiarized articles on their websites during training and did not display "author, title, copyright or terms of use information" when generating content.
New York federal judge Colleen McMahon dismissed the lawsuit filed by Raw Story and AlterNet on the grounds that the plaintiffs failed to prove that they suffered a cognizable injury. The lawsuit filed by Raw Story and AlterNet did not claim that OpenAI violated their copyrights like other publications, but accused OpenAI of violating the provisions of the Digital Millennium Copyright Act (DMCA). The judge believed that "it seems unlikely that ChatGPT outputs plagiarized content from [their] articles" and believed that the plaintiffs should not claim the removal of copyright-related information, but should claim compensation for the use of their content resources when developing ChatGPT.
Although the lawsuit was dismissed, Raw Story and AlterNet have no intention of giving up. Their lawyer Matt Topik said they are confident that they can address the concerns raised by the court by amending the complaint and continue to pursue their legal rights. (IT Home)
Jen-Hsun Huang: The possibility and future prospects of expanding AI computing clusters to 1 million chips
Recently, Huang Renxun revealed in an interview that in the future, AI computing clusters may be expanded to the concept of 1 million chips. He said: "No law of physics can prevent the realization of this goal."
Huang Renxun mentioned "Super Moore's Law", which means that the computing power of AI will double or triple every year in the future, far exceeding the doubling every two years described by the traditional Moore's Law. This breakthrough will not only lead to revolutionary changes in hardware, but is also likely to have a disruptive impact on algorithms and applications.
黄仁勋强调了软硬件协同设计的重要性。他认为在 AI 的发展历程中,单一的技术突破已难以满足日益增长的算力需求,协同设计将成为关键。一方面,机器学习和 AI 的发展已经显著改变了我们的计算模式,另一方面,数据中心的设计也需要进行全面的创新与优化。黄仁勋也提到了英伟达在与 xAI 合作中取得的进展,仅用 19 天就完成了 10 万卡 H100 超级集群的建设。
Faced with challenges such as capital, energy and supply chain, Huang Renxun firmly believes that these challenges can be overcome. In the next two or three years, every scientific breakthrough and technological progress produced by AI will be centered on AI. This is the trend of future technological development. (Sohu.com)
Google Gemini 2.0 may be released soon with faster response speed
Google is reportedly planning to release a 2.0 version update for its large language model Gemini. It is reported that some users have seen the new model marked as Gemini 2.0 in the AI model selection interface and have conducted preliminary tests. Compared with the current Gemini 1.5 Pro version, the Gemini 2.0 model has a faster response speed. However, the model does not seem to be fully mature yet. Initial reports indicate that it failed the basic "Strawberry Test", while other models passed easily.
However, Google has not yet responded to this news, and the specific release date has not yet been determined.
(Pinwan News)
Harvard's new ChatGPT-like cancer diagnosis AI is published in Nature, with an accuracy rate of up to 96%
Recently, scientists from Harvard Medical School and other institutions developed a multifunctional AI cancer diagnosis model called CHIEF (Clinical Histopathology Imaging Evaluation Foundation), which was published in Nature on September 4. It is worth mentioning that CHIEF is the first model that can predict patient prognosis and has been validated in multiple international patient groups.
The new CHIEF model has similar flexibility to ChatGPT - it can not only perform multiple tasks, but also identify areas that require special attention for different cancer types. By reading digital slices of tumor tissue, it can detect cancer cells and analyze the genetic characteristics of the tumor based on the cell characteristics observed in the image. In addition, it can predict patient survival rates for multiple cancer types and accurately locate the characteristics of the tissue surrounding the tumor, the tumor microenvironment. These characteristics are related to the patient's response to standard treatments such as surgery, chemotherapy, radiotherapy and immunotherapy. Going further, CHIEF also has the potential to generate new insights - it has discovered specific tumor characteristics that were not previously thought to be related to patient survival.
The research team noted that these findings further demonstrate that AI can help clinicians efficiently and accurately assess cancer, including identifying patients who may not respond well to standard cancer therapies.
Mistral releases content review API: supports 11 languages including Chinese, and can classify 9 categories including hate speech
On November 9, it was reported that Mistral AI launched a new content review API to meet the growing demand for a safe online environment. The content review API is based on the fine-tuned Ministral 8B model and can classify content into 9 categories, including hate speech, violence, and personal data leakage.
The content review API supports 11 languages including Chinese, Arabic, English, French, German, Italian, Japanese, Korean, Portuguese, Russian and Spanish, and can process raw text and analyze conversation content. Mistral also launched the Mistral Batch API, which is designed for companies that need to process large amounts of data. This feature allows asynchronous content processing, and Mistral claims that it can reduce processing costs by 25%. This feature has attracted companies that want to optimize operations and further consolidated Mistral's competitiveness in the market. (IT Home)
Outlook will launch AI-personalized dynamic themes
Microsoft's Outlook email client now offers custom AI-generated themes for users who subscribe to Copilot. The new feature, called "Copilot Themes," is available for Outlook for Windows, Outlook for macOS, Outlook for mobile, and Outlook for the web, and is available to users of Copilot Pro or Microsoft 365 Copilot.
To create a custom theme, Outlook users can use a location or weather type as a starting point and then select an art style. When choosing a location, Outlook users can either select their own location or choose from more than 100 curated destinations. Themes can also be set to automatically refresh at specific intervals.
Copilot themes are available in Outlook's appearance settings, and they work like other Copilot features in Outlook, including email summaries and AI-assisted features for drafting emails or creating meeting invitations.
Google DeepMind research appears on Nature cover again, invisible watermark makes AI invisible
Recently, a study published by Google DeepMind appeared on the cover of Nature. The researchers developed a watermarking scheme called SynthID-Text, which has been put into use on its own Gemini to track the text content generated by AI and make it invisible. In order not to affect the quality of LLM-generated text, SynthID-Text uses a novel sampling algorithm (Tournament sampling). Compared with existing methods, the detection rate is higher, and it can balance the text quality and the detectability of the watermark through configuration. (New Intelligence)
Google's AI video editing app Google Vids is now available: it helps you write scripts, edit videos, find materials, etc.
On November 9, Google released a blog post announcing the official launch of the Google Vids app for Google Workspace users. Google Vids integrates the Gemini model, which can help users create slides, write video scripts, find materials from Shutterstock, and make storyboards for the entire video. Vids uses the powerful functions of Gemini, and users only need to enter the prompt words to generate a preliminary storyboard. After the user selects the style and style, Gemini will automatically splice the video draft, including recommended scenes, text, scripts and background music.
In addition, users can start their creations from a variety of templates, add animations, transitions and effects, use a library of copyright-free content or import media directly from Google Drive and Google Photos.
The UK will legislate next year to prevent AI risks, mainly targeting "cutting-edge models" such as ChatGPT
On November 8, according to foreign media reports, the UK plans to pass legislation next year to strengthen prevention of potential risks of AI. Peter Kyle, the country's science and technology minister, said that the UK's current voluntary AI testing agreement "works well and is a good framework", but the upcoming AI bill will turn such agreements with major developers into legal obligations, and the government will also invest in infrastructure to support the development of the AI industry.
The bill will be introduced in the current parliament and would make the UK's AI Safety Institute a body independent of government, acting "solely in the interests of the British people".
The legislation would reportedly target ChatGPT-style “cutting-edge” models — state-of-the-art systems developed by a handful of companies that are capable of generating text, image, and video content.
Kyle also promised to invest in advanced computing technology to help the UK develop its own AI and large language models (LLM). Previously, the British government was criticized for canceling funding for the University of Edinburgh's "10 billion supercomputer" project, but the project originally received the government's promised support of 800 million pounds (currently about 7.421 billion yuan). (IT Home)
Beating human artists? Humanoid robot Ai-Da's first painting sold at auction for over $1 million
At a Sotheby's auction in New York, a painting by the humanoid robot Ai-Da, titled "AI God. Portrait of Alan Turing," sold for $1.08 million, far exceeding the pre-auction estimate of $120,000 to $180,000. The 2.2-meter (7.5-foot) painting depicts the famous mathematician Alan Turing, who was a key figure in breaking Nazi codes during World War II and a pioneer in early computer science.
Ai-Da is the world's first hyper-realistic robot artist with the ability to speak. She said at the auction that her work aims to stimulate dialogue about emerging technologies, especially the ethical and social impacts of artificial intelligence and computers. Turing's portrait not only shows the mystery of technology, but also prompts the audience to think about the deep-seated issues behind technological progress. (New Intelligence)
Meta open source small language AI model MobileLLM family: suitable for smartphones, providing 125M-1B version
On November 8, Meta announced the official open source of the MobileLLM family of small language models that can run on smartphones, and added three different parameter versions to the series of models: 600M, 1B and 1.5B.
Meta researchers said that the MobileLLM model family is designed for smartphones. The model claims to use a streamlined architecture and introduces the "SwiGLU activation function" and "grouped-query attention" mechanism, which can balance efficiency and performance results. In addition, the MobileLLM model is said to be fast to train. Meta researchers claimed that when they trained the MobileLLM model with different parameter amounts with 1 trillion words (tokens) in a server environment with 32 Nvidia A100 80G GPUs, the 1.5B version only took 18 days, and the 125M version only took 3 days.
Judging from the results, the accuracy of the MobileLLM 125M and 350M models in zero-shot common sense understanding tasks is 2.7% and 4.3% higher than that of State of the Art (SOTA) models such as Cerebras, OPT, and BLOOM. (IT Home)
AI reproduces the Beatles' last song "Now and Then": compete with Beyonce and others for the Grammy
On November 10, the list of nominees for the 67th Grammy Awards in 2025 was announced. Nearly 50 years after the disbandment of the legendary band The Beatles, their last song "Now and Then" was nominated for two Grammy Awards with the help of AI. The song will compete with contemporary pop singers Beyonce, Charlie XCX, Billie Eilish and Taylor Swift for the Best Production of the Year Award. It was also nominated for the Best Rock Performance Award, with competitors including Green Day, Pearl Jam and The Black Keys.
It is understood that "Now and Then" was originally a demo recorded by John Lennon in the late 1970s, but it was not completed in the end. After Lennon's death, his widow Yoko Ono gave two tapes to Paul McCartney in 1994, one of which included "Grow Old With Me" and "Now And Then". The other three Beatles decided to remake "Now And Then", but because John Lennon's vocals could not be perfectly extracted, the plan was temporarily shelved. In 2022, director Peter Jackson and the recording engineer used machine learning algorithms to separate John Lennon's voice from the original demo of "Now and Then", allowing other members of the band to continue to participate in completing the song.
Although "Now and Then" was created using machine learning, it still falls within the scope of the Grammys' AI rules. Current guidelines state that "only human creators are eligible to submit for review, nomination or winning of the Grammy Awards," but works that contain "elements" of AI material are eligible to enter applicable categories. (Fast Technology)
AI 'electronic tongue' comes out to detect taste and food safety
Researchers at Pennsylvania State University have developed an artificial intelligence-based "electronic tongue" that can accurately identify the acidity and freshness of food, and even detect harmful substances.
It is understood that the researchers used an ion-sensitive field-effect transistor (ISFET) as a "tongue" to sense the taste by collecting ion information in the liquid and converting it into electrical signals. Subsequently, artificial intelligence (artificial neural network) played the role of the taste cortex to process and interpret these signals.
This ability to "learn autonomously" enables the "electronic tongue" to distinguish between similar soft drinks or coffee mixes, detect whether milk is diluted, identify spoiled juice, and even detect the presence of harmful per- and polyfluoroalkyl substances (PFAS) in water.
In addition, the researchers also used a method called Shapley Additive Explanations to analyze the decision-making process of neural networks. This method helps scientists better understand the decision-making mechanism of artificial intelligence and improve its transparency and explainability. (IT Home)