Important news
NEWS REMIND
-
Tongyi Qianwen's three main models have been reduced in price again, with the highest reduction of 85%
-
Alibaba CEO Wu Yongming: AI's greatest imagination is not in the mobile phone screen, but in changing the physical world
-
Nvidia expands AI territory again, reportedly spends $165 million to acquire startup OctoAI
-
Musk's brain-computer interface is approved, and the blind may see again
-
OpenAI appoints former Coursera exec Leah Belsky as general manager of education
Today's headlines
HEADLINE NEWS
Tongyi Qianwen's three main models have been reduced in price again, with the highest reduction of 85%
At the Yunqi Conference on September 19, Alibaba Cloud CTO Zhou Jingren released the Tongyi Qianwen new generation open source model Qwen2.5. Among them, the flagship model Qwen2.5-72B surpassed the performance of Llama 405B and once again became the world's largest open source model. Qwen2.5 covers large language models, multimodal models, mathematical models, and code models of multiple sizes. Each size has a basic version, an instruction follower version, and a quantized version. A total of more than 100 models are available, setting a new industry record.
Zhou Jingren announced that Tongyi's flagship model Qwen-Max has been fully upgraded, with performance close to GPT-4o. The backend models of Tongyi's official website and Tongyi APP have been switched to Qwen-Max, which will continue to provide free services to all users. Users can also call Qwen-Max's API through Alibaba Cloud Bailian Platform.
It is worth mentioning that after the first substantial price cut in May, the three Tongyi Qianwen main models on the Alibaba Cloud Bailian platform have been reduced in price again. The price of Qwen-Turbo dropped by 85% to as low as 0.3 yuan per million tokens, and Qwen-Plus and Qwen-Max were further reduced by 80% and 50% respectively. Among them, Qwen-Plus's reasoning ability is on par with GPT4 and can be applied to complex tasks. It is the best choice for balancing effect, speed and cost. After the price cut, Qwen-Plus has the highest cost-effectiveness in the industry, and is 84% lower than the industry price of the same scale. At the same time, the Alibaba Cloud Bailian platform also gives away more than 50 million tokens and 4,500 picture generation quotas to all new users for free. (Bianniushi)
Domestic Information
DOMESTIC NEWS
Alibaba CEO Wu Yongming: AI's greatest imagination is not in the mobile phone screen, but in changing the physical world
On September 19, Wu Yongming, CEO of Alibaba Group and Chairman and CEO of Alibaba Cloud Intelligence Group, delivered a keynote speech at the 2024 Yunqi Conference. He believes that in the past 22 months, AI has developed faster than any other period in history, but we are still in the early stages of the AGI revolution. The biggest imagination of generative AI is not to create one or two new super apps on the mobile phone screen, but to take over the digital world and change the physical world.
Wu Yongming's core views are as follows:
-
AI is developing faster than any period in history, but we are still in the early stages of the AGI revolution.
-
The investment threshold for the next stage of advanced models is in the tens of billions or hundreds of billions of US dollars.
-
The greatest imagination of generative AI is not to create one or two new super apps on the mobile phone screen, but to take over the digital world and change the physical world.
-
Robotics will be the next industry to undergo a major change. In the future, all objects that can move will become intelligent robots.
-
In the future, almost all software and hardware will have reasoning capabilities, and their computing cores will become a computing model that mainly relies on GPU AI computing power and supplemented by CPU traditional computing.
-
Over the past year, Alibaba Cloud has invested in a large amount of new AI computing power, but it is still far from meeting the strong demand from customers.
-
People tend to overestimate the new technological revolution in the short term and underestimate it in the long term, but it will grow in your doubts and you will miss the big trend in your hesitation. (Bianniushi)
The world's first multimodal geographic science model "Kunyuan" was released, created by the Chinese Academy of Sciences
On September 19, the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, the Institute of Tibetan Plateau Research, Chinese Academy of Sciences, the Institute of Automation, Chinese Academy of Sciences and other institutions officially released the world's first multimodal geographic science model "Kunyuan" (Sigma Geography).
According to reports, this large model is a professional language large model focusing on geographical science, and has the professional ability to handle geographical science-related issues. The R&D team has completed the construction of a high-quality corpus of the entire spectrum of geography, the construction of a large model of geographical science language, and the development of a geographical science research intelligent guidance platform, so that "Kunyuan" has the characteristics of "understanding geography", "precisely matching maps", "knowing people's hearts", and "intelligent maps", and has realized the functions of answering geographical professional questions, intelligent analysis of geographical literature, querying geographical data resources, mining and analyzing geographical data, and drawing thematic maps. (IT Home)
Transsion and MediaTek jointly build an artificial intelligence joint laboratory to focus on AI technology innovation on mobile phones
Recently, the artificial intelligence joint laboratory jointly built by Transsion Holdings and MediaTek was unveiled in Shenzhen. The two parties will integrate the superior technical resources in the field of artificial intelligence to accelerate the application and popularization of AI technology in smart terminals. Zhang Qi, senior vice president of Transsion Holdings, Shi Tuanwei, general manager of TEX AI Center, Dr. Lu Zhongli, deputy general manager of MediaTek's computing and artificial intelligence technology group, and Li Shaoding, assistant manager of the wireless product software development department, jointly unveiled the laboratory.
According to an official introduction from Transsion Holdings, the newly established joint artificial intelligence laboratory will focus on application innovation in the fields of large-scale language models, agent intelligence, AI voice, imaging, etc. on mobile phones, provide more end-side deployment and optimization solutions for generative AI, and jointly explore AI intelligence services and mobile application scenarios for the general public.
It is reported that Transsion has currently made some arrangements in large-scale models and interconnection technology, imaging AIGC technology, and small language AI voice technology. Transsion has released TECNO AI and launched a new generation of smart assistant Ella. (IT Home)
Zhiyuan launches the next generation retrieval enhancement large model framework MemoRAG
According to official news from the AI Research Institute, the Beijing AI Research Institute and the Gaoling School of Artificial Intelligence of Renmin University of China jointly launched MemoRAG, a next-generation retrieval-enhanced large model framework based on long-term memory, aiming to promote the expansion of RAG technology from only being able to handle simple QA tasks to coping with complex general tasks.
MemoRAG proposes a new RAG model of "memory-based clue generation - information acquisition based on clue guidance - content generation based on retrieval fragments", which realizes accurate information acquisition under complex scene conditions (especially "fuzzy query expression" and "highly unstructured knowledge"). Under this new model, MemoRAG has shown great potential in processing domain knowledge-intensive tasks in real-world scenarios such as justice, medical care, education, and code. The technical report of MemoRAG has been published on ArXiv, and the code has also been open sourced. (Pinwan News)
The Flow 2.0 version adds a new discovery section and focuses on AI Native high-quality content
On September 19, the large-scale model application product, AI assistant Xinliu, launched version 2.0, adding "Featured Content on the Home Page" and "Discovery" sections, providing users with AI Native high-quality graphic content and Q&A with intelligent recommendation capabilities, aiming to help users solve confusion and problems in work and life. Currently, the official website and APP versions of Xinliu have been updated simultaneously.
It is understood that the 2.0 version of Xinliu released this time covers various encyclopedia categories such as efficiency tools, food and leisure, travel interests, sports and health, home health, etc. Users can obtain high-quality content needed for life and work through the homepage or discovery section with pictures and texts, and can also ask questions to Xinliu through AI Q&A. At the same time, the Xinliu APP version has launched intelligent voice dialogue capabilities, and users can ask questions and communicate with Xinliu through voice. (Jiemian News)
Alibaba International releases the latest version of multi-modal large model Ovis
Alibaba International AI team announced the release of Ovis, a multimodal large model. It is reported that Ovis can perform well in mathematical reasoning question answering, object recognition, text extraction, and complex task decision-making. For example, Ovis can accurately answer math questions, identify flower varieties, support text extraction in multiple languages, and even recognize handwritten fonts and complex math formulas. The data, models, training, and reasoning codes of Ovis 1.0 and 1.5 are all open source and reproducible. The weights of Ovis1.6-Gemma2-9B in the Ovis1.6 series have also been open sourced. (36氪)
APUS Qihuang large model released, leading the medical industry into the AI era
Recently, the global artificial intelligence company Kirin Hesheng Network Technology Co., Ltd. (hereinafter referred to as: APUS) announced the launch of its research and development results - APUS Qihuang (Medical) Model. This move not only demonstrates APUS's artificial intelligence technology strength in the field of medical health, but also indicates that the medical industry is about to usher in a new wave of digital transformation.
The APUS Qihuang Big Model is based on the general large model with 210 billion parameters independently developed by APUS, and is optimized through precise pruning technology. This model has been trained through deep learning of 600 billion high-quality medical knowledge data, which took about three months to complete. With the help of the APUS Qihuang Big Model, patients can enjoy more convenient and personalized medical services. Whether it is online consultation services or remote diagnosis and treatment, the "Qihuang Big Model" can ensure that users receive timely and professional responses. For patients living in remote areas or with limited mobility, this means that they can also enjoy the same quality of medical services as urban residents.
In the future, APUS will continue to increase its investment in the research and development of AI medical products, and is committed to building an open, shared, cooperative and win-win smart medical ecosystem, allowing AI medical care to protect health. (Jiemian News)
Fudan University: Proposing an innovative and revolutionary prompt word trading model
The Multimedia and Intelligent Security Team of Fudan University proposed an innovative prompt word transaction (PBT) scenario and online pricing mechanism to maximize profits for consumers, platforms, and sellers.
The PBT system includes the platform, consumers, and sellers who provide prompt words of various categories. The platform aggregates prompt packages as a data trading broker. After the consumer pays, the platform collects part of the reward as service compensation, and pays the rest to the seller. The problem of selecting prompt word categories with unknown quality is modeled as a combinatorial multi-armed bandit (CMAB) problem. The highest quality category is selected through a greedy search strategy. The goal is to maximize the total estimated quality of the selected category in T iterations. A three-stage hierarchical Stackelberg (HS) game is introduced to find the optimal incentive strategy. Consumers, platforms, and sellers are regarded as first-, second-, and third-level leaders, respectively, and the optimal incentive strategy is derived through reverse deduction. The profit functions of each party are defined, and the optimal strategies of each party are derived through theorems and proofs. Better adapt to the future buyer's market. Researchers believe that this model is expected to reshape the AI content creation ecosystem and improve creation efficiency. (ITSoul)
Xpeng Motors accelerates end-to-end autonomous driving and deepens AI computing cooperation with Alibaba Cloud
On September 19, He Xiaopeng, chairman of Xpeng Motors, drove the "world's first AI car" P7+ to the 2024 Yunqi Conference. This car is equipped with the industry's leading end-to-end large model. It is understood that in the past two years, the scale of AI computing power jointly built by Xpeng Motors and Alibaba Cloud has increased by more than 4 times. He Xiaopeng said that he will continue to deepen the AI computing power cooperation with Alibaba Cloud and accelerate the end-to-end large model to expand the upper limit of autonomous driving and improve the lower limit.
In May this year, Xiaopeng Motors was the first in China to realize end-to-end autonomous driving mass production and quickly landed nationwide. The industry generally believes that the demand for computing power for end-to-end intelligent driving will further expand in the future, and an investment of hundreds of millions of yuan is only the entry ticket to intelligent driving computing power. In order to further maintain its first-mover advantage, Xiaopeng Motors announced that it will invest 3.5 billion yuan in research and development each year, of which 700 million yuan will be used for computing power training. It will also continue to deepen cooperation with Alibaba Cloud to accelerate the implementation of end-to-end large models (Gronhui)
Shanghai Jiaotong University & Tencent open source SaRA: taking into account both original generation and downstream tasks
SaRA is an efficient fine-tuning method for pre-trained diffusion models. By fine-tuning the invalid parameters in the pre-trained diffusion model, the model is given the ability to process downstream tasks. SaRA can significantly save computing memory overhead and code complexity, and the fine-tuning process can be achieved by modifying only one line of training code. The core innovation of this method lies in: parameter importance analysis, sparse low-rank training, progressive parameter adjustment strategy, and unstructured back-propagation strategy. SaRA has been extensively experimentally verified on multiple downstream tasks, including base model capability improvement, downstream data fine-tuning, image customization, controllable video generation, etc. Experimental results show that SaRA can not only improve the generation ability of the basic model in the original task, but also in downstream tasks, it can take into account the learning of downstream tasks and the maintenance of pre-training priors to achieve superior model fine-tuning effects. (AI generates the future)
International News
FOREIGN NEWS
Nvidia expands AI territory again, reportedly spends $165 million to acquire startup OctoAI
Artificial intelligence (AI) chip giant Nvidia is reportedly interested in expanding its presence and plans to acquire Seattle startup OctoAI, and is currently in deep negotiations. OctoAI mainly sells software that can improve the efficiency of AI model operations. According to a letter sent by the company to shareholders, Nvidia proposed to acquire the company for approximately US$165 million, hoping to enhance its software and cloud computing service capabilities through this acquisition.
Nvidia and OctoAI have worked closely together before, with Nvidia providing OctoAI with its latest chips in advance so that the company could test how to efficiently run AI models on Nvidia chips.
However, OctoAI also has close cooperation with Nvidia's competitors such as Amazon Web Services, AMD and Qualcomm. Therefore, analysts believe that if Nvidia acquires OctoAI, it will face scrutiny from the US Department of Justice. (China Business Times)
Musk's brain-computer interface is approved, and the blind may see again
Recently, Neuralink, a brain-computer interface company owned by Musk, announced that its experimental implant device "Blindsight" designed to restore vision has obtained breakthrough medical device certification from the U.S. Food and Drug Administration (FDA).
Blindsight is implanted in the brain in an invasive way, directly stimulating the visual cortex through external electrical signals, allowing the blind to sense light (perceive the specific position of light in the field of vision). After the blind person feeds back the light sensing position information to the researchers, the researchers will release the designed and combined electrical stimulation signals again to form an image pattern within the patient's field of vision. Neuralink's innovation lies in making the implant wireless and increasing the number of implanted electrodes. Musk has previously stated that Neuralink's short-term goal is to help paralyzed people realize the function of mind typing. In the future, it will also be possible to enable paralyzed people to walk, blind people to see, and ultimately realize "human-machine symbiosis."
This time, Blindsight's breakthrough medical device certification is not only an important step for Neuralink, but also allows it to obtain FDA support earlier, thereby accelerating its research and development and market approval process. It is reported that "Breakthrough Device Designation" is a special certification provided by the FDA for certain medical devices with potential breakthrough medical treatment, diagnosis or monitoring functions. Typically, these devices can treat or diagnose life-threatening diseases. (Head Technology)
Qualcomm responds to European Court antitrust penalty: We disagree with the ruling and have always complied with EU competition law
In response to the European Court of Justice's antitrust ruling on September 18 that Qualcomm had abused its dominant market position and set a fine of approximately 238.7 million euros, a Qualcomm spokesperson responded on the morning of September 19 that Qualcomm respectfully disagreed with the ruling and the Commission's decision and "believed that we have always complied with EU competition law."
On September 18, the European General Court, Europe's second highest court, affirmed the European Commission's antitrust penalty against Qualcomm in 2019 and reduced the fine to 238,732,659 euros (about 265.5 million U.S. dollars, 1.88 billion yuan), lower than the 242 million euros previously imposed by the Commission on Qualcomm. The penalty was mainly because Qualcomm sold its baseband chipsets to two customers at below-cost prices between 2009 and 2011 to curb rival British mobile software company Icera.
In response to the new Icera lawsuit, the court held that the European Commission had confirmed the ruling by providing direct and indirect evidence, and the relevant antitrust ruling against Qualcomm was valid, but with regard to the amount of the fine, the court held that the Commission deviated from the approach set out in its 2006 guidelines. Therefore, in exercising its unlimited jurisdiction, the court reduced the amount of Qualcomm's fine to 238.7 million euros. (Titanium Media AGI)
OpenAI appoints former Coursera exec Leah Belsky as general manager of education
On September 19, OpenAI announced the appointment of Leah Belsky as its first general manager of education, signaling its commitment to expanding its AI products to more schools and classrooms.
Belsky, who previously worked as an executive at Coursera, will be responsible for expanding OpenAI's relationship with the education community, including K-12, higher education, and continuing education. Belsky will help implement OpenAI's solutions in the educational process. Her new position is to handle the relationship between OpenAI and schools and work with internal teams to develop product, policy, and marketing plans. This cross-functional approach aims to build alliances with universities and develop AI solutions for educational applications.
After launching ChatGPT in 2022, the product has received widespread attention from students, many of whom use the tool in their courses. To this end, OpenAI launched ChatGPT Edu in May, a version built specifically for educational institutions with additional features and pricing options. Belsky's appointment coincides with OpenAI's growing collaboration with academic institutions around the world. Some of the institutions that have adopted ChatGPT Edu include the University of Oxford, Arizona State University, and Columbia University.
At the same time, OpenAI plans to hold a meeting with presidents and provosts of top universities in October. According to the announcement, the meeting will focus on how to properly integrate artificial intelligence into teaching and research. This move is in line with OpenAI's goal of supporting educators in applying artificial intelligence best practices in educational institutions. (Bianniushi)
The prototype of "
Terminator" will be launched, and Nvidia predicts that the field of robotics will usher in the "GPT-3 moment" in the next 2-3 years
According to a report by foreign media on September 18, Nvidia senior scientist Jim Fan predicted that in the next 2-3 years, there will be major breakthroughs in related research in the field of robotics, but he also admitted that it will take longer for robots to enter daily life.
In an interview, Fan said he expects a "GPT-3 moment" in robotics -- a breakthrough in basic robotics models that will have an impact comparable to that of GPT-3 in language processing. He believes that, in theory, a capable humanoid robot could perform any task a human can, and predicts that the ecosystem for humanoid robot hardware will be ready in two to three years.
NVIDIA uses a combination of three data types when developing robot artificial intelligence: Internet data, simulated data, and real-world robot data. Dr. Fan emphasized the advantages and disadvantages of each approach and believed that their combination is the key to success. NVIDIA is developing technologies such as "Eureka", which uses language models to generate reward functions for robot training to automate the process.
In addition to the real world, Fan's team is also studying AI agents for virtual environments such as video games. He found similarities between these fields and is committed to developing a unified model that can control both virtual and physical agents in the long term. (IT Home)
OpenAI and T-Mobile partner on AI customer experience platform
According to foreign media reports, T-Mobile and OpenAI will build a new customer service system driven by artificial intelligence agents.
Customer service has become one of the leading commercial uses of generative AI, alongside coding and marketing. T-Mobile is creating a customer service platform called IntentCX that will draw on OpenAI’s technology, including OpenAI’s API and its latest o1 model. The companies say the model has shown promise in analyzing customer service call transcripts and identifying pain points that can be better addressed.
"One of the many things we're excited about with the next generation of models is what we can do for personalization," OpenAI CEO Sam Altman said at the T-Mobile event, noting that the o1 model is still in its early stages and will be significantly improved in the coming years. "But even in the next few months, as we move from o1 preview to o1, you'll see it get better."
OpenAI ChatGPT adds an automatic mode to flexibly select the appropriate AI model based on the complexity of the prompt word
It is reported that OpenAI has launched an "Auto" automatic mode for ChatGPT for all users on multiple devices. After the user switches to Auto, it will automatically select the most suitable AI model based on the complexity of the prompt words entered by the user: for complex prompts, it will use the most advanced model, and for simpler prompts, it will use a faster model to save time.
Many users have reported that in most cases they prefer to interact with the most advanced models. However, in some specific scenarios, being able to choose to optimize for speed is seen as a valuable feature.
Fal.ai receives $23 million in funding from investors including a16z, focusing on media generation AI models
Fal.ai (short for “Features and labels”) is a developer-oriented platform focused on AI-generated audio, video, and images. On September 19, the company announced that it had received $23 million in financing from investors including Andreessen Horowitz (a16z), Black Forest Labs co-founder Robin Rombach, and Perplexity CEO Aravind Srinivas.
The financing was divided into two rounds: $14 million came from a Series A round led by Kindred Ventures, and the remaining $9 million came from a previously undisclosed seed round led by a16z.
Fal offers two products: private managed compute and workflows for running models, and an open source model API for generating images, audio, and video. Fal is the first platform to host Black Forest Labs' Flux model, which provides image generation services for X's controversial chatbot Grok. In addition to Perplexity and corporate customers in the retail and e-commerce fields, popular generative AI applications Photoroom, Freepik, and PlayHT are also using Fal's services. (Saasverse)
Anthropic hints at new Claude AI desktop app
According to foreign media reports, Anthropic is preparing to launch a new Claude AI desktop application, which is internally called "Claude Nest". The official has recently added a download button to the Claude AI web interface, but has not yet released the download link.
In addition to the desktop app, Anthropic is also developing a new feature for artifacts that may allow users to export their artifacts directly into VSCode through a standalone extension. This feature may be similar to the artifact remixing operation, generating a URL through deep linking to open in VSCode. The extension will then obtain data from the URL to simplify the development process. This enhancement is expected to be a useful addition to developers using Claude for projects. (IT Home)
YouTube will launch AI "one-stop service": can generate creative, title, complete video
At the "Made on YouTube" special event held on September 18th local time, Google announced that it would bring a series of AI-related features to YouTube, which is expected to change the way videos are produced and even the videos themselves.
It is reported that Google has brought a new "Inspiration" tab to the YouTube Creator Center, which is driven by AI and its main function is to "tell" creators what they should make - recommending video concepts, providing titles and thumbnails, and even writing video outlines. YouTube positions it as a "useful brainstorming tool", but users can also use the tool to build entire video projects.
YouTube has also launched a tool called Veo, which integrates Google's DeepMind video model and can generate various video backgrounds through AI, as well as create complete video clips.
The above two features are being rolled out "slowly" and will be available to creators at the end of this year or early next year. YouTube will also launch other AI features, such as the "automatic dubbing" feature that can convert videos into multiple languages to serve more creators and languages; there are also AI tools that allow creators to interact with fans through the new community section of the App. (IT Home)
Zenlayer fully upgrades its hyper-connected network to empower AI development in Asia
On September 19, Zenlayer, a distributed edge cloud service provider with hyperconnectivity at its core and global coverage, announced a comprehensive upgrade of its software-defined network (SDN) in Asia. The upgrade aims to provide extremely low-latency, ultra-high-bandwidth network connection services for core AI computing clusters in Asia, especially Southeast Asia. This important move demonstrates Zenlayer's determination to promote the development of the artificial intelligence industry in Asia.
The upgraded Zenlayer hyperconnected network will be centered in Singapore, with high-speed connections to nearly 60 data centers in Indonesia, Japan, Thailand, Vietnam, the Philippines, Malaysia, etc., to meet the region's surging data transmission needs due to the development of AI training and reasoning. The upgraded network has a total capacity of nearly 100 Tbps in Asia, and its backbone network will use 800GB single-fiber bandwidth technology to ensure high-speed and stable data transmission between AI computing clusters.
Joe Zhu, founder and CEO of Zenlayer, said: "Asia will play a vital role in the development of the artificial intelligence industry. But to achieve this development, there must be a specially built infrastructure to support it. Our artificial intelligence network will become a solid foundation for the future development of Asia's digital economy, helping our customers, especially Chinese companies going overseas, to continue to be at the forefront of AI innovation." (Jiemian News)
AI successfully enters Hollywood, video generation platform Runway reaches cooperation with Lionsgate
On September 18, AI video generation platform Runway announced a partnership with Lionsgate Entertainment, under which the company will use Lionsgate's film catalog to train a custom video model.
Michael Burns, vice chairman of Lionsgate, said the company's filmmakers, directors and other creative talents will get the model to enhance their work efficiency. "Lionsgate has an outstanding creative team with a clear vision of how AI can help their work, and we are excited to help them turn their ideas into reality."
Runway said it is considering how to license the model as a template to individual creators to build and train their own custom models. It is worth mentioning that this is the first generative AI company to openly cooperate with a major Hollywood studio. The day before Lionsgate reached a deal with Runway, California signed a bill restricting the use of AI digital copies in film and television projects. Runway is still facing a lawsuit, and the company has been accused of using copyrighted works to train models without permission (Pinwan News)