share_log

中国银河证券:大模型训练数据付费或成趋势 关注出版板块估值修复机会

China Galaxy Securities: Big model training data payments may become a trend, focus on valuation repair opportunities in the publishing sector

Zhitong Finance ·  Dec 15, 2023 00:34

Most companies in the publishing industry have rich electronic graphic resources, which can be used as an important data set for big model training at home and abroad.

The Zhitong Finance app learned that China Galaxy Securities published a research report saying that most publishing industry companies have rich electronic graphic resources, which can be used as an important data set for big model training at home and abroad. The resource advantages of publishing industry companies in terms of copyright and IP are expected to help them focus on AI model development at home and abroad. In addition, the current overall valuation of the publishing industry itself is relatively low. It is recommended to focus on companies with a large amount of high-quality content and unique materials that can be digitized in the industry: Zhongyuan Media (000719.SZ), China Publishing (601949.SH), Phoenix Media (601928.SH), CITIC Publishing (300788.SZ), Shandong Publishing (), etc. 601019.SH

Incident: Recently, press and publication giant AxelSpringer Publishing Group (AxelSpringer) signed an agreement with ChatGPT development agency OpenAI, becoming the first publishing agency in the world to cooperate with OpenAI to further integrate journalism with artificial intelligence technology. ChatGPT users will receive news report summaries from AxelSpringer's brands as well as attributions and links to the source of the original report, fully optimizing the response results of the OpenAL model in their chatbot. AxelSpringer will also provide content from its media brand as training data for OpenAL's large-scale language model to help train GPT-4, an artificial intelligence model owned by OpenAI.

The views of China Galaxy Securities are as follows:

Payment for big model training data is expected to generate new revenue:

The agreement signed between OpenAI and AxelSpringer indicates that artificial intelligence will need to pay media brands when using media brand content for big model training, which means that AI big models may pay for intellectual property rights to data providers may become an industry trend. For publishing industry companies with high-quality data resources, this payment model is expected to help them use existing “dust-sealed” copyright resources to create high-quality data sets to serve large AI model manufacturers, thus creating new revenue growth points.

Big model data is in high demand, which favors the concept of copyright going overseas:

According to a report published by NewsMediaAlliance, the data set used to train popular artificial intelligence models “significantly” relies more on publisher content. Compared with general web content, its share ranges from 5 times more to nearly 100 times. The bank believes that under the premise that payment for training data sets is about to become a major trend, the high-quality data sets required for large-scale overseas model training will be realized by going overseas through copyright. Therefore, publishing companies with high-quality and scarce data resources are expected to open up new business growth points by going overseas through copyright.

The big model training data copyright policy was introduced, highlighting the value of high-quality training data:

Policy documents to promote the development of AI technology were issued in many places during the year, such as “Beijing's Certain Measures to Promote the Innovation and Development of General Artificial Intelligence” and the “Shenzhen Action Plan to Accelerate the Development Level of Artificial Intelligence in Shenzhen”, which all mentioned “high-quality data sets.” Furthermore, the “Interim Administrative Measures on Generative Artificial Intelligence Services” jointly issued by seven departments including the State Internet Information Office stipulates that generative AI service providers must not infringe on the intellectual property rights of others. The bank believes that currently AI policies have been intensively introduced, high-quality data sets and training data copyright issues have received attention, and the value of high-quality training databases will be highlighted in the future.

Risk warning: risk that copyright overseas progress falls short of expectations, risk of changes in publishing market policies, risk that big model development progress falls short of expectations

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more
    Write a comment