share_log

ABEJA、NEDOが公募した「ポスト5G情報通信システム基盤強化研究開発事業/ポスト5G情報通信システムの開発」に、LLM開発事業案が採択

An LLM development project plan was adopted for the “Post-5G Information Communication System Infrastructure Enhancement Research and Development Project/Post-5G Information Communication System Development” publicly solicited by ABEJA and NEDO

Abeja ·  Feb 1 10:00

ABEJA Co., Ltd. (Headquarters: Minato-ku, Tokyo; Representative Director and CEO: Okada Yosuke; hereinafter “ABEJA”) is a generalized “post-5G information communication system infrastructure enhancement research and development project*1/post-5G information communication system development” proposed by the National Research and Development Corporation New Energy and Industrial Technology Development Organization (hereinafter “NEDO”) that “implements a rich world” through collaboration between humans and AI We are pleased to announce that “LLM” has been selected.

ABEJA plans to receive grants of 700 million yen, mainly for computational resources required to build an LLM.

ABEJA conducts research and development of Japanese LLM and peripheral technologies (RAG*3, Agent*4) with the aim of dramatically improving accuracy and computational cost performance, which are essential for social implementation of LLM.

Furthermore, we will disclose the developed LLM, source code, development know-how, etc. as appropriate so that we can promote the utilization of LLM, accelerate AI technology innovation in society as a whole, and contribute to the development of the next generation of researchers and engineers.

Furthermore, in our commercialization, we plan to offer it widely in conjunction with the “ABEJA LLM Series,” which has been installed on the digital EMS “ABEJA Platform” since 2023. The business model assumes a distribution model*6 for open source software (OSS) *5, and it is planned to provide necessary support for a fee in connection with the utilization of the LLM to be released.

ABEJA has been promoting research and development in LLM, which is one type of generative AI, since 2018, and since 2023/3, the “ABEJA LLM Series” has been installed on the ABEJA Platform and provided to client companies. Currently, in order to realize the implementation of LLM for client companies, we have expanded our support area more widely, and are responsible for customer support all the way through strategy formulation, business process construction, and operation in the business process, and we are working to further expand our services, and continue to advance LLM research and development.

ABEJA has recently been adopted, and we believe that this project is a meaningful initiative in realizing our management philosophy of “implementing a rich world,” and will help accelerate the implementation of LLM in society as a whole.

Currently, companies around the world are starting various initiatives with the aim of enjoying the huge value generated from generative AI centered on LLM. In fact, the LLM market size is expected to expand rapidly, and the market size of the dialogue AI business in Japan is expected to grow from 14 billion yen in fiscal 2023 to 690.5 billion yen (average annual growth rate 165.0%, CAGR: 2023-2027) under an optimistic scenario (source: Seed Planning Co., Ltd. “Current Status and Future Prospects of 2023 Dialogue AI Business”), and is also based on ABEJA We anticipate a market of 200 billion yen as a scenario.

While it is expected that major changes in the industrial structure will occur due to the utilization of LLM, at present, consumption of large-scale computational resources is unavoidable when using LLM, so restrictions on the scope of application occur when return on investment is taken into account, which is one of the causes of hindrance to social implementation of LLM. Also, typical issues faced by LLM are “knowledge cutoffs” that do not respond to the latest information or updated information, and “halcination,” which generates inaccurate information that is not based on facts. This is because LLM knowledge is based on huge amounts of “existing” data, and it is based on LLM's unique property of “learning even incompleteness and misinformation present in learning data.” In order to improve the accuracy of LLM, it is essential to eliminate data containing incorrect or biased information and learn accurate and reliable data. As a coping method, there is “fine-tuning,” where additional learning is performed using a new data set to the LLM that has already been learned, but it consumes large computational resources each time, which is costly and time-consuming. For this reason, the reality is that it is limited to the application of some enterprise companies. OpenAI announced the fine-tuning function of the “GPT-3.5 Turbo” in 2023, but the amount of data that can be handled is limited to 4,096 tokens and files of 50 MB or less, so there are issues with practicality.

“RAG (Retrieval-Augmented Generation)” is a method that is viewed as promising for solving such current issues. RAG is a technology that links LLM with external databases and information sources (hereinafter “external data”), and allows LLM to generate answers incorporating knowledge from external data. It is possible to perform high-precision answers related to external data simply by replacing external data without performing fine-tuning each time. Also, by optimizing the “Agent”, LLM will be able to autonomously plan and execute necessary actions, such as utilizing APIs and tools, based on input content.

ABEJA believes that improving accuracy through RAG and optimizing agents will improve computational cost performance, bring economic rationality and expandability of application scope, and strongly promote social implementation of LLM. We believe that there is room for technological progress in RAG currently being used, and we will realize pioneering methods with high practicality by working integrally on LLM and peripheral technology (RAG, Agent) research and development. Note, in LLM standalone research and development, existing open source LLMs are used as benchmarks, and the goal is to achieve top scores in all JGLUE*7 items at the time of publication.

ABEJA is considering that Japan will play an important role in the international AI field in the future and establish a new standard for information processing technology in the international community.

ABEJA promotes social implementation of LLM by providing society with LLM, source code, development know-how, etc. obtained through research and development for the purpose of increasing the number of companies and organizations utilizing generative AI, drastically accelerating AI technology innovation in society, and developing the next generation of researchers and engineers, and strives to realize ABEJA's corporate philosophy of “implementing a spacious world.”

Business Overview

Public offering business name

Post-5G Information Communication System Infrastructure Enhancement Research and Development Project/Post-5G Information Communication System Development

The name of the business we are applying for

Research and development on generalized LLM as the basis for specialized models for social implementation of LLM

Implementation period

2024/2 to 2024/8

purpose

・Research and development of Japanese LLM and peripheral technology (RAG, Agent) with an eye on general-purpose use for the social implementation of LLM

・Disclose deliverables (LLM, source code, development know-how, etc.) obtained through research and development, and promote utilization of generative AI, acceleration of AI technology innovation in society, and development of the next generation of researchers and engineers

・Japan will play an important role in the international AI field and establish a new standard for information processing technology in the international community

Overview

・Generalized LLM research and development as a source of specialization

- Achieve top scores in evaluations using open source LLM as a benchmark

- Improve the accuracy of peripheral technology (RAG, Agent) and promote data utilization

・For social implementation, we aim for development related to our own business, and also disclose and provide deliverables such as some models and know-how

- We offer a wide range of LLM and peripheral technology (RAG, Agent) that we have researched and developed along with the services we currently provide

- Publish deliverables (source code, models, development know-how) obtained through research and development

NEDO publication details

Adoption results publication page URL:https://www.nedo.go.jp/koubo/IT3_100304.html

■ Overall Overview Diagram (image)

■ Implementation schedule

About terms

terms

content

1

Post-5G Information Communication System Infrastructure Enhancement Research and Development Project

A business that develops core technology with the aim of strengthening the development and manufacturing infrastructure of post-5G information communication systems within Japan. A post-5G information communication system indicates a communication system compatible with post-5G with further enhanced functions such as ultra-low latency and multiple simultaneous connections compared to 5th generation mobile communication systems (5G).https://www.meti.go.jp/policy/mono_info_service/joho/post5g/index.html

2

LLM

It is an abbreviation of Large Language Model, and large-scale language model is one of the areas of generative AI.

3

RAG

An abbreviation for Retrieval-Augmented Generation. Technology that links external databases and information sources. By utilizing this technology, LLM will be able to generate highly accurate responses incorporating knowledge from external databases and information sources.

4

agent

An agent is a technology that makes it possible to plan and execute autonomous actions. By using this technology, LLM can autonomously make decisions and plan and execute actions such as utilizing APIs and tools based on the inputted content. Thus, it is possible to autonomously create answers using external data not included in learning data.

5

Open source software (OSS)

A general term for software that can use, investigate, reuse, modify, expand, and redistribute source code free of charge regardless of the user's purpose.

6

distribution model

A business model developed by an OSS provider or other community that provides support related to maintenance, bugs, security, and other updates required for models incorporating OSS. ABEJA assumes the “Red Hat Enterprise Linux (RHEL)] method for this commercialization.

7

JGLUE

A set of data sets for measuring general language comprehension ability in Japanese. The LLM model is evaluated from various perspectives.

■ About ABEJA Co., Ltd.

ABEJA has a management philosophy of “implementing a spacious world,” and is developing a “digital platform business” that transforms the core business processes of client companies based on the “ABEJA Platform,” and continues to achieve business profit growth. We have been promoting research and development on the ABEJA Platform since our establishment in 2012, and until now, we have realized digital transformation for more than 300 companies in a wide variety of industries and business categories on the ABEJA Platform. Furthermore, using advanced know-how and approaches such as “Human In the Loop,” we have realized “human-AI coordination,” which is essential for digital transformation, strategically and efficiently transform core customer operations, and are also working to innovate business models.

Headquarters: 2nd Floor, Bizflex Azabujuban, 1-14 Sanda, Minato-ku, Tokyo

Established: 2012/9/10

Representative: Representative Director and CEO Yosuke Okada

Business: Digital platform business

URL:https://abejainc.com

Disclaimer: This content is for informational and educational purposes only and does not constitute a recommendation or endorsement of any specific investment or investment strategy. Read more
    Write a comment