AIGC Scenario Analysis and Research on Technology Roadmap of Internet Industry Application

2023-11-06 01:17XiongfeiRenLiliTongJiaZengChenZhang

China Communications 2023年10期

Xiongfei Ren ,Lili Tong ,Jia Zeng ,Chen Zhang

1 Beijing University of Posts and Telecommunications,Beijing 100876,China

2 Faculty of Education Beijing Normal University,Beijing 100875,China

*The corresponding author,email: tonglili@bnu.edu.cn

Abstract: The explosion of ChatGPT is considered to be a milestone in the normalization of artificial intelligence education applications.On the technical line,the cross-modal AI generation application based on human feedback system is accelerated.In the business model,the scenes to realize interactive functions are constantly enriched.This paper reviews the evolution process of AIGC,closely follows the current situation of the coexistence of business acceleration and technical worries in the application of artificial intelligence education,analyzes the application of AIGC education in 7 subdivided fields,and analyzes the optimization direction of application cases from the perspective of perception-cognition-creation technology maturity matrix.The 3 recommendations and 2 follow-up research directions will promote the scientific application of artificial intelligence education in the AIGC period.

Keywords: AIGC;artificial intelligence education application;large model

In today’s era,the scientific and technological revolution and industrial revolution with digital technology as the core driving force are profoundly affecting the process of transformation and upgrading of all walks of life,spawning a series of new needs,new scenarios and new models[1].In 2022,AI generated content(AIGC)technology,represented by ChatGPT,triggered a hot debate in the whole network,setting off a new wave of data-driven artificial intelligence boom.ChatGPT(Chat Generative Pre-Trained Transformer) is an artificial intelligence chatbot developed by OpenAI.It has emerged functions such as interactive question answering,text writing,code generation,and music creation,which can assist in completing multi-scene tasks [2].At present,AIGC has become a new engine for the innovation and development of digital content in the new era.It has been applied in many fields such as media,e-commerce,entertainment,film and television,and has achieved remarkable results.Its application in the field of education has also attracted much attention.Under the background of digital transformation of education,it is the general trend to explore the path of integration of technology and education.With its powerful text content productivity,AIGC can not only provide personalized learning programs for the classroom,but also assist teachers in teaching evaluation and realize the development of digital educational resources,which is the key to promoting the high-quality development of education.Based on this,this study will deeply explore the application status and prospect trend of AIGC in the field of education,and help to build an intelligent ecosystem of human-computer interaction.

I.THE CONNOTATION OF AIGC

AI Generated Content(AIGC),also known as generative AI,is the latest form of content production.The development of content production mode has experienced Professional Generated Content (PGC),User Generated Content(UGC)and AI-assisted production content in the initial stage of AIGC.At present,it is moving towards a more intelligent AI production content stage.AIGC overcomes the shortcomings of PGC and UGC in quality and yield,and is expected to become the mainstream content production mode in the future.

From the academic understanding of AIGC,the main views in China come from the following aspects.China Academy of Information and Communications Technology believes that AIGC originated from deep learning technology and the rapidly growing demand for digital content.It is a product that integrates artificial intelligence-generated content,content generation methods,and technologies for automatic content generation.It will become a new engine for the development of digital content[3].Based on the characteristics of the Web 3.0 era,scholars in the field of Internet technology research and development point out that AI generated content (AIGC) has become one of the important ways of content creation in the Web3.0 era by using natural language processing and natural language generation technology[4].Based on the evolution of digital resources,scholars in the field of library and information science have summarized the three development stages of AIGC-based on entity twins,based on learning and creation,and based on real-time autonomous generation.Industry-driven AIGC technology has ushered in explosive growth with the rise of the Metaverse [5].From the similarities and differences between AIGC,UGC and PGC,researchers in various industries believe that although AIGC has many advantages in content production,it does not mean that it completely replaces UGC and PGC.On the contrary,AIGC will become an important part of content production in the future,and together with UGC and PGC,it will provide content support for the development and operation of the Metaverse[6].

Foreign scholars’research on AI-generated content can be traced back to the last century.In 1966,computer scientist Joseph Weizenbaum of Massachusetts Institute of Technology developed “Eliza” [7],the world’s first robot capable of human-machine dialogue.Eliza uses a simple pattern matching method to respond to users by simulating the language situation in chat,thus creating the effect of simulating human dialogue and laying the foundation for later AIgenerated content.It is widely considered to be the beginning of computer language understanding and generation.Before Ian.j proposed the generative adversarial networks (GAN) [8],AIGC was limited by algorithm bottlenecks and could not directly generate content.GAN uses the game between two deep neural network models to generate highly complex data.The technology based on GAN is widely used in the generation and conversion of multimedia data types such as video and audio,which greatly promotes the application of AIGC in different fields.After that,the deep learning algorithms continue to iterate,and AIgenerated content develops rapidly.In 2021,OpenAI proposed DALL-E,which is mainly used for text and image interaction to generate content[9].

Today,AIGC uses AI algorithms,natural language processing,and computer vision to generate multimedia forms,including text,images,audio,video,and other modalities and flexible interactions.While significantly improving intelligence,AIGC also implies potential risks,including the possibility of bias,logical errors in generated content,and blurring the boundaries between human and machine-generated content[10].This paper will explore the appropriate application of various subdivision types of AIGC in the field of education,and explore the scientific guidance mechanism of its advantages and disadvantages.

II.THE DEVELOPMENT OPPORTUNITIES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE EDUCATION APPLICATION

In 2022,China fully implement the strategic action of digital education,and put forward the “3C” concept of Connection,Content and Cooperation.According to the principle of “application is king,service first,simple and efficient,safe operation”,many typical applications and resource contents are integrated into the national intelligent education public service platform,and the world’s largest education and teaching resource library has been basically built[11].In 2023,the education authorities are promoting the coordinated development of digital education,digital technology,digital humanities and digital ethics,and steadily promoting the theoretical and practical guidance of digital education in China[12].

The application of artificial intelligence in education is one of the important ways to implement the strategic action of educational digitization.In the current new round of application boom caused by ChatGPT and the competition of large language model research,the current situation of opportunities and challenges must be clearly recognized:On the one hand,driven by market demand and technological diversity,business attention remains high;On the other hand,the scale of the underlying parameters required to achieve intelligence is huge,which is different from the factual consistency and logical consistency of the development of real social cognition.How to incorporate them into educational applications requires in-depth research and scientific advancement.

2.1 Opportunity 1: The Capital Response Is Enthusiastic and the Business Process Is Rapid

In July 2019,Microsoft injected 1 billion dollars into OpenAI for the first time.Subsequently,investment institutions such as Sequoia Capital,Cornerstone Management and Tiger Global Management entered the market.OpenAI has completed six financings before Microsoft reconfirmed billions of dollars in investment in 2023.Concerns about the lack of clarity about its business model have been eased by the opening of the first 70 ChatGPT Plugins in May 2023.These third-party plug-ins cover daily needs such as food,clothing,housing,social networking,work and study,becoming closer to the intelligence of 24-hour personal assistants.Plug-ins closest to the field of education include edx(finding courses and content from top universities),speak (language learning and translation),etc.

On the other hand,according to statistics,as of March 15,2023,ChatGPT daily activity exceeded 58.37 million,and the number of weekly active users was 3.3181 million.The main reason why more and more users join the team of experience and application models is that the experience of use is smoother,the answer of reply is more accurate,and the ability of context traceability is stronger.

2.2 Opportunity 2:The Technical Route Is Diversified and the Industrial Ecology Can Be Expected

The human intentions that ChatGPT can achieve come from the accumulation of various technical models of machine learning,neural networks,and Transformer models.Its training principle includes three typical characteristics–

(1)Using a single large model.The“large”in large models refers to having more model parameters,larger data processing capacity,and greater computational training volume.Compared with the previous small models with poor interpretability and limited scope of use in specific application scenarios,this round of intelligent application upgrade has only one master model for natural language understanding and generative functions,but it shows stronger capabilities in semantic understanding,reasoning,collaboration,etc[13].The boom of ChatGPT verifies the importance of parameter growth and training data volume to AI models,which is qualitatively different from the previous cognition of artificial intelligence applications[14].

(2) Using small sample learning method.Under the few-shot learning method,the AI pre-trained model can establish a more general generalization ability without using a large number of labeled training data [15].The author’s team is undertaking the National Natural Science Foundation of China ‘Research on Key Technologies for the Generalization of Human-Computer Collaborative Learning Ability Based on Domain Adaptation Algorithms’ (Project No.62277002) is also committed to the wider needs of the small sample training service for the application of artificial intelligence education.

(3) Using supervised fine-tuning learning of human feedback.Compared with the previous generation,the main change of ChatGPT is the use of a human feedback mechanism to fine-tune the supervised learning process [16],aligning the ability of the language model with human needs and values to better follow user instructions.In essence,whether it is large model or small sample learning,the goal is to improve the efficiency of training,but the key technology to make ChatGPT achieve accurate and reasonable results is to add human feedback.

ChatGPT mainly involves the following key technologies and architectures [17]: (1) Transformer model,which is the basis for the efficient operation of the model,has strong parallel computing ability.(2) Reinforcement Learning from Human Feedback(RLH) is the core technology to improve the ability of content generation.It is based on three key steps of supervised fine-tuning–constructing reward model–proximal policy optimization(PPO)[18,19],the text content that meets the needs of users is generated.(3) Instruction Tuning technology,which is designed to assist in generating high-quality text and help the model better understand the instruction task[20].(4)Chain of Thought technology to assist in the completion of complex reasoning tasks,especially to solve complex logical reasoning problems[21].

Supported by the above three features and key technologies and architectures,AIGC can achieve better content generation results based on language,graphics and foundation model.The AIGC industrial ecology is currently continuously evolving and upgrading in multimodal interactive functions such as graphics,audio,and video,laying a commercial foundation for multiple scenarios.Cross-modal generation technology is also expected to become a turning point for truly realizing cognitive and decision intelligence.For example,OthersideAI,CopyAI and JasperAI have the functions of AI writing emails,advertising,and providing ideas.Lexica,playground,etc.have image generation functions;tabnine,GitHub Copilot,etc.have code generation capabilities;in the ecological application of financial scene,intelligent customer service,intelligent marketing,intelligent operation such as financial exhibition assistant have been presented.

2.3 Challenge 1: The Development of Large Models Is Fast and the Logic Is Hidden Dangers

Under the background of intensified international competition in the field of general large models,domestic large models emerge in an endless stream,and the R & D cycle is continuously shortened.Table 1 shows the incomplete statistics in the past two years.Internet manufacturers such as Baidu,Tencent and Huawei,as well as universities and research and development institutions have become the main force of large model research and development.From the perspective of the number of model research and development,according to incomplete statistics,79 large models above the scale of 1 billion parameters have been released in China,and more than half of them are open source,mainly concentrated in Beijing and Guangdong.From the perspective of model development direction,it mainly focuses on natural language processing and multimodal fields,and pays less attention to fields such as computer vision and intelligent voice.From the perspective of model industrialization route,domestic large model development pays more attention to landing transformation,and many companies adopt the mode of coordinated development of‘general basic large model+industry large model’to promote the application of large model in related industries.In addition,in the vertical fields such as biomedicine and remote sensing meteorology,relevant research has also been carried out in China to seize the opportunities for future development.

Table 1.The release of some domestic large models.

At present,ChatGPT has been applied in many fields such as advertising marketing,medical health,and intelligent office.For example,GPT-4 scored in the top 10% in the mock lawyer examination;it is comparable to experienced doctors in medical diagnosis[22].However,the large model technology is in its infancy,and there are many problems in the use process.For example,in December 2022,the Tsinghua University team evaluated the level of gender discrimination against GPT-2,and found that GPT-2 had a 70.59% probability of predicting teachers to be male,and a 64.03% probability of predicting doctors to be male,showing the stereotype of AI on human gender.In March 2023,many Twitter users reported that someone else’s chat records appeared in the chat record bar on the left side of the ChatGPT page,even including some user’s name,credit card number,e-mail and other sensitive data,resulting in user information leakage;in addition,AI image recognition function always tends to identify people in the kitchen as women,even if the other person is male,indicating that there is a serious‘bias’in the value orientation behind the AI system.At the same time,different aspects of AIGC applications have exposed hidden dangers of factual consistency and logical consistency.

Example 1:Fact consistency failure case(see Figure 1)

Figure 1.Fact consistency failure case.

Description: Two children,Chloe and Alexander,went for a walk.They both saw a dog and a tree.Alexander also saw a cat and pointed it out to Chloe.She went to pet the cat.Who saw the cat first?

Failed answer: ChatGPT wrong answer to see at the same time,can not give the correct answer is“Alexander”see first.

Example 2: Logical consistency failure case(see Figure 2)

Figure 2.Logical consistency failure case.

Figure 3.A schematic diagram of the relationship between model parameter scale and its performance accuracy.

Logical question: Is the number of letters in the word “prime” prime? Think about it carefully and show your steps.

Failed Reasoning: 5 is prime,but ChatGPT gives false answers and illogical reasoning.

It can be seen that AIGC has brought great changes to the information industry,but the risks it brings cannot be ignored[23].It is urgent to face up to and solve the following key problems: model training may involve sensitive data of users,causing data leakage and privacy security problems;the decision-making process and prediction principle within the model are difficult to explain,and the rationality of the generated content remains to be discussed.The training and reasoning of large models will consume a lot of energy and aggravate environmental problems.Paradigms such as model data cleaning and parameter setting need to be refined to improve the computational performance and accuracy of the model.AI algorithm has problems such as discrimination,unfairness and value bias,and it is necessary to strengthen the audit and optimization of relevant content.

2.4 Challenge 2: Computational Intelligence Is Ahead,Cognitive Intelligence Is Lagging Behind

The breakthrough innovation in the ChatGPT boom is that the large pre-trained model does not require more high-cost data annotations,and uses the characteristics of deep learning networks with many layers,many connections,and many parameters to achieve excellent intent recognition and language comprehension on the basis of a billion-level parameter scale.This also confirms the universal law of qualitative change caused by quantitative change.In the previous technical evaluation of the pre-trained model by the author’s team,the performance of the algorithm shows a law:the larger the data size,the more the pre-trained model parameters,the higher the algorithm output accuracy.

Under the global trend that many foreign technology giants have invested heavily in pre-trained models and AI generated content tracks,ChatGPT can be said to be a milestone artificial intelligence product in terms of underlying data,core technology,and user experience.However,at the same time,we need to clearly recognize that the current underlying data is based on the information graph,that is,massive Internet information is the main source,and its“emerging intelligence”and“factual consistency and logical consistency problems” coexist due to its large scale.In academic,we have seen that there are teams studying the general model based on event logic graph,and the scientific research team of Beijing Normal University is working with universities such as Beijing University of Posts and Telecommunications to promote the application of industrial artificial intelligence education in the direction of ‘cognitive graph’ that reflects the law of education.

III.THE SUBDIVISION AND APPLICATION OF AIGC

AIGC is the key driving force to promote the paradigm upgrade of computing education.Driven by powerful algorithmic power,students’learning,teachers’teaching,school education and so on present a new ecology of man-machine integration [24].AIGC’s diversified technical route makes its expected application in the field of education will also be rich,including at least 7 individual segments of “text-image-audiovideo-cross-modal-strategy-virtual human”.Enhancing understanding of the core technologies and application scenarios in these more micro-practical fields will help create the leading edge of science and technology in the AIGC period.

3.1 AIGC-Text Generation Applications

Including non-interactive text and interactive text applications.In non-interactive text applications,exploration based on structured data is encouraged,generate structured text content under specific scene types,such as campus management regulations and regional education governance notifications.Encourage the exploration of unstructured data based on higher openness and freedom,and generate text content with a common basic framework and personalization,such as teachers’lesson preparation notes and media communication text.In interactive text applications,context interaction based on cognitive rules can be promoted for students,such as learning companion Q & A robot,skill game robot,etc.Promote context interaction based on subject development for teachers,such as document library robots,high-risk environment operation robots,etc.Text generation applications can not only efficiently generate structured notification content,but also greatly save teaching efficiency.It also has the ability to generate enlightening content,provide innovative materials and resources for students and teachers,promote the generalization of knowledge to more fields,and broaden the thinking of solving problems.

3.2 AIGC-Image Generation Applications

It is divided into image editing tools and image autonomous generation tools.Image editing tool applications,change or copy the picture style,or change part of the picture according to the prompt or add new elements to the picture,etc.,to promote scene applications such as intelligent art education and diversified special education.The application of image autonomous generation tools advocates the pilot application of generating creative images or functional images,and realizes the transformation ability from 2D images to 3D models based on different complexity basic models such as generative adversarial networks and Transformer networks.In the actual teaching process,image generation applications can provide students with a more personalized and interactive learning experience.For example,according to the description of the students to generate charts,diagrams,flow charts,etc.,to help students better understand and memorize knowledge points;generate images of simulation experiments or virtual experiments,create new teaching models,and promote the combination of physical and cyberspace to the virtual and real fusion space[25],generate visual learning tools,jigsaw puzzles,map reading training and other learning auxiliary tools to help enhance students’observation,logical thinking and problem solving ability.

3.3 AIGC-Audio Generation Applications

Including intelligent dubbing,text generation specific voice,music generation.Focus on the problem that the data that still needs to be solved in the current audio generation task is difficult to label,and break through the key technology of data labeling granularity affecting the controllability of the audio generation task.In the field of intelligent dubbing,it is explored to convert the input speech or text into the speaker’s speech in the target speech based on the given target speech,such as the post-production dubbing link of digital resource courses.In the field of text generation of specific speech,input text and output the speech of specific speakers,such as language learning in the environment of insufficient foreign language teachers.In the field of music generation,AI is used to automatically generate specific music according to the opening melody or text description,such as the application of basic education music class and higher education art student training.In the field of special education,the course content,teaching materials or exercises are converted into audio format,so that visually impaired students can obtain information and participate in learning through hearing.Audio generation ability promotes the occurrence of multi-sensory learning,and enhances students’ development in the fields of language,hearing and creativity.

3.4 AIGC-Video Generation Applications

Including video attribute editing,video automatic editing,video part editing.By cutting the video at the frame level,the processing power of each frame is realized.Video attribute editing applications promote the integration of video quality repair,adding specific content,video beautification and other technologies with teaching management.The application of video automatic editing promotes the integration of service capabilities such as AI technology to detect video clips,generate trailers,promotional videos,and library-school collaboration.Video part editing applications promote the use of AI technology to dynamically edit the content in the video,such as replacing local content to better meet the needs of learners in different regions of different learning segments,and reducing the cost of remaking the whole series of digital teaching resources.Video generation involves more technologies and resources.It is currently in the research stage.The application of this technology can bring higher quality,more diversified and more vivid learning experience and effects to education.

3.5 AIGC-Cross-Modal Generation Applications

Including text to generate images,text to generate video,image/video to text.Cross-modal generation technology is a turning point to truly realize cognitive and decision intelligence.In the field of text generation image,it promotes the accelerated landing of technology,strengthens the educational application review of creative image database,and highlights the principle of adaptive cognition.In the field of text-generated video,attention should be paid to the space to be improved in terms of video duration,clarity,and logical rationality,and promote long video,high-definition,and reasoning scene applications.Image/video to the text field,such as visual question answering system,automatic subtitles,etc.The essence of cross-modal technology is to organize,reconstruct and reconfigure a variety of resources,realize the information conversion and integration of different perceptual modes [26],and build a bridge for data between different modes,so as to make up for the lack of information and enrich the expression of information[27].In the field of education,cross-modal generative applications can perform a series of operations such as fragmentation,labeling,and conversion of resources according to the real-time needs of teachers and students,and then provide resource representations of different arrangements and combinations,generate educational resources that meet the intentions of teachers and students,and realize educational value appreciation in the process of sharing and reuse.

3.6 AIGC-Strategy Generation Applications

AI is the process of proposing solutions based on specific problems and scenarios.To carry out the exploration and application of biogeography deduction in the field of natural sciences and the upgrading of medical and educational integration programs;explore and apply practical teaching scenarios such as project management strategy iteration and enterprise sand table simulation in the field of social science.For the student’s learning process,the application can combine the professional knowledge of the subject to provide students with a solution to the problem;combining learning needs and knowledge mastery,it provides dynamic learning strategy support and personalized learning suggestions and guidance.For the teaching process of teachers,the application can generate teaching methods,textbook suggestions and evaluation strategies in line with the characteristics of the subject,and provide a variety of program selection and support services for teachers to prepare lessons.The strategy generation application provides solutions and optimization paths for problems in the teaching process,plays the role of “think tank”,and provides important support for educational decision-making.

3.7 AIGC-Virtual Human Generated Applications

It refers to a comprehensive product that exists in the non-physical world (such as pictures,videos,live broadcasts,integrated servers,VR) and has multiple human characteristics.Promote the exploration and application of service-oriented virtual humans.Service-oriented virtual humans can reduce costs,improve the efficiency of teaching management and regional governance,and encourage in-depth piloting in repetitive,mechanical,explanatory and customized human labor work.For example,virtual digital people can be applied in library scenarios,relying on proprietary knowledge data sets in the library field,providing users with targeted and personalized answers,and promoting users to use library services more efficiently and conveniently [28].Promote the exploration and application of identity-based virtual humans,based on the IP attributes of the Internet,avoid the potential hidden dangers that real people in the Internet learning environment make online behaviors untraceable through virtual identities,and realize the greening and high quality of the digital learning environment.Compared with ordinary intelligent question answering robots,virtual digital people have empathy ability,which can express emotions more truly,identify and feedback emotions,and provide more accurate services[29].

IV.TECHNOLOGY MATURITY MATRIX AND APPLICATION CASES

The early AIGC technology is mainly the traditional pre-deep learning stage based on templates or rules.With the rapid development of deep neural networks,AIGC has entered the stage of deep learning.This stage involves algorithms such as convolutional neural networks and deep variational autoencoders.These technologies have now become important tools in the field of AIGC.The multi-modal large-scale neural network integrates multiple data inputs for analysis and processing,and has the ability to process complex scene data,opening a new stage of AIGC technology.The mature application fields of AIGC technical directions are shown in Table 2.

Table 2.Application of AIGC technology.

4.1 Perception

Perception ability is one of the key technologies of AIGC assisted data acquisition,and large visual model is an effective means to improve AIGC perception ability.

The application of visual large model has become mature.Alexandria and others used smart devices to collect data on learners’ learning behaviors,facial expressions,and physiological indicators,and used a Java programming learning system to predict and adjust learners’learning outcomes[30].The system can mark and annotate conversations between instructors and learners,and monitor learners“facial expressions and mental states to better understand learners”cognitive and emotional states.Xu Zhenguo et al.proposed a learner emotion recognition method based on deep learning [31].They built a large-scale learner emotion database and divided learner emotions into seven types,including normal,happy,angry,sad,panic,concentration and boredom.This method performs better than traditional methods,and can be applied to smart learning environments,improve learner models,and achieve emotional interaction.These applications are designed to improve the teaching experience of students and teachers in order to achieve more efficient education.By using real-time perception technology,the system can perceive the user’s learning behavior and judge the state of learning in real time,which helps to improve teaching efficiency and learning outcomes.At the same time,the system can also automatically identify the expression state of learners,providing a new way for emotional interaction in online education [32].This interaction makes the teaching process more vivid and attractive,and has significant advantages in mobilizing students’interest in learning and improving their enthusiasm for learning.

4.2 Cognition

Cognitive ability is an intermediate level ability based on perception.AI cognitive models are often used in teaching tasks such as question answering robots,intelligent tutors,and automatic topic generation.The dialogue education system uses artificial intelligence technology and natural language processing technology to provide learners with a way to interact with computers.

The application of the large language model is constantly being optimized.In recent years,some intelli-gent English learning systems have adopted the form of dialogue robots,which regard learners as their“English partners”.The ChatGPT system that triggered the discussion boom is a landmark creation in the field of dialogue robots.Its basic component is the Transformer model,and its core function is to effectively use the context information in the text sequence to generate the next word.In 2017,Google first proposed the Transformer network model.Subsequently,Facebook AI Research proposed a pre-trained language model BERT,which is based on the Transformer network structure and creates new optimal results in multiple natural language processing tasks [33].In the GPT series models,the input text sequence is embedded into a vector space of a specific dimension,and the context information is gradually learned through the multi-layer Transformer Encoder module.This pre-trained fine-tuning approach has been proven to be very effective in natural language processing tasks[34],making the GPT model series one of the most advanced language generation and text task solutions.

4.3 Creation

Creative ability is a high-level application ability in AIGC technology,including single-mode and multimode.

This field is the frontier focus and the next step forward.Single-modal technology is widely used in learners’intelligent diagnosis and personalized teaching based on the analysis and recognition of any kind of information in images,sounds or texts.Multimodal technology uses two or more data types for data analysis and processing to help teachers understand students more comprehensively and design teaching content and methods in a personalized way.In 2021,the OpenAI team unveiled the latest deep learning model CLIP (Contrastive Language-Image Pre-Training)[35].CLIP is considered to be the most advanced image classification AI,which is the basis of artificial intelligence painting.The model focuses on general image classification,learns a lot of visual and semantic features,and can accurately judge the corresponding degree between image and text prompt.Radford et al.proposed a method that combines neural descriptors with reinforcement learning models to learn painting tasks,decouple complex textures,and generate painting stroke sequences[36].The training process of this method does not require the experience of human painters or stroke tracking data to achieve a good painting effect.In terms of text writing and poetry creation,AIGC can also help students create output,detect and correct grammatical errors,and provide optimization and feedback suggestions to make creation clearer and smoother.

V.RESEARCH OUTLOOK

Under the support of the three core elements of data,algorithm and computing power,AIGC’s ability to generate multimodal content such as text,code,image and video continues to improve,and will continue to promote the application of artificial intelligence technology in the field of education in the future.Based on the understanding of the connotation of AIGC,this paper analyzes two opportunities and two challenges faced by the new round of AI education application represented by ChatGPT.On this basis,it elaborates on the seven subdivisions of AIGC and the analysis of educational application scenarios.

Based on the continuous deepening of educational digital strategic actions and the continuous advancement of intelligent technology maturity,the follow-up research directions on the AIGC track can include:

Deep research and development of the large model.The credibility of the model,customization,private data utilization,multi-modal interaction capability enhancement,training efficiency improvement and thinking chain construction are all research points for large model to continuously emerge intelligence.

The follow-up of large model evaluation technology.The classification standard of models,the way of utilizing user feedback,data quality measurement,and emotional companionship experience value measurement are all research points to be studied for the orderly management of multiple large model classifications[37].

Based on the seven subdivision studies of AIGC in this paper,the specific recommendations for AI educational applications include 3aspects:

First,promote the application of artificial intelligence education by closely following the cognitive map.Promote the integration of artificial intelligence and education from the standpoint of cultivating all kinds of talents at all levels and promoting cognitive development,including digital education resource quality audit,algorithm design evaluation,online and offline cognitive tracking evaluation and other specific work.ChatGPT based on information graph and ChatGPT-like commercial application based on event logic graph are being optimized will be upgraded to the educational application of AIGC closely following cognitive graph.

Second,highlight the scene demand orientation of the teaching management evaluation.The development of technology and demand-driven complement each other,with the closed-loop education activities of teaching management evaluation as the application traction.Design rich,reasonable,safe and credible scene experiments,and form a scientific humanmachine collaborative exploration that helps talent training.

Third,collaborative intelligent objective data and human subjective feedback elements jointly support application evolution.Excavate the objective process data in the application of large models,pay attention to different types of data analysis results such as user feedback and machine generation,support the standard research on the application of largescale model classification and the diversified education scenarios of various disciplines in various academic stages,and support the education administrative departments to support the development and superstandard governance of intelligent applications entering the campus.The objective data in AIGC and the subjective experience of educational practitioners are effectively coordinated to jointly promote the scientific evolution of artificial intelligence education applications.

AIGC plays a huge potential in repositioning future learning forms,breaking the boundaries of the perceived world,building an open and shared resource supply chain,reshaping the training process of innovative talents,and building a new form of future education.It is a key force to promote educational equity.However,while AIGC brings opportunities,it also faces a series of challenges,which require the full attention and joint efforts of various technical subjects and application subjects at all levels.From responding to problems afterwards to predicting scenario needs beforehand,technology-driven and demand-driven are the two core intrinsic drivers of technology development.Pay attention to the internal needs of education development,appropriate intelligent technology support or guidance,provide differentiated services of“personalization+certainty”for all kinds of teachers and students,meet the dynamic needs of each user,and help promote the implementation of education digitization strategy.

ACKNOWLEDGEMENT

This work was supported by a grant from 2022 National Natural Science Foundation of China project“Research on key technology of generalization of human-computer collaborative learning ability based on domain adaptation algorithm”.(No.62277002)

China Communications2023年10期

China Communications的其它文章: SR-DCSK Cooperative Communication System with Code Index Modulation: A New Design for 6G New Radios; Sparsity Modulation for 6G Communications; Sparse Rev-Shift Coded Modulation with Novel Overhead Bound; Reconfigurable Intelligent Surface-Based Hybrid Phase and Code Modulation for Symbiotic Radio; Model-Driven Deep Learning for Massive Space-Domain Index Modulation MIMO Detection; Modulation Recognition with Frequency Offset and Phase Offset over Multipath Channels