Keywords

1 Introduction

With the development of digital media technology and new interactive technology, image transmission has made revolutionary progress in terms of production efficiency, image performance, transmission and preservation, and so on. Image resolution are becoming clearer, digital colors are becoming brighter and wider, and shooting techniques are becoming increasingly convenient. However, image arts creation and public viewing methods still abide by the traditional image laws and rules, and no great changes have taken place due to the emergence of new image technology. In addition, the advent of new technology has not changed the traditional narrative modes which the core of a traditional film making. It has been more than 100 years since the development of image art, the narrative structure and methods have basically formed a fixed pattern, and it is difficult to innovate in form, while the emergence of Virtual Reality technology makes it possible for the digital image art to have a new evolution.

The arrival of Virtual Reality image era will be a major subversion of image production, which will no longer stop at a traditional image expression and immovable viewing method. The technological change of VR image will break the drama concept of French enlightenment thinker Denis Diderot’s “Fourth wall”. The core of this major change is not only the 360-degree immersion imaging replaced traditional planar, but also the interactive function mixed with mature digital media manifestation method. At present, the form of virtual reality interactive image is still in its initial stage, and the technology is the technology of “new media”, but in terms of the artistic output or artistic work, they are still the art of “traditional media”. This is precisely the bottleneck problem that digital image creators are facing at the stage of application and creation of new technologies artworks. What does Virtual Reality image creation need is a revolution in form and a change in the way of production, while this change cannot just stop at subversion or upgrading of VR image’s “audio-visual” aspect, but the “interactive” function should have a new narrative form on such a new platform. As a result, in the creation of Virtual Reality interactive images, designers and creators need to deeply understand and break through the three key points within the film creation, as well as the creation form of the script, image narrative guidance and exploration of image interactive mode.

2 Non-linear Plays and Circular Narrative Structure of Interactive Image Narrative

The script creation of traditional video plays is basically based on the clue of single timeline, with a relatively closed structure, attaching much importance to the design of the basic structure of the unified ending and putting emphasis on the logical relationship of cause and effect. With the rapid development of commercial films, this kind of script creation is even more obvious. If it goes back to Aristotle’s “Three Unities” in Poetics [1], such a narrative rule has been used for nearly 2,000 years. It has formed a complete set of drama aesthetic system, and countless movie creators have tried many times, and the audience is also delighted. In the process of image development, image media creators are constantly challenging the traditional narrative logic, proposing and trying many breakthroughs in non-linear narrative, requiring breaking the linear image mode, emphasizing the transformation of image narrative from multiple viewpoints, and using the synchronicity of space to cut off the continuity of drama time. However, such “non-linearity” is more manifested in the director’s personal subjective narrative technique than in the breakthrough of the narrative at the level of the play like using inversion of storytelling and flashbacks in the form of fragments, changing perspectives and so on. Those artistic film expressions usually appear only locally and occasionally, rather than macroscopically in the structure of the film play. On the whole, the narrative of most video works still belongs to linear narrative. The non-linear narrative proposed in this paper refers to the linear narrative that is different from the traditional narrative law and the classical drama law in the whole drama structure, which does not follow the general space-time sequence or logic, but appears in a fragmented and discrete form. The specific performance characteristics can be summarized as multi-perspective, non-timing, contingency, fragmented multi-branch interactive ring narrative structure.

As shown in the figure, in the traditional non-linear script creation, the choices provided at each plot point can be more branches such as AB, ABC and ABCD. But at the same time, it should be noted that each additional plot point in this structure will multiply the workload of the creator and make it grow at the speed of geometric progression. Therefore, on the one hand, it is necessary to control the number of plot points, and on the other hand, it is vital to integrate or interrupt many branch clues in time (interrupts represent the end of the story of this clue), so that the clues in the binary tree structure are controlled within a reasonable range. This interactive narrative structure has been widely used in the British interactive film Late Shift, the film has 40 plot points to choose from, and the audience has achieved a better experience of controlling the narrative trend of the film through interactive means. However, due to the number of branches, the total complete length of this interactive film has reached more than 240 min, which is much longer than the length of a single narrative thread reached (Fig. 1).

Fig. 1.
figure 1

Dendrogram of the nonlinear narrative

When we have understood what is the non-linear narrative of traditional images, we will ask a question whether there will be an end to every linear plot or whether there will be a possibility of restarting the film play cycle when the viewers feel the storyline they selected is not perfect. In interactive video creation, it is difficult for viewers to accept an interrupted plot development and a “dead end”. As a result, on the basis of non-linear narration, it is necessary to design the plot route to return to the main plot at the end of the selection of each subplot, so that the development of each subplot can return to the main plot. As shown in the Fig. 2, each tree branch node is a choice node for audience to participate in the interaction, and only one of the branch nodes is the end point of the main plot. A complete non-linear interactive narrative drama structure and an interactive circular narrative drama structure are formed in the design of a reasonable loop of the branch plot.

Fig. 2.
figure 2

Dendrogram of the nonlinear circular narrative

In the movie Black Hollow Cage, the plot in this suspense movie is very in line with the story structure of non-linear circular narration, and viewers can make choices that affect the plot in VR interactive environment. If this choice is not leading to the main plot development direction arranged by the director, viewers will experience the plot again and again to the interactive selection point, especially in suspense plots, such circular plot development will not cause viewers’ boredom, but will strengthen the suspense and drama participation of the plot. Another classic film case with a circular plot structure is the film Accro, which also has circular narrative features in the play. The plot setting and story features are similar to Black Hollow Cage. The characters in the story are trapped in an area of a time cycle and can only walk out of this cycle by making the right action, thus reaching the end point of the main plot arranged by the director. Based on this interactive circular narrative drama structure, in Virtual Reality interactive image creation, subplots can be different directions of traditional story development, or events experienced by different characters in different spaces in the same period of time. Thus, viewers can choose memories, dreams, parallel time, space crossing and so on at nodes, forming a multi-viewpoint narrative structure in a single interactive film. Viewers can choose to enter the story from a certain point of view and have the interactive option of jumping between several points of view. Its abstract narrative model structure can be seen in Fig. 2. This interactive narrative is similar to a parallel montage to some extent, and the audience’s choice of jumping between different perspectives is a clip. Therefore, the benefit of VR interactive film is given to the audience the editing right, rather than the director to control everything in the traditional film. The advantage of which is that the audience dominates the observation right of the film-watching experience. This controlling right which belongs to directors in the traditional film switches to viewers can enhance the audience’s sense of immersion and experience, which is the future direction of VR virtual reality image development.

3 Sensory Guidance and Picture Optimization of VR Virtual Reality Interactive Image Narrative

Compared with the traditional image needs to design a 16: 9 plane picture content, the visual input of VR image is a 360-degree spherical camera. Thus, the virtual reality interactive image is different from the traditional image due to the vision wideness, it needs to design a 360-degree spherical screen surrounding image. Besides, the area of the picture is more than 6 times larger than that of the traditional image. In addition to designing a 360-degree full-motion video in virtual reality interactive images, it is also necessary to design gyroscopes for lens control. And interactive triggers and trigger modes in the interactive engine. All of this assist to achieve the characteristics of immersive visual effects, free and controllable lens experience and interactive scenarios. VR images can simulate the visual experience of human eyes while emphasizing the depth of scene. Therefore, virtual reality interactive images can create a specific visual experience and display a unique visual display effect. However, it is precisely because of this completely immersive free-lens feature and special shooting technique that the controllability of the camera becomes weak. The excessive liberalization of the audience’s lens control right makes it easy for the picture to deviate from the focus of the plot development or ignore the important film content which emphasized by the script and the director.

Points of interest has always been a very important design element. In the traditional image design, film artists can control the audience’s interest points according to composition, character costumes, scene scheduling, and so on. However, in the 360-degree panoramic image, there are many “hidden frames” that affect the visual focus in the virtual reality environment. Besides, scenes and environments beyond the viewer’s horizon or main line of sight in the VR digital film are more critical to control visual guidance and set the position of interest points when making VR films. In order to guide the audience’s visual center to pay attention to the designed “points of interest” or the core picture content, the designer needs to design the narrative guidance of the image around the human visual, auditory, tactile and three sensory systems. Thus, VR designers could achieve the sensory narrative guidance of the virtual reality interactive image through the in-depth design of the core guiding point of the visual picture, the orientation guiding point of the auditory music sound effect and the tactile sensation auxiliary guiding point.

3.1 The Core Guiding Narration of Visual Images

As we all know, the shooting technique of 360-degree panoramic images is simple, and a panoramic camera can be placed. However, the lighting layout and scene scheduling of 360-degree full-motion video shooting are very difficult, the stage feeling is weak, and the core role is not prominent. Even if the lighting inspector can arrange the scene layout carefully, it is extremely difficult to avoid the lighting devices appear in the audiences’ view by accident. Here we propose to use the interactive engine to optimize the artistic effect of 360-degree full-motion video. Designers can do real-time editing of the video in engine which can enhance visual guidance and enhance the visual color, purity, brightness and contrast of the core narrative points in the Interactive engine. The more important advantage of using engine is setting the real-time lighting triggers which can be activated by viewers’ actions and motions. At present, the popular interactive engines on the market, Unreal4 unreal engine and Unity3D engine, both have 360-degree panoramic image input and playback functions. Also, the construction method is relatively simple and friendly, that is, attaching the photographed 360-degree full-motion video to the inner surface of a sphere in the engine, with the audience control character standing right in the middle of the sphere, and realizing the real-time turning control of the camera by the Blueprint, thus realizing the viewing of 360-degree full-motion video in the engine and completing the interactive input of VR influence, which is only the first step.

The next step is the secondary lighting of interactive scenes in the sphere image. There are three specific lighting types: static lighting adjustment, dynamic lighting guidance and real-time interactive lighting tracking. Visual creators need to arrange preset points according to the lights of the video, find out the specific coordinate positions where the lights appear on the sphere, and use the physical lights created in the engine to conduct secondary lighting and color adjustment on the image, such as designing spotlights or complementary color spotlights to guide the key items of decryption in the suspense films, arranging character spotlights for the core characters of plot development, and adjusting the color of lights on scenes, etc. These lights are all binding coordinate positions and will not be adjusted in real time according to the viewer’s viewing angle changes for the secondary lighting on the screen part requiring light adjustment.

Dynamic lights guidance is a new lights layout method which is different from traditional image production. The angle between the viewer’s visual center and the main subject of the scene is calculated by programming the code for lights in the interactive engine, so that the light color can be adjusted in real time according to the angle. Even though the viewer can freely control the 360-degree camera angle in VR sphere video, interactive lights can adjust the color and brightness of the light in real time in line with the field of view and focal length to simulate the real light perception of the pupil of the eye. Another unique feature of interactive engine light distribution is real-time dynamic tracking to control the position of lights. The simplest and most intuitive case is VR animation film Buggy Night. The film is a dark scene as a whole, and the clue of the whole article is a spotlight. When the spotlight moves to any position, the audience will actively follow the spotlight to focus on the set plot direction. Although this work is not interactive, it can be said to be a VR virtual reality image with interactive features (Fig. 3).

Fig. 3.
figure 3

Animation, Buggy Night, 2016

In addition to real-time lighting layout, real-time color adjustment is also an advantage of interactive engine editing. The visual guidance technique of VR animated image Henry is very outstanding, and the way of “contrasting” is often adopted to attract the audience’s attention, which also ensures that their interest points can focus on the “main line flow” of the story. For example, a small range of colorful action elements are added to a large number of static scenes, and dynamic light-emitting objects are set in a dark environment. What’s more, a function called “look at” was added to the VR animated image Henry in its early design, and the protagonist little hedgehog in the animation can change the direction of vision with the action of the experimenter and sometimes produce different expressions. This function gives the designer a new direction guidance, namely real-time AI color and light processing in accordance with the viewer’s visual point, so as to achieve visual guidance.

Moreover, in addition to the real-time editing in the later stage, designers can increase visual obstacles, hide or weaken the invalid narrative information part of VR interactive images, and highlight 180-degree viewing angle videos. The horizontal visual angle of human eyes can reach 188° at most, and the visual angle of human eyes can be 124° when the human eyes coincide with each other, that is to say, the visual acceptance range of the audience is 124–180°. When designing VR panoramic images in the early stage, designers have always considered how to make the audience experience more 360-degree content, but it is because of the over-quantification of visual information (360-degree visual images) that the audience cannot follow the story development line designed by the director. Here, we propose to mix 180-degree images in full-motion video to help narrative, post-process half of the images that are irrelevant to narrative, weaken or even arrange obstructions so that viewers can focus on viewing useful information, guide the visual center of the audience to the part where drama development is needed, and block a large amount of useless narrative information from filling the audience’s field of vision. In the classic VR panorama animation Henry, the research set the VR observation position in a small and closed indoor environment, and the machine position is set very close to the wall, so that the viewer will naturally face the visual angle to the main scene direction and back to the wall, thus leading the viewer to pay attention to the scene part where the plot occurred and using the wall as a visual obstruction prop.

3.2 Narrative Guidance of Auditory Feeling, Music and Sound Effects

The sound design of VR images is an extremely complex system, and the overall sound perception includes non-directional sound information and directional sound information, in which the recording and later position tracking of multi-track music, sound, dialogue, narration and other sound information need to be implemented in the interactive engine. At present, Google, Valve and Facebook all provide VR audio SDK to developers to facilitate the developers to embed 3D audio and enhance the immersive VR experience, but the production of 360-degree dynamic surround audio source files is still in the traditional technology.

The traditional method of making 360-degree dynamic surround is that it records the 3D audio through “human head recording” equipment, which relies heavily on the recording equipment and cannot be post-produced, and the price of the recording equipment is as high as 34000. The second is to use an extremely complex plug-in editor (such as Panorama 5), which requires the audio editor to use his brain to fill the 3D space scene and manually adjust the XYZ value of the audio source. Compared with the first method, the second method can bring richer 3D audio content, but the plug-in still needs extremely complicated work. Of course, the characters in the story will move, and the sound will move with it. It is also worth exploring whether to set different XYZ at each time node, moving at a constant speed or accelerating. Even if designers set it up and listen with ears, audiences will not be able to accurately recognize the XYZ of this sound source (Fig. 4).

Fig. 4.
figure 4

Graphic of 360° Audio.

Here we recommend a paradigm of editing 360-degree dynamic surround sound in real time in VR environment, namely, Sound Flare, VR audio surround editor of Mint Muse, Sound Flare allows 3D audio content to be edited visually in VR scenes. In VR environment, content producers can wear VR headset show and drag audio files directly on the software interface they see. In addition to the basic functions of adjusting audio duration, editing and volume adjustment, the distance of sound can also be changed according to the sound path by means of dotting and dragging key frames so as to make the sound characters match the sound effect. The sound material is visualized as a sound source ball, providing an intuitive editing operation, allowing the audio editor to design the audio playing track by dragging without needing to supplement XYZ coordinate values. While editing audio, you can also run VR image content to analyze and judge the correctness of audio design in real time. Everything becomes efficient and fast, and what you see is what you get. 3D audio edited through Sound Flare can be exported and embedded into VR content to achieve spatial sound effects. Up to 7 audio clips can be loaded into the track once. These audio clips can be arranged in time by dragging forward and backward. You can position each audio clip in 3D space and immediately hear their sound from that location, you can adjust the volume, mute or solo of each clip in 3D audio software editing. Once the audio is located, you can save the mix as a WAV file for listening or further editing with other software.

Mint Muse Technology said that Sound Flare is still a very early version. At present, it has realized the production of binaural 3D audio (HRTF and directional spatialization) and only supports VR devices with HTC VIVE. Ambisonics (ambient stereo) will be added in subsequent versions to achieve full immersion space sound effect. With good technical support, the designer needs to make a new design idea on the plot guidance. In the interactive engine, the guiding sound source is bound in a fixed position of the spherical image, and the real-time adjustment of the left and right volume is made according to the clip angle between the visual square of the character and the sound source, with the smaller included angle, the larger the volume (Fig. 5).

Fig. 5.
figure 5

Working interface of Sound Flare

3.3 The Assistance of Tactile Feelings for Guiding Narration

With the development of VR market, the technology and product pages of somatosensory recognition are constantly being innovated and updated, such as tactile handles, bracelet, tactile clothing, somatosensory seats, and so on. At present, the four VR headsets manufacturers Oculus, Sony PSVR, HTC Vive and Samsung HMD all adopt virtual reality handles as standard interaction modes: two hands separated, six degrees of freedom space tracked (three rotational degrees of freedom and three translational degrees of freedom), and handles with buttons and vibration feedback. Such a device is obviously used for some highly specialized game applications (as well as light consumption applications), which can also be regarded as a business strategy, because the early consumers of VR display should be basically game players. However, the advantage of such a highly specialized/simplified interactive device is obviously that it can be used very freely in applications such as games, but it cannot adapt to a wider range of application scenarios.

The scene that can be realized in terms of the type of touch feeling experiences simulated touch, simulated contact between wind and smoke, and temperature sensing that are roughly earthquake-sensitive. The commonly used tactile sensation realization techniques are vibration motors and can generate vibration sensation, which can be experienced in common interactive handles. The muscle electrical stimulation system simulates many contact experiences through micro-clicks. We use a VR boxing device, Impacto, to illustrate that Impacto combines tactile feedback and muscle electrical stimulation to accurately simulate the actual feeling. Stimulation of muscle contraction movement by electric current. The combination of the two can give people the illusion that they hit their opponents in the game, because the device will produce a “sense of impact” similar to real boxing at the right time. However, the insiders have some disputes about this project, and the current biotechnology level cannot use muscle electrical stimulation to highly simulate the actual feeling. A researcher working on a pain relief physiotherapy instrument said that there are many problems to overcome in simulating real feelings with electrical stimulation of muscles because the nerve channel is a delicate and complex structure and stimulation from the external skin is unlikely, but it is possible to “casually” stimulate the muscles to move as feedback. This paper believes that the development of VR virtual images and the audience’s experience needs can no longer be satisfied with the use of simple vibration handle, and the continuous improvement of VR virtual image’s visual reality does not match the old vibration touch, so VR image designers must continue to pay attention to the latest research and development of haptic technology (Fig. 6).

Fig. 6.
figure 6

VR Boxing Template, Impacto

4 Story Building, Interactive Image Operation and Selection Behavior Optimization of VR Virtual Reality Interactive Image

4.1 Level Construction of Story of VR Virtual Reality Interactive Image

Unlike the post-editing of traditional digital images, using some movies effects software to edit the images, VR interactive video is no longer a simple VR video player that plays video from the beginning to the end. Viewers simply control the camera’s angle. In VR interactive images, the video material of VR 360-degree panoramic virtual reality images needs to be spherical stitched first to ensure the image integrity of each frame of images. Mistika-SGO is a relatively mature VR video stitching software in VR field. Mistika-SGO is used for 360-degree video stitching cutting edition and color adjustment, cutting the video into tree fragments and performing level management. Editors need to edit VR video into clip images rich in level relationship, and divide the plot into several progressive levels around the starting point of the main plot and the setting of the director. After the progressive level relationship, the fragmented video segments are managed hierarchically according to the plot tree logic diagram, and then the videos of different levels are structured using the blueprint function of the Unreal Engine 4 interactive engine, and are connected with the GUI system of Unreal Engine 4 to realize the interactive click input to the tree subplots, thus completing the interactive functions of video playing and clicking in the interactive engine. A simple input terminal change is a great innovation in the editing method and logic composition of the entire VR image.

VR’s interactive narrative story is not like the traditional timeline narrative. VR’s interactive narrative has a plot stop and a static loop of pictures on the interactive node. When the audience chooses to complete, the plot will develop with the audience’s different choice branches, and the pictures will then start to enter the next level of video files from the static or short loop. The VR 360-degree Stereo Panomic Movie Capture plug-in of the Unreal Fantasy 4 interactive engine is used to acquire images, VR editing is performed using Visual Studio code editor to process the images to form left and right eye images, and an automatic combination single image at the output of VR head-mounted glasses is set to project the images onto a 360-degree sphere to form a spherical player.

4.2 The Optimization of Operation and Selective Behaviors of Interactive Images

After building the hierarchical story structure of VR virtual reality, effective and reasonable interaction methods are needed next. This article recommends using blueprint function in Unreal Engine 4 to add interactive triggers and select buttons or triggers. For VR interactive images, the most common way of interaction is to influence the selection behavior of content. The engine needs to identify and feedback the audience’s selection behavior, and achieve seamless video connection and weaken the audience’s selection process. At present, the most mature and common way of interaction is visual direction tracing. One of the most important technologies in VR field is to realize real-time tracking of vision through real-time sensing of gyroscope’s displacement information. The further development of this field is not only eye tracking technology. Oculus founders Palmyrac once called it “the heart of VR” because its detection of human eye position can provide the best 3D effect for the current viewing angle and make the VR helmet display more natural and less delayed, which can greatly increase playability. At the same time, because the eye tracking technology can know the real gaze point of human eyes, the depth of field of the viewpoint position on the virtual object can be obtained. Therefore, eye tracking technology is considered by most VR practitioners as an important technological breakthrough to solve the problem of helmet vertigo in virtual reality.

In VR interactive image creation, a tracing point can be added to the viewer’s visual center, and the direction tracking of VR helmet can be used to control the user’s tracing point selection in VR virtual image and even the person’s advancing direction. However, if you use direction tracking to adjust the direction, it is likely that you will not be able to turn over, because the user does not always sit on a swivel chair that can rotate 360°, and may have limited space in many cases. Many designers use other buttons on the handle to return to the starting point or adjust the direction through the rocker, but the problem still exists. Taking the direction of the user’s face as the walking direction, the matching of steering and vision greatly enhances the immersion feeling, but it will increase the fatigue during interaction and weaken the comfort. However, the selection of visual tracing points is the most comfortable way of direction tracing selection in the current experience test. In the interactive image work Late Shift and using direction tracing points for plot selection.

Secondly, gesture and motion tracking is also an interactive way that conforms to the participation behavior and weakens the selection action. The use of gesture tracking as an interaction can be divided into two ways: the first is to use optical tracking, such as a depth sensor such as Leap Motion, and the second is to wear a data glove with the sensor on your hand. The advantage of optical tracking lies in its low use threshold and flexible scene. Users do not need to take off the equipment on their hands. It is a very feasible thing to integrate optical hand tracking directly on the integrated mobile VR headsets display as an interactive way of moving scenes in the future, but the disadvantage is that the sensor area is limited and the scene space is limited. The data glove, usually assembled with inertial sensors on the glove, tracks the movement of the user’s finger and even the whole arm. Its advantage is that there is no field of view limitation, and it can completely integrate feedback mechanisms (such as vibration, buttons and touch) on the device. Its drawback is that the use threshold is high: the user needs to wear and tear off the device, and its use scenario as a peripheral is still limited: for example, it is unlikely to use a mouse in many mobile scenarios. However, there is no absolute technical threshold for these problems. It is completely conceivable that highly integrated and simplified data gloves like rings will appear in the VR industry in the future. Users can carry them with them and use them at any time, which can be understood as a lifting product controlled by the handle.

However, both techniques are only applicable to digital video works with individual themes, because the tracking of real viewers’ gestures requires a real-time virtual gesture action to match in the virtual video picture, and adding a virtual body picture to the image will make viewers feel uncomfortable, but there is a great sense of interaction for some types of video works, such as opening door action in horror movies, event triggering in suspense decryption movies, simulated driving in science fiction movies, and so on.

The last mode of interaction in the exploration phase is voice interaction. In VR, huge amounts of information flooded the user. It pays no attention to the instructions from the visual center and looked around to find and explore. If some graphical instructions are given at this time, they will also interfere with their immersive experience in VR, so the best way is to use voice and not interfere with the world of images being watched by the audience. At this time, it would be more natural for users to have voice interaction with VR virtual reality world, and it is ubiquitous. It’s not necessary for users to move their heads and look for them, and users can communicate with them anywhere in any direction.

Due to the restrictions of many languages and the tone and intonation of the speaker, the most difficult thing for the creator to do in the process of making voice interaction is to optimize the voice database and input the specific words and various waveform recognition of words needed in VR images, so as to achieve smooth and accurate recognition of voice. What is optimized for developers is that they can use an open voice recognition library, such as xunfei platform, Baidu cloud, tencent cloud, etc. all can use the SDK method and REST API, which provides the creator with a common HTTP interface through the REST API port to upload a single voice file within 60 s. Finally, the editing is triggered at the selection time through the original code C++ of Unreal, so that the speech selection scenario in the real virtual image develops.

5 Conclusion

As a mainstream art form, image can be said to be inseparable from the support of technical media when looking back its development process. It can also be drawn a conclusion that every major image change is closely related to the progress and integration of technology. So far, VR virtual reality technology has been recognized by the market, and has also attracted the attention of art creators in many fields. As a result, as a digital image creator, exploring the essential innovation points brought by VR virtual reality new technology is necessary, how to technically quantify the creative ideas and sort out the production logic. Whether image collide “interactive” can be regarded as a movie? At present, there is no final conclusion, but at least its creative exploration of image art on the new media platform has not stopped. Perhaps it is a form between the traditional film and the game, inheriting the film’s audio-visual language, narrative techniques and performance techniques, and adding the audience’s interactive experience so that the audience can control the progress and direction of the narrative to a certain extent. The author believes future image creators will dig out the radically innovative mode and standard of image creation in the new era.