Keywords

1 Introduction

The rapid development of the Internet has led to the emergence of Web2.0, which has the main feature of advocating individualization, and UGC is coming with the development of Web2.0. In the context of this era, the data on the Internet are growing exponentially, and users are more dependent on the Internet to obtain information. And personalized recommendation system was born to solve the above problems. However, the information recommendation systems are all limited to the content of information itself, ignoring the differences in the emotional expression of information.

This research aims to purpose an approach for the information emotional semantic classification and recommendation based on emotional cognitive model. This research put news Internet product as an example, and used variable control to explore the differences between different emotional expressions of information under the same content. In view of the differences found, professional researchers have proposed a quantitative coordinates of emotional cognition system.

The quantitative coordinates of emotional cognition system constructed in this research innovatively integrates the users’ differences in emotion expression and cognition into the information recommendation mechanism, which is quantifiable.

This research creatively proposes the optimization mechanism based on the quantitative coordinates of emotional cognition system. The optimization mechanism has the following advantages,

  • Integrating the users’ subjective evaluation of information expression into the mechanism, which fills the blank and further enriches the information recommendation mechanism.

  • Putting the differences in the emotional expression and cognition of users into the information recommendation system, and the related factors will make recommendation system more comprehensive and diversified, the similarity analysis between users will also be more diversified, eventually making personalized recommendations more precise and effective.

  • Adding emotional factors to information recommendation system, which adds more data to the Internet from another dimension, expanding the value of data for the Internet.

The rest of the paper is organized as follows. An overview of emotion, UGC and intelligent recommendation system with current situation is presented in Sect. 2. Section 3 used News Internet Products as an example to explore the differences between different emotional expressions of information under the same content through variable control, samples association analysis and statistical data analysis. The information recommendation optimization mechanism with quantitative coordinates of emotional cognition system are presented in Sect. 4. The verification experiment in order to prove that it is practical and effective for information classify and recommend is presented in Sect. 5. Section 6 is the summary and prospect.

2 Desktop Research

Emotion.

Emotion refers to the subjective feelings or experiences of the individual [1]. Emotional experience refers to the individual subjective experience of emotion [2]. Emotion is a part of attitude. It is in harmony with the introverted feelings and intentions in attitude. It is a more complex and stable physiological evaluation and experience of physiology [3, 4]. And for the same thing, each person always has a different emotional feedback and experience [2].

UGC.

With the development of the Internet, the interaction of Internet users is embodied. The user is not only the browser of the content of the network, but also the creator of the content of the network [5].

UGC (User Generated Content) is arisen with the concept of Web2.0, which is the main feature of advocating personalization. It is not a specific business, but a new way for users to use the Internet, from the original downloading to downloading and uploading [6, 7].

Recommendation System.

Recommendation system is a subclass of information filtering system that seeks to predict the “rating” or “preference” that a user would give to an item [8, 9]. Recommendation system is the product of the development of Internet and e-commerce. It is a high-level business intelligence platform based on massive data mining, providing personalized information services and decision support to customers [10]. In recent years, many successful examples of large-scale recommender systems have emerged. For example, MovieLens recommends movies for users [11], Amazon recommends books, audio-visual resources and other products for users [12], and VERSIFI and TOUTIAO recommend news for users [13]. Meanwhile, personalized recommendation system has gradually become one of the research hotspots in academic circles [14,15,16,17] (Fig. 1).

Fig. 1.
figure 1

Intelligent recommendation system

The current research of personalized intelligent recommendation system is limited to the research of user’s objective data and subjective behavior data, and the topic classification of content itself [18].

  • From the point of users, personalized recommendation is made on the basis of user behavior (praise, comment and so on), user relationship (common friends, etc.), user interest, etc.

  • From the point of content, personalized intelligent recommendation for users based on relevant content (keyword association, the same topic), popular content (hot content recommendation) and so on.

However, the research on user’s emotion cognition is still blank. Therefore, the main purpose of this study is to explore the relationship between the different expressions of the content and the different emotional cognition of the users.

3 Exploratory Research

There are many scenarios for the application of the intelligent recommendation system, such as the e-commerce industry: Amazon, Taobao, multimedia software: MovieLens, news app: Versifi, Toutiao and so on.

This paper aims to study the relationship between the content and the user’s emotional cognition, so we choose the news app as the research object, the reasons are:

  • Personalized intelligent recommendation system is widely used in the news app. And news app takes text content as the main part, which is very representative.

  • The main product of news app is based on text content, which facilitates the research of the relationship between the content and the user’s emotional cognition.

  • News app is now widely used in daily life, and the results are highly practical and extensible.

  • Data in news app is rich, and easy to obtain.

3.1 Data Collection

The classification of news in news app can be divided into: sports, society, technology, military, health, education, women, estate, culture, automobile, finance and economics, international, entertainment, food, tourism, etc.

Professional researchers chose three categories in the above categories: sports, technology and society, which have a lot of news in each category and a low correlation between each other. And professional researchers collected 20 news with different expressions for the same content in each category. The source of the data is the common Chinese News app and internet.

3.2 Sample Process

Professional researchers processed the collected 60 news, extracting their text content (including the title), using a unified font size, word spacing, fonts to print them on the same white paper. And then researchers numbered them on the back of each white paper, A1 to A20 (Technology), B1 to B20 (Sports), C1 to C20 (Society), took them as the experimental samples of follow-up. Photo of the processed samples is shown in Fig. 2.

Fig. 2.
figure 2

Photo of the processed samples

3.3 Correlation Analysis

Subsequently, professional researchers respectively did correlation analysis to different kinds of news (20 samples of each kind of news), and observed their correlation results in the derived Euclidean distance model, trying to find out the key factors affecting the user cognitive and emotional differences on it.

Correlation Score

Under the hypothesis that samples in each kind of news search is extensive and representative, the experiment invited three professionals to grade the correlation between any two indicators of 20 samples in each kind of news. The score is on a 9-point scale ranging from ‘significant negative correlation’ to ‘significant positive correlation’ [19]. The scoring criteria are based on the professionals’ understanding and cognition of each news content. The relevant portion of the score result is shown in Fig. 3.

Fig. 3.
figure 3

Part screen shot of sample correlation score

Multidimensional Scaling Analysis

The correlation in 3 groups was scored by 3 professionals, and we got the results of 9 sets of correlation scores. Professional researchers then respectively numbered the data results to Group1-A, Group1-B, Group1-C, Group2-A, Group2-B, Group2-C, Group3-A, Group3-B, Group3-C.

SPSS software is used to analyze the samples correlation matrix through multidimensional scale analysis, and factor analysis. The reliability analysis results are as follows (Fig. 4).

Fig. 4.
figure 4

(a) Reliability statistics of professional Group1-A; (b) reliability statistics of professional Group1-B; (c) reliability statistics of professional Group1-C; (d) reliability statistics of professional Group2-A; (e) reliability statistics of professional Group2-B; (f) reliability statistics of professional Group2-C; (g) reliability statistics of professional Group3-A; (h) Reliability statistics of profession- al Group3-B; (i) reliability statistics of professional Group3-C.

According to the results of three sets of correlation scores for each sample, professional researchers selected two of the highest reliability as the follow-up experiment subjects: Group1-B, Group2-A, Group2-B, Group2-C, Group3-A, Group3-C. Meanwhile, we found that the 6 selected groups’ reliabilities were all above 0.7. According to the majority of scholars’ point of view on the SPSS reliability analysis, the reliability coefficient above 0.7 or more, means that the data needs to be modified, but it still has its value [20, 21]. On the basis that the test data is still valuable, researchers continue to analyze the data by multidimensional scaling analysis.

Based on the relevance coefficient in the matrix to build N-dimensional space, the Euclidean distance formula (1) can be used to calculate the spatial distance of two samples. The closer, the more similar samples can be considered.

$$ {\text{Euclid}}\left( {1,2} \right) = \sqrt[2]{{\left( {{\text{x}}_{1} - {\text{x}}_{2} } \right)^{2} + \left( {{\text{y}}_{1} - {\text{y}}_{2} } \right)^{2} + \left( {{\text{z}}_{1} - {\text{z}}_{2} } \right)^{2} }} $$
(1)

Multidimensional scaling analysis can visually see the spatial distribution of all samples. The results of multidimensional scaling analysis are as follows (Fig. 5).

Fig. 5.
figure 5

(a) Multidimensional scaling analysis of Group2-A; (b) multidimensional scaling analysis of Group1-B; (c) multidimensional scaling analysis of Group2-C; (d) multidimensional scaling analysis of Group3-A; (e) multidimensional scaling analysis of Group2-B; (f) multidimensional scaling analysis of Group3-C;

Because the selected samples’ difference is the expression of the news, combining the multidimensional scaling analysis results, professional researchers suspected that the main factors that affect the user’s cognitive differences are following two dimensions: emotional (more popular and entertainment in expression) and rational (more rigorous and official in expression); simple and rich (the latter will use more relevant content to support expressing the same content).

4 Quantitative Coordinates of Emotional Cognition System

The current personalized recommendation system in news app is based on user’s objective data and subjective behavioral data to judge and recommend the topics and columns that users interest in.

However, through the above exploratory experiments, professional researchers found that different expression of news content can cause great emotional and cognitive differences to users.

At the same time, after a certain text analysis, professional researchers suspected that the main factors that affect the user’s cognitive differences are following two dimensions: emotional (more popular and entertainment in expression) and rational (more rigorous and official in expression); simple and rich (the latter will use more relevant content to support expressing the same content).

Therefore, quantitative coordinates of emotional cognition system is put forward to optimize personalized recommendation system in news app, make personalized recommendation system more targeted and accurate, to improve user’s cognitive efficiency and experience when using the products. Quantitative coordinates of emotional cognition system is shown in Fig. 6.

Fig. 6.
figure 6

Quantitative coordinates of emotional cognition system

4.1 News Data

Editor’s Upload.

When editor is uploading a news, he was traditionally uploading the content and choosing the section of the news. However, in this system, editor needs to score the emotional and rational, simple and rich of the news to get the coordinates of emotional cognition. Then, a news has three main parameters: Content, Section tag, Coordinates of emotional cognition. Given initial coordinates of emotional cognition of news Ci0.

User’s Feedback to Optimize.

After users read the news, some of them are randomly selected to score the emotional and rational, simple and rich of the news, coordinates of emotional cognition set C = {Ci1, Ci2,…, Cin} can be gotten. The results of the score will be processed according to the relevance between the users’ behavior and the news, as follows (Fig. 7).

Fig. 7.
figure 7

Relevance of users’ behavior and the news

Then the processed results of the score will be attached to the original coordinates of emotional cognition.

As the users continue to read news and score, the coordinates of emotional cognition of the news will be constantly adjusted and optimized.

4.2 User Data

Basic Information.

Users need to enter some basic information when they register in a news app, such as sex, age, hobbies, etc.

Dynamic Information.

In addition to the basic information, some technologies accompanied by the development of the Internet like global positioning system, caching, cloud computing, big data and so on can also automatically obtain some objective dynamic information of the user. For example, based on global positioning system, we can obtain the dynamic location of the user, using this information, we can recommend some local news.

Behavior Data.

When users use news app, they produce a lot of behavior data, which have a certain correlation with the news. It can be seen in Fig. 7.

According to the relevance between the users and the news, we can get the news users are interested in, which based on the three main parameters of the news mentioned above.

In addition, we can also get more information based on the user’s behavior data. For example, based on user behavior data, we can get whether a user will be interested in different kinds of news in different periods of time and other details.

4.3 Personalized Recommend

User Similarity Computing.

User similarity computing plays a very important role in collaborative filtering systems, user recommendation systems as well as social network services [22].

The input is a m user’s score matrix for n news. We can find neighbor users who are similar to the current interest through user ratings. We can also recommend new news resources to the users according to the neighboring users. The method of finding neighbor users is the user similarity computing. The common methods of similarity measure are Pearson correlation coefficient, cosine similarity, etc.

Pearson correlation coefficient. If user a and u jointly evaluate excessive item set as Iau, Pearson correlation coefficient can be used to measure similarity between user a and u. User a and u similarity computation can be expressed as follows:

$$ {\text{sim}}\left( {{\text{a}},{\text{u}}} \right) = \frac{{\mathop \sum \nolimits_{{i \in I_{au} }} \left( {r_{a,i} - \overline{{r_{a} }} } \right)\left( {r_{u,i} - \overline{{r_{u} }} } \right)}}{{\sqrt {\mathop \sum \nolimits_{{i \in I_{au} }} \left( {r_{a,i} - \overline{{r_{a} }} } \right)^{2} \mathop \sum \nolimits_{{i \in I_{au} }} \left( {r_{u,i} - \overline{{r_{u} }} } \right)^{2} } }} $$
(2)

Among it, \( r_{a,i} \) and \( r_{u,i} \) respectively express user a and u’s evaluation of item i. \( \overline{{r_{a} }} \) and \( \overline{{r_{u} }} \) respectively represent the average score for the items.

Cosine similarity. The user’s score of n items are regarded as the scoring vectors on the n-dimensional projects. \( \vec{a} \) and \( \vec{u} \) denote the user a and u’s scoring vectors respectively. Then the similarity between users can be measured by calculating the angle between different users’ scoring vectors.

$$ {\text{sim}}\left( {{\text{a}},{\text{u}}} \right) = \cos \left( {\vec{a},\vec{u}} \right)\frac{{\vec{a}*\vec{u}}}{{\left\| {\vec{a}} \right\|/\left\| {{\vec{\text{u}}}} \right\|}} $$
(3)

Computing user similarity can not only be helpful for recommending new news resources to the users, but also optimize user data and news data to enrich the data of the whole recommendation mechanism.

The Match Between the News Data and the User Data.

Personalized recommend the news to the user based on the match of user data and news data, which is the most commonly recommendation system. But we can see that the recommendation system will be more accurate in the above recommendation mechanism, based on the section tag and emotion score.

5 Experimental Verification

5.1 Experiment Setting

Professional researchers had suspected that the main factors that affect the user’s cognitive differences are following two dimensions: emotional (more popular and entertainment in expression) and rational (more rigorous and official in expression); simple and rich (the latter will use more relevant content to support expressing the same content).

In this chapter, the same professionals used Likert scale to quantify the emotional coordinates of the samples, then the emotional coordinates of the samples would be gotten. Professional researchers classified the samples by the sample correlation score above. The classification results would be compared with emotional coordinates of the samples to see if there is a high degree of consistency to verify the conjecture described in the previous section.

5.2 Experiment Process

Likert Scale.

Likert scale is a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term (or more accurately the Likert-type scale) is often used interchangeably with rating scale [23].

To verify the conjecture described in the previous section, the professional researchers designed the following 7 point Likert scale (Fig. 8).

Fig. 8.
figure 8

7 point Likert scale

Then the professionals scored the same samples in the Likert scale.

According to the scoring results of professionals, professional researchers draw the following coordinates of the 6 samples’ scoring results (Fig. 9).

Fig. 9.
figure 9

(a) Emotional coordinates of Group2-A; (b) emotional coordinates of Group1-B; (c) emotional coordinates of Group2-C; (d) emotional coordinates of Group3-A; (e) emotional coordinates of Group2-B; (f) emotional coordinates of Group3-C;

With the dendrograms based on the sample correlation score, professional researchers clustered the samples, as follows (Fig. 10).

Fig. 10.
figure 10

(a) Dendrogram and sample clustering of Group2-A; (b) dendrogram and sample clustering of Group1-B; (c) dendrogram and sample clustering of Group2-C; (d) dendrogram and sample clustering of Group3-A; (e) dendrogram and sample clustering of Group2-B; (f) dendrogram and sample clustering of Group3-C;

The figure of their multidimensional scaling with clustering is shown in Fig. 11.

Fig. 11.
figure 11

(a) Multidimensional scaling with clustering of Group2-A; (b) multidimensional scaling with clustering of Group1-B; (c) multidimensional scaling with clustering of Group2-C; (d) multidimensional scaling with clustering of Group3-A; (e) multidimensional scaling with clustering of Group2-B; (f) multidimensional scaling with clustering of Group3-C;

Then the clustering results were put into the emotional coordinates obtained above, we could get the figure as follows (Fig. 12).

Fig. 12.
figure 12

(a) Emotional coordinates with clustering of Group2-A; (b) emotional coordinates with clustering of Group1-B; (c) emotional coordinates with clustering of Group2-C; (d) emotional coordinates with clustering of Group3-A; (e) emotional coordinates with clustering of Group2-B; (f) Emotional coordinates with clustering of Group3-C;

Professional researchers then compared the emotional coordinates with the clustering results based on the sample correlation score and the figure of multidimensional scaling with clustering.

We can clearly see that there is a high degree of consistency between them, proving that the system mentioned above is very practical and effective in information rank and recommend. It is reasonable to recommend information based on the emotional coordinates as optimization.

6 Conclusion

This paper initially envisages an information recommendation optimization mechanism based on the quantitative coordinates of emotional cognition system through variable control, samples association analysis and statistical data analysis. Its core idea is to integrate the factors of user’s emotional expression and cognition to the original information recommendation mechanism to optimize the mechanism.

At the same time, based on behavior psychology, the users’ behaviors provide much data for the modeling and make the model more representative. Besides, the users’ subjective evaluation of information expression is integrated into the mechanism, which fills the blank and further enriches the information recommendation mechanism.

However, the factors that affect users’ cognition and use efficiency should be multi-dimensional at the level of emotional expression and cognition, not just the two-dimensional displayed in the quantitative coordinates of emotional cognition system. Therefore, we actually need more samples which contain various fields and topics to analysis, and more subjects, including professionals and ordinary users, to get more accurate multidimensional factors.

It is foreseeable that the theory of this research can be applied to other internet products about business, social communication and so on. Meanwhile, the research methods and results can be applied to psychology, sociology research and other specific areas, playing a guiding and testing role.