Keywords

1 Introduction

The Japanese government announced that digital textbooks will be introduced in K12 schools by 2020 [1]. The policy regarding the digital textbooks focuses on introducing the digital textbook technologies. Here are reasons why digital textbooks are going to be introduced:

  1. (1)

    Interactive multimedia: The digital textbooks can respond to learners’ actions by presenting contents such as texts, images, animations, videos, audios and video games in themselves.

  2. (2)

    Portability: Paper-based textbooks can be bulky and heavy. On the other hand, the digital textbooks can be read on a mobile device, enabling learners to access more than a hundred textbooks.

  3. (3)

    Functionality: Learners can use functions such as highlight, bookmarks, zoom-in, zoom-out, and copy and paste.

For the reasons stated above, the majority of educational institutions have introduced digital textbook systems in schools and universities, but little attention has been paid to analyzing and visualizing the collected logs to improve learning and teaching.

To tackle the issue, Ogata et al. [2] introduced a digital textbook system in universities and analyzed the logs collected by the system. By analyzing the logs, it is possible to collect information on which textbook pages were browsed by learners. However, it is difficult to collect the more specific information of which zones of the textbook were browsed. Mouri et al. [3, 4] proposed a scratching method to collect information regarding the zones browsed. The method detects and hides with masks each section of the slide in a digital textbook. Thereafter, the learner clicks the masks one by one to delete them while browsing the textbook contents. By recording the learner’s clicking operations, the method can collect information regarding the zones browsed by the learner. However, this method was found to cause a decline in learning achievement and system usability as a large number of zones were hidden.

Therefore, we propose a grouping method based on the layout information of the slides in order to identify the appropriate zones to hide with masks. The rest of this paper is structured as follows. Section 2 describes previous work regarding digital textbook systems. Section 3 describes the system we propose. Finally, Sect. 4 describes our conclusions followed by future works.

2 Digital Textbook Systems

In many countries, their policies regarding the digital textbooks focus only on introducing the technology of the digital textbooks to K12 schools [5,6,7]. For example, the digital textbook system called BookLooper was developed by Kyocera Maruzen Systems Integration Co., Ltd [8]. The system provides a cloud service and the digital textbooks are managed in the cloud based on digital right management. Learners can download the digital textbooks using BookLooper. However, if a certain analyst wants to analyze the log data of the digital textbooks, s/he needs to download them because the company manages the collected logs. Therefore, it is difficult for the analyst to analyze them in real time. If the analyst could analyze the logs in real time, it would be possible to support each learner in accordance with his/her browsing status.

To solve the problem, Mouri et al. [3, 4] developed a digital textbook system called SEA (Smart E-textbook Application) to collect the digital textbook logs. Figure 1 shows the interface of SEA.

Fig. 1.
figure 1

Interface of SEA [4]

A learner can read the digital textbooks on a web browser at any time and any place. The system supports six functions: Next, Prev, Bookmark, Highlight, Memo, and Search. By clicking on the Bookmark button, the current page will be saved as a favorite for easier future access. When the learner feels some words, phrases, or sentences important, s/he can Highlight them. When the learner clicks on the Memo button, s/he can post a note about the target words.

Unlike previous digital textbook systems [3, 4, 9,10,11], the proposed system automatically hides the text in the digital textbooks with mask processing before the learner browses the texts. If the learner clicks on a masked text, the system gets rid of the mask and the text appears. Then, the system can identify the x- and y-coordinates of the position where the learner clicked on the texts. From this, the system can identify which zone of the page the learner was browsing in the digital textbook. Table 1 shows the difference between the previous digital textbook systems and SEA.

Table 1. Difference between previous digital textbook systems and SEA

3 Proposed System

3.1 Overview

In this section, we propose a system for grouping texts in the slide based on the layout information of the slide in order to improve scratching. Figure 2 shows an overview of our system.

  1. 1.

    Input. Teachers prepare lecture slides with PowerPoint. By uploading the slides to the SEA (digital textbook system), SEA can gather information on the character areas and slide images. The details of the input are described in Sect. 3.2.

  2. 2.

    System. Our system executes text processing and image classification.

    1. a.

      Text processing executes three processes, (1) Single-line processing, (2) Multiple-line processing. The proposed system determines relation among texts through (1) and (2). The detail of these processes is described in 3.3.

    2. b.

      To process the image, the system extracts shapes from the slide image and detects the corners of each shape. Based on the positions of the corners, the system detects an arrow and a balloon. The detail of image processing is described in 3.3

  3. 3.

    Output. The system groups texts based on the result of 2.a and 2.b. The output is shown to teachers; they confirm the relation between texts in a slide, and can apply scratch to the important areas of a slide.

    Fig. 2.
    figure 2

    Overview of our system

3.2 Input

We used two types of input data: slide images and information of the character areas. A slide image includes sentences, images and shape objects. As shown in Fig. 3, the information of the character area consists of a single character and the upper-left and bottom-right coordinates (x1, y1), (x2, y2) of the rectangle surrounding the character in the slide image.

Fig. 3.
figure 3

Information of a character area

3.3 System

Processing the Information of Character Areas

Integration processing carries out (a) Single-line processing, (b) Multiple-line processing.

  1. (a)

    Single-line processing

Figure 4 shows the Single-line processing.

Fig. 4.
figure 4

Single-line processing

  1. (1)

    In the slide image, search the top-left character area that is not included in any text area.

  2. (2)

    If a such character area exists, introduce a new text area that consists of the character area and has a search window whose size is identical to the character area.

  3. (3)

    Locate the top-left point of the search window to the top-right point of the current text area.

  4. (4)

    If there is the top-left point of another character area inside the search window, extend the text area so that it contains the character area and go back to (3). Otherwise, the Single-line processing terminates.

  5. (b)

    Multiple-line processing

This process decides whether there is relation between the vertically juxtaposed text areas. Figures 5 and 6 show the multiple-line processing.

Fig. 5.
figure 5

Multiple-line processing

Fig. 6.
figure 6

Itemized multiline processing

  1. (1)

    If there are two text areas that are vertically juxtaposed as shown in Fig. 5(1), select the upper text area.

  2. (2)

    Locate the search window so that its top-left point overlaps the bottom-left point of the upper text area.

  3. (3)

    Note that the size of the search window is identical to the size of the character area of the top-left character of the upper text area.

  4. (4)

    Search the text area below the upper text area. If an item symbol is included in the search window, recognize that the lower text area does not belong to the upper text area. If the text area is not detected, the processing terminates.

  5. (5)

    Otherwise, that is, if a text area is detected, integrate the upper and lower text areas into a new text area, and go back to (1).

  6. (1)

    If there are two text areas that are vertically juxtaposed as shown in Fig. 6(1), select the upper area.

  7. (2)

    Locate the search window so that its top-left point overlaps the bottom-right point of the top-left character of the upper text area.

  8. (3)

    Note that the size of the search window is identical to the size of the character area of the top-left character of the upper text area.

  9. (4)

    Search the text area below the upper text area. If an item symbol is included in the search window, recognize that the lower text area does not belong to the upper text area. If the text area is not detected, the processing terminates.

  10. (5)

    Otherwise, that is, if a text area is detected, integrate the upper and lower text areas into a new text area, and go back to (1).

Shape Object Classification

In the slide, in order to extract shape objects such as circles, squares, triangles, and so on, this study carries out the following image processing.

  1. (1)

    Generate a binary image by binarizing the slide image. After binarization, label each shape object.

  2. (2)

    By using a corner detection technology [12], classify shape objects. This paper describes how to classify arrows.

  3. (3)

    Judge if the shape object is an arrow or not by using the pattern in Fig. 7. That is, judge if the shape object is an arrow or not by overlapping it onto the 3 × 3 zones shown in Fig. 7 and checking the number of the corners in each zone to be equal to the corresponding integer in Fig. 7.

    Fig. 7.
    figure 7

    A 3 × 3 detection pattern for a right arrow using horizontal and vertical lines

  4. (4)

    If the shape object is not judged to be an arrow in (3), Judge if the shape object is an arrow or not by using the pattern in Fig. 8. For example, as shown in Fig. 9, each red point means each corner. From the results, we can find the arrow and the direction. That is, judge if the shape object is an arrow or not by overlapping it onto the 2 × 2 zones shown in Fig. 8 and checking the number of the corners in each zone to be equal to the corresponding integer in Fig. 8.

    Fig. 8.
    figure 8

    A 2 × 2 detection pattern for a right arrow using diagonal lines

    Fig. 9.
    figure 9

    An example of detection of a right arrow (Color figure online)

Output

Figure 10 shows the output of our system. The slide consists of three groups of texts. The two-line texts were successfully integrated. In addition, there are three nested subgroups of texts, which are related by the up and down arrows. The system also successfully integrated them into one group by detecting the arrows. The system can output the result of grouping as a two-dimensional array. The size of the array is equal to the resolution of the input image. Each element of the array has a natural number, which represents a group number. That is, if a point in the slide belongs to a group whose group number is 2, the corresponding element of the array has the value 2. If the point does not belong to any group, the element has the value 0.

Fig. 10.
figure 10

Example of output of the system

4 Conclusions and Future Works

In this paper, we proposed a system for grouping areas in a slide of a digital textbook. Grouping is based on the layout information of the slide. The input of the system is the information of character areas and the slide images. The system detects the relationship among texts using the information of character areas. Based on the position of each character, the system decides the positions and the sizes of the text areas. In addition, the system extracts shape objects from slide images. Among shape objects, arrows represent context of text groups. Therefore, the system classifies arrows to extract context of the text groups in the slide. The system uses the positions of corners in a shape object extracted from the slide image. In other word, the system groups texts in the slide with the relation among the texts and the shape objects.

The system helps to extract the appropriate zones to hide with masks by grouping texts in a slide. Note that the system doesn’t extract the appropriate zones themselves. Grouping the texts prevents hiding too many zones, but it doesn’t show the appropriate zones to hide. To solve this problem, it is necessary to define priority of masking.