Keywords

1 Introduction

With the popularization of mobile technologies and mobile social networking, mobile social networking applications (MSNAs) have become one of the most popular activities on smartphones and tablets. In 2016, We Are Social [1] released its latest report “Global Digital Snapshot”, giving its readers a glimpse at the comprehensive survey of the global Internet, social networking and mobile usage. As the report shows, the total number of mobile social media users has reached 1.97 billion with a significant growth of 17% annual increase compared with the data of 2015, accounting for 27% of the world’s population. In the meantime, with the development of big data platform, when users’ privacy is recognized as an asset, social networking applications have undoubtedly become the primary and most vulnerable targets. Tons of users’ privacy data is generated in the MSNAs, however it is still uncertain whether this data has been handled carefully by the application developers. Therefore, it is necessary to have a deeper study into the privacy leakage issue and the protection mechanisms of the privacy information in the MSNAs.

The privacy leakage issue on Android platform has already drawn wide attention. Since Android is an operating system based on access control, plenty of solutions focused on the analysis and optimization of the permissions. VetDroid is a dynamic analysis platform to reconstruct a fine-grained access control mechanism and detect the sensitive behaviors of the applications on Android [2]. Shebaro et al. [3] presented a context-based access control system, through which applications can be dynamically granted or revoked certain permissions based on the specific context. Nauman et al. [4] put forward a fine-grained, user-centric privacy preserving permission framework that allows the users to selectively grant permissions to the applications installed. Wu et al. [5] proposed an effective access control scheme for preventing permission leakage in the application level and provided developers with better management of security of components. However, the user-based access control for data leakage detection approach in all these studies is considered to be an impractical means since they requires users’ frequent participation.

Some other researches focus on the analysis of certain type of privacy detection and protection. CHEX is a static analysis tool which can automatically review the Android application of components hijacking vulnerabilities, thus protecting user privacy data [6]. Tan et al. [7] proposed Chips, a context-based run-time access control system to deal with the photos of applications with a fine-grained access control. Naveed et al. [8] performed a comprehensive study of privacy leakage in external devices for Android mobile phones, and proposed management approach of external equipment through a bluetooth, NFC or etc. Rahman et al. [9] and Fawaz et al. [10] studied the information leakage of geographic location and proposed corresponding protection strategies.

Besides the above-mentioned approaches, other researchers track the leakage of private data by modifying the Android framework or source code of applications. SplitDroid segregated the sensitive components of an application based on the Linux Container mechanism for isolated execution and privacy protection [11]. Tripp and Rubin [12] established a quantitative and probabilistic dual judgment model using Bayes’ theorem and solved the problem of privacy judgment according to the environment of diffusion points. TaintDriod modified the Android virtual machine and the interpreter, provided a complete function of dynamic taint tracking [13]. Furthermore, researchers have also put forward more optimized schemes based on TaintDroid such as PasDroid [14] and Styx [15]. Cui et al. [16] and Zhang and Yin [17] rewrote the bytecode of Android applications to add the corresponding privacy detection strategy, realized the privacy tracking and leakage detection by means of repackaging.

Unfortunately, most of the existing schemes based on certain type of permission or privacy data are not qualified to handle the problems of privacy tracking and leakage detection as MSNAs often involve many types of sensitive system permissions and private data. Moreover, strategies based on TaintDroid need to modify the Android framework, and the approaches based on TaintDroid [14, 15] need repackage the applications, thus not only increasing the experimental cost, but also introducing a great impact on the efficiency of the applications.

To address the security problems caused by privacy leakage, we propose Xposed-based Detecting-Cache-File, namely “X-Decaf”, a detection framework of MSNAs together with an auto-protection method named ATFed (Automatic Transparent File Encryption/Decryption). The detection framework first uses taint tracking and Xposed framework to innovatively monitor cache files generated during the application run-time and organize the suspected leakage path to evaluate the leakage rank of sensitive data. Then ATFed is applied to this application to offer an automatically privacy data protection under the conditions of keeping low coupling with the Android system and posing low impacts on the original MSNA. The main contributions of this paper are:

  • We undertake a study of privacy data and privacy leakage. As for the MSNAs, some attack scenarios are listed and a more detailed definition of privacy leakage standard is given.

  • Based on the taint tracking and Xposed framework, we propose an MSNA’s privacy leakage detection framework and conduct in-depth evaluation of security and performance on popular MSNAs.

  • We design an auto privacy data protection mechanism named ATFed, which automatically encrypts and decrypts the privacy cache file without the modification of Android framework and the involvement of application developers.

The rest of this paper is organized as follows. In Sect. 2, we define the privacy data and leakage standard on social applications. In Sect. 3, we present a detailed system design and implementation of X-Decaf. Section 4 presents the evaluations and discusses the experimental results. In Sect. 5, a mitigation of ATFed is proposed to solve the problem of cache file privacy and an evaluation of effectiveness and overhead is given in Sect. 6. Section 7 concludes the whole paper.

2 Privacy Data and Leakage

2.1 Definition of Privacy Data

Since MSNAs generally produce tons of data in various types during users’ interaction, privacy data related to the users’ private information can be involved frequently. It can be seen from the analysis result of 50 kinds of Android social networking applications in the market as shown in Fig. 1 that 6 types of data, i.e. pictures, video, voice, geographical location, contact, phone calls and SMSs are mainly involved in MSNAs and may cause potential vulnerabilities of data security. Furthermore, since media data such as images, video and voice often produce numerous unprotected cache files on storage, privacy data are defined as media privacy data in this paper, and we mainly focus on this kind of privacy data leakage.

Fig. 1.
figure 1

Types of privacy data involved in MSNAs

2.2 Definition of Privacy Leakage

The Android system will create a separate file directory for each application in the directory of /data/data/and the data generated during applications’ run-time will be kept in this default directory. Note that Android system itself provides a security mechanism to guarantee the safety of this directory. When an application accesses a file stored in the /data/data directory, it first passes the UID/GID-based DAC security checks, then the MAC check of SEAndroid. However, all these checks are not strong enough when a mobile phone is rooted and the malwares break through the Android security mechanisms to get the users’ privacy data. At the same time, many other improper handlings may cause the run-time data of applications not to be stored in the corresponding /data/data directory. These operations may include: (1) The developers may arbitrarily call some system APIs (i.e. getExternalStorageDirectory(), etc.) regardless of the security mechanisms when writing an application, resulting in caching files or information documents during run-time stored in some public directory; (2) Different Android systems have different components, for example, a device with limited memory may need an SD Card memory extension, as a result, some cache files have to be stored in the public directory of SD Card. Meanwhile, the application does not use any security mechanisms, i.e. encryption or obfuscation, so any application in such devices can access the files in this directory and cause privacy leakage problems. Traditional privacy leakage behaviors refer to the application’s remote collection and dissemination of user private information without explicitly notification. In this paper, we will conduct research on the privacy leakage from a new perspective and mainly discuss the applications creating cache files involved users privacy data due to its design defects and lack of proper strategies to manage these files. To have a better understanding of social software and analyze the risk of privacy leakage, the X-Decaf framework is designed for monitoring cache files of MSNAs during their run-time and future detection.

2.3 Standard of Privacy Leakage

We define the standard of privacy leakage based on the following three aspects:

Cache File Path.

As mentioned in Sect. 2.2, Android system will generate cache files in corresponding paths. See in Table 1 below.

Table 1. Cache file path

As the file storage paths can reflect the access permissions to a certain extent in Android system, we list some attack scenarios respectively.

DATA_PRI: Only the application itself can access the file path, the attackers cannot obtain the files without root permission;

DATA_PUB: Any application can access the file path and even maliciously tamper the file when it is global-readable;

SD_PUB, SD_PRI: The attackers can easily tamper a file in SD-card file directory, no matter it is private or public.

Note that the files in SD_PRI directory can be deleted with “CLEAR DATA” function under Android Settings, and unless the application manages by itself, the files in SD_PUB cannot be deleted directly. We distinguish files in SD_PRI directory from those in SD_PUB directory because it reflects the irregularities of developers.

Cache File Protection Status.

According to the analysis of the above mentioned attack scenarios, it is not qualified to guarantee the security of data generated during the application run-time only relying on the file directory security mechanisms provided by the Android system itself. With such protection measures as confusion, encryption, etc., files can be better protected. Even if an attacker broke through Android system protection and make access to the files, it will still take a great cost to restore the treated files. So, the cache file protection status should be taken into consideration and thus provide important foundation for analyzing privacy leakage.

Cache File Life-Cycle.

Cache file life-cycle contains the processes of cache files from generation, transfer, storage to deletion, during which they are under threat of hacker’s attacks. Therefore, it is important to have a good perspective into the life-cycle of cache files. Here is a summarization of scenario simulation of cache file life-cycle:

  • Case 1: The cache files are generated during the application run-time, and deleted after the application exits;

  • Case 2: The cache files still exist after the application exits, but the application provides related functions such as “clear cache” which can be executed to delete the cache files.

  • Case 3: The cache files still exist and cannot be deleted even with function “clear cache” provided by the application itself.

In view of the three aspects mentioned above, we define a standard for cache file privacy leakage according to their storage path and life-cycle. The overview of privacy leakage criteria is shown in Table 2. Note that our analysis only focuses on unprotected cache files and the protected cache files are regarded as safe with the default protection applied by Android security mechanism, so there are no leakages.

Table 2. Standard of privacy leakage

According to the definition of privacy leakage standards combined with the current Android platform common attack scenario, the privacy leakage is divided into 5 levels: NO_LK, MILD_LK, MEDIUM_LK, SEVERE_LK, and SP_SEVERE_LK. More details are given as below.

NO_LK: No cache files or only protected cache files created during application run-time;

MILD_LK: An attacker can obtain privacy data only with root permission;

MEDIUM_LK: An attacker can obtain privacy data by monitoring the application run-time actions;

SEVERE_LK: An attacker can use folder tools to view files or Android File APIs directly to get privacy data;

SP_SEVERE_LK: Application cannot delete cache files even by its own “clear cache” function.

X-Decaf will study the links between private data and cache files within the application based on the above definition and privacy leakage standard, and perform analysis of privacy leakage effectively and efficiently.

3 Detection of Cache File Privacy Leakage

3.1 X-Decaf Overview

X-Decaf exploits the characteristics of the Android system, the taint tracking technology and the X-Posed framework to detect the leakage paths and the privacy data in the cache files within the MSNAs on the Android platform. It will perform a static analysis together with dynamic analysis of the application with high detection precision and only import a low impact, neither does it have a tight coupling with the Android system nor have the need to modify the application. The X-Decaf contains three components, i.e. sensitive library, taint tracking and cache file analysis, as shown in Fig. 2.

Fig. 2.
figure 2

Components of X-Decaf

The functions of the three components are:

Sensitive Library.

After analyzing a large number of social applications on the market and obtaining the statistics of these applications’ calling API, we filter and obtain those system APIs which are related to the processes for generating and spreading of sensitive data to form a sensitive function library for X-Decaf.

Taint Tracking.

Taint tracking mainly consists of two parts: a dynamic tracking module first requests sensitive functions for specific privacy data from sensitive library, as a detection target, these sensitive functions are monitored by the corresponding Hook Module generated by X-Posed frames; the monitoring results are marked in the taint marking module, which achieves the file-based taint marks combing source privacy data type, source file name and appropriate strategies.

Cache File Analysis.

Firstly, the manual verification performs the corresponding detection based on the definitions and standards of privacy leakage; secondly, policy judgment, i.e. an automated monitoring script, detects all taint-marked cache files generated during the taint tracking phase. The policy judgment module based on the criteria monitors taint cache file status puts out the leakage report.

3.2 X-Decaf Framework Architecture

We present the overall work-flow of X-Decaf to detect file-based privacy leakage during application run-time. It takes the following major steps as shown in Fig. 3.

Fig. 3.
figure 3

X-Decaf architecture

Establishing Sensitive Library.

In the early stage of X-Decaf design, we have collected and analyzed more than 100 MSNAs on the market for the relationship between privacy data and Android system API. We found that Applications usually access specific permissions and call similar system APIs in generating, obtaining and disseminating private data. For example, photo data will refer to APIs like camera service, system gallery reading, image compression, and file I/O, etc. With the help of some decompilation tools such as Apktool, IDA pro, etc., we can filter these system APIs related to the creation and propagation of sensitive data, and eventually establish a sensitive library for X-Decaf.

As we mainly analyze the cache files for the social software, the sensitive library will involve three types of system APIs including voice, pictures and video which contain large amounts of sensitive data of users’ private information. Note that sensitive library will take an open sensitive API strategy of XML format for the convenience of automatic detection scripts for the sensitive functions. Figure 4 lists part of the corresponding system sensitive APIs of voice data in sensitive library.

Fig. 4.
figure 4

Sensitive API of voice (part)

Here is a more detailed description of the sensitive API strategy.

  • Sensitive-policy tag. It is the root tag for specifying privacy policy. The attribute privacy-type figures out the type of privacy data.

  • Uses-permission tag. It describes the corresponding permissions for this kind of privacy, and is similar to the declared permission in Android system.

  • Class-info tag. It describes the name of class which contains sensitive functions, and requires a class-name attribute for the class name.

  • Method-info tag. It describes the specific information for this sensitive method. Two attributes can be used for description: method-name and method-args.

Dynamic Tracking.

The sensitive function contained in the sensitive library is the main detection object during the social application run-time. Dynamic Tracking module first sends a request to the sensitive functions for specific privacy data, then takes these sensitive functions as a detection target and generates the corresponding Hook Module by X-Posed framework. After the corresponding Hook Module has been loaded to the phone system, Dynamic Tracking module can monitor the application’s actions during its run-time.

Taint Marking.

Taint Marking module is mainly used for cache file filtering and taint marking. The major steps are presented as follows:

Cache File Filter.

In the stage of dynamic tracking, X-Decaf monitors system I/O operations. Since there are tons of I/O operations during the application run-time, the taint marking module will mark too many unrelated items without the file filtering operation, and consequently affect the operational efficiency of the entire system. In order to improve accuracy of the taint marks and reduce the impact of application running, X-Decaf filters cache file with a fine-grained strategy. An example for photo data is presented. Firstly, a cache file with suffix “.jpg”, “.jpeg” and “.bmp” will be classified as sensitive cache file directly. For unusual file extensions, X-Decaf will then further detect its data streams (“.jpg” files’ data stream begins with “JFIF”). In addition, if the application itself has already protected the cache files (using confusion, encryption, etc.), we can also mark in this case. Thus, no matter which form the cache files exist, X-Decaf can easily filter data streams, and trace all privacy leakage files.

Taint Mark.

After filtering sensitive cache files, X-Decaf will mark the file with taint marking mechanism by adding TAG to these files’ name. The TAG contains 3 attributes: privacy type, file hash and file protection status, in which protected file is marked with 1 and 0 means no protection. For example, an unprotected photo cache file named “cache.tmp” will become “cache_photo_hash_0.tmp” after taint marking. This handle strategy will bring the following benefits:

  • Without changing target applications’ data flow and control flow;

  • Taint TAG can associate all the corresponding cache files with applications’ data flow;

  • Each same-origin cache file’s protection status will be monitored.

Manual Verification.

X-Decaf marks a series of cache files with taint TAG. After that, we will analyze these files manually. Firstly, we must verify if a social application has managed cache file’s life-cycle by manual tests as described in Sect. 2.3. Secondly, based on the above-mentioned standards, we must monitor the cache file by its storage path and protection status to figure out whether it exists or has been removed during the test, and then perform the appropriate policy in the next stage.

Policy Judgment.

Policy judgment cooperates with manual verification. Policy judgment runs as an automated monitoring script, monitoring changes of protection status, file path, life-cycle of these cache files with taint TAG, and finally outputs a leakage report according to privacy leakage criteria.

3.3 X-Decaf Analysis

Compared with the existing detection tools, X-Decaf has the following advantages:

A Lower System Coupling and No Application Modification.

Most of dynamic taint tracking tools or privacy leakage detection frameworks, such as TaintDroid and its derivatives, require the modification of the Android system. The other existing tools modify application directly by bytecode rewriting and strategies insertion. However, with the development of tamper-resistant and signature mechanism, the cost of repackaging applications gradually increases. X-Decaf does not require modification of the Android platform or applications, and subtly takes advantage of system APIs as well as analyzes the correlation between privacy data and system API for data privacy leakage detection.

Multiple Types of Privacy Detection.

Sensitive library provides common system API related to privacy leakage. Therefore, facing with different data privacy, X-Decaf can flexibly choose a variety of strategies to monitor simultaneously these sensitive functions.

Multi-lateral Application Test.

Existing studies show that applications usually share similar API call to operate privacy data. Therefore, our X-Decaf system can easily and simultaneously monitor multiple applications on the market for certain types of sensitive privacy leak.

4 Experimental Results of Privacy Leakage Detection

In this section, we will discuss how we perform privacy leakage detection on the most popular MSNAs on the Android platform including WeChat, Mobile QQ, Weibo, Yixin, Momo and Wumi, and evaluate the effectiveness, accuracy and efficiency of X-Decaf when handling the voice, photo and video files. The evaluation is performed on Nexus5 and Nexus6 with android version 5.1.1 and the detailed analysis results are presented below.

4.1 Vertical Analysis of Privacy Leakage

We first conduct a vertical analysis of the country’s most popular mobile MSNA, WeChat, by X-Decaf for privacy data leakage involving voice, images, video, etc. The analysis focuses on the data leakages of cache files and counts the number of leakage path. As shown in Table 3, X-Decaf can accurately detect a source data in the process of application run-time including generation, transfer, propagation etc. As for the corresponding cache files of image data, for example, three cache files will be generated from the same source data, *.jpg for copy of the original image, th_* for small thumbnail and th_*hd for large thumbnails. In conclusion, X-Decaf can accurately analyze the data of an application during its run-time, with no false negative and false positive of any privacy path.

Table 3. Privacy leakage of WeChat

4.2 Horizontal Analysis of Privacy Leakage

We use X-Decaf to analyze the most popular MSNAs in the Android platform and detect whether they suffer from some kinds of leakages while handling the voice, photo and video files. The leakage reports are shown in Table 4.

Table 4. Leakage report of leakage rank and leakage path number

4.3 Privacy Leakage Score

We define a grading rule for evaluating the rank as shown in Table 5 and quantity of leakage considering the types of privacy data and the number of leakage path, the score is calculated as:

$$ {\text{C}} = \sum\limits_{{{\text{k}} = 0}}^{4} {V_{k} \bullet n} $$
(1)

where C refers to the score of the application of certain types of privacy data leakage, Vk is the leakage score of leakage level while n is the leakage path number of the certain type of privacy data. Based on Table 5, the privacy leakage score of mainstream social applications in voice, photos and video data can be computed, the results are shown in Fig. 5.

Table 5. Leakage score rules
Fig. 5.
figure 5

Leakage score of some major MSNAs

Leakage score is a reflection of the management of cache file including private data of an application. It is not surprising that WeChat ranks first in the leakage scoring of the three types of private data because of its abundant functions and complex business that provides related rich entertainment service.

Furthermore, we use X-Decaf to analyze the leakage about photo for the most 50 popular MSNAs. It can be found from Fig. 6 that only 4% of the MSNAs do not involve photo data, the other 96% of the MSNAs are the presence of SEVERE_LK or more, and 74% applications are SP_SEVERE_LK.

Fig. 6.
figure 6

Leakage statistic of photo in 50 MSNAs

The statistic fully illustrates that the current application developers often have little consideration of the user’s privacy leakage and fail to comply with the specifications in the development.

4.4 Impact on MSNAs

To have a better understanding of the performance while running the X-Decaf in the process of privacy leakage detection, we conduct an experiment of influence on the application by using the DDMS (Dalvik Debug Monitor Service) provided by Android SDK. DDMS provides a useful tool of “method profiling” to monitor the run-time of terminal application process. Besides, “Method profiling” tools can dynamic analyze the application without source code and feedback all the Java methods involved, including executing time of methods, calling numbers, the overall proportion of time consuming, which can be used to analyze the performance of the social applications while debugging.

We perform studies on WeChat for performance influence through the following five typical test scenarios while running the X-Decaf framework.

  • Test 1: In the non-chat screen of WeChat, execute several similar click operations and analyze the overall performance of X-Decaf that influences WeChat;

  • Test 2: Call the camera API to take and send a picture, analyze the performance influence that X-Decaf may have in the photographing process;

  • Test 3: Send nine images continuously to analyze the performance impact of X-Decaf to the picture sending process;

  • Test 4: Send three small videos of six seconds continuously, analyze the performance impact of X-Decaf to the video sending process;

  • Test 5: Send three small videos of six seconds and nine images continuously, analyze the performance influence of X-Decaf to the sending process.

Each test runs for more than 20 times under the same WiFi network, moreover, the cache files generated during the test will be cleared every time when the test finish in order to prevent influence caused by manual testing, network conditions and the cache data of last test. It can be seen from the test results of Fig. 7 that under the condition of not modifying the application, X-Decaf only imposes very low effects on the application performance.

Fig. 7.
figure 7

Performance impact on WeChat

5 Auto-protection of Cache File

ATFed (Automatic Transparent File Encryption/Decryption) offers a general framework for protecting the cache files from leaking the user’s private information by transparently encrypting and decrypting private data during the runtime without the modifications of the underlying Android framework or any involvement of the application developers. That is, ATFed will transparently encrypts the application’s cache files through API hooking on the caller side. For instance, a photo taken in the MSNA will be encrypted without the application’s involvement while being written into a cache file. When reading from the storage, ATFed decrypts the data accordingly.

To realize the hook process, the file IO operations both on the Java level and the native level should be considered. Specifically, most of the Java file IO related APIs eventually go through the underlying native libraries, such as libjavacore, so during the implementation of ATFed, APIs inside this native library will be detoured to our hooks and then the data will be encrypted when written to the cache files and decrypted when read. For example, the Java FileIOStream API eventually invokes the APIs inside the IOBridge class, which in turn calls the native functions inside the native library libjavacore. Then the native functions inside the libjavacore dynamically link the file IO APIs including open, read and write to libc. To make things easier, we will hook into the caller side APIs.

Figure 8 depicts the architecture of ATFed which consists of a hook module and a crypto module.

Fig. 8.
figure 8

Architecture of ATFed

The register module will mainly parse the ELF file and obtain the information needed to execute hook as ATFed starts works, then hook module will provide the hook functions and the backup of the original methods, finally, the uninstall module will be used to uninstall hook functions and restore the function of the original methods after the execution of the whole process. When the sensitive data of an application is being saved as a cache file, hook module in ATFed will get control of the file IO operation at first, and then the crypto module will be called for transparent encryption. Accordingly, when the encrypted cache files of an application are being read for the content of sensitive data, hook module will still preferentially get control of the file IO operation, and then the crypto module will launch the transparent decryption.

The major processes are given as follows:

Preparation.

Determine the APIs (such as read, write) of the access operations in the implementation of file operations called in the system and the corresponding dynamic link library where these APIs exist. This part of work mainly relies on the analysis of Android source code including Java level and the native level, and finally we will hook into the open, close, read and write functions in the libjavacore.so;

Start Hook.

Call the hooked function in the corresponding dynamic link library and hook module will redirect to the specified function as we have modified function addresses in the GOT (Global Offset Table) which stores the address of the global variables and functions. In this way, we replace the call of original read or write methods in the Java core codes by the call of the new implemented read or write in our program of ATFed. We realized this part in the hook module by the GOT hook technology which will analyze the dynamic link library of ELF format to get information of String Table, Symbol Table and the Relocation Table as well as figure out the address of hook functions in GOT;

Protection.

Realization of the replaced methods of open and close to mark cache files in the specified path and then protection will be realized in the replaced methods of read and write by decrypting the file data before read operations and encrypting the file data before the write operations. Note that the encryption algorithm can be replaced to balance the safety and efficiency.

As a practical solution to secure sensitive cache file data, ATFed can be provided as part of the X-Decaf framework to monitor applications run on the users’ devices. While a cache file is generated and stored on the unsafe directory, ATFed performs the protection and transparently encrypt the file and decrypt it when opening and reading the cache file. Besides, ATFed can also be provided as a native library to the application developer. While an application developer released an application, he can include this library and call the protection function for privacy protection. Finally, ATFed can also be provided as a wrapper service for the application with enhanced safety capabilities to protect sensitive cache file data. The application will be repackaged before it is released or user can upload the application for wrapper and then downloaded the enhanced one and installed on their phones.

6 Experimental Results of Auto-protection of Cache File

To figure out the compatibility and effectiveness of ATFed as well as the performance overhead, we conduct several simulation testing on Nexus5 with android version 4.4.4 and 5.1.1, the detailed analysis results are presented below.

As for compatibility and effectiveness, we use a test apk with file IO operation and install it on the phones. The apk is run on both the Dalvik virtual machine and the Android runtime (ART) virtual machine under protection of our system. After that, we manually interact with it and then use the “adb pull” command to pull out all the files in certain path and check whether the files were encrypted. The evaluation shows that this apk can execute normally both in Dalvik and ART, and all these files are encrypted successfully.

To evaluate the performance overhead of ATFed, we use this apk to encrypt and decrypt ten files with different sizes ranging from 1 megabytes to 500 megabytes 5 times and calculate the average time used for each operation. The algorithm we adopted is RC4. Table 6 shows the experimental results for Dalvik run-time with android version of 4.4.4 and Table 7 is for ART with android version of 5.1.1.

Table 6. Average time consumed in reading and writing operations in Dalvik
Table 7. Average time consumed in reading and writing operations

It is clear that the transparent encryption and decryption process does consume more time and have an influence on the application run-time to a certain extent, but fortunately the overhead is acceptable, as the size of a cache file in the MSNA can be much smaller than most of the test data given in our experiments. For example, the size of a general image or voice cache file in WeChat can be several hundred kilobytes or even smaller while the size of a general video file is around 2 megabytes. More specifically, a 10 s video file size in WeChat is just 1.7 M and it takes less than 0.1 s to encrypt or decrypt this cache file. What’s more, the overhead can be even more positive as they are not such IO extensive in the real-world applications.

7 Conclusion

By analyzing the MSNAs from the China domestic market, we find that the developers of the current MSNAs fail to consider the protection of the user’s privacy data in the development process, which is greatly different from our common sense. Typically, we believe that as the most popular mobile applications, how to protect user privacy should be considered in the MSNAs from the beginning of the development. However, experimental results show that most of the MSNAs have suffered from media privacy leakage. A solution called ATFed mechanism is proposed to protect the cache files generated by the MSNAs, so that the files can be stored in ciphertext and regarded as the protected cache files created during the run-time of the MSNA.