skip to main content
10.1145/3092703.3092707acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

CPR: cross platform binary code reuse via platform independent trace program

Published: 10 July 2017 Publication History

Abstract

The rapid growth of Internet of Things (IoT) has been created a number of new platforms recently. Unfortunately, such variety of IoT devices causes platform fragmentation which makes software development on such devices challenging. In particular, existing programs cannot be simply reused on such devices as they rely on certain underlying hardware and software interfaces which we call platform dependencies. In this paper, we present CPR, a novel technique that synthesizes a platform independent program from a platform dependent program. Specifically, we leverage an existing system called PIEtrace which can generate a platform independent trace program. The generated trace program is platform independent while it can only reproduce a specific execution path. Hence, we develop an algorithm to merge a set of platform independent trace programs and synthesize a general program that can take multiple inputs. The synthesized platform-independent program is representative of the merged trace programs and the results produced by the program is correct if no exceptions occur. Our evaluation results on 15 real-world applications show that CPR is highly effective on reusing existing binaries across platforms.

References

[1]
1 hour software by skrommel - donationcoder.com. http://www.donationcoder. com/Software/Skrommel/.
[2]
10 enterprise internet of things deployments with actual results. http://www.networkworld.com/article/2848714/cisco-subnet/ 10-enterprise-internet-of-things-deployments-with-actual-results.html.
[3]
11 amazing success stories to prove that internet of things (iot) is not just a verbal tic. https://www.linkedin.com/pulse/ 11-amazing-success-stories-prove-internet-things-iot-just-sambhani.
[4]
404 - file or directory not found. http://www.altools.com/al/downloads/egg_ module/unegg_v0.5.tar.bz.
[5]
Alzip - cute & easy file compression program - altools. http://www.altools.com/ altools/alzip.aspx.
[6]
Arduino. https://www.arduino.cc/.
[7]
Autohotkey script showcase. https://autohotkey.com/docs/scripts/.
[8]
Binary executable transforms (bet). https://opencatalog.darpa.mil/BET.html.
[9]
Bypassing malware defenses. https://www.sans.org/reading-room/whitepapers/ testing/bypassing-malware-defenses-33378.
[10]
Cisco ios technologies. http://www.cisco.com/c/en/us/products/ ios-nx-os-software/ios-technologies/index.html.
[11]
Dagger. http://dagger.repzret.org/.
[12]
Dronecode. https://www.dronecode.org/.
[13]
Findfirstfile behaves differently on vista. http://www.yqcomputer.com/1147_ 3324_1.htm.
[14]
Hex-rays. ida pro disassembler. https://www.hex-rays.com/idapro.
[15]
Intel(r)-based drone technology pushes boundaries. http://www.intel.com/ content/www/us/en/technology-innovation/aerial-technology-overview.html.
[16]
Intel(r) galileo gen 2. http://www.intel.com/content/www/us/en/embedded/ products/galileo/galileo-overview.html.
[17]
Intel(r) iot platform. http://www.intel.com/content/www/us/en/ internet-of-things/infographics/iot-platform-infographic.html.
[18]
Internet of things: Why iot is here to stay within the enterprise. http://blogs. air-watch.com/2015/11/internet-things-iot-enterprise/#.V79OzlsrJUQ.
[19]
Lifehacker code: Texter (windows). http://lifehacker.com/238306/ lifehacker-code-texter-windows.
[20]
Linux cross reference - inflate.c source code. http://lxr.free-electrons.com/source/ lib/inflate.c.
[21]
mbed iot device platform. https://www.arm.com/products/ internet-of-things-solutions/mbed-IoT-device-platform.php.
[22]
Mc-semantics. .https://github.com/trailofbits/mcsema.
[23]
Onr baa announcement # n00014-17-s-b010. https://www.onr.navy.mil/-/media/ Files/Funding-Announcements/BAA/2017/N00014-17-S-B010.ashx.
[24]
Raspberry pi. https://www.raspberrypi.org/.
[25]
Scripts and functions - autohotkey community. https://autohotkey.com/boards/ viewforum.php?f=6.
[26]
Top 10 windows applications that should be on macs. http://lifehacker.com/ 5567174/top-10-windows-applications-that-should-be-on-macs.
[27]
View / export the address book of ms-outlook. http://www.nirsoft.net/utils/ outlook_address_book_view.html.
[28]
Why does this code work on windows 7, but doesn’t on windows xp? http://stackoverflow.com/questions/12638698/ why-does-this-code-work-on-windows-7-but-doesnt-on-windows-xp.
[29]
Winehq - run windows applications on linux, bsd, solaris and mac os x. https: //www.winehq.org/.
[30]
G. Altekar and I. Stoica. Odr: Output-deterministic replay for multicore debugging. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP ’09, pages 193–206, New York, NY, USA, 2009. ACM.
[31]
K. Anand, M. Smithson, K. Elwazeer, A. Kotha, J. Gruen, N. Giles, and R. Barua. A compiler-level intermediate representation based binary analysis and rewriting system. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys ’13, pages 295–308, New York, NY, USA, 2013. ACM.
[32]
A. Arcuri and X. Yao. A novel co-evolutionary approach to automatic software bug fixing. In 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), pages 162–168, June 2008.
[33]
L. Atzori, A. Iera, and G. Morabito. The internet of things: A survey. Comput. Netw., 54(15):2787–2805, Oct. 2010.
[34]
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. SIGOPS Oper. Syst. Rev., 37(5):164–177, Oct. 2003.
[35]
E. T. Barr, M. Harman, Y. Jia, A. Marginean, and J. Petke. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, pages 257–269, New York, NY, USA, 2015. ACM.
[36]
F. Bellard. Qemu, a fast and portable dynamic translator. In Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC ’05, pages 41–41, Berkeley, CA, USA, 2005. USENIX Association.
[37]
S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drinić, D. Mihočka, and J. Chau. Framework for instruction-level tracing and analysis of program executions. In Proceedings of the 2Nd International Conference on Virtual Execution Environments, VEE ’06, pages 154–163, New York, NY, USA, 2006. ACM.
[38]
P. T. Breuer and J. P. Bowen. Decompilation: The enumeration of types and grammars. ACM Trans. Program. Lang. Syst., 16(5):1613–1647, Sept. 1994.
[39]
D. Brumley, I. Jager, T. Avgerinos, and E. J. Schwartz. Bap: A binary analysis platform. In Proceedings of the 23rd International Conference on Computer Aided Verification, CAV’11, pages 463–469, Berlin, Heidelberg, 2011. Springer-Verlag.
[40]
J. Chow, T. Garfinkel, and P. M. Chen. Decoupling dynamic program analysis from execution in virtual environments. In USENIX 2008 Annual Technical Conference, ATC’08, pages 1–14, Berkeley, CA, USA, 2008. USENIX Association.
[41]
B. Cmelik and D. Keppel. Shade: a fast instruction-set simulator for execution profiling. SIGMETRICS Perform. Eval. Rev., 22(1):128–137, May 1994.
[42]
B. Dolan-Gavitt, T. Leek, M. Zhivich, J. Giffin, and W. Lee. Virtuoso: Narrowing the semantic gap in virtual machine introspection. In Proceedings of the 2011 IEEE Symposium on Security and Privacy, SP ’11, pages 297–312, Washington, DC, USA, 2011. IEEE Computer Society.
[43]
G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. Revirt: enabling intrusion analysis through virtual-machine logging and replay. SIGOPS Oper. Syst. Rev., 36(SI):211–224, Dec. 2002.
[44]
G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE ’08, pages 121–130, New York, NY, USA, 2008. ACM.
[45]
C. Goues, S. Forrest, and W. Weimer. Current challenges in automatic software repair. Software Quality Journal, 21(3):421–443, Sept. 2013.
[46]
M. Harman, W. B. Langdon, Y. Jia, D. R. White, A. Arcuri, and J. A. Clark. The gismoe challenge: Constructing the pareto program surface using genetic programming to find better programs (keynote paper). In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012, pages 1–14, New York, NY, USA, 2012. ACM.
[47]
M. Harman, W. B. Langdon, and W. Weimer. Genetic programming for reverse engineering. In 2013 20th Working Conference on Reverse Engineering (WCRE), pages 1–10, Oct 2013.
[48]
N. M. Johnson, J. Caballero, K. Z. Chen, S. McCamant, P. Poosankam, D. Reynaud, and D. Song. Differential slicing: Identifying causal execution differences for security applications. In Proceedings of the 2011 IEEE Symposium on Security and Privacy, SP ’11, pages 347–362, Washington, DC, USA, 2011. IEEE Computer Society.
[49]
D. Kim, Y. Kwon, W. N. Sumner, X. Zhang, and D. Xu. Dual execution for on the fly fine grained execution comparison. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’15, pages 325–338, New York, NY, USA, 2015. ACM.
[50]
D. Kim, W. N. Sumner, X. Zhang, D. Xu, and H. Agrawal. Reuse-oriented reverse engineering of functional components from x86 binaries. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 1128– 1139, New York, NY, USA, 2014. ACM.
[51]
C. Kolbitsch, T. Holz, C. Kruegel, and E. Kirda. Inspector gadget: Automated extraction of proprietary gadgets from malware binaries. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, SP ’10, pages 29–44, Washington, DC, USA, 2010. IEEE Computer Society.
[52]
Y. Kwon, X. Zhang, and D. Xu. Pietrace: Platform independent executable trace. In 2013 IEEE/ACM 28th International Conference on Automated Software Engineering, pages 48–58, Nov 2013.
[53]
Langdon and W. B. Mark Harman. Optimising Existing Software with Genetic Programming. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 19, NO. 1, FEBRUARY 2015.
[54]
D. Merkel. Docker: Lightweight linux containers for consistent development and deployment. Linux J., 2014(239), Mar. 2014.
[55]
S. Narayanasamy, G. Pokam, and B. Calder. Bugnet: Continuously recording program execution for deterministic replay debugging. SIGARCH Comput. Archit. News, 33(2):284–295, May 2005.
[56]
J. Petke, M. Harman, W. B. Langdon, and W. Weimer. Using genetic improvement & code transplants to specialise a c++ program to a problem class. In In 17th European Conference on Genetic Programming (EuroGP, 2014.
[57]
J. Polley, D. Blazakis, J. Mcgee, D. Rusk, and J. S. Baras. Atemu: A fine-grained sensor network simulator. In IEEE SECON ’04, 2004.
[58]
Y. Saito. Jockey: a user-space library for record-replay debugging. In AADEBUG’05, 2005.
[59]
E. J. Schwartz, J. Lee, M. Woo, and D. Brumley. Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In Proceedings of the 22Nd USENIX Conference on Security, SEC’13, pages 353–368, Berkeley, CA, USA, 2013. USENIX Association.
[60]
S. Sidiroglou, O. Laadan, C. Perez, N. Viennot, J. Nieh, and A. D. Keromytis. Assure: Automatic software self-healing using rescue points. SIGPLAN Not., 44(3):37–48, Mar. 2009.
[61]
E. H. Spafford. Extending mutation testing to find environmental bugs. Software Practice and Principle, 20(2):181–189, February 1990.
[62]
ISSTA’17, July 2017, Santa Barbara, CA, USA Yonghwi Kwon, Weihang Wang, Yunhui Zheng, Xiangyu Zhang, and Dongyan Xu
[63]
S. M. Srinivasan, S. Kandula, C. R. Andrews, and Y. Zhou. Flashback: A lightweight extension for rollback and deterministic replay for software debugging. In Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC ’04, pages 3–3, Berkeley, CA, USA, 2004. USENIX Association.
[64]
J. Swan, M. G. Epitropakis, and J. R. Woodward. Geno-fix: An embeddable framework for dynamic adaptive genetic improvement programming. 2014.
[65]
B. L. Titzer, D. K. Lee, and J. Palsberg. Avrora: Scalable sensor network simulation with precise timing. In Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, IPSN ’05, Piscataway, NJ, USA, 2005.
[66]
IEEE Press.
[67]
M. Van Emmerik and T. Waddington. Using a decompiler for real-world source recovery. In Proceedings of the 11th Working Conference on Reverse Engineering, WCRE ’04, pages 27–36, Washington, DC, USA, 2004. IEEE Computer Society.
[68]
A. Vasudevan, N. Qu, and A. Perrig. Xtrec: Secure real-time execution trace recording on commodity platforms. In Proceedings of the 2011 44th Hawaii International Conference on System Sciences, HICSS ’11, pages 1–10, Washington, DC, USA, 2011. IEEE Computer Society.
[69]
B. Walters. Vmware virtual platform. Linux J., 1999(63es), July 1999.
[70]
S. Wang, P. Wang, and D. Wu. Reassembleable disassembling. In Proceedings of the 24th USENIX Conference on Security Symposium, SEC’15, pages 627–642, Berkeley, CA, USA, 2015. USENIX Association.
[71]
W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pages 364–374, Washington, DC, USA, 2009. IEEE Computer Society.
[72]
D. R. White, A. Arcuri, and J. A. Clark. Evolutionary improvement of programs. Trans. Evol. Comp, 15(4):515–538, Aug. 2011.
[73]
G. Xu, A. Rountev, Y. Tang, and F. Qin. Efficient checkpointing of java software using context-sensitive capture and replay. In ESEC-FSE, 2007.
[74]
M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. SIGARCH Comput. Archit. News, 31(2):122– 135, May 2003.
[75]
J. Zeng, Y. Fu, K. Miller, Z. Lin, X. Zhang, and D. Xu. Obfuscation resilient binary code reuse through trace-oriented programming. In CCS ’13, 2013.

Cited By

View all
  • (2024)DeLink: Source File Information Recovery in BinariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680338(1009-1021)Online publication date: 11-Sep-2024
  • (2024)PTGFI: A Prompt-Based Two-Stage Generative Framework for Function Name Inference2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10830989(3839-3844)Online publication date: 6-Oct-2024
  • (2022)Improving cross-platform binary analysis using representation learning via graph alignmentProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3533767.3534383(151-163)Online publication date: 18-Jul-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis
July 2017
447 pages
ISBN:9781450350761
DOI:10.1145/3092703
  • General Chair:
  • Tevfik Bultan,
  • Program Chair:
  • Koushik Sen
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 July 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Binary-analysis
  2. Binary-reuse
  3. Cross-platform
  4. Reverse-engineering

Qualifiers

  • Research-article

Conference

ISSTA '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)DeLink: Source File Information Recovery in BinariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680338(1009-1021)Online publication date: 11-Sep-2024
  • (2024)PTGFI: A Prompt-Based Two-Stage Generative Framework for Function Name Inference2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10830989(3839-3844)Online publication date: 6-Oct-2024
  • (2022)Improving cross-platform binary analysis using representation learning via graph alignmentProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3533767.3534383(151-163)Online publication date: 18-Jul-2022
  • (2022)APT Attribution for Malware Based on Time Series Shapelets2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom56396.2022.00108(769-777)Online publication date: Dec-2022
  • (2019)CustomPro: Network Protocol Customization Through Cross-Host Feature AnalysisSecurity and Privacy in Communication Networks10.1007/978-3-030-37231-6_4(67-85)Online publication date: 11-Dec-2019
  • (2018)TOSSProceedings of the 2018 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3273045.3273048(1-7)Online publication date: 15-Oct-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media