The current commercial anti-virus software detects a virus only after the virus has appeared and caused damage. Motivated by the standard signature-based technique for detecting viruses, and a recent successful text classification method, we explore the idea of automatically detecting new malicious code using the collected dataset of the benign and malicious code. We obtained accuracy of 100% in the training data, and 98% in 3-fold cross-validation.
Citation:
Tony Abou-Assaleh, Nick Cercone, Vlado Kešelj, Ray Sweidan, "N-Gram-Based Detection of New Malicious Code," compsac, vol. 2, pp.41-42, 28th Annual International Computer Software and Applications Conference - Workshops and Fast Abstracts - (COMPSAC'04), 2004