Team, Visitors, External Collaborators
Overall Objectives
Research Program
Highlights of the Year
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
XML PDF e-pub
PDF e-Pub

Section: New Software and Platforms


Evolving-PAcker Classifier

Keywords: Packer classification - Incremental learning - Clustering - Malware - Obfuscation

Functional Description: E-PAC is an Evolving packer classifier that identifies the class of the packer used in a batch of packed binaries given in input. The software has the ability to identify both known packer classes and new unseen packer classes. After each update, the evolving classifier self-updates itself with the predicted packer classes.

The software is based on a semi-supervised machine learning system composed of an offline phase and an online phase. In the offline phase, a set of features is extracted from a collection of packed binaries provided with their ground truth labels, then a density-based clustering algorithm (DBSCAN) is used to group similar packers together with respect to a distance measure. In this step, the similarity threshold is tuned in order to form the clusters that fit the best with the the set of labels provided.

In the online phase, the software reproduces the same operations of features extraction and distances calculation with the incoming packed samples, then uses a customized version of the incremental clustering algorithm DBSCAN in order to classify them, either in knowns packer classes or fom new packer classes, or provisoirely leave them unclassified (notion of noise with DBSCAN).

The clusters formed after each update serve as a baseline for the application to self-evolve.