Classification process #23

harelber · 2022-11-26T20:38:51Z

Hi Hugo,
Regarding Adagio, do you remember how to run the classification process on the extracted graphs?
I extracted the graphs using the -f/-p flags. However, there is no documentation how to actually run the train and test of an ML on these graphs. I assume that the common directory holds the functions, but I don't find the sequence.
I will be happy if you can add some instructions/script on this matter.

hgascon · 2022-11-29T16:11:12Z

Hi @harelber you can instantiate an Analysis object:

In [1]: from adagio.core.analysis import Analysis

In [2]: Analysis?
Init signature:
Analysis(
    dirs,
    labels,
    split,
    max_files=0,
    max_node_size=0,
    precomputed_matrix='',
    y='',
    fnames='',
)
Docstring:      A class to run a classification experiment
Init docstring:
The Analysis class allows to load sets of pickled graoh objects
from different directories where the objects in each directory
belong to different classes. It also provide the methods to run
different types of classification experiments by training and
testing a linear classifier on the feature vectors generated
from the different graph objects.

:dirs: A list with directories including types of files for
    classification e.g. <[MALWARE_DIR, CLEAN_DIR]> or just
    directories with samples from different malware families
:labels: The labels assigned to samples in each directory.
    For example a number or a string.
:split: The percentage of samples used for training (value
    between 0 and 1)
:precomputed_matrix: name of file if a data or kernel matrix
    has already been computed.
:y: If precomputed_matrix is True, a pickled and gzipped list
    of labels must be provided.
:returns: an Analysis object with the dataset as a set of
    properties and several functions to train, test, evaluate
    or run a learning experiment iteratively.

And then run the different experiments in the Analysis class:

In [3]: a = Analysis(...)
[...]
In [4]: a.run_linear_experiment(...)

There are other helping functions in the Analysis class that you can experiment with. Let me know if that works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification process #23

Classification process #23

harelber commented Nov 26, 2022

hgascon commented Nov 29, 2022

Classification process #23

Classification process #23

Comments

harelber commented Nov 26, 2022

hgascon commented Nov 29, 2022