Skip to content
This repository was archived by the owner on Apr 26, 2019. It is now read-only.

kefth/secloud-taint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TaintClassify

SeCloud project on classifying node.js sinks and sources. Based on OWASP list of JavaScript vulnerabilities. Inspired by the paper by Rasthofer et al. A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks

App for classifying can be found in the secloudapp folder.

JSON data format

Data is extracted from the multiple files downloaded from node.js and located in json folder.

Currently only the 'textRaw' and 'params' are taken into account. Those are aggregated in data.json.

Format is as follows:

{
    "cl": 0,
    "params": [
        "value",
        "message"
    ],
    "textRaw": "assert(value[, message])"
}

Param "cl" refers to the class. There are three classes in this dataset:

    neither:    0
    source:     1
    sink:       2

For unknown class:

cl: -1

The python file that handles parsing is processJSON.py

Features

For handcrafted features to be used as input look at helperJSON.py

Currently features are binary(is a feature present) and extracted from method names. Features are based on OWASP list of JavaScript vulnerabilities e.g. get usually is a source of information. There are 15 such features extracted.

Issues

  • Dataset is small with 265 hand annotated examples.
  • Hand crafted features do not cover all possible cases of a source or a sink in Node.js hence some valuable info for classification is missing.

About

Classification of sinks and sources in node.js API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors