Skip to content

Latest commit

 

History

History
501 lines (360 loc) · 22.9 KB

README.md

File metadata and controls

501 lines (360 loc) · 22.9 KB

NetFlow-Based-ML-NIPS on Docker

Most of the documentation, mutch like the base ELK implementation is from Anthony Lapenna AKA deviantony used under his MIT License

Elastic Stack version

Based on the official Docker images from Elastic:


Installation

Clone this repository onto the Docker host that will run the stack with the command below:

git clone https://github.com/JacDM/ML-Based-NIPS.git

Navigate to the directory

cd ML-Based-NIPS

Wait for Elasticsearch to start, if the setup container faile, run it again and elasticsearc should be up.

docker-compose up setup

The next command is to start all other containers as they all depend on Elasticsearch

docker-compose up

Machine Learning & Deep Learning Experimentation

To run the jupyter Notebook, you need to download and extract the dataset into a folder called NF-ToN-IoT-v2 within the Data Sets folder. you can then run the file up to the Decision Tree section at which point, the modified and reduces dataset should be saved and you don't have to run that section again which makes things easier. yuou can also skip that and use the already preprocessed files by running the second part and only running the For Neural Networks cell when testing ANN's.

Running the NIPS

By default it should run automatically, in the case that it takes too long or dosent start, you can execute:

python app.py

to start the app manually, or just run it independently from the compose on your hosts system which is my favorite method of doing it as you dent have to fiddle with global docker variables.

Setting up NIPS Parameters

esClient = Elasticsearch('http://localhost:9200', basic_auth=["elastic", "changeme"])
    hostIP = "127.0.0.1" # used to open SSH Client to your host OS to add firewall rules.
    username = "NIPS"  # Replace with your username
    password = "passwd"  # Replace with your password
    consoleLogLevel = 1 #Replace with desired log level
    nipsMode = 1 #Enables IP Blockinf if attack is detected
    timeFrame = "5m" #Defines timeframe to lookback from for elastic
    modelsPATH = "app/config/models/" # Path where models are stored

These parameters can are hardcoded and if you run them on your host OS then they will be user, also make sure all packages listed in app/Dockerfile are installed in your host if you want to run the script on your host. Make sure sklearn is version 1.4.1-post1

However if you run it via docker, then you dont need to wwory aboyut the packages as they will install themselves.

Setting up SSH

Also for NIPS to work you need to setup an ssl server on the host, for windows you can follow the Tutorial

Setting up nProbe

Warning

nProbe requires a licence to generate flows, contact ntop team to request an educational licence for free, i cannot share mine as they are linked to my VM and will not work on another machine

install nprobe from the ntop website and setup the licence file, and run the code bellow after navigating to the install directory/bin

.\nprobe /c -i ’Intel(R) PRO/1000 MT Network Connection #3’ -V 9 -b 1 -t 60 -o
1 --export-template -T " %IPV4_SRC_ADDR %L4_SRC_PORT %IPV4_DST_ADDR
%L4_DST_PORT %PROTOCOL %FLOW_DURATION_MILLISECONDS %TCP_WIN_MAX_IN
%DURATION_OUT %MAX_TTL %L7_PROTO %SRC_TO_DST_AVG_THROUGHPUT %SHORTEST_FLOW_PKT
%MIN_IP_PKT_LEN %TCP_WIN_MAX_OUT %OUT_BYTES %FIRST_SWITCHED %LAST_SWITCHED "

-V, specifies the netflow format used, -t is the timeout for active conversations and -T specifies the features we want to be sent in the flow.

Setting up Kibana

While kibana works, you need to setup certain things to make flows enter python and make visualization easy, first head to the discover panel and create a new data view,

Elastiflow view setup

after elastiflow sends some flows to elastic, you can enter elastic* in the index pattern field and select @Timestamp from the dropdown for the timestamp field. Name the view elastiflow* as well and save the view. if it matches some sources, amazing, if not, then nProbe is not sending data to elastic. Remember if you dont see any data try changing the time frame on the top right of every page

Elastiflow data View

To get the visualizations like the one below:

Elastiflow Sankey Diagram

its a little more involved:

  • Navigate to the hamburger menu and scroll dowwn and click management
  • Scroll down the list to the kibana section and click saved objects.
  • from this page, hit the import button and grab the file located at kibana\Dashboard.ndjson in this repo and import all 400 or so objects.
  • Next navigate to data views within the kibana section of the leftward menu
  • 3 new flows views will popup following the syntax elastiflow-X-ecs* click on each one click the edit button and change the index patern to elastiflow*
  • Now when you go to any of the dashboards, some elements may be brocken as some IE that are parsed by elastiflow have different names. all you have to do is edit them and chose an appropriate feature to replace it.

Elastiflow

Warning

Elastiflow requires the free trial version to collect and parse nprobe Information Elements, the current credentials will expire on 11/05/2024, if it does you can go to elastiflows website and gain another account id and licence key.

Contents

  1. Requirements
  2. Usage
  3. Configuration
  4. Extensibility
  5. JVM tuning
  6. Going further

Requirements

Host setup

Note

Especially on Linux, make sure your user has the required permissions to interact with the Docker daemon.

By default, the stack exposes the following ports:

  • 2055: ElastiFlow Netflow input
  • 9200: Elasticsearch HTTP
  • 9300: Elasticsearch TCP transport
  • 5601: Kibana

Warning

Elasticsearch's bootstrap checks were purposely disabled to facilitate the setup of the Elastic stack in development environments. For production setups, we recommend users to set up their host according to the instructions from the Elasticsearch documentation: Important System Configuration.

Docker Desktop

Windows

If you are using the legacy Hyper-V mode of Docker Desktop for Windows, ensure File Sharing is enabled for the C: drive.

macOS

Warning

While the NIPS technically can run on a mac but it has not been tested and is not suported, Additionally, NIPS Mode, which blocks IP's will not work as there is no command for it so set the docker env variables and dockerfile defaults in the app folder accordingly.

The default configuration of Docker Desktop for Mac allows mounting files from /Users/, /Volume/, /private/, /tmp and /var/folders exclusively. Make sure the repository is cloned in one of those locations or follow the instructions from the documentation to add more locations.

Usage

Note

The instructions bellow are straight from deviantony for troubleshooting and you should not need any of the following.

Important

Platinum features are enabled by default for a trial duration of 30 days. After this evaluation period, you will retain access to all the free features included in the Open Basic license seamlessly, without manual intervention required, and without losing any data. Refer to the How to disable paid features section to opt out of this behaviour.

Warning

You must rebuild the stack images with docker-compose build whenever you switch branch or update the version of an already existing stack.

Note

You can also run all services in the background (detached mode) by appending the -d flag to the above command.

Give Kibana about a minute to initialize, then access the Kibana web UI by opening http://localhost:5601 in a web browser and use the following (default) credentials to log in:

  • user: elastic
  • password: changeme

Note

Upon the initial startup, the elastic, and kibana_system Elasticsearch users are intialized with the values of the passwords defined in the .env file ("changeme" by default). The first one is the built-in superuser, the other two are used by Kibana and Logstash respectively to communicate with Elasticsearch. This task is only performed during the initial startup of the stack. To change users' passwords after they have been initialized, please refer to the instructions in the next section.

Initial setup

Setting up user authentication

Note

Refer to Security settings in Elasticsearch to disable authentication.

Warning

Starting with Elastic v8.0.0, it is no longer possible to run Kibana using the bootstraped privileged elastic user.

The "changeme" password set by default for all aforementioned users is unsecure. For increased security, we will reset the passwords of all aforementioned Elasticsearch users to random secrets.

  1. Reset passwords for default users

    The commands below reset the passwords of the elasticand kibana_system users. Take note of them.

    docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user elastic
    docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user kibana_system

    If the need for it arises (e.g. if you want to collect monitoring information through Beats and other components), feel free to repeat this operation at any time for the rest of the built-in users.

  2. Replace usernames and passwords in configuration files

    Replace the password of the elastic user inside the .env file with the password generated in the previous step. Its value isn't used by any core component, but extensions use it to connect to Elasticsearch.

    [!NOTE] In case you don't plan on using any of the provided extensions, or prefer to create your own roles and users to authenticate these services, it is safe to remove the ELASTIC_PASSWORD entry from the .env file altogether after the stack has been initialized.

    Replace the password of the kibana_system user inside the .env file with the password generated in the previous step. Its value is referenced inside the Kibana configuration file (kibana/config/kibana.yml).

    See the Configuration section below for more information about these configuration files.

  3. Restart Python and Kibana to re-connect to Elasticsearch using the new passwords

    docker-compose up -d python kibana

Note

Learn more about the security of the Elastic stack at Secure the Elastic Stack.

Cleanup

Elasticsearch data is persisted inside a volume by default.

In order to entirely shutdown the stack and remove all persisted data, use the following Docker Compose command:

docker-compose down -v

Version selection

This repository stays aligned with the latest version of the Elastic stack. The main branch tracks the current major version (8.x).

To use a different version of the core Elastic components, simply change the version number inside the .env file. If you are upgrading an existing stack, remember to rebuild all container images using the docker-compose build command.

Important

Always pay attention to the official upgrade instructions for each individual component before performing a stack upgrade.

Older major versions are also supported on separate branches:

Configuration

Important

Configuration is not dynamically reloaded, you will need to restart individual components after any configuration change.

How to configure Elasticsearch

The Elasticsearch configuration is stored in elasticsearch/config/elasticsearch.yml.

You can also specify the options you want to override by setting environment variables inside the Compose file:

elasticsearch:

  environment:
    network.host: _non_loopback_
    cluster.name: my-cluster

Please refer to the following documentation page for more details about how to configure Elasticsearch inside Docker containers: Install Elasticsearch with Docker.

How to configure Kibana

The Kibana default configuration is stored in kibana/config/kibana.yml.

You can also specify the options you want to override by setting environment variables inside the Compose file:

kibana:

  environment:
    SERVER_NAME: kibana.example.org

Please refer to the following documentation page for more details about how to configure Kibana inside Docker containers: Install Kibana with Docker.

How to configure Logstash

The Logstash configuration is stored in logstash/config/logstash.yml.

You can also specify the options you want to override by setting environment variables inside the Compose file:

logstash:

  environment:
    LOG_LEVEL: debug

Please refer to the following documentation page for more details about how to configure Logstash inside Docker containers: Configuring Logstash for Docker.

How to disable paid features

You can cancel an ongoing trial before its expiry date — and thus revert to a basic license — either from the License Management panel of Kibana, or using Elasticsearch's start_basic Licensing API. Please note that the second option is the only way to recover access to Kibana if the license isn't either switched to basic or upgraded before the trial's expiry date.

Changing the license type by switching the value of Elasticsearch's xpack.license.self_generated.type setting from trial to basic (see License settings) will only work if done prior to the initial setup. After a trial has been started, the loss of features from trial to basic must be acknowledged using one of the two methods described in the first paragraph.

How to scale out the Elasticsearch cluster

Follow the instructions from the Wiki: Scaling out Elasticsearch

How to re-execute the setup

To run the setup container again and re-initialize all users for which a password was defined inside the .env file, simply "up" the setup Compose service again:

$ docker-compose up setup
 ⠿ Container docker-elk-elasticsearch-1  Running
 ⠿ Container docker-elk-setup-1          Created
Attaching to docker-elk-setup-1
...
docker-elk-setup-1  | [+] User 'monitoring_internal'
docker-elk-setup-1  |    ⠿ User does not exist, creating
docker-elk-setup-1  | [+] User 'beats_system'
docker-elk-setup-1  |    ⠿ User exists, setting password
docker-elk-setup-1 exited with code 0

How to reset a password programmatically

If for any reason your are unable to use Kibana to change the password of your users (including built-in users), you can use the Elasticsearch API instead and achieve the same result.

In the example below, we reset the password of the elastic user (notice "/user/elastic" in the URL):

curl -XPOST -D- 'http://localhost:9200/_security/user/elastic/_password' \
    -H 'Content-Type: application/json' \
    -u elastic:<your current elastic password> \
    -d '{"password" : "<your new password>"}'

Extensibility

How to add plugins

To add plugins to any ELK component you have to:

  1. Add a RUN statement to the corresponding Dockerfile (eg. RUN logstash-plugin install logstash-filter-json)
  2. Add the associated plugin code configuration to the service configuration (eg. Logstash input/output)
  3. Rebuild the images using the docker-compose build command

How to enable the provided extensions

A few extensions are available inside the extensions directory. These extensions provide features which are not part of the standard Elastic stack, but can be used to enrich it with extra integrations.

The documentation for these extensions is provided inside each individual subdirectory, on a per-extension basis. Some of them require manual changes to the default ELK configuration.

JVM tuning

How to specify the amount of memory used by a service

The startup scripts for Elasticsearch and Logstash can append extra JVM options from the value of an environment variable, allowing the user to adjust the amount of memory that can be used by each component:

Service Environment variable
Elasticsearch ES_JAVA_OPTS
Logstash LS_JAVA_OPTS

To accommodate environments where memory is scarce (Docker Desktop for Mac has only 2 GB available by default), the Heap Size allocation is capped by default in the docker-compose.yml file to 512 MB for Elasticsearch and 256 MB for Logstash. If you want to override the default JVM configuration, edit the matching environment variable(s) in the docker-compose.yml file.

For example, to increase the maximum JVM Heap Size for Logstash:

logstash:

  environment:
    LS_JAVA_OPTS: -Xms1g -Xmx1g

When these options are not set:

  • Elasticsearch starts with a JVM Heap Size that is determined automatically.
  • Logstash starts with a fixed JVM Heap Size of 1 GB.

How to enable a remote JMX connection to a service

As for the Java Heap memory (see above), you can specify JVM options to enable JMX and map the JMX port on the Docker host.

Update the {ES,LS}_JAVA_OPTS environment variable with the following content (I've mapped the JMX service on the port 18080, you can change that). Do not forget to update the -Djava.rmi.server.hostname option with the IP address of your Docker host (replace DOCKER_HOST_IP):

logstash:

  environment:
    LS_JAVA_OPTS: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=18080 -Dcom.sun.management.jmxremote.rmi.port=18080 -Djava.rmi.server.hostname=DOCKER_HOST_IP -Dcom.sun.management.jmxremote.local.only=false

Going further

Plugins and integrations

See the following Wiki pages: