Computer Engineering Master's Degree final project @ University of Rome 'Tor Vergata'
Author: Francesco Marino
Academic Year: 2019/2020
ServerlessFlowBench is a framework that allows users to:
- deploy serverless functions to Amazon Web Services, Google Cloud Platform and OpenWhisk (already defined functions are available),
- deploy serverless function compositions to Amazon Web Services, Google Cloud Platform and OpenWhisk (as before, already defined compositions are available),
- perform HTTP benchmarks on deployed functions and compositions.
- Java Developer Kit (JDK) version 8 (recommended) or newer
- Docker Desktop with the following Docker images installed:
amazon/aws-cli
with tag2.0.60
google/cloud-sdk
with tag316.0.0
francescom412/ow-utils-complete
with tag63a5498
influxdb
with tag1.8.2
grafana/grafana
with tag6.5.0
mysql
with tag8.0.17
bschitter/alpine-with-wrk2
with tag0.1
- Amazon Web Services valid account that can access to the following services:
- Google Cloud Platform valid account with, at least, the following enabled:
- OpenWhisk running deployment
- [OPTIONAL] Azure valid and active account with, at least, the following enabled:
Please note: the framework remains usable even with just 1 or 2, out of the 3, serverless platform(s) available.
Folder containing files needed for a container based execution of the project architecture.
docker-compose.yml
used to describe and deploy the project support architecture,grafana_storage
folder, to be added if not present, used to store Grafana container content and implement persistence,influx_storage
folder, to be added if not present, used to store InfluxDB container content and implement persistence,mysql_storage
folder, to be added if not present, used to store MySQL container content and implement persistence,grafana_dashboards
folder used to store Grafana dashboards needed to show benchmarks results.
- MySQL: a relational database used to keep track of every entity deployed to the cloud in order to be able to reach and, eventually, delete each of them.
- InfluxDB: a time series database used to keep track of benchmarks' results, each of them with the right test performance date and time.
- Grafana: a visualization tool used to show benchmarks' result stored in InfluxDB in clear and explicative dashboards.
In docker-compose.yml
file are listed credentials needed to access service containers:
- MySQL:
- username:
root
, - password:
password
.
- username:
- InfluxDB:
- username:
root
, - password:
password
.
- username:
- Grafana:
- username:
root
, - password:
password
.
- username:
In order to import in Grafana, after having the Docker compose environment up, the dashboards saved in grafana_dashboards
:
- connect to
http://localhost:3000
, - login using Grafana username and password,
- select the "setting" panel,
- choose "datasources" and add a new datasource,
- choose influxDB as datasource, set
http://influx-db:8086
(or replace "influx-db" with your InfluxDB Docker container name) as url, select your database (name can be set using theconfig.properties
file located in the project root) and insert InfluxDB credentials (please note: an error message appears if the database does not exist yet, make sure to insert the correct name and ignore the error as the information is going to be consistent at the moment of measurement insertion), - select the "+" tab,
- choose "import" option,
- select every dashboard inside the
grafana_dashboards
directory.
Folder containing examples of serverless functions and compositions created and benchmarked by the author.
Here is the list of the functionalities realized:
basic_composition
: composition realized just calling two different functions.latency_test
: JSON response generator.cpu_test
: big number factorization.
memory_test
: dynamic array allocation and filling.face_recognition
: detection of face and anger in an image.image_recognition
: detection of faces.anger_detection
: detection of anger if face found.
cycle_translator
: translation of sentences from any language to english (OpenWhisk version not realized).loop_controller
: utility to manage more sentence translation at a time.language_detection
: detection of the sentence language.sentence_translation
: translation to English language.translation_logger
: translation logging in a cloud bucket.
Each of them has been realized for Python, Java and Node.js (Javascript) in different versions, one for each tested provider.
aws
folder containing functionalities meant to be deployed to Amazon Web Services:java
containing Java AWS version of the functionalities,node
containing Node.js AWS version of the functionalities,python
containing Python AWS version of the functionalities,orchestration_handler
folder containing a Python handler to execute and return result of compositions.
gcloud
folder containing functionalities meant to be deployed to Google Cloud Platform:java
containing Java Google Cloud version of the functionalities,node
containing Node.js Google Cloud version of the functionalities,python
containing Python Google Cloud version of the functionalities,orchestration_handler
folder containing a Python handler to execute and return result of compositions.
openwhisk
folder containing functionalities meant to be deployed to OpenWhisk:
This directory contains, in its subdirectories, Java code for Serverless Composition Performance Project execution, further details are provided in the next section.
The entire project part was developed using JetBrains' IntelliJ IDEA so it is recommended to open it using this IDE for better code navigation.
In the main folder is located the class ServerlessFlowBenchMain.java
, this is the application entry point that allows the user to:
- deploy serverless functions,
- deploy serverless compositions,
- optionally deploy of elements needed by the previous entities to work (e.g. cloud buckets),
- perform benchmarks on functions and compositions,
- deploy serverless functions that collect information about their execution environment,
- remove every entity previously deployed.
This package contains classes for shell commands execution grouped by functionality type.
In the main folder there are:
CommandExecutor.java
, an abstract class providing common functions needed for shell command execution,CommandUtility.java
, an abstract class providing common functions and elements needed for shell command building,StreamGobbler.java
used for executing shell command output collection.
BenchmarkCommandExecutor.java
needed to execute load benchmarks, cold start benchmarks and collect results,BenchmarkCommandUtility.java
needed to build shell commands for load benchmarks execution using wrk2,- output_parsing package containing utilities to parse benchmarks results:
BenchmarkCollector.java
needed to parse wrk2 benchmarks results,BenchmarkStats.java
needed to collect wrk2 benchmarks results.
DockerException.java
raised when a Docker daemon execution related error occurs,DockerExecutor.java
needed to check Docker containers correct configuration, Docker images presence and Docker composition running.
AmazonCommandUtility.java
used to create Amazon Web Services CLI shell commands,GoogleCommandUtility.java
used to create Google CLoud Platform CLI shell commands,OpenWhiskCommandUtility.java
used to create OpenWhisk CLI shell commands,BucketsCommandExecutor.java
used to execute cloud buckets related commands,CompositionCommandExecutor.java
used to execute serverless compositions related commands,FunctionCommandExecutor.java
used to execute serverless functions related commands,TablesCommandExecutor.java
used to execute cloud NoSQL storage related commands,IllegalNameException.java
raised when a malformed name is attempted to be assigned to a resource,- output_parsing package containing utilities to parse command outputs:
ReplyCollector.java
used to collect console command execution output,URLFinder.java
used to collect deployment url from console command execution output,
- security package containing security utilities:
GoogleAuthClient.java
used to authenticate Google Cloud Workflows [BETA] executions urls.
This package contains classes needed for external databases interaction.
InfluxClient.java
used to export benchmark results to the time series database InfluxDB.
CloudEntityData.java
used to collect functions, compositions, bucket and NoSQL table information,DAO.java
, an abstract class providing common information and methods needed by database access objects,FunctionalityURL.java
used to collect resource deployment url,MySQLConnect.java
used to connect and disconnect MySQL database,- daos package containing database access objects implementations:
BucketsRepositoryDAO.java
needed for cloud buckets' persistence management,CompositionsRepositoryDAO.java
needed for serverless compositions' persistence management,FunctionsRepositoryDAO.java
needed for serverless functions' persistence management,TablesRepositoryDAO.java
needed for cloud NoSQL tables' persistence management.
This package contains classes needed for configuration purposes.
ComposeManager.java
used to obtain automatically Docker images used inside thedocker-compose.yml
,PropertiesManager.java
used to get configuration parameters fromconfig.properties
file stored in the project root (further details provided in following sections).
Authentication files related to user's active services required to run the application (the ones used in the development process were excluded using .gitignore
file for privacy related reasons).
A file named credentials
is required serverless_functions/aws/.aws
, it should contain AWS account access key and secret. This file has the following structure:
[default]
aws_access_key_id=xxxxxxxxxx
aws_secret_access_key=xxxxxxxxxx
It can be downloaded from AWS Console β My Security Credentials (in the account menu) β Access Keys β New Access Key.
A file named credentials.json
is required in serverless_functions/gcloud/.credentials
, it should contain a Google Cloud Platform service account related info. This file has the following structure:
{
"type": "service_account",
"project_id": "id of the Google Cloud Platform project",
"private_key_id": "xxxxxxxxxxxxxxx",
"private_key": "-----BEGIN PRIVATE KEY-----\nxxxxxxxxxxxxxxxxxx\n-----END PRIVATE KEY-----\n",
"client_email": "xxxxxxxxxx@xxxxx.xxx",
"client_id": "xxxxxxxxxxxxxxx",
"auth_uri": "https://xxxxxxxxxxxxx",
"token_uri": "https://xxxxxxxxxx",
"auth_provider_x509_cert_url": "https://xxxxxxxxxx",
"client_x509_cert_url": "https://xxxxxxxxxxx"
}
It can be downloaded from Google Cloud Platform Console β API and services (in the side menu) β Credentials β Service accounts (selecting the one with desired authorizations) β New key.
These files are needed only if the user needs to execute benchmarks on OpenWhisk for the originally defined anger detection workflows. Being every file specific for each function, several versions of this information are needed. Strings needed to fill these files can be found from Azure Console β Resources (in the side menu) β Choose the specific Cognitive Service resource β Keys and endpoints.
In serverless_functions/openwhisk/java/face_recognition/anger_detection/src/main/java/anger_detection
and serverless_functions/openwhisk/java/face_recognition/image_recognition/src/main/java/image_recognition
a file named AzureConfig.java
with the following structure:
public class AzureConfig {
protected static String endpoint = "xxxxxxxxxx";
protected static String key = "xxxxxxxxxx";
}
In serverless_functions/openwhisk/node/face_recognition/anger_detection
and serverless_functions/openwhisk/node/face_recognition/image_recognition
a file named azureconfig.js
with the following structure:
module.exports = {
endpoint: "xxxxxxxxxx",
key: "xxxxxxxxxx"
};
In serverless_functions/openwhisk/python/face_recognition/anger_detection
and serverless_functions/openwhisk/python/face_recognition/image_recognition
a file named azureconfig.py
with the following structure:
endpoint = "xxxxxxxxxx"
key = "xxxxxxxxxx"
A file named config.properties
in the project root with the following structure (filled with valid current information):
docker_compose_dir=absolute_path_to:docker_env
mysql_ip=localhost ['localhost' to use Docker compose MySQL instance]
mysql_port=3306
mysql_user=xxxxxxx
mysql_password=xxxxxxx
mysql_dbname=xxxxxxx
influx_ip=localhost ['localhost' to use Docker compose InfluxDB instance]
influx_port=8086
influx_user=xxxxxxx
influx_password=xxxxxxx
influx_dbname=xxxxxxx
google_cloud_auth_json_path=absolute_path_to:credentials.json
google_cloud_cli_container_name=gcloud-cli
google_cloud_stage_bucket=name_of_stage_bucket_in_Google_Cloud_Platform
aws_auth_folder_path=absolute_path_to:credentials
aws_lambda_execution_role=arn:xxxxxxx
aws_step_functions_execution_role=arn:xxxxxxx
openwhisk_host=xxx.xxx.xxx.xxx
openwhisk_auth=xxxxxxx
openwhisk_ignore_ssl=True [or False if OpenWhisk is deployed on a SSL certified endpoint]
google_handler_function_path=absolute_path_to:serverless_functions/gcloud/orchestration_handler
aws_handler_function_path=absolute_path_to:serverless_functions/aws/orchestration_handler
Please note: in order to execute successfully the provided functions on AWS, the lambda role needs access to Comprehend, Translate, Rekognition, S3 and Step Functions, the step functions role needs access to Lambda only.
This section's purpose is to explain how to create packages ready for deployment to the different service providers.
The .jar
file to deploy can be easily created using the project management tool Maven.
Here an example of the pom.xml
file.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>GROUP_ID</groupId>
<artifactId>PROJECT_NAME</artifactId>
<version>VERSION</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
</properties>
<dependencies>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-lambda-java-core</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-lambda-java-events</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-lambda-java-log4j2</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>javax.json</groupId>
<artifactId>javax.json-api</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>javax.json.bind</groupId>
<artifactId>javax.json.bind-api</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>org.glassfish</groupId>
<artifactId>javax.json</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>x.x.x</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>x.x.x</version>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>x.x.x</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
In order to create a Node.js zipped package:
- define the
package.json
file with every needed dependency (an example can be found at the end of this subsection), - install every needed dependency using
npm
inside a folder namednode_modules
placed in the Node.js project root, - put
package.json
file,node_modules
folder and.js
code files inside a.zip
archive ready to be deployed.
Here an example of the package.json
file.
{
"name": "PROJECT_NAME",
"version": "VERSION",
"description": "PROJECT_DESCRIPTION",
"main": "index.js",
"author": "PROJECT_AUTHOR",
"license": "ISC",
"dependencies": {
"dependency_name": "^x.x.x"
}
}
Please note: package creation for AWS Node.js example functions can be automatically performed running the generate_archives.sh
script.
In order to create a Python zipped package:
- install every needed dependency using
pip
inside the Python project root, - put every dependency installed and the
.py
files inside.zip
archive ready to be deployed.
Please note:
- In the common cases the function needs only to communicate with AWS services, a .zip archive with just .py files inside is needed.
- Package creation for AWS Python example functions can be automatically performed running the
generate_archives.sh
script.
For Google Cloud Platform no archive creation is needed.
The project to deploy can be easily created using Maven, in order to perform deployment is enough passing the project root path to the deployment utility.
Here an example of the pom.xml
file needed for Google Cloud Functions deployment.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>GROUP_ID</groupId>
<artifactId>PROJECT_NAME</artifactId>
<version>VERSION</version>
<properties>
<maven.compiler.target>1.8</maven.compiler.target>
<maven.compiler.source>1.8</maven.compiler.source>
</properties>
<dependencies>
<dependency>
<groupId>com.google.cloud.functions</groupId>
<artifactId>functions-framework-api</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>javax.json</groupId>
<artifactId>javax.json-api</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>javax.json.bind</groupId>
<artifactId>javax.json.bind-api</artifactId>
<version>x.x.x</version>
</dependency>
<dependency>
<groupId>org.glassfish</groupId>
<artifactId>javax.json</artifactId>
<version>x.x.x</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>x.x.x</version>
<configuration>
<excludes>
<exclude>.google/</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
</project>
In order to create a Node.js package to deploy:
- define the
package.json
file with every needed dependency (an example can be found in the Amazon Web Services Node.js section), - put
package.json
file and.js
code files inside the project root to deploy and pass its absolute path to the deployment utility.
In order to create a Python package to deploy:
- put every needed
.py
file in the package root, - create a
requirements.txt
file in the package root with every needed dependency.
The deployment process is similar to the ones for Node.js and Java in Google Cloud Platform.
Here an example of the requirements.txt
file needed for Google Cloud Functions deployment.
dependency-name==x.x.x
dependency-name==x.x.x
dependency-name==x.x.x
...
The .jar
file to deploy can be created, again, using Maven.
Here an example of the pom.xml
file.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>GROUP_ID</groupId>
<artifactId>PROJECT_NAME</artifactId>
<version>VERSION</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
</properties>
<dependencies>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>x.x.x</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>x.x.x</version>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>x.x.x</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
In order to create a Node.js zipped package:
- define the
package.json
file with every needed dependency (an example can be found in the Amazon Web Services Node.js subsection), - install every needed dependency using
npm
inside a folder namednode_modules
placed in the Node.js project root, - put
package.json
file,node_modules
folder and.js
code files inside a.zip
archive ready to be deployed.
Please note: package creation for OpenWhisk Node.js example functions can be automatically performed running the generate_archives.sh
script.
[SOURCE] In order to create a Python zipped package:
- create the entry point file in the Python project root and name it as
__main__.py
, - create a virtual environment,
- install every needed dependency using
pip
inside the Python project root, - put the
virtualenv
directory and the.py
files inside.zip
archive ready to be deployed.
In order to create a virtual environment execute the following command starting from the Python project root:
$ virtualenv virtualenv
In order to install dependencies execute the following commands starting from the Python project root:
$ source virtualenv/bin/activate
(virtualenv) $ pip install dependency-name
(virtualenv) $ pip install dependency-name
...
Please note: package creation for OpenWhisk Python example functions can be automatically performed running the generate_archives.sh
script.