Let’s start our first Marathon service and then present some common pitfalls in the troubleshooting section.
Basic setup
One can download Marathon directly from the website https://mesosphere.github.io/marathon/.
Let’s deploy Marathon in ~/apps/marathon-0.6.0
Starting the services
Start the marathon server:
1
|
|
Check the services are ok:
- Marathon console is enabled at http://localhost:8080/ We should have no active marathon service
Repository
To deploy the same application in multiple nodes, an easy way is a shared repository. Each version of the app and its dependencies must be available in the Deployment Repository.
Here is a simple structure for a Simple Web Service called SWS :
1 2 3 4 5 6 7 |
|
The easy step to deploy this repository:
1 2 3 |
|
Check the repository is available:
1 2 3 4 5 6 7 8 |
|
Test the application locally:
1 2 |
|
Open a browser at http://127.0.0.1:8091/, you should see an hello message.
Deploy with Marathon
We ready to start playing with Marathon. Let’s deploy our first app server with Marathon:
1
|
|
We can access directly the service from Marathon console.
If we need to deploy a second instance:
- go to marathon console at http://127.0.0.1:8080/, the console is displayed
- click the scale button, the scale window is displayed
- select 2 instances and press OK, the marathon console is updated with 2 instances
Troubleshooting
task identifier does not support uppercase
Let’s use an UPPERCASE identifier (SWS1), starting the task with:
1 2 |
|
OK, we should have been used sws1 as task indentifier.
repository is down when starting a task
(we have killed the repository, or the python server process in this case).
Start the task with:
1 2 |
|
“null%”, so no error. everything looks good. Right ? Hmm. No. Why is the task not started in the Marathon console ?
If we have a look at Marathon console logs, no details:
1 2 3 4 |
|
If we look at mesos console, we can see that the Framework is staging and repetedly failing:
But if we have a look at the Failed task error log, it’s clearer:
1 2 3 4 5 6 |
|
So enable the repository and we finally get a running task:
Not providing the uris correctly
Marathon is expecting the parameter “uris”: [ “http://127.0.0.1:8000/SWS/v1-first_revision/spray-test-assembly-0.1.jar” ]. What happen if we try to use the following command:
1
|
|
Marathon console shows the task is switching between STAGING and FAILED states.
When we check the framework error log:
1 2 3 |
|
When we check the Framework STDOUT:
1 2 3 4 5 |
|
No file is downloaded nor installed. The task is starting fine, but the java process terminates with status = 1 as the jarfile is not downloaded.
An invalid command
Let’s use the (in)famous blah command:
1
|
|
In the Mesos console we get
1 2 3 |
|
There is another variant when we execute the task in another node : Marathon try to start the task with the same user as the current user having started the Marathon process. If it does not exist in the slave node, we get something like :
1
|
|
Important:
- Be sure you know how access the mesos logs : without them, you’re lost
- Make sure you repository is accessible from all the nodes
- Always use a “uris” : [ “url1” ] syntax.
- Always ensure the *nix accounts are created on all the slave nodes