Reference

The Engine ML Command line interface allows you to interact with runs, notebooks, and reservations. Below are descriptions of all commands and any associated options.

add-ssh-key PATH_TO_KEY

Add ssh authorization to push to and pull from the Engine ML git server.

If you do not have an ssh public/private key pair, generate one

Keys are associated with your user and they must be unique (i.e. don't share your private keys).

$ engine add-ssh-key ~/.ssh/id_rsa.pub
Successfully added key: ~/.ssh/id_rsa.pub
The following entry should be prepended to your ~/.ssh/config:
Host *.app.engineml.com
IdentityFile ~/.ssh/id_rsa.pub
User git

autocomplete

If you want to be able to autocomplete engine commands, you can install our autocomplete scripts. You will need to source the generated file to enable autocomplete.

info

Autocomplete is only supported by bash and zsh

$ engine autocomplete > ~/.engine/autocomplete
$ source ~/.engine/autocomplete

If you want autocomplete every time you launch your shell, you can execute source ~/.engine/autocomplete in your .bashrc or .zshrc.

$ echo "source ~/.engine/autocomplete" >> ~/.bashrc

clean-cache RUN_ID

Remove the data associated with a local run. engine clean-cache all removes data for all local runs.

Data is stored at ~/.engine/local/RUN_ID.

config REPOSITORY

engine config OWNER/PROJECT prints a stub run configuration for the current repository. By default, engine config assumes this configuration is for local runs.

-t/--type

Either local or remote. local generates a local run configuration. remote generates a remote run configuration.

dashboard

Opens the web interface for Engine ML.

docs

Opens the web interface for Engine ML documentation.

get-files RUN_ID PATTERN

You can download files with the get-files command.

$ engine get-files adaptive-strut '*'
✔ Found 16 files (Uncompressed size: 2.9G) matching "*" for run adaptive-strut
Downloading compressed archive: 18%|█▊ | 521M/2.85G [02:51<11:28, 3.66MB/s]

Using list-files with the same PATTERN will show a preview of the files that will be downloaded.

See File Patterns for more details on the PATTERN argument.

info

Prints the current user, CLI version, documentation URL, and website address.

list

You can view all of your currently running runs, notebooks (coming soon), and reservations (coming soon) with engine list. Runs are sorted with the most recent at the top.

$ engine list
┌──────────────────────┬──────────────────────────────────┬───────────────────────┬────────────────┬────────────────┐
│ Run │ Run Details │ Configuration │ Scalars │ Commit Message │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: outer-converter │ Status: Running │ GPUs: 2x K80 │ step: 100 │ Engine ML Test │
│ Commit: af1464 │ Created: 10 minutes ago │ dataset: kitti/object │ acc: 0.511678 │ 2 GPUs │
│ Branch: demo │ Training: 3 minutes, 28 seconds │ model: avod_model │ loss: 0.256536 │ │
└──────────────────────┴──────────────────────────────────┴───────────────────────┴────────────────┴────────────────┘

Adding a run, notebook, or reservation id after list (e.g. engine list sparking-converter) will show details for that ID.

-a / --active-runs

Only show active runs. Runs that are canceled, failed, or finished will not be displayed.

-n INTEGER / --num INTEGER

Shows INTEGER number of runs in the list.

$ engine list -n 1
┌──────────────────────┬──────────────────────────────────┬───────────────────────┬────────────────┬────────────────┐
│ Run │ Run Details │ Configuration │ Scalars │ Commit Message │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: outer-converter │ Status: Canceled │ GPUs: 2x K80 │ step: 160 │ Engine ML Test │
│ Commit: af1464 │ Created: 24 minutes ago │ dataset: kitti/object │ acc: 0.611678 │ 2 GPUs │
│ Branch: demo │ Training: 15 minutes, 28 seconds │ model: avod_model │ loss: 0.156536 │ │
│ │ Finished: 8 minutes ago │ │ │ │
└──────────────────────┴──────────────────────────────────┴───────────────────────┴────────────────┴────────────────┘
$ engine list -n 2
┌──────────────────────┬──────────────────────────────────┬───────────────────────┬────────────────┬────────────────┐
│ Run │ Run Details │ Configuration │ Scalars │ Commit Message │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: outer-converter │ Status: Canceled │ GPUs: 2x K80 │ step: 160 │ Engine ML Test │
│ Commit: af1464 │ Created: 24 minutes ago │ dataset: kitti/object │ acc: 0.611678 │ 2 GPUs │
│ Branch: demo │ Training: 15 minutes, 28 seconds │ model: avod_model │ loss: 0.156536 │ │
│ │ Finished: 8 minutes ago │ │ │ │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: adaptive-pushrod │ Status: Canceled │ GPUs: 8x K80 │ step: 160 │ Engine ML Test │
│ Commit: d4ea15 │ Created: 1 day ago │ dataset: kitti/object │ acc: 0.911592 │ 8 GPUs │
│ Branch: demo │ Training: 36 minutes, 24 seconds │ model: avod_model │ loss: 0.062201 │ │
│ │ Finished: 1 day ago │ │ │ │
└──────────────────────┴──────────────────────────────────┴───────────────────────┴────────────────┴────────────────┘

--no-reservations

Optionally display reservations. By default, show reservations.

--no-runs

Optionally display runs. By default, show runs.

list-files RUN_ID PATTERN

Lists file names and sizes for any files saved to eml.data.output_dir() during a run.

$ engine list-files adaptive-strut '**'
DIR avod_cars_example/
4.5K avod_cars_example/avod_cars_example.config
DIR avod_cars_example/checkpoints/
DIR avod_cars_example/logs/
DIR avod_cars_example/logs/train/
DIR avod_cars_example/logs/train/2019-04-02 21:59:36.211486/
6.9M avod_cars_example/logs/train/2019-04-02 21:59:36.211486/events.out.tfevents.1554242377.ip-192-168-103-46.ec2.internal
418.0B checkpoint
581.0M checkpoint-0-00000000.data-00000-of-00001
14.3K checkpoint-0-00000000.index
3.6M checkpoint-0-00000000.meta
581.0M checkpoint-1000-00001000.data-00000-of-00001
14.3K checkpoint-1000-00001000.index
3.6M checkpoint-1000-00001000.meta
581.0M checkpoint-2000-00002000.data-00000-of-00001
14.3K checkpoint-2000-00002000.index
3.6M checkpoint-2000-00002000.meta
581.0M checkpoint-3000-00003000.data-00000-of-00001
14.3K checkpoint-3000-00003000.index
3.6M checkpoint-3000-00003000.meta
581.0M checkpoint-4000-00004000.data-00000-of-00001
14.3K checkpoint-4000-00004000.index
3.6M checkpoint-4000-00004000.meta

Using get-files with the same PATTERN will download a zip file containing all files in the output of this command.

See File Patterns for more details on the PATTERN argument.

--no-size

Do not show file sizes. This can sometimes improve performance for very large directories.

login TOKEN

Log in to Engine ML. An API_KEY can be obtained from your settings page.

logout

Log out of Engine ML.

run RUN_CONFIG_PATH

engine run RUN_CONFIG_PATH is the primary way to launch runs on Engine ML. Runs can be launched locally or remotely (coming soon). The first argument to engine run is the file path to a run configuration file. You can generate a stub run config with engine config. More examples for local and remote run configurations can be seen here (local) and here (remote).

If you do not specify RUN_CONFIGURATION, engine run will default to a file called engine.yaml at the root of your repository.

$ engine run local.yaml
[ENGINE ML] Updating engine remote
[ENGINE ML] Your Run ID is: floor-hood
[ENGINE ML] Website: https://app.engineml.com/jobs/floor-hood
Train Epoch: 1/1 Loss: 2.309322
Train Epoch: 1/1 Loss: 2.304794
Train Epoch: 1/1 Loss: 2.288168
Train Epoch: 1/1 Loss: 2.286560
Train Epoch: 1/1 Loss: 2.313777
Train Epoch: 1/1 Loss: 2.248435
Train Epoch: 1/1 Loss: 2.203526
Train Epoch: 1/1 Loss: 2.172441
Test set: Average loss: 2.0197, Accuracy: 42.00%
[ENGINE ML] Your run is finished
[ENGINE ML] Syncing remaining logs to Engine ML...
[ENGINE ML] sending incremental file list
[ENGINE ML] Syncing remaining metrics to Engine ML...
[ENGINE ML] sending incremental file list
[ENGINE ML] Syncing remaining metadata to Engine ML...
[ENGINE ML] sending incremental file list
[ENGINE ML] ./
[ENGINE ML] eml_model/
[ENGINE ML] eml_model/config.json
[ENGINE ML]
[ENGINE ML] 164 100% 0.00kB/s 0:00:00
[ENGINE ML] 164 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/3)
[ENGINE ML] Syncing remaining outputs to Engine ML...
[ENGINE ML] sending incremental file list
info

During a engine run you may see lines prefixed with [ENGINE ML]. These messages are for informational purposes only and will not be captured

Reproducibility and Analytics

Every engine run command tracks code, logs, system metrics, and files.

Engine ML uses git to track code changes. In order to reduce friction from launching a run, any unstaged changes are automatically committed. If a commit message is not specified with the -m flag, it will be automatically generated.

Example Generated Commit Message
commit 57ce99bf71c11c6d8569596d1d7ae9d8fca9c22a
Date: Wed Feb 10 12:00:00 2020 -0800
Modified files: pytorch/mnist/pytorch_mnist.py, tf/mnist/engine_build.sh
tip

Engine ML also supports standard git add and git commit workflows. See the Getting Started dialog on your project's details page for the git remote.

After the commit, your branch is automatically pushed to a git remote where it is tagged with the Run ID. Common git commands continue to work with these Run IDs.

$ git diff metering-gasoline floor-hood
diff --git a/tf/mnist/engine.yaml b/tf/mnist/engine.yaml
index c39f2e6..a647e6d 100644
--- a/tf/mnist/engine.yaml
+++ b/tf/mnist/engine.yaml
@@ -1,7 +1,7 @@
apiVersion: "3.0.0"
build: engine_build.sh
-command: python tf/mnist/tensorflow_mnist.py --data-dir /data/mnist
+command: python pytorch/mnist/pytorch_mnist.py --data-dir /data/mnist
dataBucket: datasets.us-east-1.engineml.com
dataBucketSubdirectory: /mnist
environment:
MY_SECRET_ENV_VAR: foo
WIDGET_TYPE: gizmo

-o / --override KEY VALUE

Modify settings in your run configuration.

$ engine run -o command 'echo hello' local.yaml
[ENGINE ML] Updating engine remote
[ENGINE ML] Modified files: tf/mnist/engine.yaml
[ENGINE ML] Your Run ID is: speedy-stick
[ENGINE ML] Website: https://app.engineml.com/jobs/speedy-stick
hello
[ENGINE ML] Your run is finished
[ENGINE ML] Syncing remaining logs to Engine ML...
[ENGINE ML] sending incremental file list
[ENGINE ML] Syncing remaining metrics to Engine ML...
[ENGINE ML] sending incremental file list
[ENGINE ML] Syncing remaining metadata to Engine ML...
[ENGINE ML] sending incremental file list
[ENGINE ML] Syncing remaining outputs to Engine ML...
[ENGINE ML] sending incremental file list

To override nested settings like environment, use -o environment ENV=VAL.

-m / --commit-message STRING

Use a custom commit message instead of an autogenerated one.

stop RUN/NOTEBOOK/RESERVATION_ID

Stops remote runs (coming soon), notebooks (coming soon), and reservations (coming soon).

Local runs will be placed in the canceled state, but will continue running. This can be used to force a local run to enter the canceled state.

sync-cache RUN_ID

Sync a run's logs, configuration metadata, system metrics, and files saved on the local machine to Engine ML. engine sync-cache all syncs data for all local RUN_IDs.

tip

If you lose network connection when running locally, executing sync-cache after the run has completed will update the run's data stored on Engine ML.

Data is stored at ~/.engine/local/RUN_ID.

--clean

After syncing, delete the local copies of the files.

tag RUN_ID TAG

Tags provide an alias for a group of experiments. For example, you might want a tag called best-model for your team's model with the lowest validation loss. You also may want to group models that have common structure. For example, resnet-50 or inception-v2.

Experiments aren't restricted to a single tag. If your best model is using Resnet 50, you could tag the experiment with best-model and resnet-50.

$ engine tag adaptive-strut my-tag
Tagging dependable-ring with tag "my-tag"...

-c / --color

The tag will appear as the specified color in the dashboard.

$ engine tag helpful-magneto red-tag -c red
Tagging helpful-magneto with tag "red-tag"...

For a full list of colors run engine tag-colors

-d / --delete

Deletes a tag, if it exists.

$ engine tag -d adaptive-strut my-tag
Deleting tag "my-tag" from run adaptive-strut...

tail RUN_ID

View your run's logs. By default, this command only shows the last 10 lines of the logs from replica 0. stderr will appear red unless the --no-color flag is enabled.

$ engine tail adaptive-strut
First layer weights: (3, 3, 3, 32) [-0.01514911 -0.09147684 -0.08772741 -0.05053544 0.09287488]
Step 4160, Total Loss 1.676, Time Elapsed 15.583 s
First layer weights: (3, 3, 3, 32) [-0.01512622 -0.09169471 -0.08770902 -0.05086816 0.09276406]
Step 4170, Total Loss 1.535, Time Elapsed 17.184 s
First layer weights: (3, 3, 3, 32) [-0.01529853 -0.09192348 -0.08704228 -0.05095764 0.09282026]
Step 4180, Total Loss 1.267, Time Elapsed 13.575 s
First layer weights: (3, 3, 3, 32) [-0.01554769 -0.09195881 -0.08685017 -0.05112816 0.09311653]
Step 4190, Total Loss 0.120, Time Elapsed 14.800 s
First layer weights: (3, 3, 3, 32) [-0.01578166 -0.09185258 -0.08646027 -0.05092932 0.09335227]
Step 4200, Total Loss 1.898, Time Elapsed 14.845 s

-n INTEGER

The number of lines to seek backwards before showing logs. If INTEGER is negative, start logs from the start.

--replica INTEGER

Stream the logs of a single replica. By default, stream logs from replica 0.

--download-all

Downloads all logs from all replicas to a folder. This option and the -n option are mutually exclusive.

tip

If your run crashes and replica 0 does not have any logs that indicate error, one of your other replicas might have crashed.

$ engine tail adaptive-strut --download-all
Downloading logs for run adaptive-strut
Logs saved to directory "adaptive-strut-logs-190410_013642"
$ tail adaptive-strut-logs-190410_013642/*.log
==> adaptive-strut-logs-190410_013642/0.log <==
First layer weights: (3, 3, 3, 32) [-0.01514911 -0.09147684 -0.08772741 -0.05053544 0.09287488]
Step 4170, Total Loss 1.535, Time Elapsed 17.184 s
First layer weights: (3, 3, 3, 32) [-0.01529853 -0.09192348 -0.08704228 -0.05095764 0.09282026]
==> adaptive-strut-logs-190410_013642/1.log <==
First layer weights: (3, 3, 3, 32) [-0.01514911 -0.09147684 -0.08772741 -0.05053544 0.09287488]
Step 4170, Total Loss 1.521, Time Elapsed 17.124 s
First layer weights: (3, 3, 3, 32) [-0.01529853 -0.09192348 -0.08704228 -0.05095764 0.09282026]
Step 4180, Total Loss 1.267, Time Elapsed 13.565 s
First layer weights: (3, 3, 3, 32) [-0.01554769 -0.09195881 -0.08685017 -0.05112816 0.09311653]
Traceback (most recent call last):
File "train.py", line 101, in <module>
IndexError: list index out of range

-f / --follow

Follow logs from a run as they are produced.

version

Prints the current CLI version