Reference

The Engine Command line interface allows you to interact with runs, notebooks, and reservations. Below are descriptions of all commands and any associated options.

add-ssh-key PATH_TO_KEY

Add ssh authorization to push to and pull from the Engine git server.

If you do not have an ssh public/private key pair, generate one

Keys are associated with your user and they must be unique (i.e. don't share your private keys).

$ engine add-ssh-key ~/.ssh/id_rsa.pub
Successfully added key: ~/.ssh/id_rsa.pub
The following entry should be prepended to your ~/.ssh/config:
Host *.app.engineml.com
IdentityFile ~/.ssh/id_rsa.pub
User git

autocomplete

If you want to be able to autocomplete engine commands, you can install our autocomplete scripts. You will need to source the generated file to enable autocomplete.

info

Autocomplete is only supported by bash and zsh

$ engine autocomplete > ~/.engine/autocomplete
$ source ~/.engine/autocomplete

If you want autocomplete every time you launch your shell, you can execute source ~/.engine/autocomplete in your .bashrc or .zshrc.

$ echo "source ~/.engine/autocomplete" >> ~/.bashrc

clean-cache RUN_ID

Remove the data associated with a local run.

Data is stored at ~/.engine/local/RUN_ID.

-a / --all

Removes data for all runs cached on your local machine.

$ engine clean-cache -a
Removing all local files affiliated with run(s): ethical-voltage, strongest-magneto

config REPOSITORY

engine config OWNER/PROJECT prints a stub run configuration for the current repository.

tip

Make sure you track the generated config with git add after creating it.

-t/--type

Either local or remote. local generates a local run configuration. remote generates a remote run configuration. The default is local.

dashboard

Opens the web interface for Engine.

docs

Opens the web interface for Engine documentation.

get-files RUN_ID PATTERN

You can download files with the get-files command.

$ engine get-files adaptive-strut '*'
✔ Found 16 files (Uncompressed size: 2.9G) matching "*" for run adaptive-strut
Downloading compressed archive: 18%|█▊ | 521M/2.85G [02:51<11:28, 3.66MB/s]

Using list-files with the same PATTERN will show a preview of the files that will be downloaded.

See File Patterns for more details on the PATTERN argument.

info

Prints the current user, CLI version, documentation URL, and website address.

list

You can view all of your currently running runs, notebooks (coming soon), and reservations (coming soon) with engine list. Runs are sorted with the most recent at the top.

$ engine list
┌──────────────────────┬──────────────────────────────────┬───────────────────────┬────────────────┬────────────────┐
│ Run │ Run Details │ Configuration │ Scalars │ Commit Message │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: outer-converter │ Status: Running │ GPUs: 2x K80 │ step: 100 │ Engine ML Test │
│ Commit: af1464 │ Created: 10 minutes ago │ dataset: kitti/object │ acc: 0.511678 │ 2 GPUs │
│ Branch: demo │ Training: 3 minutes, 28 seconds │ model: avod_model │ loss: 0.256536 │ │
└──────────────────────┴──────────────────────────────────┴───────────────────────┴────────────────┴────────────────┘

Adding a run, notebook, or reservation id after list (e.g. engine list sparking-converter) will show details for that ID.

-a / --active-runs

Only show active runs. Runs that are canceled, failed, or finished will not be displayed.

-n INTEGER / --num INTEGER

Shows INTEGER number of runs in the list.

$ engine list -n 1
┌──────────────────────┬──────────────────────────────────┬───────────────────────┬────────────────┬────────────────┐
│ Run │ Run Details │ Configuration │ Scalars │ Commit Message │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: outer-converter │ Status: Canceled │ GPUs: 2x K80 │ step: 160 │ Engine ML Test │
│ Commit: af1464 │ Created: 24 minutes ago │ dataset: kitti/object │ acc: 0.611678 │ 2 GPUs │
│ Branch: demo │ Training: 15 minutes, 28 seconds │ model: avod_model │ loss: 0.156536 │ │
│ │ Finished: 8 minutes ago │ │ │ │
└──────────────────────┴──────────────────────────────────┴───────────────────────┴────────────────┴────────────────┘
$ engine list -n 2
┌──────────────────────┬──────────────────────────────────┬───────────────────────┬────────────────┬────────────────┐
│ Run │ Run Details │ Configuration │ Scalars │ Commit Message │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: outer-converter │ Status: Canceled │ GPUs: 2x K80 │ step: 160 │ Engine ML Test │
│ Commit: af1464 │ Created: 24 minutes ago │ dataset: kitti/object │ acc: 0.611678 │ 2 GPUs │
│ Branch: demo │ Training: 15 minutes, 28 seconds │ model: avod_model │ loss: 0.156536 │ │
│ │ Finished: 8 minutes ago │ │ │ │
├──────────────────────┼──────────────────────────────────┼───────────────────────┼────────────────┼────────────────┤
│ ID: adaptive-pushrod │ Status: Canceled │ GPUs: 8x K80 │ step: 160 │ Engine ML Test │
│ Commit: d4ea15 │ Created: 1 day ago │ dataset: kitti/object │ acc: 0.911592 │ 8 GPUs │
│ Branch: demo │ Training: 36 minutes, 24 seconds │ model: avod_model │ loss: 0.062201 │ │
│ │ Finished: 1 day ago │ │ │ │
└──────────────────────┴──────────────────────────────────┴───────────────────────┴────────────────┴────────────────┘

--no-reservations

Optionally display reservations. By default, show reservations.

--no-runs

Optionally display runs. By default, show runs.

list-files RUN_ID PATTERN

Lists file names and sizes for any files saved to eml.data.output_dir() during a run.

$ engine list-files adaptive-strut '**'
DIR avod_cars_example/
4.5K avod_cars_example/avod_cars_example.config
DIR avod_cars_example/checkpoints/
DIR avod_cars_example/logs/
DIR avod_cars_example/logs/train/
DIR avod_cars_example/logs/train/2019-04-02 21:59:36.211486/
6.9M avod_cars_example/logs/train/2019-04-02 21:59:36.211486/events.out.tfevents.1554242377.ip-192-168-103-46.ec2.internal
418.0B checkpoint
581.0M checkpoint-0-00000000.data-00000-of-00001
14.3K checkpoint-0-00000000.index
3.6M checkpoint-0-00000000.meta
581.0M checkpoint-1000-00001000.data-00000-of-00001
14.3K checkpoint-1000-00001000.index
3.6M checkpoint-1000-00001000.meta
581.0M checkpoint-2000-00002000.data-00000-of-00001
14.3K checkpoint-2000-00002000.index
3.6M checkpoint-2000-00002000.meta
581.0M checkpoint-3000-00003000.data-00000-of-00001
14.3K checkpoint-3000-00003000.index
3.6M checkpoint-3000-00003000.meta
581.0M checkpoint-4000-00004000.data-00000-of-00001
14.3K checkpoint-4000-00004000.index
3.6M checkpoint-4000-00004000.meta

Using get-files with the same PATTERN will download a zip file containing all files in the output of this command.

See File Patterns for more details on the PATTERN argument.

--no-size

Do not show file sizes. This can sometimes improve performance for very large directories.

login CLI_KEY

Log in to Engine ML. A CLI_KEY can be obtained from your settings page.

logout

Log out of Engine ML.

run RUN_CONFIG_PATH

engine run RUN_CONFIG_PATH is the primary way to launch runs on Engine ML. Runs can be launched locally or remotely (coming soon). The first argument to engine run is the file path to a run configuration file. You can generate a stub run config with engine config.

If you do not specify RUN_CONFIGURATION, engine run will default to a file called engine.yaml at the root of your repository.

$ engine run local.yaml
[ENGINE] Updating engine remote
[ENGINE] Your Run ID is: floor-hood
[ENGINE] Website: https://app.engineml.com/jobs/floor-hood
Train Epoch: 1/1 Loss: 2.309322
Train Epoch: 1/1 Loss: 2.304794
Train Epoch: 1/1 Loss: 2.288168
Train Epoch: 1/1 Loss: 2.286560
Train Epoch: 1/1 Loss: 2.313777
Train Epoch: 1/1 Loss: 2.248435
Train Epoch: 1/1 Loss: 2.203526
Train Epoch: 1/1 Loss: 2.172441
Test set: Average loss: 2.0197, Accuracy: 42.00%
[ENGINE] Your run is finished
[ENGINE] Syncing remaining logs to Engine ML...
[ENGINE] sending incremental file list
[ENGINE] Syncing remaining metrics to Engine ML...
[ENGINE] sending incremental file list
[ENGINE] Syncing remaining metadata to Engine ML...
[ENGINE] sending incremental file list
[ENGINE] ./
[ENGINE] eml_model/
[ENGINE] eml_model/config.json
[ENGINE]
[ENGINE] 164 100% 0.00kB/s 0:00:00
[ENGINE] 164 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/3)
[ENGINE] Syncing remaining outputs to Engine ML...
[ENGINE] sending incremental file list
info

During a engine run you may see lines prefixed with [ENGINE]. These messages are for informational purposes only and will not be captured

Reproducibility and Analytics

Every engine run command tracks code, logs, system metrics, and files.

Engine ML uses git to track code changes. In order to reduce friction from launching a run, any unstaged changes are automatically committed. If a commit message is not specified with the -m flag, it will be automatically generated.

Example Generated Commit Message
commit 57ce99bf71c11c6d8569596d1d7ae9d8fca9c22a
Date: Wed Feb 10 12:00:00 2020 -0800
Modified files: pytorch/mnist/pytorch_mnist.py, tf/mnist/engine_build.sh
tip

Engine also supports standard git add and git commit workflows.

You can clone an existing project from Engine by cloning it with git clone git@git.app.engineml.com/OWNER/TEAM

After the commit, your branch is automatically pushed to a git remote where it is tagged with the Run ID. Common git commands continue to work with these Run IDs.

$ git diff metering-gasoline floor-hood
diff --git a/tf/mnist/engine.yaml b/tf/mnist/engine.yaml
index c39f2e6..a647e6d 100644
--- a/tf/mnist/engine.yaml
+++ b/tf/mnist/engine.yaml
@@ -1,7 +1,7 @@
apiVersion: "3.0.0"
build: engine_build.sh
-command: python tf/mnist/tensorflow_mnist.py --data-dir /data/mnist
+command: python pytorch/mnist/pytorch_mnist.py --data-dir /data/mnist
dataBucket: datasets.us-east-1.engineml.com
dataBucketSubdirectory: /mnist
environment:
MY_SECRET_ENV_VAR: foo
WIDGET_TYPE: gizmo

-o / --override KEY VALUE

Modify settings in your run configuration.

$ engine run -o command 'echo hello' local.yaml
[ENGINE] Updating engine remote
[ENGINE] Modified files: tf/mnist/engine.yaml
[ENGINE] Your Run ID is: speedy-stick
[ENGINE] Website: https://app.engineml.com/jobs/speedy-stick
hello
[ENGINE] Your run is finished
[ENGINE] Syncing remaining logs to Engine...
[ENGINE] sending incremental file list
[ENGINE] Syncing remaining metrics to Engine...
[ENGINE] sending incremental file list
[ENGINE] Syncing remaining metadata to Engine...
[ENGINE] sending incremental file list
[ENGINE] Syncing remaining outputs to Engine...
[ENGINE] sending incremental file list

To override nested settings like environment, use -o environment ENV=VAL.

-m / --commit-message STRING

Use a custom commit message instead of an autogenerated one.

stop RUN/NOTEBOOK/RESERVATION_ID

Stops remote runs (coming soon), notebooks (coming soon), and reservations (coming soon).

Local runs will be placed in the canceled state, but will continue running. This can be used to force a local run to enter the canceled state.

sync-cache RUN_ID

Sync a run's logs, configuration metadata, system metrics, and files saved on the local machine to Engine.

tip

If you lose network connection when running locally, executing sync-cache after the run has completed will update the run's data stored on Engine.

Data is stored at ~/.engine/local/RUN_ID.

-a / --all

Syncs data for all runs cached on your local machine.

$ engine sync-cache -a
Updating the following runs:
ethical-voltage, strongest-magneto
✔ Syncing cache for ethical-voltage
✔ Syncing cache for strongest-magneto

--clean

After syncing, delete the local copies of the files.

tag RUN_ID TAG

Tags provide an alias for a group of experiments. For example, you might want a tag called best-model for your team's model with the lowest validation loss. You also may want to group models that have common structure. For example, resnet-50 or inception-v2.

Experiments aren't restricted to a single tag. If your best model is using Resnet 50, you could tag the experiment with best-model and resnet-50.

$ engine tag adaptive-strut my-tag
Tagging dependable-ring with tag "my-tag"...

-c / --color

The tag will appear as the specified color in the dashboard.

$ engine tag helpful-magneto red-tag -c red
Tagging helpful-magneto with tag "red-tag"...

For a full list of colors run engine tag-colors

-d / --delete

Deletes a tag, if it exists.

$ engine tag -d adaptive-strut my-tag
Deleting tag "my-tag" from run adaptive-strut...

tail RUN_ID

View your run's logs. By default, this command only shows the last 10 lines of the logs from replica 0. stderr will appear red unless the --no-color flag is enabled.

$ engine tail adaptive-strut
First layer weights: (3, 3, 3, 32) [-0.01514911 -0.09147684 -0.08772741 -0.05053544 0.09287488]
Step 4160, Total Loss 1.676, Time Elapsed 15.583 s
First layer weights: (3, 3, 3, 32) [-0.01512622 -0.09169471 -0.08770902 -0.05086816 0.09276406]
Step 4170, Total Loss 1.535, Time Elapsed 17.184 s
First layer weights: (3, 3, 3, 32) [-0.01529853 -0.09192348 -0.08704228 -0.05095764 0.09282026]
Step 4180, Total Loss 1.267, Time Elapsed 13.575 s
First layer weights: (3, 3, 3, 32) [-0.01554769 -0.09195881 -0.08685017 -0.05112816 0.09311653]
Step 4190, Total Loss 0.120, Time Elapsed 14.800 s
First layer weights: (3, 3, 3, 32) [-0.01578166 -0.09185258 -0.08646027 -0.05092932 0.09335227]
Step 4200, Total Loss 1.898, Time Elapsed 14.845 s

-n INTEGER

The number of lines to seek backwards before showing logs. If INTEGER is negative, start logs from the start.

--replica INTEGER

Stream the logs of a single replica. By default, stream logs from replica 0.

--download-all

Downloads all logs from all replicas to a folder. This option and the -n option are mutually exclusive.

tip

If your run crashes and replica 0 does not have any logs that indicate error, one of your other replicas might have crashed.

$ engine tail adaptive-strut --download-all
Downloading logs for run adaptive-strut
Logs saved to directory "adaptive-strut-logs-190410_013642"
$ tail adaptive-strut-logs-190410_013642/*.log
==> adaptive-strut-logs-190410_013642/0.log <==
First layer weights: (3, 3, 3, 32) [-0.01514911 -0.09147684 -0.08772741 -0.05053544 0.09287488]
Step 4170, Total Loss 1.535, Time Elapsed 17.184 s
First layer weights: (3, 3, 3, 32) [-0.01529853 -0.09192348 -0.08704228 -0.05095764 0.09282026]
==> adaptive-strut-logs-190410_013642/1.log <==
First layer weights: (3, 3, 3, 32) [-0.01514911 -0.09147684 -0.08772741 -0.05053544 0.09287488]
Step 4170, Total Loss 1.521, Time Elapsed 17.124 s
First layer weights: (3, 3, 3, 32) [-0.01529853 -0.09192348 -0.08704228 -0.05095764 0.09282026]
Step 4180, Total Loss 1.267, Time Elapsed 13.565 s
First layer weights: (3, 3, 3, 32) [-0.01554769 -0.09195881 -0.08685017 -0.05112816 0.09311653]
Traceback (most recent call last):
File "train.py", line 101, in <module>
IndexError: list index out of range

-f / --follow

Follow logs from a run as they are produced.

version

Prints the current CLI version