How to Run dsgrid on Kestrel

  1. ssh to a login node and start a screen session (or similar, e.g., tmux):

$ screen -S dsgrid
  1. Follow the installation instructions at Installation.

  2. Create a dsgrid runtime config file:

$ dsgrid config create sqlite:////projects/dsgrid/standard-scenarios.db -N standard-scenarios --offline
  1. Start a Spark cluster with your desired number of compute nodes by following the instructions at How to Start a Spark Cluster on Kestrel.

  2. Run all CPU-intensive dsgrid commands from the first node in your HPC allocation like this:

$ spark-submit --master=spark://$(hostname):7077 $(which dsgrid-cli.py) [command] [options] [args]
  1. Because you started a screen session at the beginning, if you disconnect from your ssh session for any reason you can pick your work back up by ssh’ing to the same login node you used the first time and resuming your screen session:

$ screen -r dsgrid