6. Virtual Environments

Using Virtual Environments

There are a few different types of Virtual Environments for running workflows on the engaging cluster. The virtual environment you need for your workflow will depend on multiple factors, including but not limited to:

  • Language your workflow is written in (python, ruby, C++, java, etc)

  • Dependency packages your workflow needs

  • Size of your workflow

  • Permissions needed inside your workflow

Interactive Virtual Environments

Interactive virtual environments are interactive in the sense that they use a GUI or web interface in conjunction with the cluster’s command line. On the engaging cluster, there are interactive Jupyter Notebook Environments that use a GUI web interface for users to run their workflows.

Jupyter Notebooks

Jupyter Notebooks extends the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results.

These can be launched from the engaging-ood portal, which the instructions for accessing that are available here.

There are two forms of Jupyter Notebooks on Engaging OOD found under “Interactive Apps” or “My Interactive Sessions”:

  • Graph-tool Jupyter Notebook: This is a notebook with a graph tool that uses a small amount of cores and memory.

  • Jupyter Notebook: This is a more configurable Jupyter Notebook, with several customizable options available for to expand its functionality:

    • Additional Modules: Allows you to preload modules into the Jupyter Notebook as it is launched from Engaging OOD
    • Anaconda Module Selection: Allows selection of system-wide Anaconda module usage with the Jupyter Notebook
    • Custom Conda Environment: Allows use of an existing conda environment inside the Jupyter Notebook

Scripted Virtual Environments

Scripted virtual environments on the engaging cluster require a user’s workflow to be scripted, as in the job the user is running in their workflow uses a piece of code, whether it be Python, Ruby, C++, Java, or any other scripting language.

  • Python Pip: Using Python’s pip feature is the most common and the simpliest form of virtual environment available on the cluster. However, can only be used with Python workflows, since pip is Python’s package installer.

  • Python Virtual Environments: A python virtual environment is a cooperatively isolated runtime environment that allows users and to install and upgrade Python distribution packages without interfering with the behaviour of other Python applications running on the same system. They essentially allow you to create a “virtual” isolated Python installation and install packages into that virtual installation. When you switch projects, you can simply create a new virtual environment and not have to worry about breaking the packages installed in the other environments. It is always recommended to use a virtual environment while developing Python applications.

  • Anaconda: Also referred to as ‘conda’, Anaconda is a virtual environment that is built off of Python that creates a contained virtual environment that allows users to run workflows that use software other than Python based modules. Both Pip and Python Virtual Environments require the software to be Python modules or Python based, where as Conda allows software that is not only Python based to be installed in a virtual environment.