Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Using VS Code on BMRC

BMRC would like to acknowledge Ellen Visscher for creating the initial version of this webpage.

 

Before we start, a few things to remember about the BMRC server.

  • When you login with 2FA, you are on the login node. You can then submit jobs (interactively or not), that are run on the compute nodes. Running computationally intensive tasks on the login node is bad practice, as it can slow down everyone else using BMRC.
  • When you first login to BMRC you are in your home directory, /users/<group>/<username>, which is limited to 10GB, see documentation, whereas your group home folder has a much higher storage allocation- /well/<group>/users/<username>. Note here, you will need to replace <group> and <username> with your own details.
  • If you are on standard BMRC partition you cannot SSH directly onto compute nodes, and compute nodes do not have access to the internet. GPU nodes do have access to select internet sites. Your group might have its own special BMRC partition, where neither of these may be the case.

Here is the VS Code documentation on remote development. It offers a few remote solutions, but due to the final point above, remote tunnels on the compute nodes are not available, nor is sshing directly onto the compute nodes.

 

VS Code Remote

In the extensions panel of VScode, you should install Remote Development extension.

Basically when you use this, it creates (or piggybacks off, see below) an ssh connection, spinning up a "VS Code server" on the remote login node, this is the backend information/processes needed by VS Code, however on your local computer is all the frontend user interface. The information is passed to/from the frontend/backend over the ssh connection so that from a user perspective it behaves very similarly to how it would locally. However VS Code has access to all your files/code on the server, and when you run things, it is running on the server side.

When the ssh connection or VS code window is closed, the VS code server instance closes. So there is an overhead everytime you leave/come back.

Setup

First ensure you have installed the Remote Development extension. Now go to your settings json, Command/Ctrl + Shift + P -> Preferences: Open User Settings (JSON).

Then add the following to the settings.json. This ensures on the BMRC server-side that everything related to VS Code is installed in your group home directory that has more space. cluster3 is preferred as the login node for VS Code.

{
"remote.SSH.serverInstallPath": {
    "<username>@cluster3.bmrc.ox.ac.uk": "/well/<group>/users/<username>",
    }
}

Then navigate to Remote Explorer panel which should be on the left side panel- where things like the project tree, search, run/debug also exist- its symbol is a computer with a little circle at the bottom right.

In top right of that panel, you will see a drop down that might be set to dev containers, press the down arrow and press Remotes (Tunnels/SSH).

Then find <username>@cluster3.bmrc.ox.ac.uk, and press the arrow. You will then be prompted to enter your first and second factor through the VS Code UI (see SSH config below for tips on avoiding this). This should take you to your home directory on rescomp1, you can also do something similar with cluster2 if you prefer. If you do not see the prompt for 2FA enable the remote.SSH.showLoginTerminal setting by following the instructions here.

If this is your first time using VS Code on the server, then you will need to install some extensions on the server side. You do this simply by navigating to the extensions and installing those that are relevant (e.g python extension package, jupyer, gitlens, copilot) on the server.

You will then want to open a folder, by pressing the Open button, the folder and folder structure listed are that on the BMRC servers, and you will only be able to open your own group folders etc. due to permissions.

The integrated terminal also will be on the server side, but remember this is connected to the login node, not the compute node, so take care if running anything directly in the terminal, unless you have followed steps in the Run/Debug instructions below.

Optional SSH config things

You can change your SSH config, so that you only ever create 1 SSH connection, and that every other connection you make, piggybacks off this first connection. If you often open many tabs connecting to the server on your native terminal, using this, you'll only ever have to do your 2FA once per day (unless your internet connection drops so that the initial connection fails).

This also means that when using VS Code remote, it can piggyback off this initial connection, so that you won't have to enter the 2FA through the VS Code UI.

To do this, navigate to your .ssh folder on your personal computer (in your root/home directory), and edit your .config file to contain:

Host cluster3
  HostName cluster3.bmrc.ox.ac.uk
  User <username> # MAKE THIS YOUR USERNAME

Host *
  ControlMaster auto
  ControlPath  ~/.ssh/sockets/%r@%h-%p
  ControlPersist  1h

Back in your settings.json also add:

{
"remote.SSH.serverInstallPath": {
    "cluster3": "/well/<group>/users/<username>",
    }
}

Then in your normal terminal, ssh cluster3 -> enter 2FA, you have now established the master connection. You can now see in a new terminal tab if you ssh cluster1 you will not have to re-enter your 2FA. Similarly, on VS Code you will not have to engage with the 2FA GUI.

To explain what the above is doing: ControlMaster auto: Enables the sharing of multiple sessions over a single network connection. When set to "auto", SSH will try to use a master connection if one exists, but if not, it will create a new one.

ControlPersist: When used in conjunction with ControlMaster, this option specifies whether the master SSH process should remain in the background when the initial client connection has been closed. If set to "yes" or a time (in this case 1h), the master connection will stay alive for further client connections until it has remained idle for that duration.

ControlPath ~/.ssh/sockets/%r@%h-%p: Specifies the location of the control socket used for connection sharing. The use of percent placeholders (%r, %h, and %p) means the socket path will contain the remote login name, the target hostname, and the target port number, respectively. So, different connections (e.g if sshing onto different servers) will have different socket paths and therefore won't conflict with each other.

Note: If you do port forwarding/tunneling/listening and make a mistake on some command listening on a certain port then you can't listen on that port again until the original ssh session is closed. This is as the port is still in use by the original/shared ssh session, and there's no way to unlisten. If you don't know what port forwarding is then you can safely ignore this.

Second Note, unfortunately control master does not work forn Windows.

Python using remote VS Code

As a first step make sure that you have the python extension pack- includes debugger, python, pylance which all help with various nice features.

Some of these really nice features include code navigiation (to/from definition- hovering and seeing a function definition etc). To navigate python code properly, it needs to know which python interpreter you are using- so that it can see the installed libraries to access where functions are defined etc. So to do this you need to set your python interpreter. This is one of the first things you should do when you open a new python project in VS Code.

Press Ctrl/Cmd + Shift + P -> Python: Select Interpreter

If using miniforge, the path to interpreter is something like: /well/<group>/users/<username>/miniforge3/envs/<env_name>0/bin/python.

You can find your python interpreter path in the terminal by activating your relevant environment and typing the command which python. If you have a virtual env for example, then it might instead simply be: venv/bin/python

You should also add the following to your settings.json:

{
  "terminal.integrated.env.linux": {
    "PYTHONPATH": "${workspaceFolder}",
    }
}

This basically just ensures that running anything from the integrated terminal will add the folder to the pythonpath so it knows where to look for files etc. If you get module not found errors related to files/functions defined in your project than part of the issue might be the pythonpath (if you have included the relevant __init__.py files).

Note, if the project you are in is a large python project, and the folder you are working in is not the sources root, then this blog can help you set the environment variables properly.

Jupyter using remote VS Code

If you would like to use jupyter notebooks without VS Code, follow the instructions here to use Open OnDemand.

It can be nice to run jupyter in VS Code so that you get all the nice code completions/navigations and keyboard shortcuts. To do this, make sure you have the jupyter extension pack extensions installed. Then follow these steps, in the integrated terminal (or normal terminal on login node):

tmux new # or screen session, this is optional but will ensure the job keeps running even if your ssh connection is dropped
cd <project_folder> # if not already there
srun --mem 10GB --cpus-per-task 1 --pty bash #interactive job, adjust resources as necessary, takes you onto compute node
conda activate <your_env> # or however you activate your environment
jupyter notebook --no-browser --ip=*

Make sure to note what compute node you are on- e.g compa012. Then from the command, two links like this be printed to the terminal:

http://localhost:8888/tree?token=f0cf7ca3197e85ad6d58cdfa94b6f66ecf06546967d898c3 http://127.0.0.1:8888/tree?token=f0cf7ca3197e85ad6d58cdfa94b6f66ecf06546967d898c3

Copy either of them, but replace either the localhost or 127.0.0.1 part with the compute node that you are on, so if you are on compa012 the link should look like:

http://compa012:8888/tree?token=f0cf7ca3197e85ad6d58cdfa94b6f66ecf06546967d898c3

Then inside your jupyter notebook file in VS Code, press Select Kernel -> Existing Jupyter Server -> paste in the above link -> Enter -> Enter (until GUI is gone). You are now connected to the jupyter kernel that is running on the compute node, and can continue your analysis as normal.

The nice thing about this is that even if your ssh connection fails, you can still connect to the jupyter server for as long as your compute node job is running.

Note Tmux has some different commands/special keys from normal bash, see here for some shortcuts, including mouse scrolling.

Second Note Although Jupyter in VS Code has advantages, it is also quite buggy/slow sometimes, so try it out and decide what you prefer.

Debugging/Running on the compute nodes

Running code through VS Code on a compute node is relatively easy. This is because when you press run on a file, it just sends the run command to the current integrated terminal pane (this is the terminal you see in the VS Code window, at the bottom, if you can't see it, go to View -> Terminal). Hence, if you start an interactive job in your integrated terminal, then press run, your file will be running on a compute node.

Unfortunately, the debugger does not work in this way, so it's more involved to get debugging on the compute node, however, it is super useful to do so. To do this, you need to create a debugging session on a compute node, that spins up a debugpy server that listens to a port for instructions. Then VS Code connects to this port and sends instructions (e.g breakpoints, step over, code to run) that the compute-node-debugpy-session listens to and executes. To do this:

In integrated terminal:

cd <project_folder> # if not already there
srun --mem 10GB --cpus-per-task 1 --pty bash #interactive job, adjust resources as necessary, takes you onto compute node
conda activate <your_env> # or however you activate your environment
python -m debugpy --listen 0.0.0.0:8000 --wait-for-client <path_to_script>.py

Make sure you note what compute node you're on, for example compa012. Then find/create your launch.json- you can find this by pressing the play/little bug symbol on the left main sidebar, and then either pressing "create a launch.json file" or pressing the settings symbol at the top right of the pane. If it exists, the launch.json is stored under a .vscode directory in your project folder (make sure to add this to your .gitignore). In the launch.json add a configuration that looks like this:

 "configurations": [
        {
            "name": "Attach to Remote Python Debugger",
            "type": "debugpy",
            "request": "attach",
            "connect": {
                "host": "compa012",
                "port": 8000
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}",
                    "remoteRoot": "${workspaceFolder}" 
                }
            ],
        },
 ]

Make sure the port 8000 aligns with what you listened to in your debugpy command above, and that the host corresponds to the compute node.

Then navigate to the <path_to_script>.py file -> press the little arrow next to the play/debug button -> debug using launch.json -> make sure to choose the option with the name "Attach to Remote Python Debugger". This should now create a debug session attached to the compute node, everything else should behave as normal.

A few points: 1. If your debug session ends, you'll have to rerun the python -m debugpy --listen 0.0.0.0:8000 --wait-for-client <path_to_script>.py command- make sure you're still on the compute node. 2. If your interactive compute node session ends and you create a new one that takes you to a different compute node, then you'll have to edit the configurations so that the host node corresponds to the correct compute node.

Note If you are debugging, and you notice that duplicate file tabs are being opened but the tabs have a different path, i.e /gpfs3/well/<group>/users/<username>/path_to_wherever instead of /well/<group>/users/<username>/path_to_wherever, this is because VS Code debugging does not resolve symbolic links, and internally on the BMRC /well/<group>/users/<username>/path_to_wherever is actually a symbolic link to /gpfs3/well/<group>/users/<username>/path_to_wherever. If you would like to stop this behaviour, when debugging, when you first open your folder on the remote, provide the absolute bath using /gpfs3 at the beginning. This is related to the following github issue and issue.

Troubleshooting:

If you find that VS Code is being really slow, or jupyter is being really slow- try disabling GitHub Copilot (for that file or entirely), if you are using it. Due to network latency this might be part of the issue.

Furthermore, the VS Code python, pylance and jupyter extensions are all being constantly developed- which is great for the most part but can mean that bugs get introduced. If weird things start happening make sure to check their github issues, and you can try downgrading to earlier versions of the extensions to see if this fixes it. You can also contribute to open source development by filing your own issue!

On this page