Launching Jobs#
Set Up Your Own Conda/Mamba Environment#
In computational research, ensuring that code runs consistently across different systems and over time is essential. This is where environment management tools like conda and mamba come in.
conda is a powerful package and environment manager that allows you to create isolated environments — each with its own versions of Python and packages. mamba is a faster, drop-in replacement for conda that significantly speeds up environment creation and dependency resolution.
Why It Matters: Reproducibility#
In modern research, reproducibility isn’t optional — it’s a core requirement. By using conda or mamba environments, you can:
Isolate dependencies for specific projects
Avoid version conflicts between packages
Share exact software environments with collaborators
Ensure long-term reproducibility of analysis and results
Imagine you share a Jupyter notebook with a colleague. If you’ve used a conda/mamba environment and included an environment.yml file, your colleague can recreate your exact setup — no more “it works on my machine” problems.
Without proper environment management, research code that runs today may fail tomorrow due to subtle changes in software versions. By making conda or mamba environments part of your standard workflow, you are not just managing software — you are investing in the integrity, longevity, and reproducibility of your research.
🛠️ Basic Workflow#
Create a new environment:
Python environment
module load mamba mamba create -p /kellogg/proj/<your_netid>/envs/python_proj python=3.10
R environment with optional
dplyrpackagesmodule load mamba mamba create -p /kellogg/proj/<your_netid>/envs/r_proj -c conda-forge r-base=4.4.0 r-dplyr
Activate the environment:
module load mamba source activate /kellogg/proj/<your_netid>/envs/python_proj
Install packages:
mamba install pandas
Export environment (for sharing or archiving):
mamba env export > environment.yml
Recreate environment from a file:
mamba env create -f environment.yml
Leave an environment
conda deactivate
Loading Software with Modules#
KLC uses Environment Modules to give users access to the software installed on KLC.
List available modules:
module availLoad a module (e.g., for R or Python):
module load R module load python/3.10
Check what you’ve loaded:
module listUnload a module:
module unload R
Modules ensure you’re using the correct version of software without interfering with others.
Long-Running Jobs#
For long-running jobs, it’s important to avoid losing progress if your connection drops. You have two main options:
Use FastX, a graphical remote desktop environment that keeps your session running on the server even if you disconnect.
Use
tmux, a terminal-based tool that lets you start a session, run your job, and safely disconnect. You can reconnect later and pick up right where you left off. Here is detailed information for usingtmuxon KLC.