top of page
Search
Setting up an LDAP system in an AWS cluster
Lightweight Directory Access Protocol (LDAP) is a protocol used for accessing and managing directory services over a network. It is commonly used for centralised authentication and authorisation in enterprise environments. By integrating LDAP with a PBS cluster, user authentication and management can be centralised, making it easier to handle user accounts and permissions across multiple nodes in the cluster. In the setup, LDAP has two types of nodes LDAP server - Head node (

Joseph
2 days ago7 min read
Â
Â
Â
Setting up a PBS scheduler in an AWS cluster
This is the fifth part of an eight part series  on how to setup an HPC cluster on AWS. This document explains how to setup an OpenPBS job scheduler in an AWS cluster. OpenPBS (Portable Batch System) is an open-source workload management and job scheduling system designed for HPC environments. It efficiently allocates compute resources across clusters by queuing, scheduling, and monitoring batch jobs submitted by users. OpenPBS and PBS will be used interchangeably in this doc

Joseph
2 days ago4 min read
Â
Â
Â
Setting up a BeeGFS file system in an AWS
This is the fourth part of an eight part series  on how to setup an HPC cluster on AWS. This document explains how to setup a BeeGFS files system in the AWS cluster. BeeGFS is a high-performance parallel file system designed for scalable storage in HPC and other data-intensive environments. It stripes data across multiple storage servers, enabling fast, concurrent access for many clients, which significantly boosts I/O performance. The first crucial step in installing BeeGFS

Joseph
2 days ago7 min read
Â
Â
Â
Enabling passwordless SSH in an AWS cluster
This is the third part of an eight part series  on how to setup an HPC cluster on AWS. This document explains how to setup a passwordless SSH across all the nodes in the cluster Passwordless SSH is required in HPC clusters to allow nodes to communicate and execute tasks automatically, such as job scheduling, data transfers, and parallel computations, without repeated password prompts, ensuring efficient and seamless operation. Create the SSH keys On each node, ensure the ~/

Joseph
2 days ago6 min read
Â
Â
Â
Initial package installation in AWS instances
This is the second part of an eight part series  on how to setup an HPC cluster on AWS. This part explain the initial setup to install the packages we will need for the different components of the HPC cluster. The first thing we have to do is disable Security-Enhanced Linux (SELinux). It’s a security module built into the Linux kernel that provides mandatory access control (MAC), which is more strict than the usual discretionary access control (DAC) that standard Linux uses.

Joseph
2 days ago3 min read
Â
Â
Â
Virtual Machine Setup for an HPC Cluster in AWS
This is the first part of an eight part series on how to setup an HPC cluster on AWS. This document describes the components of the launched VMs and outlines the key design elements of the cluster The cluster will have seven virtual machines (VM) One head / control node One login node Three compute nodes Two storage nodes All the VM will have the OS Rocky Linux 9.6 (Blue Onyx) AMI: ami-0f2425d4cce4e97dd Instance Type: t3.2xlarge When the VMs are created, an SSH key, terrafor

Joseph
2 days ago5 min read
Â
Â
Â
HPC cluster on AWS
This is a eight part series on how to setup an HPC cluster on AWS. The main design elements the cluster will be as follows: The cluster will have seven virtual machines (VM) One head / control node One login node Three compute nodes Two storage nodes All the VM will have the OS Rocky Linux 9.6 (Blue Onyx) AMI: ami-0f2425d4cce4e97dd Instance Type: t3.2xlarge When the VMs are created, an SSH key, terraform-user is already added This makes it easy to log in to the VM from the l

Joseph
2 days ago1 min read
Â
Â
Â


Using LLaMa with VSCode
Download and install Ollama. There are multiple LLMs available for Ollama. In this case, we will be using Codellama, which can use text...

Joseph
Jul 6, 20241 min read
Â
Â
Â


Creating a Singularity Container for Linux Machine with GPU Support in AppleMac with Apple Silicon
Some HPC machines today use singularity containers for their machine learning workflows. Once configured Singularity containers can be...

Joseph
Jun 7, 20244 min read
Â
Â
Â


Distributed-Dask with PBS
Dask is a popular Python library designed for scalable computing with dynamic task scheduling. A key strength of Dask lies in its...

Joseph
Aug 30, 20233 min read
Â
Â
Â
Using Vim and Ctags to Manage Large Projects
The usual workflow in developing an HPC application is to develop the code in local machines and then run the completed application in an...

Joseph
Nov 9, 20222 min read
Â
Â
Â
Automate Workflow Using VSCode
VSCode is a very popular tool to manage large projects in C/C++. One of the main advantages of VSCode is we can automate workflows that...

Joseph
Jul 29, 20222 min read
Â
Â
Â


Debugging MPI Programs Using Valgrind and GDB
Debugging a Parallel program is not straightforward as debugging a sequential program because it involves multiple processes with...

Joseph
Sep 25, 20204 min read
Â
Â
Â
bottom of page