SGE 6.1 Interactive Queue on t3ce01



It's useful introduce an interactive queue in the t3ce01 SGE configuration for 2 main purposes;

  • to allow users to develop SW exploiting the WN computational power.
  • to inspect the /scratch dir hosted in each WN during and after a job execution.

To achieve those we need to modify the configuration both CE and WN side and exploit the SGE-SSH integration; basically in response of an interactive queue request the less loaded WN is selected and an SSHd daemon is started on a TCP port ( not 22 ), later the CE open an SSH connection vs the couple ( WN, TCP port ); so first prepare an executable script called qlogin.sh:

[root@t3ce01 n1ge6]# cat /swshare/sge/n1ge6/bin/lx24-amd64/qlogin.sh
#!/bin/sh
HOST=$1
PORT=$2
/usr/bin/ssh -XY -p $PORT $HOST
[root@t3ce01 n1ge6]#

Interactive queue

Be sure that your interactive queue is really declared INTERACTIVE like showed here:
[root@t3ce01 n1ge6]# qconf -sq interactive
qname                 interactive
...
qtype                 INTERACTIVE
...

qconf tuning

and modify the global SGE configuration to respect the 2 qlogin lines reported below:
[root@t3ce01 n1ge6]# qconf -sconf 
global:
execd_spool_dir              /var/spool/sge
mailer                       /bin/mail
xterm                        /usr/bin/X11/xterm
load_sensor                  none
prolog                       none
epilog                       none
shell_start_mode             posix_compliant
login_shells                 sh,ksh,csh,tcsh
min_uid                      0
min_gid                      0
user_lists                   none
xuser_lists                  none
projects                     none
xprojects                    none
enforce_project              false
enforce_user                 auto
load_report_time             00:00:40
max_unheard                  00:05:00
reschedule_unknown           00:00:00
loglevel                     log_warning
administrator_mail           none
set_token_cmd                none
pag_cmd                      none
token_extend_time            none
shepherd_cmd                 none
qmaster_params               none
execd_params                 none
reporting_params             accounting=true reporting=true \
                             flush_time=00:00:15 joblog=true sharelog=00:00:00
finished_jobs                100
gid_range                    50700-50800
qlogin_command               /swshare/sge/n1ge6/bin/lx24-amd64/qlogin.sh
qlogin_daemon                /usr/sbin/sshd -f /etc/ssh/sshd_config_sge -i
rlogin_daemon                /usr/sbin/in.rlogind
max_aj_instances             2000
max_aj_tasks                 75000
max_u_jobs                   0
max_jobs                     0
auto_user_oticket            0
auto_user_fshare             100
auto_user_default_project    none
auto_user_delete_time        86400
delegated_file_staging       false
reprioritize                 0

[root@t3ce01 n1ge6]#

Specific sshd_config for SGE - WN side

Instead to use the global file /etc/ssh/sshd_config it's worth to use a different file to specify a different Syslog facility and distinguish between administrative SSH login on TCP port 22 vs SGE SSH login, here we selected LOCAL5:
[root@t3wn19 ~]# cat /etc/ssh/sshd_config_sge 
# /etc/ssh/sshd_config
# The default configuration provided by the ssh module.
Protocol 2
SyslogFacility LOCAL5
PasswordAuthentication yes
ChallengeResponseAuthentication no
GSSAPIAuthentication yes
GSSAPICleanupCredentials yes
UsePAM yes
X11Forwarding yes
Subsystem       sftp    /usr/libexec/openssh/sftp-server
[root@t3wn19 ~]#
Later configure syslogd to send LOCAL5 to a file like /var/log/secure-sge

qlogin session

Now try a qlogin session:
[martinelli_f@t3ce01 ~]$ qlogin -q interactive
local configuration t3ce01.psi.ch not defined - using global configuration
Your job 684534 ("QLOGIN") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 684534 has been successfully scheduled.
Establishing /swshare/sge/n1ge6/bin/lx24-amd64/qlogin.sh session to host t3wn19.psi.ch ...
martinelli_f@t3wn19.psi.ch's password:
WN side you can see the SSHd daemon called by SGE:
[root@t3wn19 ~]# ps fax | grep -A 2 -B 2 sshd
...
16238 ?        S      0:00  \_ sge_shepherd-684534 -bg
16239 ?        Ss     0:00      \_ sshd: martinelli_f [priv]                    
16240 ?        S      0:00          \_ sshd: martinelli_f [net] 
but that SSHd process is not listening on any TCP port, this improves the WN security.

Firewall on the WN + Hostbased Authentication ???

Hence users are allowed to login by SSH into the WN but there is also a process SSHd listening on the port 22, so a user could login outside the SGE control;

to prevent that we can setup an iptables on the WN ( or hosts.deny ?? ) to allow a SYN on TCP 22 just from well known hosts like t3admin01 and Adminstrators laptops.

Also introducing an other password during a qlogin session hurts more than one user, we can setup an SSH Hostbased Authentication UI => WN that's going to be respected just during the qlogin request.

-- FabioMartinelli - 2011-03-18

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2011-03-18 - FabioMartinelli
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback