Tags:
create new tag
view all tags

Requirements for new Configuration Management System

Reasons for changing (or rethinking our setup)

  • Our CFEngine is getting old. A new version is out, and it's not compatible with the old one. We are not receiving bug-fixes, nor new features for the current one.
  • We have started loosing control of what's inside. The complexity is too high, and it acts like another sysadmin that does things that we don't know about. Every change we make takes more time and is more risky. It's a big unknown monster we have learned to play with.
  • It needs a cleanup. There are things inside that are not used anymore. There are also things that began very simple but became too complex for that simplicity and have to be re-written. In particular, the currently defined classes are not consistent to each other (eg. seperate arc01/arc02 classes, but common LCG_CE).
  • We need to use more intelligently the system of classes, perhaps making meta-groups of them, for clarity, taking into account both a pre-production system or a cluster-split, or hardware classes (common use-case during multiple project phases). And re-writing the classes requires almost a complete cfengine re-write.
  • Currently there is a separation between hardware and software configuration: cfengine manages software and newmachine script manages hardware. This is undesired, and it's even not very well done. We need a system where everything is in its place; newmachine script should bring a system in a state able to run cfengine under network and ONLY THAT.
  • We want to have a mechanism that allows for repeatable installations (to the bit if possible). This invites for local repository mirrors for OS & gLite, STABLE & UNSTABLE.

Before choosing the right CM, we need to define what do we need from it to be able to select some candidates.

Classes redefinition

Here is a proposal of how classes should look like.
  • Each machine would have it's own custom, individual features, like MACs, IPs... (defined within CM).
  • It should include one (and just one) class from each group, from all groups (this would also allow for better "dsh groups" with common names)
  • Each group has disjoint class members.

Machine type

  • CE (LRMS, LCG, CREAM, ARC, ARGUS)
  • SE (CORE, POOL)
  • NFS
  • LUSTRE (MDS, OSS)
  • WN
  • XENHOST
  • BDII (TOP, SITE)

Cluster

  • PROD1, PROD2 *[1]
  • PPS1
  • PPS2
(note [1]: maybe, when we make a change, we want to make it in the 'pre' tree, and develop a script to commit/merge the changes to the 'prod' tree when we think the change is finished)

Hardware

  • SUN_X4170
  • IBM_1234
  • XEN

Container (optional?)

  • Rack3
  • Xen01

Capabilities

Classes don't do all the stuff, at the end the CM has to configure real services. The idea is that we don't have to maintain the same code/config on two different places, so if WNs and CEs have to mount Lustre, we don't write the piece of code to configure a Lustre client twice. Also, to configure Lustre (to follow this example) the code has to be class-aware to configure it the right way, for example:
  • Lustre is different on Xen guests (runs over TCP)
  • And there can be more than one lustre instance, for example: pre-production and production, and if we split the MDSs/MDT, there are two production clusters.
  • Lustre is different for clients and servers, so maybe there should be two different capabilities (or config-items, however we call it): lustre-server and lustre-client, and both are independent.

We should create a list of configuration items, or capabilities, that we may need to configure. We should place all possible things to be able to choose a tool that can handle all kind of situations:

NETWORK

  • IP(s), hostname(s), subnet(s), hosts, routes, ...

SECURITY

  • known_hosts, authorized_keys, sshd_config, iptables, ...

FILESYSTEMS

  • nfs exports, nfs mounts, xfs format, ext3 format, lvm, fstab, lustre-client, lustre-server, ...

VIRTUALIZATION

  • xen_host_filesystem, xen_guest_config, ...

DCACHE

  • dcache_headnode, make_pool, ...

... ETCETERA!

Requirements inside a machine setup

Installation

  • Ethernet MAC assignment (in case of VM)
  • Hostname / public IP
  • 10.10 IP assignment
  • DHCP
  • PXE boot
  • Kickstart file (with disk drivers for OS, and initial partition table)
  • Installation and setup of Cluster Configuration Management system (CCM)
Up to this point the machine is installed with the 10.10 network only, with no internet access (maybe only to download software, maybe not)

Either Installation or CCM

The following things can be either done from the bootstrap or later. Preferabily done by the CCM.
  • Drivers for IB card
  • Assignment of Public IP
  • Securiry (basic firewall, ssh keys)
  • Partition / data

CCM only

  • Repositories
  • Packages
  • Service configuration
  • Yaim

Testing Puppet.

From the different good alternatives that are not image-based (puppet, cfengine3, quattor, cheff) we have decided to try with Puppet, because it seems to fulfill the requirements (save for the installation, that will be done with kickstart), its learning curve is reasonable, and has the biggest community around it.

The idea is to deploy a worker node fully controlled by Puppet, that can be used in production. After ensuring this works well, we could then deploy the rest of the worker nodes with it, and continue wth the rest of the systems when the time comes.

Preparation of a Worker Node

Here are the steps that need to be followed to achieve this. The idea is to have this finished by the end of September 2012.

The list is just a reference, will surely grow as we start working on it

This work will be first done inside a pre-production environment, with virtual machines. Then we need to move it to real hardware.

Prepare installation server (to be finished by 12th of March)

  • To host everything needed to kickstart a new basic machine DONE
  • With a copy of all repositories needed
  • With puppet running DONE
  • Dashboard running in VM (with sl6) DONE
  • Puppetize the KS file generation for each machine
  • Allow to run puppet master against a local working copy of a user, and clients to fetch it. DONE

Prepare node with HW/OS layer (to be finished by 15th of April - before Phoenix moves)

  • Puppet client (may not be the same as with KS) DONE
  • Bring up public network (VM: ethernet only) keeping the 10.10 active. DONE
  • resolv.conf DONE
  • /etc/hosts
  • Time sync with public ntp servers DONE
  • SSH setup
    • Unique/shared known-hosts file
    • password-less configuration (and possibly hostbased, for WNs only?) DONE
    • fully-controlled authorized-keys (removing config removes from file) DONE
  • Iptables (with the ability to add rules from other components) DONE
  • Mail agent DONE
  • Logrotate DONE
  • Syslog DONE
  • Repository setup and installation of packages (with fixed/reproducible rpm versions) DONE
  • Implement system to replicate RPMs within different machines (ideas: snapshotted repo, rpm_clone, or puppet RPM lists)
  • Shared scripts in /opt/cscs

Prepare the upper layers: Mount points, Monitoring, Middleware. (to be finished by 1st of August)

  • Ganglia
  • Nagios
  • NFS mounts
  • GPFS client
  • Pbs_mom
  • Yaim
  • Glexec
  • Sudoers

Prepare physical machine (to be finished by 15th of September)

  • Disk Partitioning (this complicates the installation process)
  • Hardware and software RAID
  • Infiniband
  • Tuning sysctl.conf
  • Benchmarking (optional?)

Node Classification

I have found the way Puppet has to do node inheritance is not really flexible, and can lead to errors, because the way it works is not intuitive (you think a variable or a class or a tag is being declared, but that truth applies only to some parts of the manifests). It turns out that Puppet has almost ten different ways to deal with this (http://puppetlabs.com/blog/the-problem-with-separating-data-from-puppet-code/) but I discovered this can go down to three:
  • Use builtin puppet capabilities, with node inheritance, using parametrized classes when needed. This may be suitable for small or simple, but it presents the problems defined above
  • Use an External Node Classifier, possibly the one included in the Puppet Dashboard. This has some strong points, since it's easy to use from the Dashboard, but it's not yet ready for using arrays and hashes(http://groups.google.com/group/puppet-users/browse_thread/thread/c0ec096d7daabbaf), and it depends on a database, which complicates using stages (users running a copy of the classifier, modifying it, testing, and merging with the production one). The Foreman has another classifier of this kind (very similar to the Puppet Dashboard) and you could build one yourself. http://docs.puppetlabs.com/guides/external_nodes.html
  • Use Hiera as an ENC.
http://www.devco.net/archives/2011/06/11/puppet_backend_for_hiera_part_2.php http://puppetlabs.com/blog/first-look-installing-and-using-hiera/ https://github.com/puppetlabs/hiera-puppet/tree/master/example http://www.mail-archive.com/puppet-users@googlegroups.com/msg28134.html

References

Puppet function/configuration/type references:

Examples from other sites:

For production, check:

For making puppet remove the configuration when you remove a class, check this out (it's called the Truth Enforcer): https://github.com/jordansissel/puppet-examples/tree/master/nodeless-puppet

-- PabloFernandez - 2010-07-23

Edit | Attach | Watch | Print version | History: r21 < r20 < r19 < r18 < r17 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r21 - 2012-04-06 - PabloFernandez
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback