Ray Butterworth — 2013 August 13 and August 21
Robyn Landers — Revised September 12
At MFCF's June 17 Management meeting, Wayne asked me to investigate getting rid of Xhier. The initial investigation showed that the functionality that Xhier provides is very pervasive throughout the various services that MFCF provides. If we wished to continue providing the same level of service, eliminating Xhier would be very expensive. A followup report on August 21 investigated the services we currently provide, many of which exist simply for historical reasons, having no significant client demand. Eliminating them would greatly reduce the impact of no longer being able to rely upon Xhier.
What follows is a combination of those two reports.
Does Xhier
refer to the various infrastructure tools
currently associated with that name,
or to the software that currently makes use of it,
or to the philosophy behind its design,
or to all of the above?
People have varying ideas as to what the term Xhier
refers to,
and conflation is a constant problem,
so it could be any or all of these.
It would help if we were all familiar with the Xhier philosophy.
If you haven't read it recently, please read
Xhier Philosophy and Goals
(Theory)https://www.math.uwaterloo.ca/mfcf/info/overviews/xhier-philosophy
and
Xhier Overview
(Practice)https://www.math.uwaterloo.ca/mfcf/info/xhier.
Briefly, xhier is a comprehensive suite of tools providing hierarchical system configuration management and software packaging and distribution in a uniform way across a heterogeneous environment. Although its rich feature set was important to us in the past (which is why it was designed that way), it may be the case that some functionality is no longer needed.
There may be popular open-source alternatives to cover most, or perhaps all, of what we need.
This report reviews what services xhier provides, explains the reasons for those services, suggests how we might revise those services, and points out some problems. It also itemizes some features we still require. Appendices address what others on campus are doing, and provide a few initial comparisons to popular alternatives.
Some of the most common services provided by Xhier are:
<mfcf/libc/xxx.h>
)
that allow code to be written independently of
the architecture on which they are to be built and run.
xhier regions, where several machines (often of different architecture) share a common set of home directories and user accounts (e.g. the student.math, general.math, and iqc.math regions).
The need for this service is created by our policy of making multiple architectures appear as similar as possible to our users.
The need for this service is created by our policy of allowing people to use a machine as if we hadn't made any changes to it. (It also reduces the work required to install system patches and perform OS upgrades.)
The need for this service is created by our policy of allowing users to have access to different versions of the same package on individual machines.
The need for this service is created by our policy of providing the same software on multiple architectures.
The need for this service is created by our policy of making administration of multiple software packages as easy to learn and manage as possible.
The need for this service is created by our policy of providing a hierarchical administration structure.
The need for this service is created by our policy of providing software on multiple hosts of the same architecture.
The need for this service is not policy, but a matter of staff wanting to do things in the most productive way possible. Maintaining ten machines should not require 10 times as much work as maintaining one. Similarly, when there was a large enough number of people (critical mass) working with Xhier, all campus groups involved with it benefited from the work done by other groups.
Most of, or perhaps all, the above services are essential in order to fulfil past and present MFCF policies. They are not necessarily essential in order to fulfil our mandate, and in many cases not essential in meeting the current needs of our clients. Even in cases where those needs exist, they might be provided by other means.
Xhier was developed to meet a real need. At one time we had nearly 20 different architectures in our student labs and research environment. Users would have become very confused had they met with significantly different behaviour depending upon which terminal they happened to have sat down to use.
We now support only 4 major architectures (Mac, Windows, Solaris, Linux), and only two of them are similar enough that anyone might be confused by their minor differences. (This raises the issue of whether there is still any need to support both Solaris and Linux.)
Virtual machines are effectively free, easy to generate, and have a more pristine environment than what we are currently providing.
Again, without the policy of providing one large seemingly homogeneous environment, the need for providing multiple versions of software on the same machine goes away. Where there is a need for a specific old version, a virtual machine for instance could be set up specifically for that purpose.
(On the other hand, a more heterogeneous environment will
conflict with our more recent policy of presenting a simplistic
view to our clients (e.g. linux.math
rather
than having to know about specific cpu###
names).)
And again, if we do not have a policy of providing a uniform environment, the need for building the same software on all architectures becomes much less. The amount of software that we require in multiple places might actually be quite small.
This is still a good goal, but given the large number of non-local packages that we now provide and support, even achieving perfection among the xhiered packages would not have a great overall effect.
Having IQC able to set their own software policies and configuration, within those set by MFCF, within those set by IST, has proven to be useful. But more and more, groups are setting up smaller self-contained computing that doesn't need the generic environment we provide.
Centralized configuration for the machines MFCF maintains is still a good and necessary goal, but without a policy that requires hierarchical administration, there are tools other than Xhier that provide this service.
While in some ways better than, and in some ways not as good as Xhier, these same centralized configuration tools (e.g. Puppet) can provide, or greatly help with, the distribution of software.
Moving to a more heterogeneous environment will make the automation and scaling of some tasks more difficult.
Eventually, it's possible that we might be in a position to develop modules for open some source packages that might benefit a community much larger than the current Xhier world.
If MFCF were to make large changes to many of our long-time policies regarding what software services we provide, as described above, most of the need for Xhier would go away.
We would support only one major version each of Windows, MacOS, and Linux. The first two are outside the scope of this report, but all Linux machines would be administered from one central place by one single authority (presumably MFCF).
Xhier hasn't outlived its usefulness, it's outlived the people that know how to use it.
Replacing all aspects of Xhier, while retaining a similar level of service to maintainers and clients would require considerable work:
.cshrc
and .profile
)
to know about the new locations of software.
We are aware of some external packages (e.g. Puppet, Bcfg2)
that can provide a part of the above
(configuration and to some extent distribution),
but no existing package or collection of packages
that can provide it all.
(E.g. see Building Source Code
in the Appendix.)
A large part of this task would effectively be
the reinvention of Xhier under a different name.
Furthermore, unless we can convince CSCF, IST, and other Campus groups to follow our lead, we would lose support for many xhiered packages that they currently maintain (e.g. the number and maintainers of xhiered packages that are currently on our Solaris systems: 80 MFCF, 50 CSCF, 40 IST, 50 other).
A reasonable estimate of the time required to fully provide the current level of service to the package maintainers and clients would be in man years.
The benefit to such an undertaking would be nowhere near the cost. The effort would be much better spent teaching staff how to use Xhier, and perhaps even how to improve it. But this late in the game, that too would be a very expensive proposition.
We don't currently know what software our clients need that is dependent on Solaris and not provided by Linux. Once we have eliminated mail services, in particular IMAP, we should be able to get a much better view of how our student and research regions are actually being used by our clients. The usage statistics for October or November should give us a good sample to work with.
We do know that a significant number of programs and software packages currently relying on Xhier for compilation, configuration, distribution, etc. are used by MFCF itself.
Some of those are part of our centralized administration and accounting systems, and so really need to exist on only one host. But others are required on all our client hosts. In either case, the current source base will not build or install without the tools and environment provided by Xhier.
We might be able to reduce the amount of such software by eliminating or reducing the need for the services that they provide. This will require considerable study and further policy changes. Even then, for what remains as essential we will need to make large changes to our source and our procedures.
A survey of MFCF staff indicates that (at least) the following software packages, some currently highly dependent on Xhier, are needed.
Package | Purpose | Future |
---|---|---|
accounts_client | Manages user resources. | ??? |
accounts-master | Enables us to know safe-to-remove resource allocations. | Being partially reviewed/replaced by Wayne et al. |
batch | Serializes execution of background tasks. | ???. |
gnomeconfig | Simple configuration for the Gnome window manager. | ???. |
gnu | Provides modern versions to cover old Solaris software. | Not needed if Solaris goes away. |
hostselect | Select the 'best' host based on configured criteria. | ??? |
linux-extras | OS-specific package for Linux. | ???. |
lpr and friends | Print submission, access control, accounting, and printing. | ??? |
mfcf-accounting | Collects usage data. | Being reviewed/replaced by Wayne et al. |
mfcf-accounting-master | Analyzes data, produces bills and reports. | Being reviewed/replaced by Wayne et al. |
mathcpuserver | Configuration overseer package for CPU server machines in the Math region. | ???. |
mysql | Database and interface. | Available on Linux. |
nagios-* | System monitoring. | ??? |
nmap | Network exploration tool and security/port scanner. | Available on Linux. |
ntp-config | A "universal" xntp3 configuration package. | ??? |
perl | For various Perl scripts. | Available on Linux. |
printcap | Maintains and distributes printcap data. | ??? |
rcs | Revision control system. | ??? |
rt-math | Interface to MFCF and CSCF request tracking. | ??? |
security | Controls root access, reports unusual files, etc. | ??? |
sunos5 | SunOS5.* OS specific package. | Not needed if Solaris goes away. |
setpw | Updating and distributing passwd and group files. | ??? |
titrax | For tracking time usage. | ??? |
uid-registry | Central database server for UID and GID values. | ??? |
wwwdata_* | www.math and other web data and configuration. | ??? |
x11-mfcfenv | Provides basic TWM environment for X11 with connection to dtlogin in Solaris 10 CDE. | ??? |
In addition, a number of specific programs were requested.
Program | Purpose | Future |
---|---|---|
acroread | For displaying PDF documents. | Is available on Linux. |
align | For quick formatting of data. | ??? |
decomment | For extracting real data from config files etc. | ??? |
dump_inventory | Formatted dump of selected inventory fields and records. | ??? |
ffind | Cached database of filesystem. | Available on Linux as locate . |
gpdf, gsview | X11 graphic display of PDF and PostScript. | Available on Linux as evince . |
gvim | Graphical version of VIM | Is available on Linux. |
host | DNS lookup. | Is available on Linux. |
hostin | Information about hosts and relative attributes. | ??? |
hostinfo | DNS information and inventory data. | ??? |
lc | Nice format of directory contents. | ??? |
logsum | Summarizes system mail sent to root. | ??? |
lsof | Lists open files (debugging and security). | Is available on Linux. |
prsh | Runs command in parallel on multiple hosts. | ??? |
ssh, ping, traceroute, netstat, ntpdate | Various handy programs. | All available on Linux. |
truss | Debugging utility. | Linux has strace and ltrace . |
vi, cron, sort | Various handy programs. | All available on Linux. |
wget | Quick download of web data. | Available on Linux. |
Some people also suggested packages that they thought our clients might need.
Program | Purpose | Future |
---|---|---|
R, maple, matlab, sage, sas | Required for teaching. Packaged using Xhier. | ??? |
buildbot, gsl, mpitch, splus, stata, twisted, xanim, zope | Required for teaching. Packaged using Xhier. | ??? |
Responses have been received from Chris R., Jim, Lori, Ray, and Robyn.
For completeness and to eliminate personal bias, the tables above include all submissions, even those that might have immediately been omitted.
(More investigation is needed into why some of these packages and programs are needed.)
Many questions remain unanswered. Some are technical, requiring more investigation, while others are more a matter of policy and direction.
Once we have a better understanding of where we hope to be in a year or two, we can start working toward that goal.
A good place to start would be to construct a small environment of say one central server and a separate region of two hosts; to build and install our essential software on them; and to ensure that configuration, resource management, accounting, etc. all work as required.
Once that is working to our satisfaction, we can add more hosts into this new world, and phase out our current environment.
We already administer several Linux machines without the benefit of Xhier. Perhaps we could start building a new environment using them, eventually possibly incorporating resource management software that Christopher is developing.
Most xhiered source can be built with a one-line
xh-imakefile
for the package.
Similarly each program could have a one-line file
specifying the name of the program, its language,
and its intended use (e.g. general users or system maintainers),
often with additional lines specifying required
libraries or other compile-time values.
These xh-imakefile
s are completely portable
across all supported architectures
(currently Solaris, IRIX, several versions of Linux, and MacOS).
I'm not aware of any open-source or commercial product
that provides this capability.
Web searches turn up similar requests,
but the answers are generally
No, build by hand using the vendor's native environment.
.
See examples below (emphasis mine).
It is possible, but for each individual package (and perhaps command) one would have to write a very package-specific and OS-specific configuration file. And each of the thousands of source files might need to be tailored to handle the various OS-specific differences.
Even within a specific architecture, such as Linux,
and even for pre-built software, there isn't any one
standard method of performing installation and updating.
E.g. RedHat has yum
, Ubuntu has apt-get
,
SuSE has zypper
.
I know it's difficult to prove non-existence, but I'd say there simply is nothing approaching the simplicity or power of xhier's generic build-and-install xh-imakefile.
Below are typical responses to the question of using Puppet or CFEngine to maintain and build source code.
How can I use Puppet to build from source? (asked Nov 13 '12)
…
I have a webserver and I want to download, unpack, configure and compile and install apache. How can I get Puppet to do this for me only once?
…
answered Nov 28 '12
…
While you could have a series of exec{} resources that each check that the commands need to be run have been run and build relationships between them for ordering, you do not want to go down that road.All software installed on a system should be done through that OS's packaging software, not through compilation. Then you can just use a package{} resource. This also gives you the benefit of leveraging the packaging software that acts as a source of truth for installed packages and generally knows the files on disk for each package.
…
answered Dec 18 '12
Using native packages is almost always the sanest way to go, as other answers suggest. That said, Puppet as a framework is capable of supporting build-from-source style application deployments.
A generic stub of what a defined type for building from source might look like: https://gist.github.com/2597027
The gist is unfinished and most notably does not include "unless" statements or refreshonly parameters, but gets the idea across in a simple way.
Can CFEngine track packages installed from source (RedHat)
So I have searched this group as well as google and trudged through a portion of the documentation but I am unclear if CFEngine can track and report packages which are installed from source
…
CFEngine leverages native packaging systems. Source code has literally no uniformity or anything even remotely resembling standardization.
That being said, CFEngine can very easily manage source based packaging systems (e.g., BSD ports, Homebrew). If you can find (or devise) a standard convention for source based packages then CFEngine can manage it for you.
However, I would strongly urge you not to. Creating native packages on most systems is not that hard. And there's always fpm.
…
(From an August 1 meeting with Mike Borkowski, Jason Gorrie, and Shawn Winnington-Ball.)
Things are generally frozen as far as Xhier is concerned. Current packages and architectures will continue to be supported, but there is no development and little maintenance. Demand for the service is expected to continue to decline.
In the non-Xhiered world, IST mostly uses CFEngine, but they believe that Puppet is more useful and are moving toward using that instead.
Puppet is good at providing configuration and consistency checks on a large number of machines.
Rather than distributing software updates, Puppet reports where native software needs to be installed or updated. This process does not scale the way Xhier does, and so requires more individual attention for each machine. The number of supported machines is not expected to significantly increase however, and in many cases massive updates can be made by periodic re-imaging and where necessary letting the owners re-do their own customization.
Unlike Xhier, Puppet does not provide an administration hierarchy. This is not a problem for IST itself, as it does not currently make use of the administration and region hierarchy that Xhier provides. Individual units, such as MFCF, or within it IQC, would have to run their own Puppet server, but could make use of modules provided by IST.
IST doesn't have much locally written software, and what they do have is distributed to only a small number of machines, all of the same architecture. Losing Xhier's ability to maintain portable source for multiple architectures will not be a significant issue for them.
(From an August 2 meeting with Fraser Gunn.)
Like MFCF, CSCF cares about this issue, but has fuzzy plans, and isn't doing much about it yet.
They would miss Xhier's ability to provide multiple versions of the same package, independently configured, on the same machine. This can't be done easily with many Linux packages. Xhier is also much better at ensuring that packages are properly installed before it makes changes to the system configuration.
Unlike MFCF, CSCF is well on the way to moving to a homogeneous environment with only one architecture. They are also arranging that their new packages are created in Debian package format.
Having everything on Debian/Ubuntu eliminates the need for portable code, though it does mean that they may eventually end up with a lot of architecture-specific code that will eventually make moving to or incorporating any other system an expensive task.
Much of the need for many xhiered packages has gone away over the years due to improvements in vendor software. Previously it was very inconsistent from one architecture to another, was lacking in features (which had to be added locally), and was often buggy, with slow or nonexistent response for fixes.
But in terms of xhiered software, they are still very dependent on some locally written packages, such as the accounts management system.
CSCF has experimented with CFEngine and Bcfg2, but don't really like them. Puppet seems like a much more promising system, especially if used with GIT for content management.