HEPiX TRIUMF Meeting Report

HEPiX TRIUMF Meeting Report

Minutes of the Meeting held April 10th - 12th, 1996

  1. Introduction

  2. Site Reports

  3. Experience with a Sendmail Substitute by R.Lauer/Yale
    The speaker had tried to investigating integrating various mail systems in a simple, easy and secure way. She started from a site where VMS acted today as a mail host, making use of the VMS port of PMDF for Internet mail. Even UNIX users used VMS to send and receive mail.

    She wanted

    There appeared to be two options - PMDF on UNIX or sendmail.

    While sendmail was a single process running setuid as root, pmdf was a set of cooperating processes running setuid pmdf except for one process. Pmdf had lots of programs queuing and de-queuing messages; a demon-like control job scheduled the tasks to be done. The server was SMTP-compatible, multi-threaded and had both POP and IMAP support. In the speaker's opinion, having these multiple processes made pmdf easier to debug problems such as lost mail.

    Another advantage she felt that pmdf had over sendmail was that its configuration files were more modular and easier to understand than the single monolithic sendmail configuration file and the equally complicated alias file used by sendmail. For a full list of the differences, see the overheads at //osf.physics.yale.edu/www/hepix/yale_pmdf.ps.

    Her conclusions were that sendmail lacks some features found in pmdf and that the latter offered a better and easier solution.

  4. Certification of UNIX Operating System Releases by A.Silverman/CERN
    The UNIX support team in CERN had decided to implement a scheme to formally certify particular releases of the various UNIX operating systems which they support. Releases would be "approved" in that both they and a list of CERN-recommended products were considered to be in a properly-working state. This should help users decide which version of an operating system to choose and when to think about updating an existing installation. Users would be given a level of expectation regarding support if they chose a particular release and finally it would also help enforce some methodology into the current way that software gets released to users.

    It was accepted that some users did not need or did not want to be restricted to versions the support team recommended but full support would be concentrated on those users who followed the recommendations. It was also accepted that system updates were disturbing and that users must be given some flexibility on when they chose to upgrade.

    Operating system releases would move through the stages of being NEW (under test), PRO (the current recommendation), OLD (the previous recommendation but still supported) and DORMANT (no longer recommended or supported). Certification of a release included building and testing both the internal tools used by the support team, for example SUE described in previous HEPiX meetings ( Saclay and Prague), plus a certain number of public domain software in ASIS (see also previous HEPiX meetings at SLAC and NIKHEF ) considered necessary for the CERN UNIX environment and of course the current release of the CERN Program Library. See URL http://wwwcn.cern.ch/hepix/www/meetings.html for an index to previous meetings.

    A major concern, still partly unresolved, was what to do about vendor-supplied patches. Should they be added "on the fly" or should they constitute a new release? Given the overhead of new releases on CERN staff performing the certification and on users "encouraged" to update to the latest release, it had been decided not to make a new release for patches unless they were deemed critical to system running. Of course patches which could be of use to some users would be made available for individual system managers to install locally. Taken together, the target was to make no more that 2 certified operating system releases per year per platform.

  5. LINUX by Bob Bruen/MIT
    LINUX was described as a superset of POSIX running on PCs as well as Alpha and other chips. It had one main author but there were many people contributing their own drivers and applications. It was available with full source code. Various benchmarks and comparisons were presented, including comparisons of LINUX with both FreeBSD and Solaris running on Intel.

    The speaker gave lots of references and URLs for further information as well as where to obtain lots of useful software for LINUX. Of course there were already lots of books and magazine articles on the topic and various companies now offered commercial services for it.

    At the present time, LINUX was undergoing several evaluations such as its use in high performance systems, in firewalls, etc. Its advantages, which included its zero cost, community support, etc were compared to its perceived drawbacks - no central authority for control or support and too many versions.

    In the discussion that followed, members of the audience discussed the merits and dangers of having the source code easily available and the risks and possibilities offered by being able to modify the source. It was stated that support for MP systems was coming and some people voiced the opinion that a PC plus LINUX was a good competitor to a RISC system running commercial UNIX.

  6. Report of the HEPiX AFS Working Group by S.Hanson/FNAL
    The HEPiX AFS Working Group met for the third time at the recent DECORUM (AFS Users) Meeting. A review was given of the topics treated to date (information gathering, establishing repository of useful software, information exchange, lobbying of Transarc). The group felt it had achieved its objectives although it accepted that the last of these activities has probably been less successful than had been hoped for. In return, Transarc had informed the Working Group of some of its future plans for AFS and for migration towards DFS.

    The Working Group heard a report from a representative of ESnet. It also decided to increase its lobbying of Transarc, pleading for more stability (especially), faster support for new operating system releases coming from vendors, better customer support and the continuing support of older client versions.

    The original Working Group chairman, Matt Wicks, had moved to a new position and had therefore resigned from the group. Rainer Toebbicke of CERN was appointed as acting chairman. The group agreed to spend the next 6 months looking at DFS migration planning and agreed to meet at the next HEPiX meeting (Rome) to decide on its future status and directions. A list of DFS tasks had been drawn up (see the speaker's overheads at http://www-oss.fnal.gov:8000/hanson/docs/HEPIX/afswg.ps. The full minutes of the Working Group can be found at URL http://www.in2p3.fr/hepix-afs/0296/.

  7. Planning for Data Storage and Batch Farming at CEBAF by R. Chambers
    CEBAF were in a planning phase for future data storage and batch facilities. The design goals were, for the most prolific experiment, 10KB event size, 10 MBps sustained rate, 1TB per day. They wished to store one year's worth of data online as well as corresponding Monte Carlo data, some 300TB in all. They desired to automate where possible and forego the use of central operators. A central data silo was planned.

    As the central data silo, they had selected an STK4410 system with 2 Redwood drives and OSM as the storage software. The reasons for this choice were explained in some detail and they noted that they had benefited from the work at DESY on OSM.

    Networking was part of the study and Cisco routers were being installed to permit Fast Ethernet for the backbone since they had decided after tests that it was too early for ATM. However, they will watch ATM for the future.

    Meanwhile, to accommodate the data rates from the Hall B experiment (see above) they were planning to use fibre-connected RAID discs from SUN over a 700 metre fibre with local buffering in the Hall for interruptions. Tests of this were currently underway and they had seen 7.5MBps without OSM in use and 7.2MBps with it. Transfers from memory to Redwood drives had been measured at 11.1MBps.

    LSF had been selected as the batch system after an evaluation of both it and Loadleveler. Again, the reasons were explained for the choice. Estimations of required batch capacity were in the vicinity of 10K SPECint92 units already this year.

    Finally, the speaker drew some lessons she and her colleagues had learned from the whole planning exercise and the importance of finding common solutions.

  8. Roundtable discussion on disk-space management led by John O'Neall/IN2P3
    John introduced the subject by pointing out that the more users who converted to UNIX, the more disc space and the more servers would need to be managed and that more client systems implied more distributed access. There were many options including

    He showed a few slides to illustrate what happens at IN2P3, a combination of AFS with its ACLs and delegation of group management. Corrie Kost reported that TRIUMF performed minimal support in this area, considering that the cost of buying more disc space as it was required was less than the cost of space management. However, this method was not without problems - file backup, search for specific data, etc.

    P.Micout noted that DAPNIA used UNIX quotas in some places to control file space growth. CEBAF users had scratch areas as well as user quotas. Also CEBAF groups could buy their own group disc space. Yale today used quotas for accounting and SLAC had a number of accounting tools.

    FNAL used AFS quotas as did CERN which had continued the VM policy of a group administrator and project space and the batch scheme (CORE) made use of a very large stage pool. RAL reported that they used staging but had implemented a fair share staging scheme.

    The discussion then turned to space management: IN2P3 said that they normally overbooked file space by some 200%; CERN, RAL and SLAC offered file archiving facilities. It was noted that often users did not understand the difference between file backup and archive and how long saved files were guaranteed for.

    A number of labs had looked at, or were looking at, HSM techniques - for example DESY had experimented with OSM, CEBAF had chosen this (see above) and SLAC were looking at it while currently using ADSM. ADSM was also being evaluated by CERN and IN2P3 for its HSM-type facilities. After an evaluation CERN had rejected OSM because of a number of problems, including concern for its future directions. Some of these problems had also been seen by CEBAF but the latter were optimistic that a fix would be provided by the vendor. FNAL used Unitree in one project. Several labs had looked at HPSS (e.g. CEBAF) but judged it not ready yet, as well as being very expensive.

  9. The Use of Wincenter from Unix workstations by Pierrick Micout/DAPNIA
    The French CEA organisation made heavy use of Windows 3.11, especially in administration although this was gradually migrating to Windows NT. On the other hand, scientists used almost exclusively UNIX, often via an X terminal. The speaker would like to bridge both worlds, for example to prepare overhead foils with a good tool from his UNIX system.

    From the many products available, see next talk, DAPNIA had chosen WinCenter and had installed a 15 user licence on a Pentium 133 MHz configuration. They had gone through several beta releases of the software and were now running a production version.

    Dedicated access to a number of Windows applications was available but usage remained low. Could their configuration really suppport 15 users? For this reason, tests were progressing slowly.

    TRIUMF were using the same product. Their configuration was larger, especially memory, and much more popular; there were already some 40 accounts., mostly accessed from X terminals. TRIUMF understood that the vendor, NCD, was adding support for NT clusters as servers. TRIUMF also reported virtually no special network load and current releases included support for Novel Netware and for Appletalk.

    RAL were running a WinDD service on a twin-CPU NT server and had seen up to 12 users without problems. It supported both Netware and Windows 95.

  10. Comparing various methods to access Windows tool by A.Silverman/CERN
    As the previous speaker had explained, there existed on the market a number of tools to permit the users of UNIX workstations and, in some cases, X terminals to obtain occasional access to Windows applications. CERN had performed a comparative test of some of these and collected information on others.

    The oldest tool was SoftPC but it was very slow due to its purely emulation method and seemed to be rapidly falling out of favour. WABI and Softwindows were similar to each other: they ran locally on a UNIX station, trapped and translated X Windows calls into X11 calls and emulated the rest of the applications. Thus, they were CPU-limited and not much faster than SoftPC. WABI suffered also from the fact that it did not contain any Microsoft licence and was consequently limited in the number of applications it could support (less than 30 today). For individual users however, both offered possibilities.

    The last 3 tools considered all ran on a PC server. Tektronix WinDD was the oldest by some months but used a proprietry protocol best understood by Tektronix devices. The HP Application Server 500 was based on SCO UNIX on the server and WinCenterPro was based on Intel/NT. For any kind of general service, these all offered their own advantages and drawbacks.

    The summary, expressed in a table was that for a new user wishing to access both Windows and UNIX, he or she should buy a PC and a PC X interface tool and that existing UNIX users should consider one of the tools listed depending on his or her situation, for example single user, general service, particular application needed, etc.

  11. Report of the CERN Connectivity Working Group by A.Miotto
    This working group, set up in the framework of the migration of CERN users off CERNVM, had studied, among many other possibilities, access to UNIX systems from other UNIX systems and from X terminals, both in the presence and absence of AFS at each end and both with access from on and from off the CERN site. Not all accesses were possible but they had as a target to produce a simple recipe for users to be able to start a remote interactive session or establish an X11 session.

    In fact, they had found that telnet worked in all cases since there was no variable to pass (e.g. DISPLAY, magic cookie) but passwords passed over the network. The tools rsh and rlogin removed this drawback but were blocked by the CERN Internet firewall; the non-AFS client to AFS server case did not work; and even AFS clients, rlogin failed in some cases.

    xrsh and xlogin worked better, the only restrictions found being from a non-AFS client to an AFS server and some traffic stopped by the CERN Internet firewall. In addition, the CERN security team recommended the use of mxconns for added security; it created a virtual X server where the owner was informed of new connections for example.

    The ideal requirement was to pass across the network the DISPLAY variable, the X11 Magic Cookie and the AFS token but not the password, unless encrypted. For non-AFS clients, this meant distributing and installing locally the token granting scheme used by AFS. Currently, investigations were being carried out on xrsh + arc (token scheme from R.Toebbicke) and the ssh (secure shell) product. ssh encrypted all network traffic, including X11 but required extra work for AFS support.

    Arc used Kerberos which would need to be installed locally on a non-AFS node, non-trivial in many cases (for example, the local login and other images needed to be replaced to grant the token). The current status was that arc was good for internal use only and more work should be done on ssh. It should be modified to pass the three items listed about (DISPLAY, Magic Cookie and AFS token). Encryption was not thought important, at least inside CERN where mxconns gave enough security. Outside CERN, use of encryption was not legal in some countries without permission (e.g. France).

    [More information about the activities of this working group can be found on the web at URL http://wwwcn.cern.ch/umtf/meetings/net/]

  12. Experience with mbone videoconferencing by R.Lauer/Yale
    It was generally agreed that MBONE videoconferencing coordination needs to be centralised within a site but the speaker had found by hard experience that installing the tools locally and making them work was not easy. Videoconferencing could be used with special, usually expensive, equipment in a central facility (e.g. CODEC) or on a private workstation without special add-ons (e.g. Mbone). Mbone stood for Multicast-Backbone; it was a virtual network running on Multicast routers inside the Internet. Groups of receivers could enter or exit a running conference at will.

    Although not much was required to start accessing such videoconferencing, there were many options which could be tuned and a high speed link was essential. Among the problems met by the speaker were the lack of good documentation and the difficulty to chase down the tools themselves, at least on the hardware platform with which she was concerned. In particular, there was no package of such tools, only individual constituents. Thus, although the tools were popular, making them work together on a given platform was not easy.

    Ms.Lauer gave a list of useful references she had found and a review of her experiences with the individual tools used. S.Hanson from FNAL stated that they had written a procedure to start the main video conferencing tools automatically and other labs were encouraged to start from there.

    The current problems included:

    In the discussion that followed, the point was made that Mbone should not be attempted on an X terminal as it provoked a very high X traffic and good sound support, sometimes any sound support, was generally missing on these devices (modern NCD terminals were an exception).0

  13. Use of GNATS by A.Lovell/CERN
    Gnats is a problem report management scheme in the public domain. The speaker described its main features and the various utilities in the package. He illustrated the talk by showing actual pages from the CERN Gnats database, showing the problem report fields filled in by the user (or person submitting the report), those generated by Gnats itself and those which the maintainer or person answering the problem should complete.

    However, as delivered, Gnats was not considered user-friendly for the average user. CERN had taken the web interface developed at Caltech and made significant alterations for use in CERN. For example, although Gnats only supports a single flat file structure for storing problem reports, to make it usable by multiple CERN support teams a scheme of artificial domains was created, each with a standard web interface. These domains were in fact a single large domain but by creating different web entry points, we could disguise this fact and only show problems relating to a specific field to the corresponding support team or user community.

    Gnats had entered production last September and was currently receiving some 30 to 40 reports per week. It had amassed 37 categories of problems across its 7 domains. Studies had begun into the next version, using a client/server architecture. The new version also promised support for a database to store the problems.

    The speaker was urged to publicise his web interface for interested labs and sites.

  14. IMAP at Fermilab by S.Hanson/FNAL
    The goal was to implement distributed mail access, accessibility to mail from multiple clients as opposed to delivery of mail to personal workstations which would answer the questions of local backup of mail, keeping the stations running during vacations, etc. Access to mail from offsite was also desirable. Today, FNAL recommended the use of exmh as the user mail agent.

    The options looked at were

    IMAP was a powerful protocol with lots of interesting features with fewer user agents than POP today, although that state promised to improve in the future. The most popular user agent was pine although this was much too simple for a sizeable fraction of experienced UNIX users. IMAP had some interoperability problems but these too were expected to improve over time as the IMAP V4 standard firmed up.

    There existed a so-called Cyrus IMAP project at CMU which was adding Kerberos user authentication to pine, support for ACLs, bulletin boards, Zephyr notification of new mail received and file quotas for file control.

    Fermilab's plan was to use IMAP to access FNAL newsgroups and mailboxes, to enable Zephyr notification of new mail, to explore Kerberised clients and MIME and HTTP integration. A major obstacle in all this was the exmh recommendation since the mh mail agent family only supported POP. The current central server used for the tests was a small SUN with a RAID 5 disc subsystem with failsafe disc mirroring. It would be certainly underpowered for an eventual production service.

  15. Tape Management System (TMS) at CERN by A.Maver/CERN
    The speaker described the evolution of TMS from its beginning in 1989 to today including its recent conversion to UNIX. It currently handled over 800K active tape volumes and made far easier the handling and tracking of tapes and statistics.

    It was organised by VID, virtual tape ID, with series of one or more libraries, some of which may be archived, in different laboratories. Experiments could define their own libraries and move tapes as required but tapes could only be mounted if they were in the VAULT or robot libraries. All tape moves must be registered. Various utilities and TMS commands for use by administrators and by users were described.

    Tapes belonged to a group or a user but the term user was frequently chosen as a user ID specially to manage the tapes and shared by several group administrators rather than one individual.

    In the CERN implementation there was no protection against world read and the default was to permit all writes by group members although some users could be excluded and tapes could be "locked" by a user with write access. Access control lists similar to the UNIX protection scheme were used and group sharing was allowed. CERN extensions included defining generic names for media types and adding extra commands.

    The speaker closed with a summary of the advantages (better control of the tapes, regular statistics of tape use, easy to add new tapes) and its main limitations (all commands must pass through SYSREQ, volume-oriented rather than file-oriented). The "missing" queue handling from the UNIX SYSREQ implementation was replaced by having 10 TMS processes running. Today SYSREQ and TMS ran on separate nodes. There were plans in the future to consider merging the TMS keyword set as used by CERN and by IN2P3.

  16. SHIFT Tape Software in Saclay by P.Micout/Saclay
    The speaker described how he had installed CERN's SHIFT tape software in Saclay and why he had made some different choices. For example, with an automatic cartridge loader at his disposal, he preferred to add a cartridge instead of another disc when the need arose for more space. He had an IBM 3494 device with 210 slots and 2 Magstar drives. Their choice of control software for this device had included ADSM and EPOCH but they had finally selected the SHIFT package although he said that he had sometimes found some specific SHIFT choices hard to understand.

    The SHIFT tape server module had compiled without error but initially its interface to TMS had caused problems and this had eventually been turned off. Their Magstar (3590) drives had not been supported when a particular SHIFT release was installed but had been supported correctly in an earlier release.

    They had recently added Digital DLT robots but SHIFT did not support that model (a TL810), only the later version (the TL820). The current situation was that they believed they required to install a tape management scheme and the options included -

  17. Roundtable discussion on Tape Interchange led by John Gordon/RAL
    The discussion was split into three sub-topics, an attempt to define a common interchange medium, tape formats and tape labels.

  18. Batch Systems
Alan Silverman, 19 July 1996